Casey Bodley [Fri, 26 Feb 2016 21:23:54 +0000 (16:23 -0500)]
rgw: use current period for InitSyncStatus
the InitSyncStatus coroutine records the position to start incremental
sync after finishing a full sync. this should be the master's marker
from the current period, rather than its oldest log period
this also adds a check to run_sync() that restarts a full sync if it
sees that our sync period is behind the master's oldest log period
Casey Bodley [Fri, 26 Feb 2016 17:28:41 +0000 (12:28 -0500)]
rgw: meta log rest handlers avoid get_log()
RGWMetadataManager::get_log() will allocate a log and keep it in memory.
this could lead to a potential denial of service by making requests with
lots of different period ids
RGWMetadataLog if effectively stateless (the only state is a set of
modified_shards, which are not touched by any of the rest api calls), so
we can use a temporary instead of calling get_log()
Yehuda Sadeh [Thu, 3 Mar 2016 22:18:25 +0000 (14:18 -0800)]
Merge pull request #7786 from ceph/wip-rgw-indexless
rgw: indexless buckets (Yehuda Sadeh)
- can define a policy, for which buckets are indexless
- users can then create buckets under the specified placement target
- indexless buckets will not be synced across zones
- does not work with (s3) versioned buckets
Reviewed-by: Matt Benjamin <mbenjamin@redhat.com> Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Piotr Dałek [Thu, 3 Mar 2016 10:30:53 +0000 (11:30 +0100)]
common/obj_bencher.cc: make verify error fatal
When run without "--no-verify", all verification errors are noted,
but they are not forwarded/reported anywhere else but to cerr, which
will cause automated testing to ignore them. Make seq_read_bench and
rand_read_bench return -EIO on any verification error which will,
in turn, return it back to caller.
Fixes: #14971 Signed-off-by: Piotr Dałek <piotr.dalek@ts.fujitsu.com>
Piotr Dałek [Wed, 2 Mar 2016 12:22:38 +0000 (13:22 +0100)]
PGMonitor: unconfuse object count skew message
"Pool <pool> has too few pgs" is okay assuming it does not take other
pools into account. And since it does, it is confusing in the following
scenario:
1. Create two pools, one with small pg count and one with large
pg count
2. Put a whole lot of objects in smaller pool, resulting in "too few
pgs" warning on that pool, which is expected behavior.
3. Put a whole lot of objects in larger pool, warning goes away.
Suddenly smaller pool has plenty of PGs?
Current message suggests adding more nodes (or PGs) to pool, when
actually it's warning about significantly more objects in that
particular pool than in the other pools.
Signed-off-by: Piotr Dałek <piotr.dalek@ts.fujitsu.com>
Jianpeng Ma [Thu, 3 Mar 2016 13:46:55 +0000 (21:46 +0800)]
os/bluestore/BlueStore: Don't leak trim overlay data before write.
Suppose: bluestore_overlay_max_length=bluestore_min_alloc_size;
bluestore_overlay_max = 2;
For the following ops:
write(off=0, len=4096) --->write into overlay
write(off=4096, len=4096)-->write into overlay
write(off=0, len=bluestore_min_alloc_size)-->because overlay_map.size()
>=2, it allocate a extent.
It should trim overlay data(0,4096) &(4096, 4096),and then write(0,
bluestore_min_alloc_size).
But the original code don't trim overlay data.
This make the later read data is orignal data rather that new data.
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Jianpeng Ma [Thu, 3 Mar 2016 10:49:28 +0000 (18:49 +0800)]
os/bluestore/BlueStore: Fix bug when calc offset & end whether locate in the a extent.
Suppose: bluestore_overlay_max_length == bluestore_min_alloc_size
The orignal code which calc content of written whether locate in a
extent:
(offset / min_alloc_size) == (offset + length) /min_alloc_size
This will make the case which offset=0 & length =min_alloc_size locate
in the different extent.
In fact, this content is in the same extent.
Change end = offset + length - 1 make work.
Fixes: #14954 Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Piotr Dałek [Thu, 3 Mar 2016 10:22:57 +0000 (11:22 +0100)]
common/obj_bencher.cc: use more readable constant instead of magic number
When clean_up_slow() fails, it returns "-5" which is equal to -EIO.
Change it in source, so it's not confusing for someone who does not
remember all error codes (functionality remains the same).
Signed-off-by: Piotr Dałek <piotr.dalek@ts.fujitsu.com>
Adam Kupczyk [Wed, 2 Mar 2016 11:31:01 +0000 (12:31 +0100)]
[MON] Fixed calculation of %USED. Now it is shows (space used by all replicas)/(raw space available on OSDs). Before it was (size of pool)/(raw space available on OSDs).
Nathan Cutler [Tue, 1 Mar 2016 20:25:11 +0000 (21:25 +0100)]
RPM: move scriptlets from ceph to ceph-base
This addresses the following RPMLINT error:
ceph-base.x86_64: E: library-without-ldconfig-postun (Badness:
300) /usr/lib64/libosd_tp.so.1.0.0
ceph-base.x86_64: E: library-without-ldconfig-postun (Badness:
300) /usr/lib64/libos_tp.so.1.0.0
This package contains a library and provides no %postun scriptlet
containing a call to ldconfig.
ceph-base.x86_64: E: library-without-ldconfig-postin (Badness:
300) /usr/lib64/libosd_tp.so.1.0.0
ceph-base.x86_64: E: library-without-ldconfig-postin (Badness:
300) /usr/lib64/libos_tp.so.1.0.0
This package contains a library and provides no %post scriptlet
containing a call to ldconfig.
Sage Weil [Mon, 1 Feb 2016 18:01:32 +0000 (13:01 -0500)]
mon/MDSMonitor: prevent pool 0 from being used as a data pool
Pool 0 means no change or default in the legacy ceph_file_layout in the
layout ioctl and file create arguments. Prevent it from being used to avoid
putting users in an awkward situation later.
Sage Weil [Tue, 12 Jan 2016 14:57:06 +0000 (09:57 -0500)]
fs_types: file_layout_t: convert pool -1 (undefined) to 0 in legacy encoding
Old code assumes that fl_pg_pool == 0 means the pool is not defined, while
file_layout_t uses -1. Translate between the two.
Note that this means a valid file_layout_t with pool_id == 0 cannot be
accurately translated to a legacy file_layout_t. That is somewhat
unavoidable, and should not be a problem since real clusters create 'rbd'
as pool 0 and it does not use any file layouts.