Yehuda Sadeh [Mon, 16 May 2016 21:35:12 +0000 (14:35 -0700)]
rgw: keep track of written_objs correctly
Fixes: http://tracker.ceph.com/issues/15886
Only add a rados object to the written_objs list if the write
was successful. Otherwise if the write will be canceled for some
reason, we'd remove an object that we didn't write to. This was
a problem in a case where there's multiple writes that went to
the same part. The second writer should fail the write, since
we do an exclusive write. However, we added the object's name
to the written_objs list anyway, which was a real problem when
the old processor was disposed (as it was clearing the objects).
Samuel Just [Thu, 10 Mar 2016 23:19:15 +0000 (15:19 -0800)]
LFNIndex::lfn_translate: consider alt attr as well
If the file has an alt attr, there are two possible matching
ghobjects. We want to make sure we choose the right one for
the short name we have. If we don't, a split while there are
two objects linking to the same inode will result in one of
the links being orphaned in the source directory, resulting
in #14766.
Vitja Makarov [Wed, 17 Feb 2016 10:46:18 +0000 (13:46 +0300)]
hammer: rgw: S3: set EncodingType in ListBucketResult
Signed-off-by: Victor Makarov <vitja.makarov@gmail.com>
(cherry picked from commit d2e281d2beb0a49aae0fd939f9387cb2af2692c8)
X-Github-PR: 7712
Backport: hammer Signed-off-by: Robin H. Johnson <robin.johnson@dreamhost.com>
src/rgw/rgw_bucket.cc
1. Do not use the rgw_user structure and remove the tenant parameter that describes as below
2. user_id is not used so just remove the line
3. instead of system_obj_set_attr you can use the method set_attr
Backport Change:
We do not use the rgw_user structure and remove the `tenant` parameter
because this feature is not introduced on hammer version.
The rgw multi-tenant feature is introduced on pr#6784 (https://github.com/ceph/ceph/pull/6784)
This feature is supported from v10.0.2 and later version.
Vicente Cheng [Tue, 9 Feb 2016 20:03:24 +0000 (12:03 -0800)]
rgw: user quota may not adjust on bucket removal
Description:
If the user/admin removes a bucket using --force/--purge-objects options with s3cmd/radosgw-admin respectively, the user stats will continue to reflect the deleted objects for quota purposes, and there seems to be no way to reset them. User stats need to be sync'ed prior to bucket removal.
Solution:
Sync user stats before removing a bucket.
src/rgw/rgw_op.cc
reordering the check seqence and replace some op_ret to ret
Backport Change:
We remove the `tenant` parameter because this feature is not introduced on hammer version.
The rgw multi-tenant feature is introduced on pr#6784 (https://github.com/ceph/ceph/pull/6784)
This feature is supported from v10.0.2 and later version.
ceph.spec.in: disable lttng and babeltrace explicitly
before this change, we do not pacakge tracepoint probe shared libraries
on rhel7. but "configure" script enables them if lttng is detected. and
rpm complains at seeing installed but not pacakged files. as EPEL-7 now
includes lttng-ust-devel and libbabeltrace-devel, we'd better
BuildRequire them, and build with them unless disabled otherwise. so in
this change
* make "lttng" an rpm build option enabled by default
* BuildRequire lttng-ust-devel and libbabeltrace-devel if the "lttng"
"lttng" option is enabled
* --without-lttng --without-babeltrace if the "lttng" option is disabled
hammer: monclient: avoid key renew storm on clock skew
Refreshing rotating keys too often is a symptom of a clock skew, try to
detect it and don't cause extra problems:
* MonClient::_check_auth_rotating:
- detect and report premature keys expiration due to a time skew
- rate limit refreshing the keys to avoid excessive RAM and CPU usage
(both by OSD in question and monitors which have to process a lot
of auth messages)
* MonClient::wait_auth_rotating: wait for valid (not expired) keys
* OSD::init(): bail out after 10 attempts to obtain the rotating keys
the gmt_hitset is enabled by default in the ctor of pg_pool_t, this
is intentional. because we want to remove this setting and make
gmt_hitset=true as a default in future. but this forces us to
disable it explicitly when preparing a new pool if any OSD does
not support gmt hitset.
Kefu Chai [Fri, 5 Jun 2015 13:06:48 +0000 (21:06 +0800)]
osd: use GMT time for the object name of hitsets
* bump the encoding version of pg_hit_set_info_t to 2, so we can
tell if the corresponding hit_set is named using localtime or
GMT
* bump the encoding version of pg_pool_t to 20, so we can know
if a pool is using GMT to name the hit_set archive or not. and
we can tell if current cluster allows OSDs not support GMT
mode or not.
* add an option named `osd_pool_use_gmt_hitset`. if enabled,
the cluster will try to use GMT mode when creating a new pool
if all the the up OSDs support GMT mode. if any of the
pools in the cluster is using GMT mode, then only OSDs
supporting GMT mode are allowed to join the cluster.
Conflicts:
src/include/ceph_features.h
src/osd/ReplicatedPG.cc
src/osd/osd_types.cc
src/osd/osd_types.h
fill pg_pool_t with default settings in master branch.
test/bufferlist: do not expect !is_page_aligned() after unaligned rebuild
if the size of a bufferlist is page aligned we allocate page aligned
memory chunk for it when rebuild() is called. otherwise we just call
the plain new() to allocate new memory chunk for holding the continuous
buffer. but we should not expect that `new` allocator always returns
unaligned memory chunks. instead, it *could* return page aligned
memory chunk as long as the allocator feels appropriate. so, the
`EXPECT_FALSE(bl.is_page_aligned())` after the `rebuild()` call is
removed.
Sage Weil [Tue, 6 Oct 2015 18:35:35 +0000 (14:35 -0400)]
osd/PG: fix generate_past_intervals
We may be only calculating older past intervals and have a valid
history.same_interval_since value, in which case the local
same_interval_since value will end at the newest old interval we had to
generate.
mon: Monitor: get rid of weighted clock skew reports
By weighting the reports we were making it really hard to get rid of a
clock skew warning once the cause had been fixed.
Instead, as soon as we get a clean bill of health, let's run a new round
and soon as possible and ascertain whether that was a transient fix or
for realsies. That should be better than the alternative of waiting for
an hour or something (for a large enough skew) for the warning to go
away - and with it, the admin's sanity ("WHAT AM I DOING WRONG???").
When in the presence of a clock skew, adjust the checking interval
according to how many rounds have gone by since the last clean check.
If a skew is detected, instead of waiting an additional 300 seconds we
will perform the check more frequently, gradually backing off the
frequency if the skew is still in place (up to a maximum of
'mon_timecheck_interval', default: 300s). This will help with transient
skews.
Conflicts:
src/common/config_opts.h
Merge the change line.
src/mon/Monitor.h
handle_timecheck_leader(MonOpRequestRef op) was replaced with handle_timecheck_leader(MTimeCheck *m)
also for handle_timecheck_peon and handle_timecheck.
Dan Mick [Thu, 26 Nov 2015 03:20:51 +0000 (19:20 -0800)]
test/librados/test.cc: clean up EC pools' crush rules too
SetUp was adding an erasure-coded pool, which automatically adds
a new crush rule named after the pool, but only removing the
pool. Remove the crush rule as well.
Sage Weil [Thu, 10 Mar 2016 13:28:59 +0000 (08:28 -0500)]
osd/MonCommand: add/fix up 'osd [test-]reweight-by-{pg,utilization}'
- show before/after pg placement stats
- add test- variants that don't do anything
- only allow --no-increasing on the -utilization versions (where
it won't conflict with the optional pool list and confuse the
arg parsing)
Dan van der Ster [Fri, 26 Feb 2016 20:52:41 +0000 (21:52 +0100)]
osd: add sure and no-increasing options to reweight-by-*
Add a --no-increasing option to reweight-by-* which can be used to only decrease
OSD weights without increasing any. This is useful for example if you need to
urgently lower the weight of nearly full OSDs.
Also add a --yes-i-really-mean-it confirmation to reweight-by-*.
Jason Dillaman [Wed, 9 Mar 2016 23:00:04 +0000 (18:00 -0500)]
librbd: complete cache reads on cache's dedicate thread
If a snapshot is created out-of-band, the next IO will result in the
cache being flushed. If pending writeback data performs a copy-on-write,
the read from the parent will be blocked.
Sage Weil [Mon, 16 Nov 2015 16:32:34 +0000 (11:32 -0500)]
osdc/Objecter: call notify completion only once
If we race with a reconnect we could get a second notify message
before the notify linger op is torn down. Ensure we only ever
call the notify completion once to prevent a segfault.
Brad Hubbard [Fri, 4 Mar 2016 03:06:47 +0000 (13:06 +1000)]
tests: Add TEST_no_segfault_for_bad_keyring to test/mon/misc.sh
94da46b6e31cac206cb32fc5bd3159209ee25e8c adds
TEST_no_segfault_for_bad_keyring which requires changes to run
in hammer since test/mon/misc.sh is not written to run multiple tests in
succession in the hammer version.
Dunrong Huang [Wed, 25 Nov 2015 11:03:03 +0000 (19:03 +0800)]
auth: fix a crash issue due to CryptoHandler::create() failed
In this case(e.g. user passes wrong key), attempts to call the CryptoKey.ckh will lead to a segfault.
This patch fixes crash issue like following:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffed10e700 (LWP 25051)]
0x00007ffff59896c6 in CryptoKey::encrypt (this=0x7fffed10d4f0, cct=0x555555829c30, in=..., out=..., error=0x7fffed10d440) at auth/cephx/../Crypto.h:110
110 return ckh->encrypt(in, out, error);
(gdb) bt
at auth/cephx/../Crypto.h:110
at auth/cephx/CephxProtocol.h:464
Piotr Dałek [Thu, 3 Mar 2016 10:30:53 +0000 (11:30 +0100)]
common/obj_bencher.cc: make verify error fatal
When run without "--no-verify", all verification errors are noted,
but they are not forwarded/reported anywhere else but to cerr, which
will cause automated testing to ignore them. Make seq_read_bench and
rand_read_bench return -EIO on any verification error which will,
in turn, return it back to caller.