Casey Bodley [Thu, 5 Oct 2017 20:39:30 +0000 (16:39 -0400)]
rgw: RGWUser::init no longer overwrites user_id
if an admin op specifies a user_id and does not find a user with that
id, but does find a user based on a later field (email, access key,
etc), RGWUser::user_id will be overwritten with the existing user's id
when this happens on 'radosgw-admin user create', RGWUser::execute_add()
will modify that existing user, instead of trying to create a new user
with the given user_id (and failing due to the conflicting email,
access key, etc)
by preserving the original user_id (when specified), this uid conflict
is detected in RGWUser::check_op() and a "user id mismatch" error is
returned
Kefu Chai [Fri, 22 Dec 2017 14:42:16 +0000 (22:42 +0800)]
install-deps.sh: update g++ symlink also
we need to update g++ symlink also, if it points to the wrong version
http://tracker.ceph.com/issues/22220 Signed-off-by: Kefu Chai <kchai@redhat.com>
Conflicts: the libboost issue does not affect master. as master builds
boost from source. so, it's not cherry-picked from master.
Kefu Chai [Wed, 13 Dec 2017 05:36:54 +0000 (13:36 +0800)]
install-deps.sh: point gcc to the one shipped by distro
to define a struct in a method is legal in C++11, but it causes internal
compiler error due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82155
if we are using GCC-7. so we need to either workaround in our source
code by moving the struct definition out of the member method or revert
to a GCC without this bug. but if we go with the first route, the jewel
build still fails, because GCC-7 starts to use the new CXX11 ABI, which
is not compatible with the libboost we use in jewel. the libboost was
still built with the old ABI for backward compatibility. so let's just
fix the install-deps.sh to point gcc to the origin one.
See: http://tracker.ceph.com/issues/22220 Signed-off-by: Kefu Chai <kchai@redhat.com>
Conflicts: the libboost issue does not affect master. as master builds
boost from source. so, it's not cherry-picked from master.
This is needed for jewel-x point to point upgrade because earlier point
releases can't handle our ec profiles with ruleset-* (later ones can) and
the test races with the mon upgrades.
Nathan Cutler [Tue, 21 Nov 2017 10:36:02 +0000 (11:36 +0100)]
tests: ceph-disk: ignore E722 in flake8 test
Very old, and very new, versions of flake8 treat E722 as an error:
flake8 runtests: commands[0] | flake8 --ignore=H105,H405,E127 ceph_disk tests
ceph_disk/main.py:1575:9: E722 do not use bare except'
ceph_disk/main.py:1582:9: E722 do not use bare except'
ceph_disk/main.py:3252:5: E722 do not use bare except'
ceph_disk/main.py:3288:21: E722 do not use bare except'
ceph_disk/main.py:3296:17: E722 do not use bare except'
ceph_disk/main.py:4358:5: E722 do not use bare except'
tests/test_main.py:26:1: E722 do not use bare except'
ERROR: InvocationError: '/opt/j/ws/mkck/src/ceph-disk/.tox/flake8/bin/flake8 --ignore=H105,H405,E127 ceph_disk tests'
Conflicts: remove bluestore.yaml as jewel does not support it. and
remove links to objectstore from where the tests do not exist in jewel
yet, for instance, qa/suites/mgr/basic.
liuchang0812 [Fri, 30 Jun 2017 12:56:04 +0000 (20:56 +0800)]
osd: new command compact via tell/daemon
user could manual compact OSD's omap as following:
1. ceph tell osd.id compact
2. ceph daemon osd.id compact
user's requests will be impacted during compaction.
Fixes: http://tracker.ceph.com/issues/19592 Signed-off-by: liuchang0812 <liuchang0812@gmail.com>
(cherry picked from commit b4ad4297652df2f6ebfadcdededc7a47607ab534)
Conflicts:
src/osd/OSD.cc
Removed all admin socket register and unregister commands
which are not part of this backport
Changed admin_command to command variable because in jewel
we use command variable.
Conflicts:
src/os/bluestore/BlueStore.h
Removed declarations which are not part of this backport
inject_data_error()
inject_mdata_error()
debug_data_eio()
debug_mdata_eio()
debug_oj_on_delete()
src/os/filestore/FileStore.h
Removed declarations which are not part of this backport
set<ghobject_t> data_error_set
set<ghobject_t> mdata_error_set
inject_data_error() override
inject_mdata_error() override
Kefu Chai [Sat, 7 Oct 2017 14:15:11 +0000 (22:15 +0800)]
ceph-disk: retry on OSError
we are likely to
1) create partition, for instance, sdc1
2) partprobe sdc
3) udevadm settle
4) check the device by its path: /dev/sdc1
but there is chance that the uevent sent from kernel fails to reach udev
before we call "udevadm", hence "/dev/sdc1" does not exist even after
"udevadm settle" returns. so we retry in case of OSError here.
Conflicts:
src/ceph-disk/ceph_disk/main.py: jewel does not have PROCDIR,
so resolve it by using '/proc'. also, in jewel, unmount() does not
have `do_rm` parameter, so do not handle it.
Conflicts:
qa/objectstore/filestore-btrfs.yaml: we add some notes in this
file in master, but didn't backport the commit(s) adding these notes to
jewel. we are removing this file anyway. so who cares!
Conflicts:
qa/suites/fs/recovery/xfs.yaml: in master, this file is factored
into a facet: a/suites/fs/recovery/fs/xfs.yaml, but in jewel, it still a
plain xfs.yaml. but it's good enough for us, as what we need is just
xfs.
Matt Benjamin [Mon, 2 Oct 2017 15:49:05 +0000 (11:49 -0400)]
radosgw: fix awsv4 header line sort order.
The awsv4 signature calculation includes a list of header lines, which
are supposed to be sorted. The existing code sorts by header name, but
it appears that in fact it is necessary to sort the whole header *line*,
not just the field name. Sorting by just the field name usually works,
but not always. The s3-tests teuthology suite includes
s3tests.functional.test_s3.test_object_header_acl_grants
s3tests.functional.test_s3.test_bucket_header_acl_grants
which include the following header lines,
tests - Added suit to test upgraded clients against jewel ceph clusters
Replaces https://github.com/ceph/ceph/pull/17981
We need to run this suite using suite-branch option in
order to use jewel workloads agains ceph cluster luminous+ branches
Added 'libcephfs1' to exclude_packages in upgrade_workload
tests: use special branch of ceph/s3-tests with pre-10.2.10
Jewel v10.2.10 introduces a fix for S3 ACL code, for which a new test was added
to ceph/s3-tests.git (ceph-jewel branch). Since the jewel point-to-point-x
upgrade test runs s3-tests on 10.2.7, modify the test to use a special
ceph/s3-tests branch (ceph-jewel-10-2-7) that omits the new test.
Sage Weil [Fri, 3 Mar 2017 03:20:08 +0000 (21:20 -0600)]
osdc/Objecter: resend RWORDERED ops on full
Our condition for respecting the FULL flag is complex, and involves
the WRITE | RWORDERED flags vs the FULL_FORCE | FULL_TRY flags. Previously,
we could block a read bc of RWORDRED but not resend it later.
Fix by capturing the complex condition in a respects_full() bool and using
it both for the blocking-on-send and resending-on-possibly-notfull-later
checks.
Additionally, the ceph-{fuse,mds,mon,osd,radosgw,rbd-mirror}
targets have WantedBy=multi-user.target. This gives the
following behaviour:
- `systemctl {start,stop,restart}` of any target will restart
all dependent services (e.g.: `systemctl restart ceph.target`
will restart all services; `systemctl restart ceph-mon.target`
will restart all the mons, and so forth).
- `systemctl {enable,disable}` for the second level targets
(ceph-mon.target etc.) will cause depenent services to come
up on boot, or not (of course the individual services can
be enabled or disabled as well - for a service to start
on boot, both the service and its target must be enabled;
disabling either will cause the service to be disabled).
- `systemctl {enable,disable} ceph.target` has no effect on
whether or not services come up at boot; if the second level
targets and services are enabled, they'll start regardless of
whether ceph.target is enabled. This is due to the second
level targets all having WantedBy=multi-user.target.
- The OSDs will always start regardless of ceph-osd.target
(unless they are explicitly masked), thanks to udev magic.
So far, so good. Except, several users have encountered
services not starting with the following error:
Failed to start ceph-osd@5.service: Transaction order is
cyclic. See system logs for details.
I've not been able to reproduce this myself in such a way as to
cause OSDs to fail to start, but I *have* managed to get systemd
into that same confused state, as follows:
- Disable ceph.target, ceph-mon.target, ceph-osd.target,
ceph-mon@$(hostname).service and all ceph-osd instances.
- Re-enable all of the above.
At this point, everything is fine, but if I then subseqently
disable ceph.target, *then* try `systemctl restart ceph.target`,
I get "Failed to restart ceph.target: Transaction order is cyclic.
See system logs for details."
Explicitly adding Before=ceph.target to each second level target
prevents systemd from becoming confused in this situation.
David Zafman [Tue, 15 Aug 2017 21:45:13 +0000 (14:45 -0700)]
osd: Fixes for osd_scrub_during_recovery handling
Fixes: http://tracker.ceph.com/issues/18206 Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 367c32c69a512d2bea85a9b3860ec28bb4433750)
Conflicts:
src/osd/OSD.cc (trivial)
src/osd/PG.cc (trivial)
src/test/osd/osd-recovery-scrub.sh (moved from qa/standalone/scrub/osd-recovery-scrub.sh)
Fixes to osd-recovery_scrub.sh for Jewel compatibility
src/osd/OSD.h (Jewel only - moved is_recovery_active() to OSDService)
src/test/Makefile.am (Jewel only - add test to make check)
src/test/osd/CMakeLists.txt (Jewel only - add test to make check)
lu.shasha [Tue, 27 Jun 2017 02:53:30 +0000 (10:53 +0800)]
rgw: fix radosgw-admin data sync run crash
If sync thread have run before, then run data sync init. sync_status is still remain in rados pool. so no matter sync_status exists or not, if state is StateInit, sync_status.sync_info.num_shards should be updated.