]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
8 years agoRevert "use the create option during instantiation" 13106/head
Vasu Kulkarni [Thu, 26 Jan 2017 21:21:30 +0000 (13:21 -0800)]
Revert "use the create option during instantiation"

jewel cephfs still uses old Filesystem initializtion method

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
8 years agouse dev option instead of dev-commit
Vasu Kulkarni [Thu, 15 Dec 2016 22:11:00 +0000 (14:11 -0800)]
use dev option instead of dev-commit

Fixes: http://tracker.ceph.com/issues/18736
Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
8 years agoMerge pull request #13103 from dillaman/wip-18672 12424/head 13134/head
Jason Dillaman [Wed, 25 Jan 2017 15:40:25 +0000 (10:40 -0500)]
Merge pull request #13103 from dillaman/wip-18672

jewel: qa/workunits/rbd: use more recent qemu-iotests that support Xenial

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
8 years agoqa/workunits/rbd: use more recent qemu-iotests that support Xenial 13103/head
Jason Dillaman [Mon, 5 Dec 2016 18:46:02 +0000 (13:46 -0500)]
qa/workunits/rbd: use more recent qemu-iotests that support Xenial

Fixes: http://tracker.ceph.com/issues/18149
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 4314cb945a9c2296e2f7cd357b09015777f233c0)

8 years agoqa/workunits/rbd: removed qemu-iotest case 077
Jason Dillaman [Wed, 7 Dec 2016 14:59:39 +0000 (09:59 -0500)]
qa/workunits/rbd: removed qemu-iotest case 077

The test case is not stable due to racing console output. This
results in spurious failures.

Fixes: http://tracker.ceph.com/issues/10773
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 2c70df978d605a45ff81971b86f5afbefbdaabb6)

8 years agoMerge pull request #12137 from jcsp/wip-17974
John Spray [Wed, 25 Jan 2017 13:57:17 +0000 (14:57 +0100)]
Merge pull request #12137 from jcsp/wip-17974

jewel: client: fix stale entries in command table

8 years agoMerge pull request #12686 from SUSE/wip-18272-jewel
John Spray [Wed, 25 Jan 2017 13:56:24 +0000 (14:56 +0100)]
Merge pull request #12686 from SUSE/wip-18272-jewel

jewel: tests: Workunits needlessly wget from git.ceph.com

8 years agoMerge pull request #12836 from SUSE/wip-18462-jewel
John Spray [Wed, 25 Jan 2017 13:56:03 +0000 (14:56 +0100)]
Merge pull request #12836 from SUSE/wip-18462-jewel

jewel: Decode errors on backtrace will crash MDS

8 years agoMerge pull request #13023 from SUSE/wip-18603-jewel
John Spray [Wed, 25 Jan 2017 13:55:46 +0000 (14:55 +0100)]
Merge pull request #13023 from SUSE/wip-18603-jewel

jewel: cephfs test failures (ceph.com/qa is broken, should be download.ceph.com/qa)

8 years agoMerge pull request #12155 from dachary/wip-17956-jewel
John Spray [Wed, 25 Jan 2017 13:55:28 +0000 (14:55 +0100)]
Merge pull request #12155 from dachary/wip-17956-jewel

jewel: Clients without pool-changing caps shouldn't be allowed to change pool_namespace

8 years agoMerge pull request #12325 from dachary/wip-18026-jewel
John Spray [Wed, 25 Jan 2017 13:55:11 +0000 (14:55 +0100)]
Merge pull request #12325 from dachary/wip-18026-jewel

jewel: ceph_volume_client.py : Error: Can't handle arrays of non-strings

8 years agoMerge pull request #13060 from asheplyakov/jewel-bp-18615
John Spray [Wed, 25 Jan 2017 13:54:51 +0000 (14:54 +0100)]
Merge pull request #13060 from asheplyakov/jewel-bp-18615

jewel: mds: fix null pointer dereference in Locker::handle_client_caps

8 years agoMerge pull request #11656 from ajarr/wip-17705-jewel
John Spray [Wed, 25 Jan 2017 13:54:35 +0000 (14:54 +0100)]
Merge pull request #11656 from ajarr/wip-17705-jewel

jewel: ceph_volume_client: fix recovery from partial auth update

8 years agoMerge pull request #12154 from dachary/wip-18008-jewel
John Spray [Wed, 25 Jan 2017 13:54:06 +0000 (14:54 +0100)]
Merge pull request #12154 from dachary/wip-18008-jewel

jewel: Cannot create deep directories when caps contain path=/somepath

8 years agoMerge pull request #13085 from jcsp/wip-18361-jewel
John Spray [Wed, 25 Jan 2017 13:53:45 +0000 (14:53 +0100)]
Merge pull request #13085 from jcsp/wip-18361-jewel

jewel: client: populate metadata during mount

8 years agoqa/tasks/cephfs/filesystem.py: backport _write_data_xattr() function 12836/head
Nathan Cutler [Tue, 24 Jan 2017 14:49:24 +0000 (15:49 +0100)]
qa/tasks/cephfs/filesystem.py: backport _write_data_xattr() function

This is a partial manual backport of 5f77f09b019b607b84e6a8f89ce19065383ca108

It is needed by test_corrupt_backtrace() in qa/tasks/cephfs/test_damage.py

Signed-off-by: Nathan Cutler <ncutler@suse.com>
8 years agoclient: populate metadata during mount 13085/head
John Spray [Fri, 13 Jan 2017 00:30:28 +0000 (00:30 +0000)]
client: populate metadata during mount

This way we avoid having to over-write the "root"
metadata during mount, and any user-set overrides (such
as bad values injected by tests) will survive.

Because Client instances may also open sessions without
mounting to send commands, add a call into populate_metadata
from mds_command as well.

Fixes: http://tracker.ceph.com/issues/18361
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 1dbff09ad553f9ff07f4f4217ba7ece6c2cdc5d2)

8 years agomds: fix null pointer dereference in Locker::handle_client_caps 13060/head
Yan, Zheng [Fri, 6 Jan 2017 07:42:52 +0000 (15:42 +0800)]
mds: fix null pointer dereference in Locker::handle_client_caps

Locker::handle_client_caps delays processing cap message if the
corresponding inode is freezing or frozen. When the message gets
processed, client can have already closed the session.

Fixes: http://tracker.ceph.com/issues/18306
Signed-off-by: Yan, Zheng <zyan@redhat.com>
(cherry picked from commit e281a0b9c1fdeaf09f1b01f34cecd62e4f49d02e)

8 years agoqa: update remaining ceph.com to download.ceph.com 13023/head
John Spray [Tue, 17 Jan 2017 16:12:46 +0000 (17:12 +0100)]
qa: update remaining ceph.com to download.ceph.com

Fixes: http://tracker.ceph.com/issues/18574
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 549d993d3fd8ffffa280ed4a64aca41d1c6f2da1)

Conflicts:
qa/tasks/cram.py (trivial resolution)

8 years agoMerge pull request #12766 from jtlayton/wip-18408-jewel
Nathan Cutler [Fri, 20 Jan 2017 14:50:16 +0000 (15:50 +0100)]
Merge pull request #12766 from jtlayton/wip-18408-jewel

client: Fix lookup of "/.." in jewel

Reviewed-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
8 years agoMerge pull request #12147 from dachary/wip-18007-jewel
Loic Dachary [Fri, 20 Jan 2017 11:31:26 +0000 (12:31 +0100)]
Merge pull request #12147 from dachary/wip-18007-jewel

jewel: ceph-disk: ceph-disk@.service races with ceph-osd@.service

Reviewed-by: Nathan Cutler <ncutler@suse.cz>
8 years agoMerge pull request #12983 from ceph/wip-cherry-pick-4vasu
vasukulkarni [Wed, 18 Jan 2017 20:43:34 +0000 (12:43 -0800)]
Merge pull request #12983 from ceph/wip-cherry-pick-4vasu

qa: Wip cherry pick https://github.com/ceph/ceph/pull/12969

8 years agoAdd ceph-create-keys to explicitly create admin/bootstrap keys 12983/head
Vasu Kulkarni [Tue, 10 Jan 2017 00:45:01 +0000 (16:45 -0800)]
Add ceph-create-keys to explicitly create admin/bootstrap keys

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
(cherry picked from commit 68f9b7eb3c0548c88650f67fb72c6ff9bc0f3ead)

8 years agoRemove debug overrides
Vasu Kulkarni [Tue, 10 Jan 2017 01:59:20 +0000 (17:59 -0800)]
Remove debug overrides

the high level of debug for mon/osd is causing remoto to hang during get key

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
(cherry picked from commit f7dcc74cd3f119a2f65584fdb544c08d115f8c39)

8 years agouse the create option during instantiation
Vasu Kulkarni [Tue, 10 Jan 2017 23:43:12 +0000 (15:43 -0800)]
use the create option during instantiation

Signed-off-by: Vasu Kulkarni <vasu@redhat.com>
(cherry picked from commit be836bb30960000468c79e08fb416ceefd79d7db)

8 years agoMerge pull request #12210 from ddiss/tracker18049_ceph_disk_trigger_flock_timeout_jewel
Loic Dachary [Wed, 18 Jan 2017 16:12:54 +0000 (17:12 +0100)]
Merge pull request #12210 from ddiss/tracker18049_ceph_disk_trigger_flock_timeout_jewel

jewel: systemd/ceph-disk: reduce ceph-disk flock contention

Reviewed-by: Nathan Cutler <ncutler@suse.cz>
8 years agoMerge pull request #12959 from SUSE/wip-18545-jewel
Jason Dillaman [Tue, 17 Jan 2017 13:41:25 +0000 (08:41 -0500)]
Merge pull request #12959 from SUSE/wip-18545-jewel

jewel: [teuthology] update Ubuntu image url after ceph.com refactor

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
8 years agoqa/tasks/qemu: update default image url after ceph.com redesign 12959/head
Jason Dillaman [Tue, 17 Jan 2017 03:12:51 +0000 (22:12 -0500)]
qa/tasks/qemu: update default image url after ceph.com redesign

Fixes: http://tracker.ceph.com/issues/18542
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 6d17befb3bbc3d83c9d23d763ad95e1e7b2e4be0)

8 years agotest_volume_client: remove superfluous arguments 11656/head
Ramana Raja [Tue, 11 Oct 2016 08:48:29 +0000 (14:18 +0530)]
test_volume_client: remove superfluous arguments

Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit bb60e01904187db417e8c7d6e57401823a0072fd)

8 years agotest_volume_client: check volume size
Ramana Raja [Tue, 11 Oct 2016 08:10:43 +0000 (13:40 +0530)]
test_volume_client: check volume size

Check that the total size shown by the df output of a mounted volume
is same as the volume size and the quota set on the volume.

Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit 91c74f4778ce5433968226345ffe26e876eb56a7)

8 years agotasks/cephfs: test recovery of partial auth update
Ramana Raja [Tue, 6 Sep 2016 12:01:04 +0000 (17:31 +0530)]
tasks/cephfs: test recovery of partial auth update

... in ceph_volume_client.

Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit f0134a3db576282ed05d4b94b969b9593297669d)

8 years agoceph_volume_client: fix partial auth recovery
Ramana Raja [Tue, 4 Oct 2016 08:25:46 +0000 (13:55 +0530)]
ceph_volume_client: fix partial auth recovery

... for volumes whose group_id is None.

Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit 0ab8badcf3ffe685135af17dc28b238f6e686922)

8 years agoceph_volume_client: check if volume metadata is empty
Ramana Raja [Wed, 28 Sep 2016 08:36:54 +0000 (14:06 +0530)]
ceph_volume_client: check if volume metadata is empty

... when recovering from partial auth updates.

Auth update happens in the following order:
auth metadata update, volume metadata update, and then Ceph auth
update.

A partial auth update can happen such that auth metadata is updated,
but the volume metadata isn't updated and is empty, and the auth
update did not propogate to Ceph. When recovering from such a
scenario, check if volume metadata is empty and if so remove the
partial auth update info in auth metadata.

Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit a95de7882cdf70e04e3c918ff41fc690d0d9bda3)

8 years agoceph_volume_client: fix _recover_auth_meta() method
Ramana Raja [Tue, 4 Oct 2016 11:20:13 +0000 (16:50 +0530)]
ceph_volume_client: fix _recover_auth_meta() method

It needs to be an instance method.

Fixes: http://tracker.ceph.com/issues/17216
Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit 675cb91b68c1b54698708d604253ab9d1b2abdec)

8 years agoMerge pull request #12745 from SUSE/wip-18386-jewel
Loic Dachary [Fri, 13 Jan 2017 10:10:39 +0000 (11:10 +0100)]
Merge pull request #12745 from SUSE/wip-18386-jewel

jewel: tests: use ceph-jewel branch for s3tests

Reviewed-by: Loic Dachary <ldachary@redhat.com>
8 years agoMerge pull request #12912 from liewegas/wip-workunits-jewel
Josh Durgin [Thu, 12 Jan 2017 21:58:42 +0000 (13:58 -0800)]
Merge pull request #12912 from liewegas/wip-workunits-jewel

qa/tasks/workunits: backport misc fixes to jewel

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
8 years agoqa/tasks/workunit: clear clone dir before retrying checkout 12912/head
Sage Weil [Thu, 22 Dec 2016 18:05:22 +0000 (13:05 -0500)]
qa/tasks/workunit: clear clone dir before retrying checkout

If we checkout ceph-ci.git, and don't find a branch,
we'll try again from ceph.git. But the checkout will
already exist and the clone will fail, so we'll still
fail to find the branch.

The same can happen if a previous workunit task already
checked out the repo.

Fix by removing the repo before checkout (the first and
second times).  Note that this may break if there are
multiple workunit tasks running in parallel on the same
role.  That is already racy, so if it's happening, we'll
want to switch to using a truly unique clonedir for each
instantiation.

Fixes: http://tracker.ceph.com/issues/18336
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 2a7013cd5a033c5be43350505d75f088e831e201)

8 years agoqa/tasks/workunit: retry on ceph.git if checkout fails
Sage Weil [Fri, 16 Dec 2016 20:06:16 +0000 (15:06 -0500)]
qa/tasks/workunit: retry on ceph.git if checkout fails

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 72d73b8c8836ae35c518fa09f44805a74038f02a)

8 years agoqa/tasks/workunit.py: add CEPH_BASE env var
Sage Weil [Thu, 15 Dec 2016 18:26:14 +0000 (13:26 -0500)]
qa/tasks/workunit.py: add CEPH_BASE env var

Root of git checkout

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 27b8eac24922f8b4bd065e6e7f0bc8e2ba37b5d5)

8 years agoqa/tasks/workunit: leave workunits inside git checkout
Sage Weil [Thu, 15 Dec 2016 18:25:23 +0000 (13:25 -0500)]
qa/tasks/workunit: leave workunits inside git checkout

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 4602884ab8f5a256d13091f7239d938990482d95)

8 years agoMerge pull request #12791 from athanatos/wip-15943-jewel
Loic Dachary [Thu, 12 Jan 2017 06:29:34 +0000 (07:29 +0100)]
Merge pull request #12791 from athanatos/wip-15943-jewel

jewel: crash adding snap to purged_snaps in ReplicatedPG::WaitingOnReplicas (part 2)

Reviewed-by: Loic Dachary <ldachary@redhat.com>
8 years agoMerge pull request #12868 from athanatos/wip-17899-jewel
Samuel Just [Wed, 11 Jan 2017 00:25:18 +0000 (16:25 -0800)]
Merge pull request #12868 from athanatos/wip-17899-jewel

OSDMonitor: only reject MOSDBoot based on up_from if inst matches

Reviewed-by: Sage Weil <sage@redhat.com>
8 years agoqa/tasks: add test_corrupt_backtrace
John Spray [Thu, 5 Jan 2017 13:40:41 +0000 (13:40 +0000)]
qa/tasks: add test_corrupt_backtrace

Validate that we get EIO and a damage table entry
when seeing a decode error on a backtrace.

Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 5f6cdab80f6e2f09af5783c8f616d8ddd6d9f428)

8 years agomds: check for errors decoding backtraces
John Spray [Tue, 20 Dec 2016 18:04:47 +0000 (18:04 +0000)]
mds: check for errors decoding backtraces

Fixes: http://tracker.ceph.com/issues/18311
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 6f489c74ac0040631fde0ceb0926cbab24d3ad55)

8 years agoPG: fix cached_removed_snaps bug in PGPool::update after map gap 12791/head
Samuel Just [Mon, 12 Dec 2016 18:35:38 +0000 (10:35 -0800)]
PG: fix cached_removed_snaps bug in PGPool::update after map gap

5798fb3bf6d726d14a9c5cb99dc5902eba5b878a actually made 15943 worse
by always creating an out-of-date cached_removed_snaps value after
a map gap rather than only in the case where the the first map after
the gap did not remove any snapshots.

Introduced: 5798fb3bf6d726d14a9c5cb99dc5902eba5b878a
Fixes: http://tracker.ceph.com/issues/15943
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 5642e7e1b3bb6ffceddacd2f4030eb13a17fcccc)

8 years agoqa/config/rados.yaml: enable osd_debug_verify_cached_snaps
Samuel Just [Wed, 14 Dec 2016 23:48:59 +0000 (15:48 -0800)]
qa/config/rados.yaml: enable osd_debug_verify_cached_snaps

Also, make map gaps more likely.

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit d4b6615a49e4635113f9ba900e9c57147b224357)

8 years agoPG::handle_advance_map: add debugging option to verify cached_removed_snaps
Samuel Just [Mon, 12 Dec 2016 18:33:13 +0000 (10:33 -0800)]
PG::handle_advance_map: add debugging option to verify cached_removed_snaps

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit aeb8fef92469831d94f06db457a4ba15b5b0e3c5)

8 years agoclient: don't use special faked-up inode for /.. 12766/head
Jeff Layton [Tue, 3 Jan 2017 17:56:51 +0000 (12:56 -0500)]
client: don't use special faked-up inode for /..

The CEPH_INO_DOTDOT thing is quite strange. Under most OS (Linux
included), the parent of the root is itself. IOW, at the root, '.' and
'..' refer to the same inode.

Change the ceph client to do the same, as this allows users to get
valid stat info for '..', as well as elimnating some special-casing.

Also in several places, we're checking dn_set.empty as an indicator
of being the root. While that is true for the root, it's also true
for unlinked directories.

This patch has treats them the same. An unlinked directory will
be reparented to itself, effectively acting as a root of its own.

Fixes: http://tracker.ceph.com/issues/18408
Signed-off-by: Jeff Layton <jlayton@redhat.com>
(cherry picked from commit 30d4ca01db0de9a1e12658793ba9bf9faf0331dd)

8 years agotests: rbd/test_lock_fence.sh: fix rbdrw.py relative path 12686/head
Nathan Cutler [Mon, 2 Jan 2017 21:49:13 +0000 (22:49 +0100)]
tests: rbd/test_lock_fence.sh: fix rbdrw.py relative path

This commit fixes a regression introduced by
cf294777ea92f0911813a7132068584d4f73a65a

Fixes: http://tracker.ceph.com/issues/18388
Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit 91231de16dbe4d0e493ec617165a2b38078d122b)

8 years agotests: use ceph-jewel branch for s3tests 12745/head
Orit Wasserman [Mon, 4 Jan 2016 09:03:08 +0000 (10:03 +0100)]
tests: use ceph-jewel branch for s3tests

Signed-off-by: Nathan Cutler <ncutler@suse.com>
8 years agoqa/tasks/workunit: clear clone dir before retrying checkout
Sage Weil [Thu, 22 Dec 2016 18:05:22 +0000 (13:05 -0500)]
qa/tasks/workunit: clear clone dir before retrying checkout

If we checkout ceph-ci.git, and don't find a branch,
we'll try again from ceph.git. But the checkout will
already exist and the clone will fail, so we'll still
fail to find the branch.

The same can happen if a previous workunit task already
checked out the repo.

Fix by removing the repo before checkout (the first and
second times).  Note that this may break if there are
multiple workunit tasks running in parallel on the same
role.  That is already racy, so if it's happening, we'll
want to switch to using a truly unique clonedir for each
instantiation.

Fixes: http://tracker.ceph.com/issues/18336
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 2a7013cd5a033c5be43350505d75f088e831e201)

8 years agoqa/tasks/workunit: retry on ceph.git if checkout fails
Sage Weil [Fri, 16 Dec 2016 20:06:16 +0000 (15:06 -0500)]
qa/tasks/workunit: retry on ceph.git if checkout fails

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 72d73b8c8836ae35c518fa09f44805a74038f02a)

8 years agoqa/workunits: include extension for nose tests
Sage Weil [Mon, 19 Dec 2016 19:08:11 +0000 (14:08 -0500)]
qa/workunits: include extension for nose tests

When you have a relative path you have to include the extension.
Weird.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 5666fd61d6dbd40be1d79354227cabd562e829ea)
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Conflicts:
qa/workunits/rados/test_python.sh (nosetests instead of nose)

8 years agoqa/workunits: use relative path instead of wget from git
Sage Weil [Thu, 15 Dec 2016 20:10:28 +0000 (15:10 -0500)]
qa/workunits: use relative path instead of wget from git

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit cf294777ea92f0911813a7132068584d4f73a65a)

Conflicts:
qa/workunits/rados/test_python.sh (nosetests instead of nose)

8 years agoqa/tasks/workunit.py: add CEPH_BASE env var
Sage Weil [Thu, 15 Dec 2016 18:26:14 +0000 (13:26 -0500)]
qa/tasks/workunit.py: add CEPH_BASE env var

Root of git checkout

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 27b8eac24922f8b4bd065e6e7f0bc8e2ba37b5d5)

8 years agoqa/tasks/workunit: leave workunits inside git checkout
Sage Weil [Thu, 15 Dec 2016 18:25:23 +0000 (13:25 -0500)]
qa/tasks/workunit: leave workunits inside git checkout

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 4602884ab8f5a256d13091f7239d938990482d95)

8 years agoMerge remote-tracking branch 'ceph/jewel-next' into jewel
Loic Dachary [Wed, 21 Dec 2016 23:18:11 +0000 (00:18 +0100)]
Merge remote-tracking branch 'ceph/jewel-next' into jewel

8 years agoMerge pull request #12591 from jtlayton/wip-18308-jewel
jtlayton [Wed, 21 Dec 2016 14:18:18 +0000 (09:18 -0500)]
Merge pull request #12591 from jtlayton/wip-18308-jewel

Clear setuid bits on ownership changes

8 years agoMerge branch 'jewel' into wip-18308-jewel 12591/head
jtlayton [Tue, 20 Dec 2016 20:36:39 +0000 (15:36 -0500)]
Merge branch 'jewel' into wip-18308-jewel

8 years agoMerge pull request #12592 from jtlayton/wip-18307-jewel
jtlayton [Tue, 20 Dec 2016 20:35:54 +0000 (15:35 -0500)]
Merge pull request #12592 from jtlayton/wip-18307-jewel

Fix mount root for ceph_mount users and change tarball format

8 years agoceph_disk: fix a jewel checkin test break 12592/head
Jeff Layton [Tue, 20 Dec 2016 19:44:04 +0000 (14:44 -0500)]
ceph_disk: fix a jewel checkin test break

Silly python:

    ceph_disk/main.py:173:1: E305 expected 2 blank lines after class or function definition, found 1
    ceph_disk/main.py:5011:1: E305 expected 2 blank lines after class or function definition, found 1

Signed-off-by: Jeff Layton <jlayton@redhat.com>
8 years agoautomake: convert to tar-pax
Jeff Layton [Tue, 20 Dec 2016 16:54:25 +0000 (11:54 -0500)]
automake: convert to tar-pax

We hit some recent build issues with the merge of ceph-qa-suite into
the main repo. The ustar format barfs on >100 character symlink
paths.

Convert to using "tar-pax" which should make it use the posix format.
Any build machine that we're reasonably targeting should support it.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
8 years agoclient: drop setuid/setgid bits on ownership change
Jeff Layton [Tue, 20 Dec 2016 13:17:21 +0000 (08:17 -0500)]
client: drop setuid/setgid bits on ownership change

When we hold exclusive auth caps, then the client is responsible for
handling changes to the mode. Make sure we remove any setuid/setgid
bits on an ownership change.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
(cherry picked from commit 18d2499d6c85a10b4b54f3b8c335cddf86c4588f)

8 years agomds: clear setuid/setgid bits on ownership changes
Jeff Layton [Tue, 20 Dec 2016 13:16:43 +0000 (08:16 -0500)]
mds: clear setuid/setgid bits on ownership changes

If we get a ownership change, POSIX mandates that you clear the
setuid and setgid bits unless you are "appropriately privileged", in
which case the OS is allowed to leave them intact.

Linux however always clears those bits, regardless of the process
privileges, as that makes it simpler to close some potential races.
Have ceph do the same.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
(cherry picked from commit 6da72500882d9749cb2be6eaa2568e6fe6e5ff4d)

8 years agoclient: set metadata["root"] from mount method when it's called with a pathname
Jeff Layton [Tue, 20 Dec 2016 13:07:23 +0000 (08:07 -0500)]
client: set metadata["root"] from mount method when it's called with a pathname

Currently, we only set the root properly config file or the
--client_metadata command line option. If a userland client program
tries to call ceph_mount with a pathname, it's not being properly
set.

Since we already hold the mutex, we can just update it directly.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
(cherry picked from commit 9f8810008c82eebe6e354e7e321e33a3dcba8407)

8 years agoMerge pull request #12454 from liewegas/qa-suite-jewel
Sage Weil [Wed, 14 Dec 2016 17:39:56 +0000 (11:39 -0600)]
Merge pull request #12454 from liewegas/qa-suite-jewel

jewel: merge ceph-qa-suite

8 years agomerge ceph-qa-suite 12454/head
Sage Weil [Wed, 14 Dec 2016 17:29:59 +0000 (11:29 -0600)]
merge ceph-qa-suite

8 years agomove ceph-qa-suite dirs into qa/
Sage Weil [Wed, 14 Dec 2016 17:29:55 +0000 (11:29 -0600)]
move ceph-qa-suite dirs into qa/

8 years agoRevert "tasks/workunit.py: depth 1 clone"
Sage Weil [Wed, 14 Dec 2016 17:27:58 +0000 (12:27 -0500)]
Revert "tasks/workunit.py: depth 1 clone"

This reverts commit e6f61ea9f19d0f1fad4a6547775fa80616eeeb89.

8 years agotasks/workunit.py: depth 1 clone
Sage Weil [Wed, 14 Dec 2016 17:19:44 +0000 (12:19 -0500)]
tasks/workunit.py: depth 1 clone

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 4faf77a649cb3f8ddf497ca81937b3dbf63a18dc)

8 years agotasks/workunit: remove kludge to use git.ceph.com
Sage Weil [Wed, 14 Dec 2016 17:18:29 +0000 (12:18 -0500)]
tasks/workunit: remove kludge to use git.ceph.com

This was hard-coded to ceph.git (almost) and breaks when
you specify --ceph-repo.  Remove it entirely.  We'll see if
github.com is better at handling our load than it used to
be!

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 159c455a0326eef2c017b3e3cf510f918b5ec76c)

8 years agotasks/ceph: restore context of osd mount path before mkfs
Kefu Chai [Fri, 9 Dec 2016 18:36:52 +0000 (02:36 +0800)]
tasks/ceph: restore context of osd mount path before mkfs

all newly created files and directories under the mount dir inherit the
SELinux type of their parent directory. so we need to set it before
mkfs.

Fixes: http://tracker.ceph.com/issues/16800
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 53225d5272a1d35d4183fcfa55a139f55f77e122)

8 years ago10.2.5 v10.2.5
Jenkins Build Slave User [Fri, 9 Dec 2016 20:08:24 +0000 (20:08 +0000)]
10.2.5

8 years agoMerge pull request #11865 from dachary/wip-17710-jewel
Yehuda Sadeh [Thu, 8 Dec 2016 19:22:16 +0000 (11:22 -0800)]
Merge pull request #11865 from dachary/wip-17710-jewel

jewel: multisite: race between ReadSyncStatus and InitSyncStatus leads to EIO errors

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
8 years agoMerge pull request #12376 from liewegas/wip-msgr-eagain-loop-jewel
Samuel Just [Thu, 8 Dec 2016 15:55:27 +0000 (07:55 -0800)]
Merge pull request #12376 from liewegas/wip-msgr-eagain-loop-jewel

msg/simple/Pipe: avoid returning 0 on poll timeout

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
8 years agomsg/simple/Pipe: avoid returning 0 on poll timeout 12376/head
Sage Weil [Thu, 8 Dec 2016 00:25:55 +0000 (18:25 -0600)]
msg/simple/Pipe: avoid returning 0 on poll timeout

If poll times out it will return 0 (no data to read on socket).  In
165e5abdbf6311974d4001e43982b83d06f9e0cc we changed tcp_read_wait from
returning -1 to returning -errno, which means we return 0 instead of -1
in this case.

This makes tcp_read() get into an infinite loop by repeatedly trying to
read from the socket and getting EAGAIN.

Fix by explicitly checking for a 0 return from poll(2) and returning
EAGAIN in that case.

Fixes: http://tracker.ceph.com/issues/18184
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 6c3d015c6854a12cda40673848813d968ff6afae)

8 years agoMerge pull request #12033 from dachary/wip-17926-jewel
Loic Dachary [Tue, 6 Dec 2016 14:54:45 +0000 (15:54 +0100)]
Merge pull request #12033 from dachary/wip-17926-jewel

jewel: ceph-disk --dmcrypt create must not require admin key

Reviewed-by: Loic Dachary <ldachary@redhat.com>
8 years agoMerge pull request #11968 from ddiss/jewel_next_flush_evict_snaps
Loic Dachary [Tue, 6 Dec 2016 08:47:07 +0000 (09:47 +0100)]
Merge pull request #11968 from ddiss/jewel_next_flush_evict_snaps

jewel: tools: snapshotted RBD extent objects can't be manually evicted from a cache tier

Reviewed-by: Kefu Chai <kchai@redhat.com>
8 years agoMerge pull request #12151 from dachary/wip-18011-jewel
Loic Dachary [Tue, 6 Dec 2016 08:46:01 +0000 (09:46 +0100)]
Merge pull request #12151 from dachary/wip-18011-jewel

jewel: test fails due to The UNIX domain socket path

Reviewed-by: Kefu Chai <kchai@redhat.com>
8 years agoMerge pull request #12296 from SUSE/wip-18133-jewel
Loic Dachary [Tue, 6 Dec 2016 08:45:26 +0000 (09:45 +0100)]
Merge pull request #12296 from SUSE/wip-18133-jewel

jewel: build/ops: fix undefined crypto references with --with-xio

Reviewed-by: Loic Dachary <ldachary@redhat.com>
8 years ago10.2.4 v10.2.4
Jenkins Build Slave User [Mon, 5 Dec 2016 22:15:20 +0000 (22:15 +0000)]
10.2.4

8 years agoMerge pull request #11997 from Abhishekvrshny/wip-17876-jewel
Sage Weil [Mon, 5 Dec 2016 19:01:55 +0000 (14:01 -0500)]
Merge pull request #11997 from Abhishekvrshny/wip-17876-jewel

jewel: osd: update_log_missing does not order correctly with osd_ops

Reviewed-by: Sage Weil <sage@redhat.com>
8 years agoMerge pull request #11944 from SUSE/wip-17866-jewel
Sage Weil [Mon, 5 Dec 2016 19:01:03 +0000 (14:01 -0500)]
Merge pull request #11944 from SUSE/wip-17866-jewel

jewel: osd: Add config option to disable new scrubs during recovery

Reviewed-by: Sage Weil <sage@redhat.com>
8 years agoMerge pull request #11672 from linuxbox2/jewel-17663
Yehuda Sadeh [Mon, 5 Dec 2016 18:09:30 +0000 (10:09 -0800)]
Merge pull request #11672 from linuxbox2/jewel-17663

jewel: rgw_rest_s3:  apply missed base64 try-catch

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
8 years agoMerge pull request #11953 from SUSE/wip-17885-jewel
Loic Dachary [Mon, 5 Dec 2016 17:47:28 +0000 (18:47 +0100)]
Merge pull request #11953 from SUSE/wip-17885-jewel

jewel: test: temporarily disable fork()'ing tests

Reviewed-by: John Spray <john.spray@redhat.com>
8 years agoMerge pull request #11884 from SUSE/wip-17754-jewel
Loic Dachary [Mon, 5 Dec 2016 16:39:50 +0000 (17:39 +0100)]
Merge pull request #11884 from SUSE/wip-17754-jewel

jewel: ceph-create-keys loops forever

Reviewed-by: Loic Dachary <ldachary@redhat.com>
8 years agoMerge pull request #11529 from SUSE/wip-17600-jewel
Loic Dachary [Mon, 5 Dec 2016 16:38:50 +0000 (17:38 +0100)]
Merge pull request #11529 from SUSE/wip-17600-jewel

jewel: common: Improve linux dcache hash algorithm

Reviewed-by: Kefu Chai <kchai@redhat.com>
8 years agoceph_volume_client: set an existing auth ID's default mon caps 12325/head
Ramana Raja [Fri, 11 Nov 2016 13:12:40 +0000 (18:42 +0530)]
ceph_volume_client: set an existing auth ID's default mon caps

... as 'allow r' (the minimum mon caps required to access a share)
when:

* authorizing the auth ID to access a volume.

* deauthorizing the auth ID to access a volume, but the auth ID is
  authorized to access other volumes.

In both the above cases, the ceph_volume_client previously tried to
set the mon caps of the auth ID to an invalid value, None.

Fixes: http://tracker.ceph.com/issues/17800
Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit b0fa3a403373e4312fd805ab7653f055f4933eae)

8 years agoMerge pull request #12167 from liewegas/wip-osdmap-encoding-jewel
Loic Dachary [Mon, 5 Dec 2016 13:50:23 +0000 (14:50 +0100)]
Merge pull request #12167 from liewegas/wip-osdmap-encoding-jewel

jewel: osd: condition OSDMap encoding on features

Reviewed-by: Loic Dachary <ldachary@redhat.com>
8 years agoceph-disk: enable --runtime ceph-osd systemd units 12147/head
Loic Dachary [Wed, 30 Nov 2016 23:28:32 +0000 (00:28 +0100)]
ceph-disk: enable --runtime ceph-osd systemd units

If ceph-osd@.service is enabled for a given device (say /dev/sdb1 for
osd.3) the ceph-osd@3.service will race with ceph-disk@dev-sdb1.service
at boot time.

Enabling ceph-osd@3.service is not necessary at boot time because

   ceph-disk@dev-sdb1.service

calls

   ceph-disk activate /dev/sdb1

which calls

   systemctl start ceph-osd@3

The systemctl enable/disable ceph-osd@.service called by ceph-disk
activate is changed to add the --runtime option so that ceph-osd units
are lost after a reboot. They are recreated when ceph-disk activate is
called at boot time so that:

   systemctl stop ceph

knows which ceph-osd@.service to stop when a script or sysadmin wants
to stop all ceph services.

Before enabling ceph-osd@.service (that happens at every boot time),
make sure the permanent enablement in /etc/systemd is removed so that
only the one added by systemctl enable --runtime in /run/systemd
remains. This is useful to upgrade an existing cluster without creating
a situation that is even worse than before because ceph-disk@.service
races against two ceph-osd@.service (one in /etc/systemd and one in
/run/systemd).

Fixes: http://tracker.ceph.com/issues/17889
Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 539385b143feee3905dceaf7a8faaced42f2d3c6)

8 years agobuild/ops: restart ceph-osd@.service after 20s instead of 100ms
Loic Dachary [Wed, 30 Nov 2016 16:33:54 +0000 (17:33 +0100)]
build/ops: restart ceph-osd@.service after 20s instead of 100ms

Instead of the default 100ms pause before trying to restart an OSD, wait
20 seconds instead and retry 30 times instead of 3. There is no scenario
in which restarting an OSD almost immediately after it failed would get
a better result.

It is possible that a failure to start is due to a race with another
systemd unit at boot time. For instance if ceph-disk@.service is
delayed, it may start after the OSD that needs it. A long pause may give
the racing service enough time to complete and the next attempt to start
the OSD may succeed.

This is not a sound alternative to resolve a race, it only makes the OSD
boot process less sensitive. In the example above, the proper fix is to
enable --runtime ceph-osd@.service so that it cannot race at boot time.

The wait delay should not be minutes to preserve the current runtime
behavior. For instance, if an OSD is killed or fails and restarts after
10 minutes, it will be marked down by the ceph cluster.  This is not a
change that could break things but it is significant and should be
avoided.

Refs: http://tracker.ceph.com/issues/17889

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit b3887379d6dde3b5a44f2e84cf917f4f0a0cb120)

8 years agoceph-disk: trigger must ensure device ownership
Loic Dachary [Tue, 22 Nov 2016 14:26:18 +0000 (15:26 +0100)]
ceph-disk: trigger must ensure device ownership

The udev rules that set the owner/group of the OSD devices are racing
with 50-udev-default.rules and depending on which udev event fires last,
ownership may not be as expected.

Since ceph-disk trigger --sync runs as root, always happens after
dm/lvm/filesystem units are complete and before activation, it is a good
time to set the ownership of the device.

It does not eliminate all races: a script running after systemd
local-fs.target and firing a udev event may create a situation where the
permissions of the device are temporarily reverted while the activation
is running.

Fixes: http://tracker.ceph.com/issues/17813
Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 72f0b2aa1eb4b7b2a2222c2847d26f99400a8374)

8 years agoceph-disk: systemd unit must run after local-fs.target
Loic Dachary [Tue, 22 Nov 2016 13:45:45 +0000 (14:45 +0100)]
ceph-disk: systemd unit must run after local-fs.target

A ceph udev action may be triggered before the local file systems are
mounted because there is no ordering in udev. The ceph udev action
delegates asynchronously to systemd via ceph-disk@.service which will
fail if (for instance) the LVM partition required to mount /var/lib/ceph
is not available yet. The systemd unit will retry a few times but will
eventually fail permanently. The sysadmin can systemctl reset-fail at a
later time and it will succeed.

Add a dependency to ceph-disk@.service so that it waits until the local
file systems are mounted:

After=local-fs.target

Since local-fs.target depends on lvm, it will wait until the lvm
partition (as well as any dm devices) is ready and mounted before
attempting to activate the OSD. It may still fail because the
corresponding journal/data partition is not ready yet (which is
expected) but it will no longer fail because the lvm/filesystems/dm are
not ready.

Fixes: http://tracker.ceph.com/issues/17889
Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit d954de5546ea34a07c1e4234b07c1cef6ab74463)

8 years agoMerge pull request #1290 from SUSE/wip-18014-jewel
Nathan Cutler [Sun, 4 Dec 2016 10:46:45 +0000 (11:46 +0100)]
Merge pull request #1290 from SUSE/wip-18014-jewel

thrashosds: try ceph-objectstore-tool for 10 minutes

Reviewed-by: Loic Dachary <ldachary@redhat.com>
8 years agothrashosds: try ceph-objectstore-tool for 10 minutes
Nathan Cutler [Thu, 24 Nov 2016 10:25:35 +0000 (11:25 +0100)]
thrashosds: try ceph-objectstore-tool for 10 minutes

If ceph-objectstore-tool binary is not present, it's likely because we're in
the middle of an upgrade. Do not try to run the binary until we verify that
it's really present. If it is absent, spend up to 10 minutes waiting for it to
appear.

Before this patch there was quite a large window for a race to occur. This
patch doesn't entirely eliminate it, but drastically reduces it.

Fixes: http://tracker.ceph.com/issues/18014
Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit 862b47faac1fc9f05ee3322ee4b65cf3d3d666c5)

8 years agobuild/ops: fix undefined crypto references with --with-xio 12296/head
Nathan Cutler [Sat, 3 Dec 2016 12:29:56 +0000 (13:29 +0100)]
build/ops: fix undefined crypto references with --with-xio

Only with --with-xio, RPM build fails due to undefined references to various
symbols starting with "PK11_" in ./.libs/libcommon.a(Crypto.o) in several
of the unit tests.

Fixes: http://tracker.ceph.com/issues/18133
Signed-off-by: Nathan Cutler <ncutler@suse.com>
8 years agoMerge pull request #12067 from SUSE/wip-17953-jewel
Loic Dachary [Sat, 3 Dec 2016 09:57:18 +0000 (10:57 +0100)]
Merge pull request #12067 from SUSE/wip-17953-jewel

jewel: mon: OSDMonitor: only reject MOSDBoot based on up_from if inst matches

Reviewed-by: Samuel Just <sjust@redhat.com>
8 years agoMerge pull request #1297 from ceph/wip-14.04
Zack Cerza [Fri, 2 Dec 2016 20:25:32 +0000 (13:25 -0700)]
Merge pull request #1297 from ceph/wip-14.04

suites/rados: s/trusty/"14.04"/

8 years agoOSDMonitor: only reject MOSDBoot based on up_from if inst matches 12067/head
Samuel Just [Mon, 14 Nov 2016 19:50:23 +0000 (11:50 -0800)]
OSDMonitor: only reject MOSDBoot based on up_from if inst matches

If the osd actually restarts, there is no guarrantee that the epoch will
advance past up_from.  If the inst is different, it can't really be a
dup.  At worst, it might be a queued MOSDBoot from a previous inst, but
in that case, the real inst would see itself marked up, and then back
down causing it to try booting again.

Fixes: http://tracker.ceph.com/issues/17899
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 033ad5b46c0492134e72a8372e44e3ef1358d2df)

8 years agoMerge pull request #12207 from jdurgin/wip-librados-setxattr-overload-jewel
Josh Durgin [Fri, 2 Dec 2016 16:16:27 +0000 (08:16 -0800)]
Merge pull request #12207 from jdurgin/wip-librados-setxattr-overload-jewel

librados: remove new setxattr overload to avoid breaking the C++ ABI

Reviewed-by: Sage Weil <sage@redhat.com>