]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Alfredo Deza [Thu, 18 Oct 2018 14:39:55 +0000 (11:39 -0300)]
rpm: require 2fsprogs needed for ceph-disk when using dmcrypt
Signed-off-by: Alfredo Deza <adeza@redhat.com>
Guillaume Abrioux [Wed, 11 Apr 2018 10:05:33 +0000 (12:05 +0200)]
specs: require of e2fsprogs
in ceph/ceph-container we've realized that `e2fsprogs` isn't installed in
centos container image because ceph hasn't a dependency for it.
It has for consequence to fail when deploying a containerized cluster
with dmcrypt when using centos image.
Typical error encountered:
typical error:
```
......
get_dm_uuid: get_dm_uuid /dev/sda uuid path is /sys/dev/block/8:0/dm/uuid
get_dm_uuid: get_dm_uuid /dev/sda uuid path is /sys/dev/block/8:0/dm/uuid
get_dm_uuid: get_dm_uuid /dev/sda5 uuid path is /sys/dev/block/8:5/dm/uuid
populate: Creating lockbox fs on %s: mkfs -t ext4 /dev/sda5
command_check_call: Running command: /usr/sbin/mkfs -t ext4 /dev/sda5
mkfs.ext4: No such file or directory
Traceback (most recent call last):
File "/usr/sbin/ceph-disk", line 9, in <module>
load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
......
```
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit
a99177834120e7a2c4592054f6a8b8736e0ffb92 )
Andrew Schoen [Tue, 16 Oct 2018 19:40:41 +0000 (14:40 -0500)]
Merge pull request #24589 from ceph/backport-luminous-24404
luminous: ceph-volume: make `lvm batch` idempotent
Reviewed-by: Andrew Schoen <aschoen@redhat.com>
Andrew Schoen [Tue, 16 Oct 2018 18:13:59 +0000 (13:13 -0500)]
Merge pull request #24451 from alfredodeza/luminous-wip-rm24795
luminous: ceph-volume lvm.prepare update help to indicate partitions are needed, not devices
Reviewed-by: Andrew Schoen <aschoen@redhat.com>
Andrew Schoen [Wed, 10 Oct 2018 19:28:29 +0000 (15:28 -0400)]
ceph-volume: extracts batch.filter_devices from Batch._get_strategy
This allows us to easily provide tests for that method.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
df7ef5383bb476ee020b898cd2e8fbce044fc07e )
Andrew Schoen [Wed, 10 Oct 2018 18:05:25 +0000 (14:05 -0400)]
ceph-volume: failing to get block db size from conf logs an exception
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
172d4af9b97d8d64ce35dc95efb72d9f190bc170 )
Andrew Schoen [Tue, 9 Oct 2018 18:05:54 +0000 (14:05 -0400)]
ceph-volume: when all devices are filtered exit gracefully
Even if all devices are filtered we want to return a 0 exit code and
make sure the json reporting still works.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
aa4fcd602f8b1a1b17bc12e59334b19508f97d6d )
Andrew Schoen [Mon, 8 Oct 2018 13:57:07 +0000 (09:57 -0400)]
ceph-volume: filter devices used by journals/block.db
If after filterering of data/block devices there are only
one device left it can not be used if it is an SSD and
has been used previously as a journal or block.db
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
fc9a10e54813a402ecd1d823ea5a33f85e8eb963 )
Andrew Schoen [Mon, 8 Oct 2018 13:39:05 +0000 (09:39 -0400)]
ceph-volume: add rotational property to Device class
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
7f6bfaaf0cf30a56481d6d67f2fa2a7e785cc344 )
Andrew Schoen [Fri, 5 Oct 2018 21:18:48 +0000 (16:18 -0500)]
ceph-volume: add info about filtered devices to batch pretty reports
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
62426db799c54436c50b5751c8e72b3dbd710a2a )
Andrew Schoen [Fri, 5 Oct 2018 15:45:35 +0000 (10:45 -0500)]
ceph-volume: remove the used_by_ceph key in the json output
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
3b6d82afe7ee83283b25cefda00c7a59cf5144af )
Andrew Schoen [Fri, 5 Oct 2018 15:39:54 +0000 (10:39 -0500)]
ceph-volume: fix idempotency checks for lvm batch tests
The mixed type tests will change strategy after the idempotency test so
we need to handle that in test playbook.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
1143482061da50910f4f883aafdbe0a0e4269f39 )
Andrew Schoen [Fri, 5 Oct 2018 15:38:11 +0000 (10:38 -0500)]
ceph-volume: fix bluestore strategy json reporting and type
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
b3cf90604d31709996538e612b1767d6fd4da8b5 )
Andrew Schoen [Thu, 4 Oct 2018 17:47:48 +0000 (12:47 -0500)]
ceph-volume: ignore failure to load ceph configuration for block.db size
If we fail to load a ceph configureation file when trying to get the
block.db size then just use defaults instead of throwing an error.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
a7ee36ca92b6592b6b3e218252c6a4c30416591b )
Andrew Schoen [Thu, 4 Oct 2018 16:54:06 +0000 (11:54 -0500)]
ceph-volume: fix strategy comparison in 'lvm batch'
This also fixes some small json reporting issues with the
filestore MixedType strategy
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
1dd15025bfd52af9c31cd281f92d743b9ca0eeb8 )
Andrew Schoen [Wed, 3 Oct 2018 20:01:08 +0000 (15:01 -0500)]
ceph-volume: raise a non zero exit code if strategy changes with batch
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
9dfc00f8e10e10fbdc52c44259596c6a96a90edd )
Andrew Schoen [Wed, 3 Oct 2018 17:13:27 +0000 (12:13 -0500)]
ceph-volume: add functional tests to ensure lvm batch is idempotent
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
9752d03cc16664cddd9ac7741284a2fb5b31f0e7 )
Andrew Schoen [Wed, 3 Oct 2018 15:19:45 +0000 (10:19 -0500)]
ceph-volume: add tests for util.device.Device.used_by_ceph
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
a28e6531e47d60d9eb7f62b67f122578daa2a683 )
Andrew Schoen [Tue, 2 Oct 2018 20:23:39 +0000 (15:23 -0500)]
ceph-volume: update tests to account for filtered_devices in batch
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
36396229d3b8d20862565b35d431ff22fa92cd1c )
Andrew Schoen [Tue, 2 Oct 2018 20:08:10 +0000 (15:08 -0500)]
ceph-volume: consider block and data devices used_by_ceph
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
67512530116c16d072c91867a259aeb429d32ff6 )
Andrew Schoen [Tue, 2 Oct 2018 14:48:27 +0000 (09:48 -0500)]
ceph-volume: add filtered_devices and used_by_ceph to all batch reports
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
0718d2e2663c8e807dbb5171a58759c76ed92c08 )
Andrew Schoen [Thu, 27 Sep 2018 20:22:17 +0000 (15:22 -0500)]
ceph-volume: pick strategy for batch with only the unused devices
This will pick a strategy, filter out any devices already been used by
ceph and then pick a strategy again. If the strategy has changed the
call should error, if the strategy is the same proceed. If there are no
unused devices then the command is a noop.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
4529f2d6053b0b07583a3d7501f4e05e08cad385 )
Andrew Schoen [Thu, 27 Sep 2018 13:55:20 +0000 (08:55 -0500)]
ceph-volume: adds a 'changed' key to lvm batch --report
This will indicate if the command would result in any OSDs being created
or not. Other tooling can use that key for idempotency checks.
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
10f1d577d4c4e66c77046fc3b274d6653af99586 )
Andrew Schoen [Wed, 26 Sep 2018 21:07:30 +0000 (16:07 -0500)]
ceph-volume: adds used_by_ceph to filestore singletype batch report
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
48d10c9ccf13ba7f42750b37d26aeb28f6b1c606 )
Andrew Schoen [Wed, 26 Sep 2018 21:01:30 +0000 (16:01 -0500)]
ceph-volume: adds a used_by_ceph property to the Device class
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
9d49a3708e34bccd183d5c32d022c36d8b118b42 )
Andrew Schoen [Wed, 26 Sep 2018 20:53:26 +0000 (15:53 -0500)]
ceph-volume: adds a lvs property to the Device class
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
3dae3247adc1b96ca688ec81cec03180d5943823 )
Andrew Schoen [Wed, 26 Sep 2018 19:29:41 +0000 (14:29 -0500)]
ceph-volume: add vg_name to the Device class
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit
619810c0ef48c1f19db8821a33f9c48cc11a0161 )
Abhishek L [Mon, 15 Oct 2018 15:13:52 +0000 (17:13 +0200)]
Merge pull request #24527 from theanalyst/wip-luminous-36382
luminous: rgw: resharding produces invalid values of bucket stats
Reviewed-By: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Wed, 10 Oct 2018 18:53:24 +0000 (14:53 -0400)]
Merge pull request #24396 from smithfarm/wip-26932-luminous
luminous: osd: scrub livelock
Reviewed-by: Neha Ojha <nojha@redhat.com>
Abhishek Lekshmanan [Fri, 5 Oct 2018 09:19:18 +0000 (11:19 +0200)]
rgw: copy actual stats from the source shards during reshard
Currently we don't copy the actual_stats field during reshard, which makes
resharded buckets show a size_utilized as 0, which further has the problem that
a subsequent object removal would subtract the object size from the 0 size
utilized showing up large uint64_t values. Copy the size_actual from the source
object in both cls and in reshard_process. This will fix the new buckets,
existing buckets will still have to go through a bucket check --fix for their
stats to be corrected.
Fixes: http://tracker.ceph.com/issues/36290
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
(cherry picked from commit
beb90638ae3d5329653b61bae0d6714796c41d04 )
Neha Ojha [Tue, 9 Oct 2018 01:01:58 +0000 (18:01 -0700)]
Merge pull request #24479 from neha-ojha/wip-36347-luminous
qa/suites/rados/upgrade/jewel-x-singleton: exclude python3-rados, python3-cephfs
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Patrick Donnelly [Mon, 8 Oct 2018 20:26:07 +0000 (13:26 -0700)]
Merge PR #24403 into luminous
* refs/pull/24403/head:
qa: add timeout to cleaning up workunit sandbox
qa: cleanup workunit dir for each unit
qa: add timeout to kclient umount
qa: do not cleanup sandbox on error
qa: use default timeout in fs workunits
qa: use sudo to cleanup workspace
qa: cleanup parallel execution of fsstress
qa/workunit: implement cleanup option
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Neha Ojha [Mon, 8 Oct 2018 19:12:39 +0000 (15:12 -0400)]
qa/suites/rados/upgrade/jewel-x-singleton: exclude python3-rados, python3-cephfs
This fix goes directly into the luminous branch since these packages do not need
to be installed on jewel, when upgrading to luminous.
Fixes: https://tracker.ceph.com/issues/36347
Signed-off-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Fri, 5 Oct 2018 20:57:44 +0000 (13:57 -0700)]
Merge pull request #24410 from smithfarm/wip-36196-luminous
luminous: mds: internal op missing events time 'throttled', 'all_read', 'dispatched'
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Yuri Weinstein [Fri, 5 Oct 2018 20:57:18 +0000 (13:57 -0700)]
Merge pull request #24421 from vshankar/wip-35937
luminous: mds: track average session uptime
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Sage Weil [Thu, 9 Aug 2018 13:33:42 +0000 (08:33 -0500)]
osd: vary tick interval +/- 5% to avoid scrub livelocks
If you have two pgs that need to scrub on two OSDs, each the primary
for one pg and the replica for the other, you can end up in a livelock:
- both osds locally reserve a scrub slot
- both osds send a scrub schedule request
- both scrub requests are rejected
- both osds wait exactly 1 second
- repeat
Seems a bit unlikely, but I've seen test cases where it goes on more an
hour.
Fixes: http://tracker.ceph.com/issues/26890
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
2011377c379c9d53a3a0a693a7874fc330278898 )
Conflicts:
src/osd/OSD.cc
- luminous does not have src/include/random.h; use #include <random>
instead, seeding with whoami so each OSD gets a different series
of pseudo-random numbers
Yuri Weinstein [Fri, 5 Oct 2018 20:12:15 +0000 (13:12 -0700)]
Merge pull request #23483 from pdvian/wip-26840-luminous
luminous: librados application's symbol could conflict with the libceph-common
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Fri, 5 Oct 2018 20:11:45 +0000 (13:11 -0700)]
Merge pull request #24405 from dillaman/wip-36143-luminous
luminous: librbd: blacklisted client might not notice it lost the lock
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Yuri Weinstein [Fri, 5 Oct 2018 20:11:15 +0000 (13:11 -0700)]
Merge pull request #24415 from dillaman/wip-36224-luminous
luminous: librbd: object map improperly flagged as invalidated
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Yuri Weinstein [Fri, 5 Oct 2018 20:10:38 +0000 (13:10 -0700)]
Merge pull request #24419 from pdvian/wip-36157-luminous
luminous: msg: ceph_abort() when there are enough accepter errors in msg server
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Fri, 5 Oct 2018 20:09:56 +0000 (13:09 -0700)]
Merge pull request #24424 from theanalyst/wip-luminous-36311
luminous: multi-site: object name should be urlencoded when we put it into ES
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Alfredo Deza [Fri, 5 Oct 2018 14:29:58 +0000 (10:29 -0400)]
ceph-volume lvm.prepare update help to remove old basic usage example
Signed-off-by: Alfredo Deza <adeza@redhat.com>
Alfredo Deza [Wed, 3 Oct 2018 12:11:58 +0000 (08:11 -0400)]
ceph-volume lvm.prepare update help to indicate partitions are needed, not devices
Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit
d31dd95b95445825f5e6669dd4ecb3118b09fdcf )
Jeffrey Zhang [Tue, 3 Apr 2018 07:03:22 +0000 (15:03 +0800)]
fix typo in ceph-volume lvm prepare help
Signed-off-by: Jeffrey Zhang <zhang.lei.fly@gmail.com>
(cherry picked from commit
d65b8844d16d71df01b57f368badc100db505506 )
Alfredo Deza [Tue, 13 Mar 2018 19:30:22 +0000 (15:30 -0400)]
ceph-volume lvm.prepare simplify help menu with bluestore default flags
Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit
b7140d23741db33b68444f29a744f552f355a6f5 )
Alfredo Deza [Tue, 13 Mar 2018 19:28:28 +0000 (15:28 -0400)]
ceph-volume lvm.create simplify help menu with bluestore default flags
Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit
2d1b1918285c941ca7b1953d77d0fc465f60a0c5 )
Alfredo Deza [Tue, 13 Mar 2018 19:26:46 +0000 (15:26 -0400)]
doc/ceph-volume document multipath support
Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit
1dca5eac386ff2f67f2449700cd4bf27b81dafda )
Sage Weil [Thu, 9 Aug 2018 13:22:05 +0000 (08:22 -0500)]
osd: tick at OSD_TICK_INTERVAL, not heartbeat interval
Heartbeat inveral is not related!
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
7d76458354661f7575c4a2cae251a9b828513580 )
Patrick Donnelly [Sun, 30 Sep 2018 00:37:12 +0000 (17:37 -0700)]
qa: add timeout to cleaning up workunit sandbox
If there is a bug preventing rm from completing, the workunit will get stuck.
Fixes: http://tracker.ceph.com/issues/36184
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit
3a10d74f3aa4901dd9edffc0061992073ae67085 )
Patrick Donnelly [Mon, 24 Sep 2018 18:29:10 +0000 (11:29 -0700)]
qa: cleanup workunit dir for each unit
This was wrongly dropped and moved to the finalizer.
Introduced-by: de824f74dd8ac909e47335ccd53d7a085e388e41
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit
70844f3f55004024a747854013a1efb409705d81 )
Patrick Donnelly [Sun, 30 Sep 2018 00:34:37 +0000 (17:34 -0700)]
qa: add timeout to kclient umount
Otherwise QA sits forever waiting for the kclient to umount when there is a
problem.
Fixes: http://tracker.ceph.com/issues/36184
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit
7a64eb9dfb908a1a8e5d2b0dcaa7ca9df52a9ab1 )
Patrick Donnelly [Wed, 26 Sep 2018 14:38:58 +0000 (07:38 -0700)]
qa: do not cleanup sandbox on error
Otherwise the command will hang if the mount is broken.
Fixes: http://tracker.ceph.com/issues/36184
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit
d4b8f94cf8d95ebb277b550fc6ebc3468052a39c )
Patrick Donnelly [Mon, 1 Oct 2018 01:10:05 +0000 (18:10 -0700)]
qa: use default timeout in fs workunits
Six hours is unnecessarily long.
Fixes: http://tracker.ceph.com/issues/36184
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit
bdd2ddcfd862b65dfd73bc1ea09b0ad07040d445 )
Patrick Donnelly [Mon, 24 Sep 2018 18:02:49 +0000 (11:02 -0700)]
qa: use sudo to cleanup workspace
Files in scratch_tmp may not be owned by ubuntu.
Fixes: http://tracker.ceph.com/issues/36165
Introduced-by: de824f74dd8ac909e47335ccd53d7a085e388e41
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit
1eaf78a75498d0f739b40bf310d036c851465fad )
Patrick Donnelly [Tue, 18 Sep 2018 21:57:05 +0000 (14:57 -0700)]
qa: cleanup parallel execution of fsstress
Two instances of fsstress clobber each other. Just build it in the local sandbox.
Fixes: http://tracker.ceph.com/issues/24177
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit
de824f74dd8ac909e47335ccd53d7a085e388e41 )
Nathan Cutler [Wed, 3 Oct 2018 19:13:11 +0000 (21:13 +0200)]
qa/workunit: implement cleanup option
This is a partial backport of
91942df5a690809ed872f5aa8c35b56e8048e485
just to get the workunit.py changes.
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Yuri Weinstein [Thu, 4 Oct 2018 21:49:20 +0000 (14:49 -0700)]
Merge pull request #24327 from smithfarm/wip-24478-luminous
luminous: read object attrs failed at EC recovery
Reviewed-by: David Zafman <dzafman@redhat.com>
Yuri Weinstein [Thu, 4 Oct 2018 21:48:34 +0000 (14:48 -0700)]
Merge pull request #24395 from smithfarm/wip-25145-luminous
luminous: mon: Automatically set expected_num_objects for new pools with >=100 PGs per OSD
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Thu, 4 Oct 2018 21:48:02 +0000 (14:48 -0700)]
Merge pull request #24397 from smithfarm/wip-36137-luminous
luminous: rgw: multisite: update index segfault on shutdown/realm reload
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Thu, 4 Oct 2018 21:47:37 +0000 (14:47 -0700)]
Merge pull request #24398 from smithfarm/wip-36202-luminous
luminous: multisite: intermittent test_bucket_index_log_trim failures
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Thu, 4 Oct 2018 21:15:45 +0000 (14:15 -0700)]
Merge pull request #24393 from smithfarm/wip-23998-luminous
luminous: osd/EC: slow/hung ops in multimds suite test
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Thu, 4 Oct 2018 20:26:26 +0000 (13:26 -0700)]
Merge PR #24375 into luminous
* refs/pull/24375/head:
mds: use monotonic waits in Beacon
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Yuri Weinstein [Thu, 4 Oct 2018 20:07:50 +0000 (13:07 -0700)]
Merge pull request #24391 from smithfarm/wip-24630-luminous
luminous: cls/rgw: don't assert in decode_list_index_key()
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Thu, 4 Oct 2018 20:07:12 +0000 (13:07 -0700)]
Merge pull request #24387 from pdvian/wip-36126-luminous
luminous: msg/async: clean up local buffers on dispatch
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Ricardo Dias <rdias@suse.com>
Yuri Weinstein [Thu, 4 Oct 2018 20:05:54 +0000 (13:05 -0700)]
Merge pull request #24389 from pdvian/wip-36128-luminous
luminous: rgw: abort_bucket_multiparts() ignores individual NoSuchUpload errors
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Thu, 4 Oct 2018 20:02:11 +0000 (13:02 -0700)]
Merge pull request #24316 from smithfarm/wip-26979-luminous
luminous: multisite: intermittent failures in test_bucket_sync_disable_enable
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Thu, 4 Oct 2018 20:01:46 +0000 (13:01 -0700)]
Merge pull request #24317 from smithfarm/wip-35703-luminous
luminous: multisite: out of order updates to sync status markers
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Thu, 4 Oct 2018 20:01:22 +0000 (13:01 -0700)]
Merge pull request #24318 from smithfarm/wip-35980-luminous
luminous: multisite: data sync error repo processing does not back off on empty
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Thu, 4 Oct 2018 20:00:53 +0000 (13:00 -0700)]
Merge pull request #24361 from pdvian/wip-36124-luminous
luminous: rgw: fix chunked-encoding for chunks >1MiB
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Thu, 4 Oct 2018 15:23:53 +0000 (08:23 -0700)]
Merge pull request #24123 from pdvian/wip-35713-luminous
luminous: librbd: ensure exclusive lock acquired when removing sync point snaps…
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Yuri Weinstein [Thu, 4 Oct 2018 15:23:00 +0000 (08:23 -0700)]
Merge pull request #24320 from smithfarm/wip-36119-luminous
luminous: [rbd-mirror] failed assertion when updating mirror status
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Yuri Weinstein [Thu, 4 Oct 2018 15:21:26 +0000 (08:21 -0700)]
Merge pull request #24390 from smithfarm/wip-24946-luminous
luminous: librbd: image create request should validate data pool for self-managed snapshot support
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Jason Dillaman [Mon, 24 Sep 2018 19:07:15 +0000 (15:07 -0400)]
librbd: keep IO blocked until after snapshot object map created
The IO was being unblocked before object map was created, allowing
a potential copyup request to fail to update a still-to-be-created
object map.
Fixes: http://tracker.ceph.com/issues/24516
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
1e874403bf861cb8b74261308d8b73434cf90341 )
Conflicts:
src/librbd/object_map/SnapshotCreateRequest.cc: trivial resolution
src/librbd/operation/SnapshotCreateRequest.cc: trivial resolution
Jason Dillaman [Mon, 24 Sep 2018 18:45:09 +0000 (14:45 -0400)]
librbd: do not invalidate object map if update races with copyup
The copyup state machine needs to iterate over all object maps to update
the existence for the object. If an snapshot is being removed concurrently,
it's possible to invalidate the object map for the image.
Fixes: http://tracker.ceph.com/issues/24516
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
5a1cb469879157297ab456261f9335d8b855684f )
Conflicts:
src/librbd/ObjectMap.cc: trivial resolution
src/librbd/ObjectMap.h: trivial resolution
src/librbd/deep_copy/ObjectCopyRequest.cc: moved to rbd-mirror image sync
src/librbd/io/CopyupRequest.cc: trivial resolution
src/test/librbd/deep_copy/test_mock_ObjectCopyRequest.cc: moved to rbd-mirror image sync
src/test/librbd/test_mock_ObjectMap.cc: trivial resolution
Jason Dillaman [Fri, 14 Sep 2018 15:46:13 +0000 (11:46 -0400)]
librbd: do not invalidate object map when attempting to delete non-existent snapshot
If duplicate snapshot remove requests are received by the lock owner from a peer
client, the first request will remove the object map. If the second request
arrives while the first is in-progress, it will again attempt to remove the
object map but fail to load it since it's already been deleted. This incorrectly
results in the next object map being flagged as invalid.
Fixes: http://tracker.ceph.com/issues/24516
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
0a31c55ea83d85da88c7586c9a8fa8d6ec6618a7 )
Conflicts:
src/librbd/object_map/SnapshotRemoveRequest.cc: trivial resolution
Chang Liu [Mon, 5 Mar 2018 07:46:43 +0000 (15:46 +0800)]
rgw: url_encode key name and instance in es sync module
Some objects whose name contains space or other special chars
can't be synced to ES correctly. we need to do url_encode when
we send a HTTP request to ES.
Fixes: tracker.ceph.com/issues/23216
Signed-off-by: Chang Liu <liuchang0812@gmail.com>
(cherry picked from commit
13978bb28b7be809033bf24550b21ed2713ddc9b )
Venky Shankar [Mon, 30 Jul 2018 05:47:02 +0000 (01:47 -0400)]
mds: include session uptime when diplaying session list
Fixes: http://tracker.ceph.com/issues/35937
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit
b23a204cdde2bc5f34304cca3f1bac3496cf7a41 )
Venky Shankar [Tue, 24 Jul 2018 03:47:02 +0000 (23:47 -0400)]
mds: track average uptime of sessions
Average session age math improvements by Patrick.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit
d2627b98d0c1477d664d00384ef033d323b26957 )
Conflicts:
src/mds/SessionMap.h
root [Mon, 30 Jul 2018 01:29:48 +0000 (21:29 -0400)]
msg: ceph_abort() when there are enough accepter errors in msg server
In some extrem cases(we have met one in our production cluster), when Accepter thread break out , new client can not connect to the osd. Because the former heartbeat connections are already connected, other osd can not detect failure then notify monitor to mark the failed osd down.
In the patch, we there are abnormal communication errors ,we just ceph_abort so that osd can go down fastly and other osds can notify monitor to mark the failed osd down.
Signed-off-by: penglaiyxy@gmail.com <penglaiyxy@gmail.com>
(cherry picked from commit
00e0ab407b2e9659d9121be1217e95c8117c411e )
Conflicts:
src/common/legacy_config_opts.h : Resolved for ms_max_accept_failures
src/common/options.cc : Resolved for ms_max_accept_failures
src/msg/async/AsyncMessenger.cc : Resolved in accept
src/msg/simple/Accepter.cc : Resolved in entry
Jason Dillaman [Fri, 14 Sep 2018 15:21:28 +0000 (11:21 -0400)]
librbd: converted object map snapshot remove state machine to new style
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
58770188ab57a53b786cf616ccfbf6acfcdc115a )
Conflicts:
src/librbd/object_map/SnapshotRemoveRequest.cc: trivial resolution
src/librbd/object_map/SnapshotRemoveRequest.h: trivial resolution
Jason Dillaman [Fri, 14 Sep 2018 13:59:35 +0000 (09:59 -0400)]
librbd: test_flags helper should require snap id parameter
The HEAD and snapshots have potentially different flag states
since object maps get invalidated per revision.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
862082792d9c2ff23823e46937b7de9a42830cfd )
Conflicts:
src/librbd/ObjectMap.cc: trivial resolution
src/librbd/operation/SnapshotRemoveRequest.cc: trivial resolution
src/test/librbd/test_DeepCopy.cc: DNE
src/test/librbd/test_Migration.cc: DNE
Yanhu Cao [Wed, 19 Sep 2018 02:32:48 +0000 (10:32 +0800)]
mds/MDCache: fix mds internal op missing events time
Fixes: http://tracker.ceph.com/issues/36114
Signed-off-by: Yanhu Cao <gmayyyha@gmail.com>
(cherry picked from commit
bd6ae6f4e29ac79e5e07373f52099338e6ab5416 )
Yuri Weinstein [Wed, 3 Oct 2018 19:49:51 +0000 (12:49 -0700)]
Merge pull request #23877 from smithfarm/wip-24842-luminous
luminous: qa: move mds/client config to qa from teuthology ceph.conf.template
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Yuri Weinstein [Wed, 3 Oct 2018 19:46:27 +0000 (12:46 -0700)]
Merge pull request #24086 from batrick/i35976
luminous: mds: configurable timeout for client eviction
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Yuri Weinstein [Wed, 3 Oct 2018 19:45:22 +0000 (12:45 -0700)]
Merge pull request #24376 from smithfarm/wip-35939-luminous
luminous: client: statfs inode count odd
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Yuri Weinstein [Wed, 3 Oct 2018 19:44:19 +0000 (12:44 -0700)]
Merge pull request #24378 from smithfarm/wip-36135-luminous
luminous: mds: rctime may go back
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Fri, 17 Aug 2018 22:03:56 +0000 (15:03 -0700)]
mds: use monotonic waits in Beacon
This guarantees that the sender thread cannot be disrupted by system clock
changes. This commit also simplifies the sender thread by manually managing the
thread and avoiding unnecessary context creation.
Fixes: http://tracker.ceph.com/issues/26962
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit
a5fc29b95281c6ca58c9177c665c379846beb4b3 )
Conflicts:
src/mds/Beacon.cc
- g_conf->foo instead of g_conf()->foo
- boost::string_view instead of std::string_view
- always specify template type std::unique_lock<std::mutex>
src/mds/Beacon.h
- time::min() instead of clock::zero()
- always specify template type std::unique_lock<std::mutex>
- std::chrono::seconds instead of "1s" in std::chrono_literals namespace
(which is a C++14ism)
Jason Dillaman [Thu, 6 Sep 2018 21:08:12 +0000 (17:08 -0400)]
librbd: use the correct error code when the exclusive lock isn't locked
If the client is currently blacklisted, use -EBLACKLISTED, otherwise
use -EROFS.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
e8eee15518facf562adf1aaba02d3a9523cdd2c3 )
Conflicts:
src/librbd/ExclusiveLock.cc: trivial resolution
src/librbd/image/RemoveRequest.cc: trivial resolution
src/test/rbd_mirror/image_deleter/test_mock_SnapshotPurgeRequest.cc: DNE
src/tools/rbd_mirror/image_deleter/SnapshotPurgeRequest.cc: DNE
src/tools/rbd_mirror/image_deleter/SnapshotPurgeRequest.h: DNE
src/librbd/DeepCopyRequest.cc: (see below)
src/librbd/deep_copy/ObjectCopyRequest.cc: (see below)
src/librbd/deep_copy/ObjectCopyRequest.h: (see below)
src/librbd/deep_copy/SetHeadRequest.cc: (see below)
src/librbd/deep_copy/SetHeadRequest.h: (see below)
src/librbd/deep_copy/SnapshotCopyRequest.cc: (see below)
src/librbd/deep_copy/SnapshotCopyRequest.h: (see below)
src/librbd/deep_copy/SnapshotCreateRequest.cc: (see below)
src/librbd/deep_copy/SnapshotCreateRequest.h: (see below)
src/test/librbd/deep_copy/test_mock_ObjectCopyRequest.cc: (see below)
src/test/librbd/deep_copy/test_mock_SetHeadRequest.cc: (see below)
src/test/librbd/deep_copy/test_mock_SnapshotCopyRequest.cc: (see below)
src/test/librbd/deep_copy/test_mock_SnapshotCreateRequest.cc: (see below)
src/test/librbd/test_mock_DeepCopyRequest.cc
- deep-copy related files were originally derived from rbd-mirror
equivalents. Similar modifications where made to the associated
rbd-mirror files.
Alfredo Deza [Wed, 3 Oct 2018 15:27:26 +0000 (11:27 -0400)]
Merge pull request #24382 from alfredodeza/luminous-rm36247
luminous ceph-volume: skip processing devices that don't exist when scanning system disks
Reviewed-by: Andrew Schoen <aschoen@redhat.com>
Jason Dillaman [Thu, 6 Sep 2018 21:15:50 +0000 (17:15 -0400)]
librbd: helper to retrieve the correct error code for read-only op
When the exclusive lock is unlocked, the error code should be
-EBLACKLISTED when the client is blacklisted, otherwise -EROFS.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
a84fbb2565fb603ea809487d920461d14442d188 )
Jason Dillaman [Thu, 6 Sep 2018 17:38:17 +0000 (13:38 -0400)]
librbd: reacquire lock should properly handle failed watcher
If the watch has been lost, assume the lock has been lost but attempt
to reacquire it if and when the watch is re-established.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
2057d99f451e3007d4fd05a88faa968319d0ba90 )
Conflicts:
src/librbd/ManagedLock.cc: trivial resolution
Jason Dillaman [Thu, 30 Aug 2018 19:12:27 +0000 (15:12 -0400)]
librbd: assume lock is unlocked if blacklisted or object deleted
This will ensure that it's possible to potentially re-acquire the
lock should the blacklist expire before the image is closed.
Fixes: http://tracker.ceph.com/issues/34534
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
60064f68f5dd2bbf5fbab95564fa522335091f4a )
Jason Dillaman [Thu, 6 Sep 2018 13:44:59 +0000 (09:44 -0400)]
librbd: watcher should internally track blacklisted state
Since it will periodically attempt to re-acquire the watch,
it will know when the RADOS client has been blacklisted and
when the blacklist has been removed.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
9ea94f284061849e452dd61c8f89ecca18642b0d )
Conflicts:
src/librbd/Watcher.cc: trivial resolution
src/test/librbd/mock/MockImageWatcher.h: trivial resolution
Jason Dillaman [Thu, 30 Aug 2018 20:51:10 +0000 (16:51 -0400)]
librbd: attempt to recover lost image watcher upon all failures
For example, if an image is blacklisted and the blacklist eventually
expires, the image should recover its watch.
Fixes: http://tracker.ceph.com/issues/34534
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
23b7447f6be87a14f84664f29431d2fdd2af4512 )
Conflicts:
src/librbd/watcher/RewatchRequest.cc: trivial resolution
src/test/librbd/CMakeLists.txt: trivial resolution
src/test/librbd/test_mock_Watcher.cc: trivial resolution
Jason Dillaman [Thu, 31 May 2018 18:09:30 +0000 (14:09 -0400)]
rbd-mirror: attempt to re-acquire leader lock if watcher recovered
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
69645f5433ce48281d3c6b70d979356c7ede2f88 )
(cherry picked from commit
a44e583fda52edceb0b20d78f1683a14d0e00f7b )
Jason Dillaman [Thu, 31 May 2018 18:04:19 +0000 (14:04 -0400)]
librbd: ensure managed lock can shut down if stuck waiting for register
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
cb6712b0d9d5bccadb23a0e011eef05cf4d92280 )
(cherry picked from commit
f1c0bda32f0d15c0b0808ec1ef4ccfaee8d177b0 )
Song Shun [Tue, 10 Apr 2018 05:41:18 +0000 (13:41 +0800)]
librbd: fix rbd close race with rewatch
fix rbd close race with rewatch
Signed-off-by: Song Shun <song.shun3@zte.com.cn>
(cherry picked from commit
8b833a293eac54fd3d38f12660d856ecc310d805 )
Conflicts:
src/librbd/Watcher.cc: trivial resolution
Mykola Golub [Tue, 13 Feb 2018 12:20:09 +0000 (14:20 +0200)]
librbd: potential race in RewatchRequest when resetting watch_handle
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit
f5c02adfdbf5d9da0186fd494ee33c469445be83 )
Casey Bodley [Thu, 20 Sep 2018 15:37:06 +0000 (11:37 -0400)]
rgw: remove BucketChangeObserver from data sync thread
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit
f05db89637d280505321708683182f0f2c886208 )
Conflicts:
src/rgw/rgw_data_sync.cc
src/rgw/rgw_data_sync.h
- argument lists are different in luminous, compared to master
Casey Bodley [Thu, 20 Sep 2018 15:34:42 +0000 (11:34 -0400)]
rgw: add BucketChangeObserver to RGWDataChangesLog
this means that BucketTrimManager will track active buckets based on
local changes, rather than changes in remote datalogs or error repos
Fixes: http://tracker.ceph.com/issues/36034
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit
f3c258c49ff6899433e742b10554c83413d64a8a )