]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Jan Fajerski [Thu, 8 Oct 2020 06:45:26 +0000 (08:45 +0200)]
ceph-volume: don't exit before empty report can be printed
get_plan() called exit in case of an empty plan. This prevented a report
being printed under these circumstances. Avoid exit in this case. Also
adds tests to ensure an empty report is printed.
Fixes: https://tracker.ceph.com/issues/47760
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
0cc5604843b215709a681fa402145c9fa403b1dd )
Jan Fajerski [Wed, 7 Oct 2020 07:45:42 +0000 (09:45 +0200)]
PendingReleaseNotes: add note about batch refactor
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
485f0d797e4b21ffb7ac742f0949e8c0a23d43f2 )
Conflicts:
PendingReleaseNotes
Sort new entry under >=15.2.6 heading
Jan Fajerski [Sat, 3 Oct 2020 07:40:33 +0000 (09:40 +0200)]
ceph-volume batch: return valid empty json reports
Fixes: https://tracker.ceph.com/issues/47729
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
ab59269a6ca5bb80c28e94beef0338f23fc10fff )
Jan Fajerski [Mon, 5 Oct 2020 10:56:26 +0000 (12:56 +0200)]
ceph-volume: pass filter_for_batch as keyword argument
This PR also removes an unused ctor argument in the Devices class.
Fixes: 7d168ad7bdbb6d6d5231a4ae540ab03040b49a38
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
c5a711e5b6499915b7c2a7b7869f890fa7dc7e2d )
Jan Fajerski [Fri, 2 Oct 2020 10:08:26 +0000 (12:08 +0200)]
doc: drop references to drive_groups in batch doc
The drive_groups code is orchestrator specific and not present in
nautilus.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Jan Fajerski [Tue, 8 Sep 2020 12:11:15 +0000 (14:11 +0200)]
idempotency must result in the same outcome
...not should
Co-authored-by: Joshua Schmid <jschmid@suse.de>
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
fcacd0b96ab195e939f6f879b0a0362a06385f9a )
Jan Fajerski [Mon, 29 Jun 2020 15:42:26 +0000 (17:42 +0200)]
doc: update ceph-volume lvm batch docs
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
7695d1ec539f54b7b7cea8d19925f0c320223f03 )
Jan Fajerski [Fri, 25 Sep 2020 09:35:19 +0000 (11:35 +0200)]
ceph-volume batch: fix very_fast_allocation plan and add tests
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
8178d5c48ac1a7f3915f0003abab6d625385bd78 )
Jan Fajerski [Wed, 16 Sep 2020 13:43:00 +0000 (15:43 +0200)]
ceph-volume: batch: call the right prepare method
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
e75ef77f23ae07463510ec213ac4007f29cbe2da )
Jan Fajerski [Fri, 11 Sep 2020 14:35:00 +0000 (16:35 +0200)]
ceph-volume inventory: add option to filter unwanted devices
Some device we never want to pass to the batch subcommand. For now this
includes devices that have a partition or are mounted on the machine.
One goal is to filter the root device, so it is not included on a batch
command and thus would contribute to its implicit sizing calculation.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
7d168ad7bdbb6d6d5231a4ae540ab03040b49a38 )
- removed the lsmdisk import from src/ceph-volume/ceph_volume/util/device.py
Jan Fajerski [Fri, 11 Sep 2020 08:36:43 +0000 (10:36 +0200)]
ceph-volume: address review comments
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
6f1592a1146529d352184c795aae8ce12f66e554 )
Jan Fajerski [Thu, 10 Sep 2020 14:45:34 +0000 (16:45 +0200)]
ceph-volume: batch: fix size retrieval for lvs
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
24e4aa1296608ef861d4ea4b6a1892246a53ef76 )
Jan Fajerski [Wed, 9 Sep 2020 11:04:14 +0000 (13:04 +0200)]
ceph-volume: include encryption in batch report
Fixes: https://tracker.ceph.com/issues/44783
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
fce184cf9b2d8f15543e1adee49e1fe6cc17437d )
Jan Fajerski [Wed, 9 Sep 2020 07:41:15 +0000 (09:41 +0200)]
ceph-volume lvm batch: use namedtuple instead of tuple
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
98c991fc6fd17b18d5bfbebe4b8febe5ff8fa2f0 )
Jan Fajerski [Tue, 8 Sep 2020 14:53:53 +0000 (16:53 +0200)]
ceph-volume: address review comments, mostly tidying, clarification
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
d0735ce1c90c952a6d2e1b805c1326d13ff7b06c )
Jan Fajerski [Mon, 7 Sep 2020 12:54:40 +0000 (14:54 +0200)]
ceph-volume: batch test should pass --journal-devices with filestore
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
55ba8630176041dce898e8b979b5849f13e01ca5 )
Jan Fajerski [Mon, 7 Sep 2020 12:54:01 +0000 (14:54 +0200)]
ceph-volume: make --journal optional, add --journal-slots
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
7f766846328aac82d75175ed2c1c0bf3438a99e0 )
Jan Fajerski [Fri, 26 Jun 2020 11:34:01 +0000 (13:34 +0200)]
ceph-volume batch: add deprecation warning for auto behaviour
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
1239f77c8871a45e21bd1e3e81be8bb8854f24da )
Jan Fajerski [Tue, 23 Jun 2020 14:58:46 +0000 (16:58 +0200)]
ceph-volume batch: add ceph.conf mocking to pass tests
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
eef9dc7a1da6d5dde0d1b02b71301c1d7b7926a9 )
Jan Fajerski [Fri, 19 Jun 2020 10:58:17 +0000 (12:58 +0200)]
ceph-volume batch: use disk.Size for size args
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
0bc7f7424cdd0a2c5f2cd777467814dbb3959fb4 )
Jan Fajerski [Fri, 19 Jun 2020 09:22:28 +0000 (11:22 +0200)]
ceph-volume batch: Fix osd_ids passing and improve plan formatting
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
2124aa57b355fb29dcfe13c643bc78b4011f15a6 )
Jan Fajerski [Tue, 9 Jun 2020 14:40:46 +0000 (16:40 +0200)]
ceph-volume batch: track rel_size in percent, more tests
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
2327e92abae74518d463a55ef4d42dbb816c9200 )
Jan Fajerski [Wed, 29 Apr 2020 05:47:18 +0000 (07:47 +0200)]
ceph-volume batch: improve backwards compatibility
This restores legacy batch behavior and also adds some initial test and
adjusts existing tests to changes.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
a23a02df02ec4a8f65df0864f3224fb311d25b11 )
Jan Fajerski [Mon, 27 Apr 2020 10:26:20 +0000 (12:26 +0200)]
ceph-volume: batch - enable legacy auto behaviour
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
d32e0e4320b54302ab989f0a93b57a0404e2094b )
Jan Fajerski [Mon, 27 Apr 2020 09:47:04 +0000 (11:47 +0200)]
ceph-volume: batch - major refactor
This completely refactors the batch code in order to make use of the
create/prepare code path for creating OSDs instead of having a second
code path doing this. This not only eases the maintenance burden but
also adds various features and fixes bugs. This subcommand can now
handle LVs, replace OSDs, reuse VGs and has a better notion of
idempotency.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
b0b797363fd66baa40eb54cf35dd6cfd11150be9 )
Jan Fajerski [Mon, 27 Apr 2020 09:35:51 +0000 (11:35 +0200)]
ceph-volume: Device - available_lvm if 10 extents are free.
This changes the available_lvm notion to only require 10 free extents
instead of 5GB.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
6cb0841658ae45f23c2457a8f6a489457012d93e )
Jan Fajerski [Mon, 27 Apr 2020 09:34:19 +0000 (11:34 +0200)]
ceph-volume: Device - add vg_free property
This new property returns the free space in any VGs present. If no VGs
are on the device we project how much space a VG will have.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
b34f130f30daeca034b6b2365cc5a832ba8faa56 )
Jan Fajerski [Mon, 27 Apr 2020 09:27:08 +0000 (11:27 +0200)]
ceph-volume: prepare/create - size args as Size class
This add the disk.Size class as all size related argument types. We
often create this class form args like this anyway and it enables users
to pass not only bytes but also strings like 50G.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
2ff37321640a614fd965d7df6e20d8f9d1430fa3 )
Jan Fajerski [Mon, 27 Apr 2020 09:45:26 +0000 (11:45 +0200)]
ceph-volume: disk.Size - add cast to bool
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
e7cdeab9dee2684c06b3543a482dd44a1db83c16 )
Jan Fajerski [Mon, 27 Apr 2020 09:21:37 +0000 (11:21 +0200)]
ceph-volume: api/lvm - add VolumeGroup.free_percent property
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
bad54e97817dfef0e7dfcb834753cbb728c3de46 )
Jan Fajerski [Mon, 27 Apr 2020 09:44:56 +0000 (11:44 +0200)]
ceph-volume: util.device - add vg_free_percent property
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
f48d225454e3ec347952d85262aa03b71bfb9111 )
Jan Fajerski [Mon, 27 Apr 2020 09:44:20 +0000 (11:44 +0200)]
ceph-volume: api/lvm - query LV units in bytes
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
de7d67603a135825d8b4e37a06dd4b6a12dbcf1c )
Jan Fajerski [Tue, 14 Apr 2020 13:34:30 +0000 (15:34 +0200)]
ceph-volume: lvm/common - refactor common arg specification
This makes it easier to create valid Namespace objects/arg lists when
programmatically calling create/prepare.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
fa62a7bf5f926db17dd2d7685878c175fc23500e )
Jan Fajerski [Wed, 26 Feb 2020 14:36:53 +0000 (15:36 +0100)]
ceph-volume: batch: fix argument help message
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
51ca694bbe1552fd52b0385c43aad7e28fc5626f )
Jan Fajerski [Fri, 2 Oct 2020 07:20:18 +0000 (09:20 +0200)]
Merge pull request #37413 from jan--f/wip-47650-nautilus
nautilus: [ceph-volume]: remove unneeded call to get_devices()
Jan Fajerski [Fri, 2 Oct 2020 07:19:46 +0000 (09:19 +0200)]
Merge pull request #37377 from shyukri/wip-47283-nautilus
nautilus: ceph-volume: fix journal size argument not work
Patrick Donnelly [Thu, 1 Oct 2020 20:07:03 +0000 (13:07 -0700)]
Merge PR #37508 into nautilus
* refs/pull/37508/head:
nautilus: qa: stop using kclient testing branch builds
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Yuri Weinstein [Thu, 1 Oct 2020 16:54:25 +0000 (09:54 -0700)]
Merge pull request #37478 from smithfarm/wip-47345-nautilus
nautilus: qa/*/mon/mon-last-epoch-clean.sh: mark osd out instead of down
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Yuri Weinstein [Thu, 1 Oct 2020 16:53:56 +0000 (09:53 -0700)]
Merge pull request #37477 from smithfarm/wip-47250-nautilus
nautilus: tools/osdmaptool.cc: add ability to clean_temps
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Yuri Weinstein [Thu, 1 Oct 2020 16:53:20 +0000 (09:53 -0700)]
Merge pull request #37476 from smithfarm/wip-46965-nautilus
nautilus: mgr: decrease pool stats if pg was removed
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Yuri Weinstein [Thu, 1 Oct 2020 16:52:50 +0000 (09:52 -0700)]
Merge pull request #37475 from smithfarm/wip-46935-nautilus
nautilus: tools/rados: Set locator key when exporting or importing a pool
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Yuri Weinstein [Thu, 1 Oct 2020 16:52:22 +0000 (09:52 -0700)]
Merge pull request #37474 from smithfarm/wip-46738-nautilus
nautilus: mon: fix the 'Error ERANGE' message when conf "osd_objectstore" is filestore
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Yuri Weinstein [Thu, 1 Oct 2020 16:51:53 +0000 (09:51 -0700)]
Merge pull request #37473 from smithfarm/wip-46710-nautilus
nautilus: osd/PeeringState: prevent peer's num_objects going negative
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Yuri Weinstein [Thu, 1 Oct 2020 16:51:20 +0000 (09:51 -0700)]
Merge pull request #37470 from smithfarm/wip-46262-nautilus
nautilus: common, osd: add sanity checks around osd_scrub_max_preemptions
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Yuri Weinstein [Thu, 1 Oct 2020 16:38:39 +0000 (09:38 -0700)]
Merge pull request #37471 from smithfarm/wip-46461-nautilus
nautilus: pybind/mgr/balancer: use "==" and "!=" for comparing str
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Thu, 1 Oct 2020 16:37:57 +0000 (09:37 -0700)]
Merge pull request #37447 from badone/wip-nautilus-ca-ansible-to-2.9
nautilus: qa/ceph-ansible: Bump required ansible to 2.9
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Ramana Raja [Thu, 1 Oct 2020 16:35:30 +0000 (22:05 +0530)]
nautilus: qa: stop using kclient testing branch builds
... in kcephfs and multimds suites.
This is a nautilus only fix. In master and octopus we still use
kclient testing branch builds.
Fixes: https://tracker.ceph.com/issues/47642
Signed-off-by: Ramana Raja <rraja@redhat.com>
Yuri Weinstein [Thu, 1 Oct 2020 15:06:07 +0000 (08:06 -0700)]
Merge pull request #37469 from smithfarm/wip-47459-nautilus
nautilus: qa/workunits/mon: fixed excessively large pool PG count
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Yuri Weinstein [Thu, 1 Oct 2020 15:05:17 +0000 (08:05 -0700)]
Merge pull request #37468 from smithfarm/wip-47417-nautilus
nautilus: rbd: include RADOS namespace in krbd symlinks
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Yuri Weinstein [Wed, 30 Sep 2020 17:30:32 +0000 (10:30 -0700)]
Merge pull request #37467 from smithfarm/wip-47322-nautilus
nautilus: rgw: replace '+' with "%20" in canonical query string for s3 v4 auth.
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Wed, 30 Sep 2020 17:30:07 +0000 (10:30 -0700)]
Merge pull request #37465 from smithfarm/wip-47318-nautilus
nautilus: rgw: Expiration days can't be zero and transition days can be zero
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Wed, 30 Sep 2020 17:29:39 +0000 (10:29 -0700)]
Merge pull request #37464 from smithfarm/wip-47315-nautilus
nautilus: rgw: radosgw-admin: period pull command is not always a raw_storage_op
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Wed, 30 Sep 2020 17:29:01 +0000 (10:29 -0700)]
Merge pull request #37463 from smithfarm/wip-46956-nautilus
nautilus: rgw: fix shutdown crash in RGWAsyncReadMDLogEntries
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Wed, 30 Sep 2020 17:28:30 +0000 (10:28 -0700)]
Merge pull request #37462 from smithfarm/wip-46950-nautilus
nautilus: rgw/cls: preserve olh entry's name on last unlink
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Wed, 30 Sep 2020 17:28:05 +0000 (10:28 -0700)]
Merge pull request #37461 from smithfarm/wip-46930-nautilus
nautilus: rgw: Empty reqs_change_state queue before unregistered_reqs
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Wed, 30 Sep 2020 17:27:37 +0000 (10:27 -0700)]
Merge pull request #37460 from smithfarm/wip-46594-nautilus
nautilus: rgw: add negative cache to the system object
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Wed, 30 Sep 2020 17:26:42 +0000 (10:26 -0700)]
Merge pull request #37459 from smithfarm/wip-47320-nautilus
nautilus: rgw: RGWObjVersionTracker tracks version over increments
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Wed, 30 Sep 2020 17:25:44 +0000 (10:25 -0700)]
Merge pull request #37438 from Vicente-Cheng/wip-47347-nautilus
nautilus: rgw: Swift API anonymous access should 401
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Neha Ojha [Tue, 29 Sep 2020 23:51:44 +0000 (16:51 -0700)]
Merge pull request #37472 from smithfarm/wip-46587-nautilus
nautilus: doc/rados: Fix osd_scrub_during_recovery default value
Reviewed-by: Neha Ojha <nojha@redhat.com>
wangyunqing [Thu, 16 Jul 2020 07:06:53 +0000 (15:06 +0800)]
mon: fix the 'Error ERANGE' message when conf "osd_objectstore" is filestore
Fixes: https://tracker.ceph.com/issues/37532
Signed-off-by: wangyunqing <wangyunqing@inspur.com>
(cherry picked from commit
4155a79f76b177ada79af746de4448773e07584a )
Conflicts:
src/mon/OSDMonitor.cc
- in nautilus, "cmd_getval()" needs cct as first argument
Neha Ojha [Fri, 4 Sep 2020 21:51:50 +0000 (21:51 +0000)]
qa/*/mon/mon-last-epoch-clean.sh: mark osd out instead of down
The test should mark the OSD out to check if only "in" OSDs are considered by
the osdmap trimming logic.
Fixes: https://tracker.ceph.com/issues/47309
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit
21c08f0be2e048edd2d3ce7ca803f94a6d32f97c )
Neha Ojha [Wed, 26 Aug 2020 18:40:01 +0000 (18:40 +0000)]
test/cli/osdmaptool/help.t: add clean-temps
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit
762966a08ab31234148fe181eb2c107ae5266906 )
Neha Ojha [Wed, 26 Aug 2020 18:37:58 +0000 (18:37 +0000)]
doc: add clean-temps to osdmaptool.rst
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit
78224c21d24d68d471eb266a37624d00d1483ed8 )
Neha Ojha [Wed, 26 Aug 2020 18:08:03 +0000 (18:08 +0000)]
tools/osdmaptool.cc: add ability to clean_temps
This is particularly useful for debugging purposes when clean_temps()
takes abnormally high amount of time due to flaws in crush rules etc.
Fixes: https://tracker.ceph.com/issues/47159
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit
ab54d3821a61a4ff1ea9973c5f31ee86868b3009 )
Casey Bodley [Mon, 31 Aug 2020 15:19:34 +0000 (11:19 -0400)]
radosgw-admin: period pull command is not always a raw_storage_op
if a --url is given, 'period pull' does not depend on any zone/period
configuration and can be a raw_storage_op. if we get a --remote instead,
we do need to initialize the zone/period configuration to find the
correct endpoint/access keys
Fixes: https://tracker.ceph.com/issues/47217
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit
2b44a9d060d33dca9768c758e1908365488aac2a )
Conflicts:
src/rgw/rgw_admin.cc
- nautilus has different (but unrelatedly so) raw_storage_ops_list
Aleksei Gutikov [Mon, 30 Mar 2020 12:27:45 +0000 (15:27 +0300)]
mgr: decrease pool stats if pg was removed
After merge of placement groups resulting pg contains
objects from itself and merged one.
PGMap::apply_incremental treat this growth as pool stats delta,
but forget to decrease stats for removed pg.
Fixes: https://tracker.ceph.com/issues/44815
Signed-off-by: Aleksei Gutikov <aleksey.gutikov@synesis.ru>
(cherry picked from commit
6090acdae4495e11f117df2330b579744eeada2a )
Iain Buclaw [Wed, 4 Mar 2020 14:32:43 +0000 (15:32 +0100)]
tools/rados: Set locator key when exporting or importing a pool
Fixes the following error when exporting a pool that contains objects
with a locator key set:
error getting xattr set [object name]: (2) No such file or directory
error from export: (2) No such file or directory
Fixes: https://tracker.ceph.com/issues/46824
Signed-off-by: Iain Buclaw <iain.buclaw@dunnhumby.com>
(cherry picked from commit
ecb2df9177f30213b3ff48e76e6fe31fcbb430a7 )
xie xingguo [Fri, 24 Jul 2020 01:57:40 +0000 (09:57 +0800)]
osd/PeeringState: prevent peer's num_objects going negative
Saw it in a teuthology run:
-5645> 2020-07-20 04:34:32.067
7f351e329700 5 osd.5 pg_epoch: 667 ... exit Started/Primary/Active/Backfilling
-5642> 2020-07-20 04:34:32.067
7f351e329700 5 osd.5 pg_epoch: 667 ... enter Started/Primary/Active/Recovered
-5633> 2020-07-20 04:34:32.067
7f351e329700 20 osd.5 pg_epoch: 667 ... _update_calc_stats shard 5 primary objects 0 missing 0
-5632> 2020-07-20 04:34:32.067
7f351e329700 20 osd.5 pg_epoch: 667 ... _update_calc_stats shard 3 objects -1 missing 1
-5631> 2020-07-20 04:34:32.067
7f351e329700 20 osd.5 pg_epoch: 667 ... _update_calc_stats shard 6 objects 0 missing 0
This will crash the choose_acting() procedure as it will mistakenly
think that peer 3 should continue to perform asynchronous recovery
(e.g., due to num_objects_missing = 1) in contrast to fully
backfill-recovered.
While I did not dig into the real cause, there are a couple of
possible explanations of how num_objects can be off. I think that
if a roll forward or log replay could delete something twice, maybe
there would be an undercount. Or maybe something as simple as a
corruption.
Since _update_calc_stats() is going to fix num_objects_missing
for that peer anyway, let's make sure it always starts with a
clean state.
Fixes: https://tracker.ceph.com/issues/46705
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit
10eff2567971ca57b1e821f704de490add021c8e )
Conflicts:
src/osd/PeeringState.cc
- file does not exist in nautilus; made the change manually in
src/osd/PG.cc instead
Benoît Knecht [Tue, 14 Jul 2020 11:50:28 +0000 (13:50 +0200)]
doc/rados: Fix osd_scrub_during_recovery default value
Since
8dca17c , `osd_scrub_during_recovery` defaults to `false`, but the
documentation was still stating that its default value is `true`.
Fixes: https://tracker.ceph.com/issues/46531
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
(cherry picked from commit
535b103d1848f8b5322af0815e1bf163267d7f2a )
Kefu Chai [Mon, 6 Jul 2020 11:16:00 +0000 (19:16 +0800)]
pybind/mgr/balancer: use "==" and "!=" for comparing str
we cannot assume that two values with the same value share the same
identity in Python.
also silences warnings like:
balancer/module.py:473: SyntaxWarning: "is" with a literal. Did you mean "=="?
if pool_ids is '':
Fixes: https://tracker.ceph.com/issues/46406
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit
65dc464977b4c2ea5ab19b8d9d0904ab31f08b94 )
xie xingguo [Tue, 16 Jun 2020 02:08:32 +0000 (10:08 +0800)]
common, osd: add sanity checks around osd_scrub_max_preemptions
to limit maximum preempt_divisor we can use when backing off the
chunky-scrub range on preempting.
Otherwise large osd_scrub_max_preemptions values (i.e., >= 32)
would cause preempt_divisor overflow, hence the dreaded
“divide by zero error”.
Fixes: https://tracker.ceph.com/issues/46024
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit
ae05de3e9b2e9868216e5168e50dfcb5074684cb )
Conflicts:
src/osd/PG.cc
- git got confused about where the change was/is
Jason Dillaman [Mon, 14 Sep 2020 12:58:52 +0000 (08:58 -0400)]
qa/workunits/mon: fixed excessively large pool PG count
Fixes: https://tracker.ceph.com/issues/47405
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit
2fa9442dcc7a0448ab7e3588f82f93ca2e55d686 )
Ilya Dryomov [Mon, 7 Sep 2020 14:51:22 +0000 (16:51 +0200)]
qa: add test for krbd symlinks created by udev
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit
7ccd2c0dcee175e4c5a03985f43e9259a7e4dbd4 )
Ilya Dryomov [Mon, 7 Sep 2020 16:39:22 +0000 (18:39 +0200)]
rbd: include RADOS namespace in krbd symlinks
Fixes: https://tracker.ceph.com/issues/40247
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit
0b5c11ff30dbb79690e47d5285f197f677e11bf7 )
yuliyang_yewu [Wed, 22 Jul 2020 02:05:17 +0000 (10:05 +0800)]
rgw: replace '+' with "%20" in canonical query string for s3
v4 auth.
fix https://tracker.ceph.com/issues/45983
Signed-off-by: yuliyang_yewu <yuliyang_yewu@cmss.chinamobile.com>
(cherry picked from commit
9002be34aa8524816708db4f3429bfe8634b776a )
zhang Shaowen [Fri, 18 Oct 2019 01:17:32 +0000 (09:17 +0800)]
rgw: transitio days can be zero in transition check
Signed-off-by: zhang Shaowen <zhangshaowen@cmss.chinamobile.com>
(cherry picked from commit
191fc25a97cb748f1105a463e9772194ba724d97 )
zhang Shaowen [Sat, 12 Oct 2019 09:59:23 +0000 (17:59 +0800)]
rgw: Expiration days can't be zero and transition days can be zero
Signed-off-by: zhang Shaowen <zhangshaowen@cmss.chinamobile.com>
(cherry picked from commit
b471fd07aa7d9ca7868572688baf89ca5a295e6f )
Casey Bodley [Fri, 29 May 2020 16:31:16 +0000 (12:31 -0400)]
rgw: fix shutdown crash in RGWAsyncReadMDLogEntries
RGWAsyncReadMDLogEntries must not store pointers into coroutine memory,
because it's not guaranteed to outlive our call. store these by-value
instead, and have RGWReadMDLogEntriesCR::request_complete() copy/move
them back on completion
Fixes: https://tracker.ceph.com/issues/45771
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit
13bf06dbe961132ca99f470ac026674e45fecc38 )
Conflicts:
src/rgw/rgw_sync.cc
- difference in RGWAsyncReadMDLogEntries() argument list: in nautilus,
"rgw::sal::RGWRadosStore *_store" becomes just plain "RGWRados
*_store"
Casey Bodley [Fri, 10 Jul 2020 16:38:06 +0000 (12:38 -0400)]
cls/rgw: preserve olh entry's name on last unlink
When rgw_bucket_unlink_instance removes the last instance of a name, it
also clears the value of rgw_bucket_olh_entry.key. However, bucket index
resharding uses this key when choosing its shard placement, so an empty
key causes all of these olh entries to be misplaced in shard 0. After
reshard, all of the olh recovery/cleanup logic would be sent to the
correct shard, and these misplaced olh entries would never be cleaned
up.
Preserving the key's name on last unlink allows the olh entry to be
resharded correctly and cleaned up normally.
Fixes: https://tracker.ceph.com/issues/46456
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit
acf8f3cad9f55e34c703fdaef684853a3fb3b369 )
Soumya Koduri [Tue, 16 Jun 2020 12:40:08 +0000 (18:10 +0530)]
rgw: Empty reqs_change_state queue before unregistered_reqs
In RGWHTTPManager::manage_pending_request(), before unregistering
or unlinking the http requests, empty the reqs_change_state list
to avoid use after free.
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit
b210437e5f28d53e770dd6938ce3c4be443da055 )
Casey Bodley [Tue, 8 Sep 2020 19:27:55 +0000 (15:27 -0400)]
rgw: ObjectCache::put() clears stale objv
if an existing object is cached with an object version, but it's
mutated without updating that version number, clear the OBJV flag so
that later cache reads asking for an object version result in a miss and
re-read the version from the osd
Fixes: https://tracker.ceph.com/issues/47306
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit
cf531cdd5e655a033f47e04b7dda81435a90271d )
Casey Bodley [Thu, 6 Aug 2020 16:57:13 +0000 (12:57 -0400)]
rgw: system object cache tracks version over increments
instead of checking write_version before the write (which doesn't take
cls_version_inc() into account), check read_version after apply_write()
has been called. only cache the result if we got a read_version != 0
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit
ad326ffc3fba865d8c426de4be0193172b7688b7 )
Conflicts:
src/rgw/services/svc_sys_obj_cache.cc
- in nautilus, the functions do not take an optional_yield argument
Casey Bodley [Tue, 4 Aug 2020 19:03:35 +0000 (15:03 -0400)]
rgw: RGWObjVersionTracker tracks read version over increments
when no write_version is given, cls_version_inc() is used to increment
the version so other writers can use cls_version_check() to detect races
however, apply_write() will clear its cached read_version, which means
that later writes can no longer use cls_version_check() to detect other
racing writers
in cases where cls_version_inc() is used AND we know the previous version,
we can increment the cached read_version and preserve the ability to use
cls_version_check(). we know the previous version if we provided a valid
read_version to cls_version_check() and it succeeded
Fixes: https://tracker.ceph.com/issues/46849
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit
298d721c1d3d8d430bfbc9e0ef5db7b44cdbe017 )
Conflicts:
src/rgw/rgw_rados.cc
- nautilus does not have "RGWObjState::RGWObjState()"
Or Friedmann [Wed, 24 Jun 2020 12:55:20 +0000 (15:55 +0300)]
rgw: add negative cache to the system object
add negative cache to the system object
Signed-off-by: Or Friedmann <ofriedma@redhat.com>
Fixes: https://tracker.ceph.com/issues/45816
(cherry picked from commit
0900bd8cf90babd6fafcb398854a4ddb071a27ee )
Conflicts:
src/rgw/services/svc_sys_obj_cache.cc
- RGWSI_SysObj_Core::raw_stat() takes different arguments in nautilus
Yuri Weinstein [Tue, 29 Sep 2020 15:16:08 +0000 (08:16 -0700)]
Merge pull request #37181 from callithea/wip-47411-nautilus
nautilus: mgr: don't update pending service map epoch on receiving map from mon
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Brad Hubbard [Mon, 28 Sep 2020 23:42:19 +0000 (09:42 +1000)]
nautilus: qa/ceph-ansible: Bump required ansible to 2.9
https://github.com/ceph/ceph-ansible/commit/
fd0b9491b60303a5d27f79226f30d51c438de2ff
requires us to move to 2.9.
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
Yuri Weinstein [Mon, 28 Sep 2020 21:02:24 +0000 (14:02 -0700)]
Merge pull request #37379 from shyukri/wip-46939-nautilus
nautilus: qa/tasks/ragweed: always set ragweed_repo
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Yuri Weinstein [Mon, 28 Sep 2020 21:01:17 +0000 (14:01 -0700)]
Merge pull request #37280 from ceph/nautilus-bucket-list-perf
nautilus: mgr/dashboard: fix perf. issue when listing large amounts of buckets
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Yuri Weinstein [Mon, 28 Sep 2020 21:00:25 +0000 (14:00 -0700)]
Merge pull request #36909 from ifed01/wip-ifed-fix-bluefs-sel-nautilus
nautilus: bluestore/bluefs: make accounting resiliant to unlock()
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
Yuri Weinstein [Mon, 28 Sep 2020 16:31:04 +0000 (09:31 -0700)]
Merge pull request #37378 from shyukri/wip-47244-nautilus
nautilus: rgw: Add bucket name to bucket stats error logging
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Matthew Oliver [Thu, 9 Jul 2020 06:13:05 +0000 (06:13 +0000)]
rgw: Swift API anonymous access should 401
There was a previous patch to fix this but turns out that only fixed it
for the Swift V1 auth. And it actaully broke keystone because it didn't
take into account the idiosyncrasies of multi tenancy. Which resulted in
the incorect behaviour for keystone. Worse, because it didn't take
tenants properly into account keystone ACLs where broken.
This patch reworks, and simplifies the original patch to work for both
auths. It even extends the ThirdPartyAccountApplier to check for an ANON
user and properly scope it to a tenant.
Fixes: https://tracker.ceph.com/issues/46295
Signed-off-by: Matthew Oliver <moliver@suse.com>
(cherry picked from commit
67081098dc2dddd80d52d5acd166e68954cae618 )
Conflicts:
src/rgw/rgw_swift_auth.h
- only need to modify the user related code to rgw_user construct
Yuri Weinstein [Mon, 28 Sep 2020 14:51:30 +0000 (07:51 -0700)]
Merge pull request #36704 from ShyamsundarR/wip-47013-nautilus
nautilus: mon: store mon updates in ceph context for future MonMap instantiation
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Laura Paduano [Mon, 28 Sep 2020 13:28:38 +0000 (15:28 +0200)]
Merge pull request #37306 from p-se/wip-47546-nautilus
nautilus: mgr/dashboard: Fix many-to-many issue in host-details Grafana dashboard
Reviewed-by: Ernesto Puertat <epuertat@redhat.com>
Reviewed-by: Jan Fajerski <jfajerski@suse.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Yuri Weinstein [Fri, 25 Sep 2020 15:46:31 +0000 (08:46 -0700)]
Merge pull request #37307 from rhcs-dashboard/wip-47303-nautilus
nautilus: mgr/dashboard: REST API returns 500 when no Content-Type is specified
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Ilya Dryomov [Fri, 25 Sep 2020 15:36:26 +0000 (17:36 +0200)]
Merge pull request #37407 from idryomov/wip-krbd-read-only-override-nautilus
nautilus: rbd: make common options override krbd-specific options
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Marc Gariepy [Tue, 22 Sep 2020 13:01:33 +0000 (09:01 -0400)]
ceph-volume: remove unneeded call to get_devices()
there is no need to probe the device to generate the argparse help
message.
also removing the test on the function as it's not there anymore.
Signed-off-by: Marc Gariepy <gariepy.marc@gmail.com>
Fixes: https://tracker.ceph.com/issues/47502
(cherry picked from commit
5c6f66166a7afad87627032cafdc5c4f11f94eac )
Ilya Dryomov [Fri, 25 Sep 2020 07:55:04 +0000 (09:55 +0200)]
rbd: make common options override krbd-specific options
ceph-csi has added support for passing custom map and unmap options via
mapOptions and unmapOptions storage class parameters. However, it also
uses --read-only for implementing ROX (ReadOnlyMany) PVs. If the user
supplies "mapOptions: rw", they will get around the intended read-only
restriction (at least on the block device).
ceph-csi could be patched to use "-o ro", but it actually makes sense
for common options to win over device type-specific equivalents.
Fixes: https://tracker.ceph.com/issues/47625
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit
a107c47360ecdb8c09768ca9eab2341100245711 )
Conflicts:
src/tools/rbd/action/Kernel.cc [ snapshot quiesce support and
commit
34f539d8af33 ("rbd: delay parsing of default kernel map
options") not in nautilus ]
Alfonso Martínez [Mon, 21 Sep 2020 13:59:38 +0000 (15:59 +0200)]
nautilus: mgr/dashboard: fix perf. issue when listing large amounts of buckets
NOTE: Due to base code divergence between master (pacific) & nautilus,
This is a dedicated fix for nautilus.
Fixes: https://tracker.ceph.com/issues/47618
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
Yuri Weinstein [Thu, 24 Sep 2020 22:25:55 +0000 (15:25 -0700)]
Merge pull request #37254 from bstillwell/wip-47425-nautilus
nautilus: compressor: Add a config option to specify Zstd compression level
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Yuri Weinstein [Thu, 24 Sep 2020 22:25:32 +0000 (15:25 -0700)]
Merge pull request #37269 from badone/wip-nautilus-enable-mgr-client-debug
nautilus: tests/qa: Enable debug_client for mgr tests
Reviewed-by: Josh Durgin <jdurgin@redhat.com>