]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Sage Weil [Wed, 4 Dec 2019 14:52:28 +0000 (08:52 -0600)]
mon: fix mon_sync_max_payload_size type
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
0de99152cc9d25e4c473dc7ea03445ce7db3bb5a )
Sage Weil [Mon, 2 Dec 2019 13:43:54 +0000 (07:43 -0600)]
mon: cap keys in mon_sync messages
The previous cap was set at 1 MB. However, a user was experiencing mon
timeouts while syncing the purged_snap_epoch * keys, which are ~20 bytes
each. Reducing the max payload to 64K resolved the problem, which maps
to (very!) roughly 1500 keys per message. Set our limit a bit higher than
that since we just made this quite a bit more efficient. Most of the time
the keys are larger than 20 bytes and we wouldn't hit the key limit, but
having one ensures that we won't burn too much CPU in one go when we do
have lots of these little keys.
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
4d6c7e349b6749a45ef3ad239113e191b2c1d96a )
Sage Weil [Tue, 12 Nov 2019 20:51:41 +0000 (14:51 -0600)]
mon/MonitorDBStore: improve get_chunk_tx limits
The old version was horribly inefficient in that it would reencode the
transaction on every iteration.
Instead, estimate the size if we add an item and stop it if looks like it
will go over. This isn't super precise, but it's close enough, since the
limits are approximate.
Drop the single-use helper since it only makes the code harder to
follow.
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
83b2ada9e935ae764be5649acee6ee02e4cb935f )
Sage Weil [Tue, 12 Nov 2019 20:46:13 +0000 (14:46 -0600)]
mon/MonitorDBStore: better size estimation for Transaction
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
5103b2563db771eb2f3d4a37f51c8eb40b4e188f )
Sage Weil [Tue, 24 Sep 2019 13:30:17 +0000 (08:30 -0500)]
mon/MonitorDBStore: use const string& for args throughout
const bufferlist& too
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
d4fdd5a1e667338d6bde0ca8d66e8432ddfddfdd )
Sage Weil [Mon, 23 Sep 2019 15:53:27 +0000 (10:53 -0500)]
mon/MonitorDBStore: add erase_range() op to transaction
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
6900420d4e510e6b9e798384f70a2cb9631964db )
Lenz Grimmer [Thu, 5 Mar 2020 12:03:33 +0000 (13:03 +0100)]
Merge pull request #33697 from rhcs-dashboard/wip-44372-nautilus
nautilus: mgr/dashboard: Fixes rbd image 'purge trash' button & modal text
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Nathan Cutler [Thu, 5 Mar 2020 08:53:43 +0000 (09:53 +0100)]
Merge pull request #33688 from badone/wip-nautilus-fix-run-tox-pythonpath-failure
nautilus: mgr/run-tox-tests: Fix issue with PYTHONPATH
Reviewed-by: Kefu Chai <kchai@redhat.com>
Brad Hubbard [Tue, 3 Mar 2020 05:58:35 +0000 (15:58 +1000)]
mgr/run-tox-tests: Fix issue with PYTHONPATH
Something changed recently on Bionic which caused tox to fail when
PYTHONPATH is a relative path. For some reason the path is mangled by
the time it gets to pytest so we need to ensure we are using an absolute
path. This seems to be nautilus specific, at least ATM.
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
Yuri Weinstein [Wed, 4 Mar 2020 23:27:40 +0000 (15:27 -0800)]
Merge pull request #33533 from trociny/wip-44263-nautilus
nautilus: rbd-mirror: improve detection of blacklisted state
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Sébastien Han [Wed, 4 Mar 2020 13:25:23 +0000 (14:25 +0100)]
Merge pull request #33428 from leseb/bkp-33371
nautilus: ceph-volume: silence 'ceph-bluestore-tool' failures
Yuri Weinstein [Tue, 3 Mar 2020 19:12:45 +0000 (11:12 -0800)]
Merge pull request #33340 from tpsilva/wip-44129-nautilus
nautilus: rgw: make max_connections configurable in beast
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Tue, 3 Mar 2020 19:11:42 +0000 (11:11 -0800)]
Merge pull request #33274 from smithfarm/wip-43852-nautilus
nautilus: test: Fix race with osd restart and doing a scrub
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Tue, 3 Mar 2020 19:09:23 +0000 (11:09 -0800)]
Merge pull request #33271 from smithfarm/wip-43999-nautilus
nautilus: rgw: MultipartObjectProcessor supports stripe size > chunk size
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Tue, 3 Mar 2020 19:08:49 +0000 (11:08 -0800)]
Merge pull request #33267 from smithfarm/wip-43855-nautilus
nautilus: rgw: fix SignatureDoesNotMatch when use ipv6 address in s3 client
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Tue, 3 Mar 2020 19:08:20 +0000 (11:08 -0800)]
Merge pull request #33266 from smithfarm/wip-43851-nautilus
nautilus: rgw: Fix dynamic resharding not working for empty zonegroup in period
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Tue, 3 Mar 2020 19:07:38 +0000 (11:07 -0800)]
Merge pull request #33265 from smithfarm/wip-43848-nautilus
nautilus: rgw: Fix upload part copy range able to get almost any string
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Tue, 3 Mar 2020 19:05:33 +0000 (11:05 -0800)]
Merge pull request #33270 from smithfarm/wip-43923-nautilus
nautilus: rgw multisite: enforce spawn window for incremental data sync
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
anurag [Thu, 6 Feb 2020 08:27:46 +0000 (13:57 +0530)]
mgr/dashboard: Fixes rbd image 'purge trash' button & modal text
Fixes: https://tracker.ceph.com/issues/43801
Signed-off-by: anurag <abandhu@redhat.com>
(cherry picked from commit
803a0e2599f76a0a5894ba857bdeb57b6641a5ad )
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-trash-list/rbd-trash-list.component.html
- nautilus uses button class="btn btn-sm btn-default btn-label" instead of class="btn btn-light"
Yuri Weinstein [Mon, 2 Mar 2020 23:30:01 +0000 (15:30 -0800)]
Merge pull request #33268 from smithfarm/wip-43878-nautilus
nautilus: rgw: when you abort a multipart upload request, the quota may be not updated
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Yuri Weinstein [Mon, 2 Mar 2020 23:28:35 +0000 (15:28 -0800)]
Merge pull request #33273 from smithfarm/wip-44038-nautilus
nautilus: rgw: fix rgw crash when duration is invalid in sts request
Reviewed-by: Pritha Srivastava <prsrivas@redhat.com>
Jenkins Build Slave User [Mon, 2 Mar 2020 17:49:20 +0000 (17:49 +0000)]
14.2.8
Patrick Donnelly [Thu, 27 Feb 2020 20:18:40 +0000 (12:18 -0800)]
Merge PR #33569 into nautilus
* refs/pull/33569/head:
mgr/volumes: unregister job upon async threads exception
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Venky Shankar [Wed, 26 Feb 2020 04:52:37 +0000 (23:52 -0500)]
mgr/volumes: unregister job upon async threads exception
If the async threads hit a temporary exception the job is
never unregistered and therefore gets skipped by the async
threads on subsequent scans.
Patrick hit this in nautilus when one of the purge threads
hit an exception when trying to log a message. The trash
entry was never picked up again by the purge threads.
Fixes: http://tracker.ceph.com/issues/44315
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit
46476ef2e290bd15af2fec2410cb4f3f86b27cd2 )
Yuri Weinstein [Tue, 25 Feb 2020 19:05:39 +0000 (11:05 -0800)]
Merge pull request #33346 from yaarith/backport-nautilus-pr-32903
nautilus: mgr/devicehealth: fix telemetry stops sending device reports after 48 hours
Reviewed-by: Sage Weil <sage@redhat.com>
Mykola Golub [Wed, 19 Feb 2020 10:17:08 +0000 (10:17 +0000)]
rbd-mirror: improve detection of blacklisted state
Fixes: https://tracker.ceph.com/issues/44159
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit
cfb4f423a42d0265cb78ebb4eb8cc6924d6f45fa )
Conflicts:
src/tools/rbd_mirror/InstanceReplayer.h (ceph::mutex vs Mutex)
src/tools/rbd_mirror/NamespaceReplayer.cc (does not exist)
src/tools/rbd_mirror/PoolReplayer.cc (code from NamespaceReplayer is here)
src/test/rbd_mirror/test_mock_PoolReplayer.cc (accordingly to PoolReplayer.cc changes)
Patrick Donnelly [Tue, 25 Feb 2020 04:26:30 +0000 (20:26 -0800)]
Merge PR #33526 into nautilus
* refs/pull/33526/head:
test: verify purge queue w/ large number of subvolumes
test: pass timeout argument to mount::wait_for_dir_empty()
mgr/volumes: access volume in lockless mode when fetching async job
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Venky Shankar [Wed, 19 Feb 2020 14:19:31 +0000 (09:19 -0500)]
test: verify purge queue w/ large number of subvolumes
Fixes: http://tracker.ceph.com/issues/44282
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit
92b20089369b0d549c8c337a60bb93ae24c7b66a )
Venky Shankar [Mon, 24 Feb 2020 07:27:25 +0000 (02:27 -0500)]
test: pass timeout argument to mount::wait_for_dir_empty()
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit
5ec09a228ca2b726da3ce79fc01e07911051326a )
Venky Shankar [Wed, 19 Feb 2020 12:31:40 +0000 (07:31 -0500)]
mgr/volumes: access volume in lockless mode when fetching async job
Saw a deadlock when deleting lot of subvolumes -- purge threads were
stuck in accessing global lock for volume access. This can happen
when there is a concurrent remove (which renames and signals the
purge threads) and a purge thread is just about to scan the trash
directory for entries.
For the fix, purge threads fetches entries by accessing the volume
in lockless mode. This is safe from functionality point-of-view as
the rename and directory scan is correctly handled by the filesystem.
Worst case the purge thread would pick up the trash entry on next
scan, never leaving a stale trash entry.
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit
808a1ce1f96f6dfd3472156ce5087372da4c1314 )
Patrick Donnelly [Mon, 24 Feb 2020 15:28:23 +0000 (07:28 -0800)]
Merge PR #33498 into nautilus
* refs/pull/33498/head:
mgr: drop reference to msg on return
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Patrick Donnelly [Sun, 23 Feb 2020 23:27:34 +0000 (15:27 -0800)]
mgr: drop reference to msg on return
Caused by backport commit
cb48be5a69fe6482cbe3bff1b53ba090e077de0d which
did not account for the explicit drop of the message reference, only in
Nautilus-.
Fixes: https://tracker.ceph.com/issues/44245
Fixes: cb48be5a69fe6482cbe3bff1b53ba090e077de0d
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Yuri Weinstein [Fri, 21 Feb 2020 22:13:27 +0000 (14:13 -0800)]
Merge pull request #33470 from neha-ojha/wip-mgsr2-order-nautilus
nautilus: qa/suites/upgrade/mimic-x/stress-split: fix msgr2 vs nautilus ordering
Neha Ojha [Thu, 20 Feb 2020 02:11:26 +0000 (18:11 -0800)]
qa/suites/upgrade/mimic-x/stress-split: fix msgr2 vs nautilus ordering
This was done for octopus in
8283ea9f587fb1136b4e9fa9d8a0435852fb948a ,
but not for nautilus
Signed-off-by: Neha Ojha <nojha@redhat.com>
Sébastien Han [Wed, 19 Feb 2020 15:03:02 +0000 (16:03 +0100)]
ceph-volume: silence 'ceph-bluestore-tool' failures
If 'ceph-bluestore-tool' fails on a device, the json output of the list
command will be messed up. Ignoring stderr of that command fixes this.
Signed-off-by: Sébastien Han <seb@redhat.com>
Yuri Weinstein [Tue, 18 Feb 2020 21:26:18 +0000 (13:26 -0800)]
Merge pull request #33378 from badone/wip-badone-testing
nautilus: qa/ceph-ansible: ansible-version and ceph_ansible
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Brad Hubbard [Wed, 5 Feb 2020 06:46:29 +0000 (16:46 +1000)]
nautilus: qa/ceph-ansible: ansible-version and ceph_ansible
Upgrade to 2.8.1 and stable-4.0 respectively
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
Yaarit Hatuka [Mon, 27 Jan 2020 13:57:55 +0000 (08:57 -0500)]
mgr/devicehealth: fix telemetry stops sending device reports after 48 hours
Telemetry module fetches device metrics which were scraped in the last
"telemetry interval"*2 (=48 hours by default) by calling
_get_device_metrics() with min_sample. _get_device_metrics() fetches the
metrics from omap and breaks on the first one that is older than
min_sample. But because it fetched in ascending order (from oldest to
newest) it was breaking on the first one it received, if it was older
than the interval above. We need to pass min_sample to get_omap_vals()
so it will start fetching from that value.
Fixes: https://tracker.ceph.com/issues/43837
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit
5f7e4a980a73e8cacb2c9bde47d822a32fb8c440 )
Sage Weil [Fri, 4 Oct 2019 20:03:02 +0000 (15:03 -0500)]
mgr/devicehealth: factor _get_device_metrics out of show_device_metrics
Add the min_sample lower-bound argument too
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
7be5c1323b3814e2634d5cd66d45cab5a77df680 )
Conflicts: had to be backported to enable backporting of
https://github.com/ceph/ceph/pull/32903
Backport tracker: https://tracker.ceph.com/issues/43873
Tiago Pasqualini [Fri, 31 Jan 2020 18:22:19 +0000 (15:22 -0300)]
rgw: make max_connections configurable in beast
Beast frontend currently accepts a hardcoded number of connections
that is defined by boost::asio::socket_base::max_connections. This
commit makes it configurable via a 'max_connections' config option
on rgw frontend.
Fixes: https://tracker.ceph.com/issues/43952
Signed-off-by: Tiago Pasqualini <tiago.pasqualini@canonical.com>
(cherry picked from commit
d6dada5bcb356abaef8d9237ceca8f42d4fcfb74 )
Yuri Weinstein [Fri, 14 Feb 2020 17:23:51 +0000 (09:23 -0800)]
Merge pull request #33278 from smithfarm/wip-44085-nautilus
nautilus: ceph-monstore-tool: correct the key for storing mgr_command_descs
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Fri, 14 Feb 2020 17:23:24 +0000 (09:23 -0800)]
Merge pull request #33277 from smithfarm/wip-43722-nautilus
nautilus: common/bl: fix the dangling last_p issue.
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Fri, 14 Feb 2020 17:22:55 +0000 (09:22 -0800)]
Merge pull request #33276 from smithfarm/wip-44082-nautilus
nautilus: qa/suites/rados/multimon/tasks/mon_clock_with_skews: disable ntpd etc
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Fri, 14 Feb 2020 17:22:04 +0000 (09:22 -0800)]
Merge pull request #32844 from smithfarm/wip-43239-nautilus
nautilus: mgr/DaemonServer: fix 'osd ok-to-stop' for EC pools
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
Jan Fajerski [Fri, 14 Feb 2020 17:00:46 +0000 (18:00 +0100)]
Merge pull request #33337 from jan--f/wip-44153-nautilus
nautilus: ceph-volume: don't remove vg twice when zapping filestore
Jan Fajerski [Fri, 14 Feb 2020 17:00:20 +0000 (18:00 +0100)]
Merge pull request #33334 from jan--f/wip-44152-nautilus
nautilus: ceph-volume: pass journal_size as Size not string
Jan Fajerski [Fri, 14 Feb 2020 13:10:36 +0000 (14:10 +0100)]
ceph-volume: don't remove vg twice when zapping filestore
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Fixes: https://tracker.ceph.com/issues/44149
(cherry picked from commit
bccdf6eafaf851d5092bb99d61edd44cd36d9dd2 )
Jan Fajerski [Fri, 14 Feb 2020 11:50:47 +0000 (12:50 +0100)]
ceph-volume: pass journal_size as Size not string
Fixes: https://tracker.ceph.com/issues/44148
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
49f6e6d600aae6310f941c6635408d496b0ff2b9 )
Jan Fajerski [Fri, 14 Feb 2020 10:41:40 +0000 (11:41 +0100)]
Merge pull request #33301 from jan--f/wip-43871-nautilus-failed-cp
nautilus: ceph-volume: batch bluestore fix create_lvs call
Jan Fajerski [Fri, 14 Feb 2020 09:45:27 +0000 (10:45 +0100)]
Merge pull request #33297 from jan--f/wip-44135-nautilus
nautilus: ceph-volume: avoid calling zap_lv with a LV-less VG
Jan Fajerski [Tue, 28 Jan 2020 08:25:39 +0000 (09:25 +0100)]
ceph-volume: batch bluestore fix create_lvs call
Fixes: https://tracker.ceph.com/issues/43844
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
df18497bc9eaf1922e5c885e8cc124e439c59364 )
Jan Fajerski [Thu, 13 Feb 2020 16:09:44 +0000 (17:09 +0100)]
ceph-volume: avoid calling zap_lv with a LV-less VG
Fixes: https://tracker.ceph.com/issues/44125
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
ad0dea53b8585d8397bf3069b1b39f13b6e0a8ce )
Yuri Weinstein [Thu, 13 Feb 2020 22:25:53 +0000 (14:25 -0800)]
Merge pull request #33151 from shyukri/wip-43877-nautilus
nautilus: rgw: fix one part of the bulk delete(RGWDeleteMultiObj_ObjStore_S3) fails but no error messages
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 22:25:13 +0000 (14:25 -0800)]
Merge pull request #33149 from shyukri/wip-43874-nautilus
nautilus: rgw: maybe coredump when reload operator happened
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 22:21:55 +0000 (14:21 -0800)]
Merge pull request #33008 from smithfarm/wip-43922-nautilus
nautilus: rgw_file: avoid string::front() on empty path
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 20:15:09 +0000 (12:15 -0800)]
Merge pull request #33152 from shyukri/wip-43879-nautilus
nautilus: mon: Don't put session during feature change
Reviewed-by: David Zafman <dzafman@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 20:14:43 +0000 (12:14 -0800)]
Merge pull request #33095 from k0ste/wip-43979-nautilus
nautilus: mgr/telemetry: check get_metadata return val
Reviewed-by: David Zafman <dzafman@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 20:12:53 +0000 (12:12 -0800)]
Merge pull request #32908 from smithfarm/wip-43821-nautilus
nautilus: mon/Session: only index osd ids >= 0
Yuri Weinstein [Thu, 13 Feb 2020 20:11:01 +0000 (12:11 -0800)]
Merge pull request #33170 from k0ste/wip-43727-nautilus
nautilus: mgr/pg_autoscaler: calculate pool_pg_target using pool size
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 20:10:28 +0000 (12:10 -0800)]
Merge pull request #33168 from k0ste/wip-44057-nautilus
nautilus: mgr/telemetry: split entity_name only once (handle ids with dots)
Reviewed-by: Sage Weil <sage@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 20:09:56 +0000 (12:09 -0800)]
Merge pull request #33157 from shyukri/wip-43924-nautilus
nautilus: mgr/prometheus: report per-pool pg states
Reviewed-by: Jan Fajerski <jfajerski@suse.com>
Yuri Weinstein [Thu, 13 Feb 2020 20:09:22 +0000 (12:09 -0800)]
Merge pull request #33155 from shyukri/wip-43916-nautilus
nautilus: mon/ConfigMonitor: only propose if leader
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 20:08:34 +0000 (12:08 -0800)]
Merge pull request #33147 from shyukri/wip-43989-nautilus
nautilus: osd: Allow 64-char hostname to be added as the "host" in CRUSH
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 20:07:53 +0000 (12:07 -0800)]
Merge pull request #33142 from shyukri/wip-44000-nautilus
nautilus: mon/MgrMonitor.cc: warn about missing mgr in a cluster with osds
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 20:07:07 +0000 (12:07 -0800)]
Merge pull request #33082 from k0ste/wip-43974-nautilus
nautilus: mgr/telemetry: anonymizing smartctl report itself
Reviewed-by: Sage Weil <sage@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 20:06:36 +0000 (12:06 -0800)]
Merge pull request #32948 from yaarith/wip-telemetry-serial-nautilus-yh
nautilus: mgr/telemetry: fix device serial number anonymization
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 19:35:30 +0000 (11:35 -0800)]
Merge pull request #33007 from smithfarm/wip-43928-nautilus
nautilus: mon: elector: return after triggering a new election
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 19:31:42 +0000 (11:31 -0800)]
Merge pull request #32931 from smithfarm/wip-43819-nautilus
nautilus: mgr/pg_autoscaler: default to pg_num[_min] = 32
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Yuri Weinstein [Thu, 13 Feb 2020 19:29:36 +0000 (11:29 -0800)]
Merge pull request #32905 from smithfarm/wip-43731-nautilus
nautilus: crush/CrushWrapper: behave with empty weight vector
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Thu, 13 Feb 2020 19:28:54 +0000 (11:28 -0800)]
Merge pull request #32856 from zhengchengyao/nautilus_no_mon_update
nautilus: mon/ConfigMonitor: fix handling of NO_MON_UPDATE settings
Reviewed-by: Nathan Cutler <ncutler@suse.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Kefu Chai [Mon, 10 Feb 2020 09:36:04 +0000 (17:36 +0800)]
doc: update mondb recovery script
to note that we also need to add mgr's key to monitor's keyring
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit
75f4765f2ffe795dba85540b8aa1675ba9de28e4 )
Kefu Chai [Mon, 10 Feb 2020 09:33:26 +0000 (17:33 +0800)]
ceph-monstore-tool: correct the key for storing mgr_command_descs
Fixes: https://tracker.ceph.com/issues/43582
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit
a5bfeca64f3204142fda5320b4fd403df4b5f532 )
Kefu Chai [Mon, 10 Feb 2020 08:27:22 +0000 (16:27 +0800)]
ceph-monstore-tool: rename mon-ids in initial monmap
when ceph-mon starts, it checks to see if it's listed in the monmap, if
not it complains
```
no public_addr or public_network specified, and mon.a not present in
monmap or ceph.conf.
```
then bails out. normally, the monitor will try to rename its name in
monmap when performing "mkfs", but in our case, we are merely using the
"mkfs" monmap for passing the monmap built by ceph-monstore-tools, and
we don't actually go through the "mkfs" process. so, ceph-mon won't
rename when booting up.
in this change, user is allowed to specify the mon-ids in command line
when rebuilding mondb, the default mon-ids would be a,b,c,... if not
specified.
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit
4b3df5a850db054928f9fcc6ef0a160a05a2ffa9 )
Radoslaw Zarzynski [Thu, 16 Jan 2020 12:17:41 +0000 (13:17 +0100)]
common/bl: fix the dangling last_p issue.
Fixes: https://tracker.ceph.com/issues/43646
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
(cherry picked from commit
8198332f3afb5de748dfa6e3349fefbf7e9ed137 )
Conflicts:
src/test/bufferlist.cc
Sage Weil [Sun, 9 Feb 2020 19:40:46 +0000 (13:40 -0600)]
qa/suites/rados/multimon/tasks/mon_clock_with_skews: whitelist MOST_DOWN
The skewed clock makes some mons miss elections.
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
08b6a2bc008c006ff815dcb85bfb24b73072c7ab )
Sage Weil [Sun, 9 Feb 2020 16:55:03 +0000 (10:55 -0600)]
qa/suites/rados/multimon/tasks/mon_clock_with_skews: disable ntpd etc
Fixes: https://tracker.ceph.com/issues/43889
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit
9f2a854b175f156d4ab7fba955aff515052c9d93 )
David Zafman [Fri, 6 Dec 2019 20:44:57 +0000 (12:44 -0800)]
test: Improve races by using kill_daemons which waits for OSDs terminate
osd-backfill-space.sh: More sleep time to make sure the backfill gets started
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
676d882649405c4a0db432a886a2be475d9a45a9 )
David Zafman [Fri, 6 Dec 2019 17:01:41 +0000 (09:01 -0800)]
test: run-standalone.sh: Only run execs in the subdirectories of qa/standalone
This will ignore scripts placed at the qa/standalone level, though
I'm not sure if we should be putting any tests there. It does
allow support scripts present like ceph-helper.sh without modifying
run-standalone.sh to ignore it.
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
8d3cfc6bc5d979c7398e3561c26a71116079d371 )
David Zafman [Thu, 5 Dec 2019 23:13:31 +0000 (15:13 -0800)]
test: Use activate_osd() when restarting OSDs
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
43f6218993bac14d0b01e1da5c14840433bae12b )
David Zafman [Thu, 5 Dec 2019 17:48:09 +0000 (09:48 -0800)]
test: osd-scrub-snaps.sh: Fix race with osd restart and doing a scrub
Fixes: https://tracker.ceph.com/issues/43150
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit
cca541d0f9a945a3e3ae2247bab238d7d4cea335 )
yuliyang [Mon, 9 Dec 2019 12:23:15 +0000 (20:23 +0800)]
rgw: fix rgw crash when duration is invalid in sts request
Fixes: https://tracker.ceph.com/issues/43018
Signed-off-by: yuliyang <yuliyang@cmss.chinamobile.com>
(cherry picked from commit
064d16f6659d190d6196e2bb26605caac6d0786a )
Casey Bodley [Thu, 30 Jan 2020 20:17:30 +0000 (15:17 -0500)]
qa/rgw: test with non-default rgw-obj-stripe-size
each job will select one of the striping strategies at random
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit
d486b5bc455d220dac1cc2fbf38317c2369fff38 )
Casey Bodley [Thu, 30 Jan 2020 20:11:42 +0000 (15:11 -0500)]
rgw: MultipartObjectProcessor supports stripe size > chunk size
the head object for a multipart part should contain the entire stripe,
unlike a normal object where the head only contains the first chunk of
data (because it has to be written atomically)
Fixes: https://tracker.ceph.com/issues/42669
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit
39d1ad6713aaa56307c6710c2cb46ff0b4254b8b )
Casey Bodley [Tue, 7 Jan 2020 18:30:51 +0000 (13:30 -0500)]
rgw: remove spawned_keys filter from incremental data sync
the spawned_keys filtering is valid "as long as we don't yield",
according to code comments. however, proper enforcement of the
spawn window necessitates yielding when we exceed that window
the key-based filtering provided by spawned_keys is actually already
satisfied by the call to marker_tracker->index_key_to_marker(), which
also takes completions (either from try_update_high_marker() or
finish()) into account
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit
3e2795fd8f14afdab909c3a15b65f47f390b2230 )
Casey Bodley [Tue, 7 Jan 2020 18:28:19 +0000 (13:28 -0500)]
rgw: incremental data sync respects spawn window
RGWReadRemoteDataLogShardCR will fetch up to 1000 entries. in order for
the spawn window to apply correctly, it has to be enforced inside the
loop over those entries
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit
94a3affe7c7fc0a64e5b86f675326d6aee4e9b7e )
Richard Bai(白学余) [Tue, 13 Aug 2019 12:56:55 +0000 (20:56 +0800)]
rgw: when you abort a multipart upload request, the quota will not update
Fixes: https://tracker.ceph.com/issues/41606
Signed-off-by: Richard Bai(白学余) <baixueyu@inspur.com>
(cherry picked from commit
f61fda5d1b572828750d53b487e0aa7cfabdf2a8 )
yuliyang [Tue, 8 Oct 2019 05:30:08 +0000 (13:30 +0800)]
rgw: fix SignatureDoesNotMatch when use ipv6 address in s3 client
fix: https://tracker.ceph.com/issues/42218
Signed-off-by: yuliyang <yuliyang@cmss.chinamobile.com>
(cherry picked from commit
1039ed8f5173cac1e1d476e5b26911dccba0a203 )
ofriedma [Tue, 3 Dec 2019 14:11:35 +0000 (16:11 +0200)]
rgw: Fix dynamic resharding not working for empty zonegroup in period
Sometimes when cluster has been upgraded from jewel, the period's zonegroup could be empty, so no dynamic resharding.
This fix should fix it and return true for less than 1 (0) zonegroup in period
Fixes: https://tracker.ceph.com/issues/43188
Signed-off-by: Or Friedmann <ofriedma@redhat.com>
(cherry picked from commit
a76e4393728c3e74a943b635d2ac0652e0cc092a )
Or Friedmann [Sun, 5 Jan 2020 16:07:42 +0000 (18:07 +0200)]
rgw: Fix upload part copy range able to get almost any string
Fix upload part copy range able to get almost any string
This PR intends to add more checking on HTTP_X_AMZ_COPY_SOURCE_RANGE header
Signed-off-by: Or Friedmann <ofriedma@redhat.com>
(cherry picked from commit
139495052ae3c87458ccd428f27657465a589201 )
Jan Fajerski [Thu, 13 Feb 2020 12:35:30 +0000 (13:35 +0100)]
Merge pull request #33254 from jan--f/wip-44112-nautilus
nautilus: ceph-volume: use get_device_vgs in has_common_vg
Jan Fajerski [Thu, 13 Feb 2020 12:35:17 +0000 (13:35 +0100)]
Merge pull request #33253 from jan--f/wip-44109-nautilus
nautilus: ceph-volume: fix is_ceph_device for lvm batch
Jan Fajerski [Thu, 13 Feb 2020 07:16:19 +0000 (08:16 +0100)]
Merge pull request #33240 from jan--f/wip-44035-nautilus
nautilus: ceph-volume: finer grained availability notion in inventory.
Jan Fajerski [Thu, 13 Feb 2020 07:15:06 +0000 (08:15 +0100)]
Merge pull request #33239 from jan--f/wip-43984-nautilus
nautilus: ceph-volume: fix has_bluestore_label() function
Jan Fajerski [Wed, 12 Feb 2020 13:47:37 +0000 (14:47 +0100)]
ceph-volume: use get_device_vgs in has_common_vg
Fixes: https://tracker.ceph.com/issues/44099
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
2c5a8c3b4066dd2aca47d719c9723850ce5f96fc )
Jan Fajerski [Wed, 12 Feb 2020 15:49:30 +0000 (16:49 +0100)]
ceph-volume: add is_ceph_device unit tests
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit
60d80636e4708761287197c534347f82e307c603 )
Dimitri Savineau [Tue, 11 Feb 2020 21:53:55 +0000 (16:53 -0500)]
ceph-volume: fix is_ceph_device for lvm batch
This is a regression introduced by
634a709
The lvm batch command fails to prepare the OSDs on the created LV.
When using lvm batch, the LV/VG are created prior the OSD prepare.
During that creation, multiple tags are set with null value.
$ lvs -o lv_tags --noheadings
ceph.cluster_fsid=null,ceph.osd_fsid=null,ceph.osd_id=null,ceph.type=null
Since we call is_ceph_device which returns True if the ceph.osd_id LVM
tag exists but doesn't test the value then we raise an execption.
When the tag value is set to 'null' then we can consider that the device
isn't part of the ceph cluster (because not yet prepared).
Closes: https://tracker.ceph.com/issues/44069
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit
a82582364c7b65a4a5e2673e3886acd6d2066130 )
Jan Fajerski [Thu, 13 Feb 2020 07:11:01 +0000 (08:11 +0100)]
Merge pull request #33242 from jan--f/wip-44047-nautilus
nautilus: ceph-volume: skip osd creation when already done
Yuri Weinstein [Wed, 12 Feb 2020 19:38:04 +0000 (11:38 -0800)]
Merge pull request #32919 from smithfarm/wip-43780-nautilus
nautilus: cephfs: qa: ignore trimmed cache items for dead cache drop
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Yuri Weinstein [Wed, 12 Feb 2020 18:46:14 +0000 (10:46 -0800)]
Merge pull request #33183 from smithfarm/wip-43846-nautilus
nautilus: rgw: update the hash source for multipart entries during resharding
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
Yuri Weinstein [Wed, 12 Feb 2020 18:44:48 +0000 (10:44 -0800)]
Merge pull request #33115 from batrick/i43790
nautilus: RuntimeError: Files in flight high water is unexpectedly low (0 / 6)
Reviewed-by: Ramana Raja <rraja@redhat.com>