]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
5 years ago14.2.9 v14.2.9
Jenkins Build Slave User [Thu, 9 Apr 2020 16:17:28 +0000 (16:17 +0000)]
14.2.9

5 years agorgw: reject control characters in response-header actions
Robin H. Johnson [Fri, 27 Mar 2020 19:48:13 +0000 (20:48 +0100)]
rgw: reject control characters in response-header actions

S3 GetObject permits overriding response header values, but those inputs
need to be validated to insure only characters that are valid in an HTTP
header value are present.

Credit: Initial vulnerability discovery by William Bowling (@wcbowling)
Credit: Further vulnerability discovery by Robin H. Johnson <rjohnson@digitalocean.com>
Signed-off-by: Robin H. Johnson <rjohnson@digitalocean.com>
5 years agorgw: EPERM to ERR_INVALID_REQUEST
Abhishek Lekshmanan [Fri, 27 Mar 2020 18:29:01 +0000 (19:29 +0100)]
rgw: EPERM to ERR_INVALID_REQUEST

As per Robin's comments and S3 spec

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
5 years agorgw: reject unauthenticated response-header actions
Matt Benjamin [Fri, 27 Mar 2020 17:13:48 +0000 (18:13 +0100)]
rgw: reject unauthenticated response-header actions

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit d8dd5e513c0c62bbd7d3044d7e2eddcd897bd400)

5 years agomsg/async/crypto_onwire: fix endianness of nonce_t
Ilya Dryomov [Fri, 6 Mar 2020 19:16:45 +0000 (20:16 +0100)]
msg/async/crypto_onwire: fix endianness of nonce_t

As a AES-GCM IV, nonce_t is implicitly shared between server and
client.  Currently, if their endianness doesn't match, they are unable
to communicate in secure mode because each gets its own idea of what
the next nonce should be after the counter is incremented.

Several RFCs state that the nonce counter should be BE, but since we
use LE for everything on-disk and on-wire, make it LE.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
5 years agomsg/async/ProtocolV2: avoid AES-GCM nonce reuse vulnerabilities
Ilya Dryomov [Fri, 6 Mar 2020 19:16:45 +0000 (20:16 +0100)]
msg/async/ProtocolV2: avoid AES-GCM nonce reuse vulnerabilities

The secure mode uses AES-128-GCM with 96-bit nonces consisting of a
32-bit counter followed by a 64-bit salt.  The counter is incremented
after processing each frame, the salt is fixed for the duration of
the session.  Both are initialized from the session key generated
during session negotiation, so the counter starts with essentially
a random value.  It is allowed to wrap, and, after 2**32 frames, it
repeats, resulting in nonce reuse (the actual sequence numbers that
the messenger works with are 64-bit, so the session continues on).

Because of how GCM works, this completely breaks both confidentiality
and integrity aspects of the secure mode.  A single nonce reuse reveals
the XOR of two plaintexts and almost completely reveals the subkey
used for producing authentication tags.  After a few nonces get used
twice, all confidentiality and integrity goes out the window and the
attacker can potentially encrypt-authenticate plaintext of their
choice.

We can't easily change the nonce format to extend the counter to
64 bits (and possibly XOR it with a longer salt).  Instead, just
remember the initial nonce and cut the session before it repeats,
forcing renegotiation.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Conflicts:
src/msg/async/ProtocolV2.cc [ context: commit 697aafa2aad2
  ("msg/async/ProtocolV2: remove unused parameter") not in
  nautilus ]
src/msg/async/ProtocolV2.h [ context: commit ed3ec4c01d17
  ("msg: Build target 'common' without using namespace in
  headers") not in nautilus ]

5 years agomsg/async: rename outcoming_bl -> outgoing_bl in AsyncConnection.
Radoslaw Zarzynski [Thu, 3 Oct 2019 13:39:15 +0000 (15:39 +0200)]
msg/async: rename outcoming_bl -> outgoing_bl in AsyncConnection.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
(cherry picked from commit 7997a3ea193344ea9782d2594cca295ac5bdf59d)

5 years ago14.2.8 40148/head v14.2.8
Jenkins Build Slave User [Mon, 2 Mar 2020 17:49:20 +0000 (17:49 +0000)]
14.2.8

5 years agoMerge PR #33569 into nautilus
Patrick Donnelly [Thu, 27 Feb 2020 20:18:40 +0000 (12:18 -0800)]
Merge PR #33569 into nautilus

* refs/pull/33569/head:
mgr/volumes: unregister job upon async threads exception

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agomgr/volumes: unregister job upon async threads exception 33569/head
Venky Shankar [Wed, 26 Feb 2020 04:52:37 +0000 (23:52 -0500)]
mgr/volumes: unregister job upon async threads exception

If the async threads hit a temporary exception the job is
never unregistered and therefore gets skipped by the async
threads on subsequent scans.

Patrick hit this in nautilus when one of the purge threads
hit an exception when trying to log a message. The trash
entry was never picked up again by the purge threads.

Fixes: http://tracker.ceph.com/issues/44315
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 46476ef2e290bd15af2fec2410cb4f3f86b27cd2)

5 years agoMerge pull request #33346 from yaarith/backport-nautilus-pr-32903
Yuri Weinstein [Tue, 25 Feb 2020 19:05:39 +0000 (11:05 -0800)]
Merge pull request #33346 from yaarith/backport-nautilus-pr-32903

nautilus: mgr/devicehealth: fix telemetry stops sending device reports after 48 hours

Reviewed-by: Sage Weil <sage@redhat.com>
5 years agoMerge PR #33526 into nautilus
Patrick Donnelly [Tue, 25 Feb 2020 04:26:30 +0000 (20:26 -0800)]
Merge PR #33526 into nautilus

* refs/pull/33526/head:
test: verify purge queue w/ large number of subvolumes
test: pass timeout argument to mount::wait_for_dir_empty()
mgr/volumes: access volume in lockless mode when fetching async job

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agotest: verify purge queue w/ large number of subvolumes 33526/head
Venky Shankar [Wed, 19 Feb 2020 14:19:31 +0000 (09:19 -0500)]
test: verify purge queue w/ large number of subvolumes

Fixes: http://tracker.ceph.com/issues/44282
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 92b20089369b0d549c8c337a60bb93ae24c7b66a)

5 years agotest: pass timeout argument to mount::wait_for_dir_empty()
Venky Shankar [Mon, 24 Feb 2020 07:27:25 +0000 (02:27 -0500)]
test: pass timeout argument to mount::wait_for_dir_empty()

Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 5ec09a228ca2b726da3ce79fc01e07911051326a)

5 years agomgr/volumes: access volume in lockless mode when fetching async job
Venky Shankar [Wed, 19 Feb 2020 12:31:40 +0000 (07:31 -0500)]
mgr/volumes: access volume in lockless mode when fetching async job

Saw a deadlock when deleting lot of subvolumes -- purge threads were
stuck in accessing global lock for volume access. This can happen
when there is a concurrent remove (which renames and signals the
purge threads) and a purge thread is just about to scan the trash
directory for entries.

For the fix, purge threads fetches entries by accessing the volume
in lockless mode. This is safe from functionality point-of-view as
the rename and directory scan is correctly handled by the filesystem.
Worst case the purge thread would pick up the trash entry on next
scan, never leaving a stale trash entry.

Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 808a1ce1f96f6dfd3472156ce5087372da4c1314)

5 years agoMerge PR #33498 into nautilus
Patrick Donnelly [Mon, 24 Feb 2020 15:28:23 +0000 (07:28 -0800)]
Merge PR #33498 into nautilus

* refs/pull/33498/head:
mgr: drop reference to msg on return

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
5 years agomgr: drop reference to msg on return 33498/head
Patrick Donnelly [Sun, 23 Feb 2020 23:27:34 +0000 (15:27 -0800)]
mgr: drop reference to msg on return

Caused by backport commit cb48be5a69fe6482cbe3bff1b53ba090e077de0d which
did not account for the explicit drop of the message reference, only in
Nautilus-.

Fixes: https://tracker.ceph.com/issues/44245
Fixes: cb48be5a69fe6482cbe3bff1b53ba090e077de0d
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agoMerge pull request #33470 from neha-ojha/wip-mgsr2-order-nautilus
Yuri Weinstein [Fri, 21 Feb 2020 22:13:27 +0000 (14:13 -0800)]
Merge pull request #33470 from neha-ojha/wip-mgsr2-order-nautilus

nautilus: qa/suites/upgrade/mimic-x/stress-split: fix msgr2 vs nautilus ordering

5 years agoqa/suites/upgrade/mimic-x/stress-split: fix msgr2 vs nautilus ordering 33470/head
Neha Ojha [Thu, 20 Feb 2020 02:11:26 +0000 (18:11 -0800)]
qa/suites/upgrade/mimic-x/stress-split: fix msgr2 vs nautilus ordering

This was done for octopus in 8283ea9f587fb1136b4e9fa9d8a0435852fb948a,
but not for nautilus

Signed-off-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #33378 from badone/wip-badone-testing
Yuri Weinstein [Tue, 18 Feb 2020 21:26:18 +0000 (13:26 -0800)]
Merge pull request #33378 from badone/wip-badone-testing

nautilus: qa/ceph-ansible: ansible-version and ceph_ansible

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
5 years agonautilus: qa/ceph-ansible: ansible-version and ceph_ansible 33378/head
Brad Hubbard [Wed, 5 Feb 2020 06:46:29 +0000 (16:46 +1000)]
nautilus: qa/ceph-ansible: ansible-version and ceph_ansible

Upgrade to 2.8.1 and stable-4.0 respectively

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
5 years agomgr/devicehealth: fix telemetry stops sending device reports after 48 hours 33346/head
Yaarit Hatuka [Mon, 27 Jan 2020 13:57:55 +0000 (08:57 -0500)]
mgr/devicehealth: fix telemetry stops sending device reports after 48 hours

Telemetry module fetches device metrics which were scraped in the last
"telemetry interval"*2 (=48 hours by default) by calling
_get_device_metrics() with min_sample. _get_device_metrics() fetches the
metrics from omap and breaks on the first one that is older than
min_sample. But because it fetched in ascending order (from oldest to
newest) it was breaking on the first one it received, if it was older
than the interval above. We need to pass min_sample to get_omap_vals()
so it will start fetching from that value.

Fixes: https://tracker.ceph.com/issues/43837
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit 5f7e4a980a73e8cacb2c9bde47d822a32fb8c440)

5 years agomgr/devicehealth: factor _get_device_metrics out of show_device_metrics
Sage Weil [Fri, 4 Oct 2019 20:03:02 +0000 (15:03 -0500)]
mgr/devicehealth: factor _get_device_metrics out of show_device_metrics

Add the min_sample lower-bound argument too

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 7be5c1323b3814e2634d5cd66d45cab5a77df680)
Conflicts: had to be backported to enable backporting of
https://github.com/ceph/ceph/pull/32903
Backport tracker: https://tracker.ceph.com/issues/43873

5 years agoMerge pull request #33278 from smithfarm/wip-44085-nautilus
Yuri Weinstein [Fri, 14 Feb 2020 17:23:51 +0000 (09:23 -0800)]
Merge pull request #33278 from smithfarm/wip-44085-nautilus

nautilus: ceph-monstore-tool: correct the key for storing mgr_command_descs

Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #33277 from smithfarm/wip-43722-nautilus
Yuri Weinstein [Fri, 14 Feb 2020 17:23:24 +0000 (09:23 -0800)]
Merge pull request #33277 from smithfarm/wip-43722-nautilus

nautilus: common/bl: fix the dangling last_p issue.

Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #33276 from smithfarm/wip-44082-nautilus
Yuri Weinstein [Fri, 14 Feb 2020 17:22:55 +0000 (09:22 -0800)]
Merge pull request #33276 from smithfarm/wip-44082-nautilus

nautilus: qa/suites/rados/multimon/tasks/mon_clock_with_skews: disable ntpd etc

Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #32844 from smithfarm/wip-43239-nautilus
Yuri Weinstein [Fri, 14 Feb 2020 17:22:04 +0000 (09:22 -0800)]
Merge pull request #32844 from smithfarm/wip-43239-nautilus

nautilus: mgr/DaemonServer: fix 'osd ok-to-stop' for EC pools

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
5 years agoMerge pull request #33337 from jan--f/wip-44153-nautilus
Jan Fajerski [Fri, 14 Feb 2020 17:00:46 +0000 (18:00 +0100)]
Merge pull request #33337 from jan--f/wip-44153-nautilus

nautilus: ceph-volume: don't remove vg twice when zapping filestore

5 years agoMerge pull request #33334 from jan--f/wip-44152-nautilus
Jan Fajerski [Fri, 14 Feb 2020 17:00:20 +0000 (18:00 +0100)]
Merge pull request #33334 from jan--f/wip-44152-nautilus

nautilus: ceph-volume: pass journal_size as Size not string

5 years agoceph-volume: don't remove vg twice when zapping filestore 33337/head
Jan Fajerski [Fri, 14 Feb 2020 13:10:36 +0000 (14:10 +0100)]
ceph-volume: don't remove vg twice when zapping filestore

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Fixes: https://tracker.ceph.com/issues/44149
(cherry picked from commit bccdf6eafaf851d5092bb99d61edd44cd36d9dd2)

5 years agoceph-volume: pass journal_size as Size not string 33334/head
Jan Fajerski [Fri, 14 Feb 2020 11:50:47 +0000 (12:50 +0100)]
ceph-volume: pass journal_size as Size not string

Fixes: https://tracker.ceph.com/issues/44148
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 49f6e6d600aae6310f941c6635408d496b0ff2b9)

5 years agoMerge pull request #33301 from jan--f/wip-43871-nautilus-failed-cp
Jan Fajerski [Fri, 14 Feb 2020 10:41:40 +0000 (11:41 +0100)]
Merge pull request #33301 from jan--f/wip-43871-nautilus-failed-cp

nautilus: ceph-volume: batch bluestore fix create_lvs call

5 years agoMerge pull request #33297 from jan--f/wip-44135-nautilus
Jan Fajerski [Fri, 14 Feb 2020 09:45:27 +0000 (10:45 +0100)]
Merge pull request #33297 from jan--f/wip-44135-nautilus

nautilus: ceph-volume: avoid calling zap_lv with a LV-less VG

5 years agoceph-volume: batch bluestore fix create_lvs call 33301/head
Jan Fajerski [Tue, 28 Jan 2020 08:25:39 +0000 (09:25 +0100)]
ceph-volume: batch bluestore fix create_lvs call

Fixes: https://tracker.ceph.com/issues/43844
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit df18497bc9eaf1922e5c885e8cc124e439c59364)

5 years agoceph-volume: avoid calling zap_lv with a LV-less VG 33297/head
Jan Fajerski [Thu, 13 Feb 2020 16:09:44 +0000 (17:09 +0100)]
ceph-volume: avoid calling zap_lv with a LV-less VG

Fixes: https://tracker.ceph.com/issues/44125
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit ad0dea53b8585d8397bf3069b1b39f13b6e0a8ce)

5 years agoMerge pull request #33151 from shyukri/wip-43877-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 22:25:53 +0000 (14:25 -0800)]
Merge pull request #33151 from shyukri/wip-43877-nautilus

nautilus: rgw: fix one part of the bulk delete(RGWDeleteMultiObj_ObjStore_S3) fails but no error messages

Reviewed-by: Casey Bodley <cbodley@redhat.com>
5 years agoMerge pull request #33149 from shyukri/wip-43874-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 22:25:13 +0000 (14:25 -0800)]
Merge pull request #33149 from shyukri/wip-43874-nautilus

nautilus: rgw: maybe coredump when reload operator happened

Reviewed-by: Casey Bodley <cbodley@redhat.com>
5 years agoMerge pull request #33008 from smithfarm/wip-43922-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 22:21:55 +0000 (14:21 -0800)]
Merge pull request #33008 from smithfarm/wip-43922-nautilus

nautilus: rgw_file: avoid string::front() on empty path

Reviewed-by: Casey Bodley <cbodley@redhat.com>
5 years agoMerge pull request #33152 from shyukri/wip-43879-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 20:15:09 +0000 (12:15 -0800)]
Merge pull request #33152 from shyukri/wip-43879-nautilus

nautilus: mon: Don't put session during feature change

Reviewed-by: David Zafman <dzafman@redhat.com>
5 years agoMerge pull request #33095 from k0ste/wip-43979-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 20:14:43 +0000 (12:14 -0800)]
Merge pull request #33095 from k0ste/wip-43979-nautilus

nautilus: mgr/telemetry: check get_metadata return val

Reviewed-by: David Zafman <dzafman@redhat.com>
5 years agoMerge pull request #32908 from smithfarm/wip-43821-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 20:12:53 +0000 (12:12 -0800)]
Merge pull request #32908 from smithfarm/wip-43821-nautilus

nautilus: mon/Session: only index osd ids >= 0

5 years agoMerge pull request #33170 from k0ste/wip-43727-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 20:11:01 +0000 (12:11 -0800)]
Merge pull request #33170 from k0ste/wip-43727-nautilus

nautilus: mgr/pg_autoscaler: calculate pool_pg_target using pool size

Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #33168 from k0ste/wip-44057-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 20:10:28 +0000 (12:10 -0800)]
Merge pull request #33168 from k0ste/wip-44057-nautilus

nautilus: mgr/telemetry: split entity_name only once (handle ids with dots)

Reviewed-by: Sage Weil <sage@redhat.com>
5 years agoMerge pull request #33157 from shyukri/wip-43924-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 20:09:56 +0000 (12:09 -0800)]
Merge pull request #33157 from shyukri/wip-43924-nautilus

nautilus: mgr/prometheus: report per-pool pg states

Reviewed-by: Jan Fajerski <jfajerski@suse.com>
5 years agoMerge pull request #33155 from shyukri/wip-43916-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 20:09:22 +0000 (12:09 -0800)]
Merge pull request #33155 from shyukri/wip-43916-nautilus

nautilus: mon/ConfigMonitor: only propose if leader

Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #33147 from shyukri/wip-43989-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 20:08:34 +0000 (12:08 -0800)]
Merge pull request #33147 from shyukri/wip-43989-nautilus

nautilus: osd: Allow 64-char hostname to be added as the "host" in CRUSH

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #33142 from shyukri/wip-44000-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 20:07:53 +0000 (12:07 -0800)]
Merge pull request #33142 from shyukri/wip-44000-nautilus

nautilus: mon/MgrMonitor.cc: warn about missing mgr in a cluster with osds

Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #33082 from k0ste/wip-43974-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 20:07:07 +0000 (12:07 -0800)]
Merge pull request #33082 from k0ste/wip-43974-nautilus

nautilus: mgr/telemetry: anonymizing smartctl report itself

Reviewed-by: Sage Weil <sage@redhat.com>
5 years agoMerge pull request #32948 from yaarith/wip-telemetry-serial-nautilus-yh
Yuri Weinstein [Thu, 13 Feb 2020 20:06:36 +0000 (12:06 -0800)]
Merge pull request #32948 from yaarith/wip-telemetry-serial-nautilus-yh

nautilus: mgr/telemetry: fix device serial number anonymization

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
5 years agoMerge pull request #33007 from smithfarm/wip-43928-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 19:35:30 +0000 (11:35 -0800)]
Merge pull request #33007 from smithfarm/wip-43928-nautilus

nautilus: mon: elector: return after triggering a new election

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
5 years agoMerge pull request #32931 from smithfarm/wip-43819-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 19:31:42 +0000 (11:31 -0800)]
Merge pull request #32931 from smithfarm/wip-43819-nautilus

nautilus: mgr/pg_autoscaler: default to pg_num[_min] = 32

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
5 years agoMerge pull request #32905 from smithfarm/wip-43731-nautilus
Yuri Weinstein [Thu, 13 Feb 2020 19:29:36 +0000 (11:29 -0800)]
Merge pull request #32905 from smithfarm/wip-43731-nautilus

nautilus: crush/CrushWrapper: behave with empty weight vector

Reviewed-by: Kefu Chai <kchai@redhat.com>
5 years agoMerge pull request #32856 from zhengchengyao/nautilus_no_mon_update
Yuri Weinstein [Thu, 13 Feb 2020 19:28:54 +0000 (11:28 -0800)]
Merge pull request #32856 from zhengchengyao/nautilus_no_mon_update

nautilus: mon/ConfigMonitor: fix handling of NO_MON_UPDATE settings

Reviewed-by: Nathan Cutler <ncutler@suse.com>
Reviewed-by: Sage Weil <sage@redhat.com>
5 years agodoc: update mondb recovery script 33278/head
Kefu Chai [Mon, 10 Feb 2020 09:36:04 +0000 (17:36 +0800)]
doc: update mondb recovery script

to note that we also need to add mgr's key to monitor's keyring

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 75f4765f2ffe795dba85540b8aa1675ba9de28e4)

5 years agoceph-monstore-tool: correct the key for storing mgr_command_descs
Kefu Chai [Mon, 10 Feb 2020 09:33:26 +0000 (17:33 +0800)]
ceph-monstore-tool: correct the key for storing mgr_command_descs

Fixes: https://tracker.ceph.com/issues/43582
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit a5bfeca64f3204142fda5320b4fd403df4b5f532)

5 years agoceph-monstore-tool: rename mon-ids in initial monmap
Kefu Chai [Mon, 10 Feb 2020 08:27:22 +0000 (16:27 +0800)]
ceph-monstore-tool: rename mon-ids in initial monmap

when ceph-mon starts, it checks to see if it's listed in the monmap, if
not it complains
```
no public_addr or public_network specified, and mon.a not present in
monmap or ceph.conf.
```
then bails out. normally, the monitor will try to rename its name in
monmap when performing "mkfs", but in our case, we are merely using the
"mkfs" monmap for passing the monmap built by ceph-monstore-tools, and
we don't actually go through the "mkfs" process. so, ceph-mon won't
rename when booting up.

in this change, user is allowed to specify the mon-ids in command line
when rebuilding mondb, the default mon-ids would be a,b,c,... if not
specified.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 4b3df5a850db054928f9fcc6ef0a160a05a2ffa9)

5 years agocommon/bl: fix the dangling last_p issue. 33277/head
Radoslaw Zarzynski [Thu, 16 Jan 2020 12:17:41 +0000 (13:17 +0100)]
common/bl: fix the dangling last_p issue.

Fixes: https://tracker.ceph.com/issues/43646
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
(cherry picked from commit 8198332f3afb5de748dfa6e3349fefbf7e9ed137)

Conflicts:
src/test/bufferlist.cc

5 years agoqa/suites/rados/multimon/tasks/mon_clock_with_skews: whitelist MOST_DOWN 33276/head
Sage Weil [Sun, 9 Feb 2020 19:40:46 +0000 (13:40 -0600)]
qa/suites/rados/multimon/tasks/mon_clock_with_skews: whitelist MOST_DOWN

The skewed clock makes some mons miss elections.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 08b6a2bc008c006ff815dcb85bfb24b73072c7ab)

5 years agoqa/suites/rados/multimon/tasks/mon_clock_with_skews: disable ntpd etc
Sage Weil [Sun, 9 Feb 2020 16:55:03 +0000 (10:55 -0600)]
qa/suites/rados/multimon/tasks/mon_clock_with_skews: disable ntpd etc

Fixes: https://tracker.ceph.com/issues/43889
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 9f2a854b175f156d4ab7fba955aff515052c9d93)

5 years agoMerge pull request #33254 from jan--f/wip-44112-nautilus
Jan Fajerski [Thu, 13 Feb 2020 12:35:30 +0000 (13:35 +0100)]
Merge pull request #33254 from jan--f/wip-44112-nautilus

nautilus: ceph-volume: use get_device_vgs in has_common_vg

5 years agoMerge pull request #33253 from jan--f/wip-44109-nautilus
Jan Fajerski [Thu, 13 Feb 2020 12:35:17 +0000 (13:35 +0100)]
Merge pull request #33253 from jan--f/wip-44109-nautilus

nautilus: ceph-volume: fix is_ceph_device for lvm batch

5 years agoMerge pull request #33240 from jan--f/wip-44035-nautilus
Jan Fajerski [Thu, 13 Feb 2020 07:16:19 +0000 (08:16 +0100)]
Merge pull request #33240 from jan--f/wip-44035-nautilus

nautilus: ceph-volume: finer grained availability notion in inventory.

5 years agoMerge pull request #33239 from jan--f/wip-43984-nautilus
Jan Fajerski [Thu, 13 Feb 2020 07:15:06 +0000 (08:15 +0100)]
Merge pull request #33239 from jan--f/wip-43984-nautilus

nautilus: ceph-volume: fix has_bluestore_label() function

5 years agoceph-volume: use get_device_vgs in has_common_vg 33254/head
Jan Fajerski [Wed, 12 Feb 2020 13:47:37 +0000 (14:47 +0100)]
ceph-volume: use get_device_vgs in has_common_vg

Fixes: https://tracker.ceph.com/issues/44099
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 2c5a8c3b4066dd2aca47d719c9723850ce5f96fc)

5 years agoceph-volume: add is_ceph_device unit tests 33253/head
Jan Fajerski [Wed, 12 Feb 2020 15:49:30 +0000 (16:49 +0100)]
ceph-volume: add is_ceph_device unit tests

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 60d80636e4708761287197c534347f82e307c603)

5 years agoceph-volume: fix is_ceph_device for lvm batch
Dimitri Savineau [Tue, 11 Feb 2020 21:53:55 +0000 (16:53 -0500)]
ceph-volume: fix is_ceph_device for lvm batch

This is a regression introduced by 634a709

The lvm batch command fails to prepare the OSDs on the created LV.
When using lvm batch, the LV/VG are created prior the OSD prepare.
During that creation, multiple tags are set with null value.

$ lvs -o lv_tags --noheadings
  ceph.cluster_fsid=null,ceph.osd_fsid=null,ceph.osd_id=null,ceph.type=null

Since we call is_ceph_device which returns True if the ceph.osd_id LVM
tag exists but doesn't test the value then we raise an execption.

When the tag value is set to 'null' then we can consider that the device
isn't part of the ceph cluster (because not yet prepared).

Closes: https://tracker.ceph.com/issues/44069
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a82582364c7b65a4a5e2673e3886acd6d2066130)

5 years agoMerge pull request #33242 from jan--f/wip-44047-nautilus
Jan Fajerski [Thu, 13 Feb 2020 07:11:01 +0000 (08:11 +0100)]
Merge pull request #33242 from jan--f/wip-44047-nautilus

nautilus: ceph-volume: skip osd creation when already done

5 years agoMerge pull request #32919 from smithfarm/wip-43780-nautilus
Yuri Weinstein [Wed, 12 Feb 2020 19:38:04 +0000 (11:38 -0800)]
Merge pull request #32919 from smithfarm/wip-43780-nautilus

nautilus: cephfs: qa: ignore trimmed cache items for dead cache drop

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agoMerge pull request #33183 from smithfarm/wip-43846-nautilus
Yuri Weinstein [Wed, 12 Feb 2020 18:46:14 +0000 (10:46 -0800)]
Merge pull request #33183 from smithfarm/wip-43846-nautilus

nautilus: rgw: update the hash source for multipart entries during resharding

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
5 years agoMerge pull request #33115 from batrick/i43790
Yuri Weinstein [Wed, 12 Feb 2020 18:44:48 +0000 (10:44 -0800)]
Merge pull request #33115 from batrick/i43790

nautilus: RuntimeError: Files in flight high water is unexpectedly low (0 / 6)

Reviewed-by: Ramana Raja <rraja@redhat.com>
5 years agoMerge pull request #32921 from smithfarm/wip-43784-nautilus
Yuri Weinstein [Wed, 12 Feb 2020 18:44:15 +0000 (10:44 -0800)]
Merge pull request #32921 from smithfarm/wip-43784-nautilus

nautilus: mds/OpenFileTable: match MAX_ITEMS_PER_OBJ to osd_deep_scrub_large_omap_object_key_threshold

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agoMerge pull request #32918 from smithfarm/wip-43777-nautilus
Yuri Weinstein [Wed, 12 Feb 2020 18:42:51 +0000 (10:42 -0800)]
Merge pull request #32918 from smithfarm/wip-43777-nautilus

nautilus: cephfs: qa: save MDS epoch barrier

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agoMerge pull request #32917 from smithfarm/wip-43733-nautilus
Yuri Weinstein [Wed, 12 Feb 2020 18:41:36 +0000 (10:41 -0800)]
Merge pull request #32917 from smithfarm/wip-43733-nautilus

nautilus: cephfs: qa: ignore slow ops for ffsb workunit

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
5 years agoMerge pull request #32756 from batrick/i43347
Yuri Weinstein [Wed, 12 Feb 2020 18:40:39 +0000 (10:40 -0800)]
Merge pull request #32756 from batrick/i43347

nautilus: mds: fix assert(omap_num_objs <= MAX_OBJECTS) of OpenFileTable

Reviewed-by: Ramana Raja <rraja@redhat.com>
5 years agoMerge pull request #31905 from batrick/i43046
Yuri Weinstein [Wed, 12 Feb 2020 18:40:03 +0000 (10:40 -0800)]
Merge pull request #31905 from batrick/i43046

nautilus: mgr: "mds metadata" to setup new DaemonState races with fsmap

Reviewed-by: Ramana Raja <rraja@redhat.com>
5 years agoMerge pull request #32916 from smithfarm/wip-43729-nautilus
Yuri Weinstein [Wed, 12 Feb 2020 18:39:01 +0000 (10:39 -0800)]
Merge pull request #32916 from smithfarm/wip-43729-nautilus

nautilus: cephfs: client: Add is_dir() check before changing directory

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
5 years agoMerge pull request #32807 from smithfarm/wip-43770-nautilus
Yuri Weinstein [Wed, 12 Feb 2020 18:36:51 +0000 (10:36 -0800)]
Merge pull request #32807 from smithfarm/wip-43770-nautilus

nautilus: mount.ceph: remove arbitrary limit on size of name= option

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
5 years agoMerge pull request #32910 from smithfarm/wip-43503-nautilus
Yuri Weinstein [Wed, 12 Feb 2020 18:36:04 +0000 (10:36 -0800)]
Merge pull request #32910 from smithfarm/wip-43503-nautilus

nautilus: mount.ceph: give a hint message when no mds is up or cluster is laggy

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
5 years agoMerge pull request #33122 from ajarr/wip-ajarr-mgr-volumes-nautilus
Ramana Raja [Wed, 12 Feb 2020 16:20:23 +0000 (21:50 +0530)]
Merge pull request #33122 from ajarr/wip-ajarr-mgr-volumes-nautilus

mgr/volumes: misc fix and feature enhancements

Reviewed-by: Venky Shankar <vshankar@redhat.com>
5 years agoqa/standalone/misc/ok-to-stop: improve test 32844/head
Sage Weil [Mon, 20 Jan 2020 19:24:12 +0000 (13:24 -0600)]
qa/standalone/misc/ok-to-stop: improve test

Make sure PGs peer (simply flushing state to mon isn't enough).

Fixes: https://tracker.ceph.com/issues/43721
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 76ea774c109cc565c0b85feed40f7f29105029d3)

5 years agoqa/standalone/ceph-helpers: add wait_for_peered
Sage Weil [Mon, 20 Jan 2020 19:23:56 +0000 (13:23 -0600)]
qa/standalone/ceph-helpers: add wait_for_peered

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 78ec6aec90c7d0eb8b017c7f3d34e376b7f6713f)

5 years agomgr/DaemonServer: fix 'osd ok-to-stop' for EC pools
Sage Weil [Thu, 5 Dec 2019 18:59:31 +0000 (12:59 -0600)]
mgr/DaemonServer: fix 'osd ok-to-stop' for EC pools

We need to pay attention to account for CRUSH_ITEM_NONE entries in the
EC PG acting set.

Fixes: https://tracker.ceph.com/issues/43151
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 66690ea3143ac5097f7c7f3118f2d00fed30cc4b)

Conflicts:
        qa/standalone/misc/ok-to-stop.sh
- nautilus "ceph osd pool create" CLI command takes a pg_num argument

5 years agoceph-volume: add unit test test_safe_prepare_osd_already_created 33242/head
Guillaume Abrioux [Fri, 7 Feb 2020 14:22:46 +0000 (15:22 +0100)]
ceph-volume: add unit test test_safe_prepare_osd_already_created

This commit adds a new unit test
`test_safe_prepare_osd_already_created()` in order to test when
`is_ceph_device()` returns `True` `RuntimeError` is well raised.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ccf92d718d6f363f6ca3b7e8499b68beb8b30c06)

5 years agoceph-volume: skip osd creation when already done
Guillaume Abrioux [Wed, 5 Feb 2020 16:48:22 +0000 (17:48 +0100)]
ceph-volume: skip osd creation when already done

When rerunning ceph-volume lvm create on a device already prepared and
activated, ceph-volume should skip the creation.

This is a regression introduced by bb4de1a3fc238eaf9f717dc59c6bdf338ef6d657

Fixes: https://tracker.ceph.com/issues/43981
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 634a709b9c6802c5b12e2d45c2f43181b297adfb)

5 years agoceph-volume: add available property in target specific flavors 33240/head
Jan Fajerski [Mon, 6 Jan 2020 17:02:57 +0000 (18:02 +0100)]
ceph-volume: add available property in target specific flavors

This adds two properties available_[lvm,raw] to device (and thus inventory).
The goal is to have different notions of availability based on the
intended use case. For example finding LVM structures make a drive
unavailable for the raw mode, but might be available for the lvm mode.

Fixes: https://tracker.ceph.com/issues/43400
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 233ccff24006082766b52a94b7c46cdf5b7cd929)

5 years agoceph-volume: remove stderr in has_bluestore_label() 33239/head
Guillaume Abrioux [Wed, 5 Feb 2020 01:15:17 +0000 (02:15 +0100)]
ceph-volume: remove stderr in has_bluestore_label()

We don't want to generate this log when a call to
`has_bluestore_label()` fails.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7f8371c32b3f613b0d094f6f26ffbeb19ea0d25b)

5 years agoceph-volume: fix has_bluestore_label() function
Guillaume Abrioux [Tue, 4 Feb 2020 21:02:26 +0000 (22:02 +0100)]
ceph-volume: fix has_bluestore_label() function

When using vg/lv, this function throws an error like following:

```
 stderr: unable to read label for test_group/data-lv2: (2) No such file or directory
 stderr: 2020-02-04T21:03:32.153+0000 7fe091af4200 -1 bluestore(test_group/data-lv2) _read_bdev_label failed to open test_group/data-lv2: (2) No such file or directory
```

using `self.abspath` fixes this error.

Fixes: https://tracker.ceph.com/issues/43970
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 148069a20fef79ef8fe510f06879a0de02987eac)

5 years agoMerge pull request #33238 from jan--f/wip-31700-notracker-nautilus
Jan Fajerski [Wed, 12 Feb 2020 12:13:54 +0000 (13:13 +0100)]
Merge pull request #33238 from jan--f/wip-31700-notracker-nautilus

nautilus: ceph-volume: refactor listing.py + fixes

5 years agoceph-volume: fix various lvm list issues 33238/head
Jan Fajerski [Thu, 6 Feb 2020 15:49:12 +0000 (16:49 +0100)]
ceph-volume: fix various lvm list issues

A single report on a non-lvm device now works.
Format was cleaned up, report lvm journal,wal, db only once.

Fixes: https://tracker.ceph.com/issues/44009
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 000bf2ffff57701952e2aa1a67a04e519c4d07a6)

5 years agoceph-volume: add get_device_lvs to easily retrieve all lvs per device
Jan Fajerski [Thu, 6 Feb 2020 15:47:08 +0000 (16:47 +0100)]
ceph-volume: add get_device_lvs to easily retrieve all lvs per device

Also drop the sep argument from get_lvs and siblings, unused.
Introduce LV_CMD_OPTIONS to unify options to lvs.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit ffe5b5732a46bd5ff888696adbfe53a38c982448)

5 years agoceph-volume: fix lvm list
Guillaume Abrioux [Wed, 5 Feb 2020 01:29:14 +0000 (02:29 +0100)]
ceph-volume: fix lvm list

17957d9beb42a04b8f180ccb7ba07d43179a41d3 introduced a regression in `lvm
list`.

When passing a vg/lv path for generating a single report, it fails
because the filter used in the `lvs` command isn't right. It uses the lv
name instead of the vg name because `os.path.basename(device)` is used
while it should be `os.path.dirname(device)`

Fixes: https://tracker.ceph.com/issues/43969
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0179fed3ab624830ba77349531763c3e116c82e5)

5 years agoceph-volume: delete test_lvs_list_is_created_just_once
Rishabh Dave [Fri, 6 Dec 2019 07:40:35 +0000 (13:10 +0530)]
ceph-volume: delete test_lvs_list_is_created_just_once

lisitng.py doesn't call api.Volumes anymore. Therefore, this test is
redundant.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 665ed2424b7bef4160289f0135acc015f8ea9980)

5 years agoceph-volume: update tests since listing.py got heavily modified
Rishabh Dave [Wed, 4 Dec 2019 07:28:19 +0000 (12:58 +0530)]
ceph-volume: update tests since listing.py got heavily modified

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit d1ae6d1a8b495adfc0c512f08359e0db1590272d)

5 years agoceph-volume: refactor devices/lvm/listing.py
Rishabh Dave [Thu, 21 Nov 2019 12:34:25 +0000 (18:04 +0530)]
ceph-volume: refactor devices/lvm/listing.py

Get rid of duplicate and redundant code and use get_lvs, get_vgs and
get_pvs to simplify the module as much as possible.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit d02bd7dd581a4bd4041eb397fae540a18f16a88b)

5 years agoceph-volume: add new method in api/lvm.py
Rishabh Dave [Thu, 23 Jan 2020 14:17:21 +0000 (19:47 +0530)]
ceph-volume: add new method in api/lvm.py

The method determines whether given LV is managed by Ceph or not.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 876244b6ab2cf1fbd724fd33966501a3366c6d3f)

5 years agoceph-volume: add helper methods to get only first LVM devs
Rishabh Dave [Fri, 3 Jan 2020 10:14:04 +0000 (15:44 +0530)]
ceph-volume: add helper methods to get only first LVM devs

These convenience methods shortens following phrase to
"lv = get_first_lv()" -

lvs = get_lvs()
if len(lvs) >= 1:
lvs = lv[0]

These methods do the same things as above phrase internall. Rewrite
listing.py to use these new helper methods.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 17957d9beb42a04b8f180ccb7ba07d43179a41d3)

5 years agoceph-volume: filter based on tags for api.lvm.get_* methods
Rishabh Dave [Mon, 30 Dec 2019 07:10:49 +0000 (12:40 +0530)]
ceph-volume: filter based on tags for api.lvm.get_* methods

get_pvs, get_vgs and get_lvs must accept tags and filter volumes based
on tags.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit fb1390964fdfd10815ab4a4302ad454169bc0d5f)

5 years agoMerge pull request #33232 from jan--f/wip-43871-nautilus
Jan Fajerski [Wed, 12 Feb 2020 11:28:39 +0000 (12:28 +0100)]
Merge pull request #33232 from jan--f/wip-43871-nautilus

nautilus: ceph-volume: batch bluestore fix create_lvs call

5 years agoMerge pull request #33231 from jan--f/wip-43849-nautilus
Jan Fajerski [Wed, 12 Feb 2020 11:28:22 +0000 (12:28 +0100)]
Merge pull request #33231 from jan--f/wip-43849-nautilus

nautilus: ceph-volume: add sizing arguments to prepare

5 years agomgr/volumes: fix py2 compat issue 33122/head
Ramana Raja [Tue, 11 Feb 2020 10:49:09 +0000 (05:49 -0500)]
mgr/volumes: fix py2 compat issue

Fix the following issue seen while upstream teuthology testing,
 File "/usr/share/ceph/mgr/volumes/fs/operations/versions/subvolume_base.py", line 98, in load_config
   self.metadata_mgr = MetadataManager(self.fs, self.legacy_config_path, 0o640)
 File "/usr/share/ceph/mgr/volumes/fs/operations/versions/subvolume_base.py", line 73, in legacy_config_path
   meta_config = "{0}.meta".format(m.digest().hex())
 AttributeError: 'str' object has no attribute 'hex'

This issue is not observed in master/octopus, as it only supports
py3.

Signed-off-by: Ramana Raja <rraja@redhat.com>