]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
3 years agomgr/dashboard: Contact Info should be visible only when Ident channel is checked 45110/head
Sarthak0702 [Wed, 16 Feb 2022 12:45:35 +0000 (18:15 +0530)]
mgr/dashboard: Contact Info should be visible only when Ident channel is checked

Fixes:https://tracker.ceph.com/issues/54133
Signed-off-by: Sarthak0702 <sarthak.0702@gmail.com>
(cherry picked from commit 15211a6378a6fee9316f79ba0b27821891527c38)

 Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/telemetry/telemetry.component.ts
- `this.loading` used in Octopus instead of `this.loadingReady()`

3 years agomgr/dashboard: telemetry activate: show ident fields when checked
Aaryan Porwal [Sun, 6 Jun 2021 22:08:37 +0000 (03:38 +0530)]
mgr/dashboard: telemetry activate: show ident fields when checked

Signed-off-by: Aaryan Porwal <aaryanporwal2233@gmail.com>
(cherry picked from commit ad5b3f200529fc0bc511ce99eed338afcaef6a62)

3 years agomgr/dashboard: dashboard turns telemetry off when configuring report
Sarthak0702 [Thu, 10 Feb 2022 19:50:42 +0000 (01:20 +0530)]
mgr/dashboard: dashboard turns telemetry off when configuring report

Signed-off-by: Sarthak0702 <sarthak.0702@gmail.com>
(cherry picked from commit 97c57adf8565756dbf24f3c46ed3916303903fb7)

Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/telemetry/telemetry.component.ts
- `this.i18n()` was used in Octopus instead of `$localize`

3 years agoMerge pull request #44986 from badone/wip-octopus-ceph-ansible-move-to-stream
Brad Hubbard [Wed, 16 Feb 2022 03:26:57 +0000 (13:26 +1000)]
Merge pull request #44986 from badone/wip-octopus-ceph-ansible-move-to-stream

octopus: qa/ceph-ansible: Move to Centos Stream

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
3 years agoMerge pull request #44791 from guits/wip-54023-octopus
Yuri Weinstein [Mon, 14 Feb 2022 20:08:50 +0000 (12:08 -0800)]
Merge pull request #44791 from guits/wip-54023-octopus

octopus: ceph-volume: improve mpath devices support

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
3 years agoceph-volume: fix typo in tests 44791/head
Guillaume Abrioux [Tue, 14 Dec 2021 10:08:48 +0000 (11:08 +0100)]
ceph-volume: fix typo in tests

This fixes 2 typo in ceph-volume tests.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit b07bd3e0e17021e0cf9773f916fad954f12254ed)

3 years agodoc/ceph-volume: fix a typo
Guillaume Abrioux [Tue, 14 Dec 2021 09:42:09 +0000 (10:42 +0100)]
doc/ceph-volume: fix a typo

This fixes a typo in ceph-volume documentation.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5d0a3cee5d7021dafd1e166e17946689b4bb90b7)

3 years agoceph-volume: add a test `test_mpath_device_is_device`
Guillaume Abrioux [Tue, 14 Dec 2021 09:40:35 +0000 (10:40 +0100)]
ceph-volume: add a test `test_mpath_device_is_device`

This test checks that Device.is_device() returns True for a mpath device.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0280ff6df09bc26107bc97446e9d5c18fbc582e9)

3 years agoceph-volume: improve mpath devices support
Guillaume Abrioux [Tue, 14 Dec 2021 08:57:10 +0000 (09:57 +0100)]
ceph-volume: improve mpath devices support

ee8887f4c0ff4f91117f31b621b95c8d08019130 was intended for adding
mpath devices support in ceph-volume but it has missed the lvm batch scenario.
This also fixes the zapping of mpath devices prepared with `ceph-volume raw`

Fixes: https://tracker.ceph.com/issues/52908
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 601ff7ed0a3ba5172b6bd886ca8ba2bd4d9e655a)

3 years agoMerge pull request #44974 from guits/wip-54245-octopus
Guillaume Abrioux [Mon, 14 Feb 2022 15:42:47 +0000 (16:42 +0100)]
Merge pull request #44974 from guits/wip-54245-octopus

octopus: ceph-volume: honour osd_dmcrypt_key_size option

3 years agoMerge pull request #44978 from ifed01/wip-ifed-clist-pend-bug-oct
Yuri Weinstein [Fri, 11 Feb 2022 22:41:42 +0000 (14:41 -0800)]
Merge pull request #44978 from ifed01/wip-ifed-clist-pend-bug-oct

octopus:  os/bluestore: list obj which equals to pend

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoos/bluestore: list obj which equals to pend 44978/head
Kefu Chai [Fri, 24 Sep 2021 15:33:03 +0000 (23:33 +0800)]
os/bluestore: list obj which equals to pend

otherwise we could have failures like

scrub : stat mismatch, got 3/4 objects, 1/2 clones, 3/4 dirty, 3/4 omap, 0/0 pinned, 0/0 hit_set_archive, 0/0 whiteouts, 49/56 bytes, 0/0 manifest objects, 0/0 hit_set_archive bytes."

where the numbers of scrubbed object, clones, dirty and omap are always
less than the total number of corresponding numbers, if the PG contains
object(s) whose hash happens to be 0xffffffff.

in this change, if the calculated hash of the upper bound is greater
than the maximum possible number represented by uint32_t, in addition to
setting the hash of the upper bound hobj to 0xffffffff, we also set the
nspace of hobj of the upper bound to "\xff", so that the upper bound
is greater than an hobj whose hash happens to be 0xfffffff. please note,
the nspace of "\xff" is not an ascii string, so it's not likely to be
less than a real-world nspace of an hobj.

with this new *greater* upper bound, we are able to include the previous
missing hobj when listing the objects in a PG. so the scrub won't be
annoyed when the number of objects does not match.

Fixes: https://tracker.ceph.com/issues/52705
Signed-off-by: Mykola Golub <mykola.golub@clyso.com>
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit ffab13bcd9006c1f961a24b8016df9d1fe06ba1d)

 Conflicts:
src/os/bluestore/BlueStore.cc
 - get_coll_range function signature alignment

3 years agoMerge pull request #44614 from ifed01/wip-ifed-fix-ram-gridy-fsck-oct
Yuri Weinstein [Thu, 10 Feb 2022 14:37:21 +0000 (06:37 -0800)]
Merge pull request #44614 from ifed01/wip-ifed-fix-ram-gridy-fsck-oct

octopus: os/bluestore: make shared blob fsck much less RAM-greedy.

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
3 years agoos/bluestore: use scope_guard to log latency
Kefu Chai [Wed, 22 Sep 2021 16:42:33 +0000 (00:42 +0800)]
os/bluestore: use scope_guard to log latency

simpler this way, and avoid using `goto`.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 715a83822ebc1a3d102d1ec13323b69db0600719)

3 years agoceph-volume/activate: load the config from lv tag 44974/head
Guillaume Abrioux [Thu, 10 Feb 2022 01:23:51 +0000 (02:23 +0100)]
ceph-volume/activate: load the config from lv tag

When `ceph-volume lvm trigger` is called with an OSD where the tag
`ceph.cluster_name` is not 'ceph', it fails.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5ac1ec65cb2a582b2ae550202cc9911f993943f2)

3 years agoceph-volume/tests: use centos/stream8 images
Guillaume Abrioux [Wed, 9 Feb 2022 17:33:27 +0000 (18:33 +0100)]
ceph-volume/tests: use centos/stream8 images

Since recent move from CentOS 8 to CentOS Stream 8, let's do the same here.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2b793952bbac7973b97d245c282165daadeabb51)

3 years agoceph-volume/tests: add tests in util/encryption.py
Guillaume Abrioux [Wed, 9 Feb 2022 16:04:19 +0000 (17:04 +0100)]
ceph-volume/tests: add tests in util/encryption.py

this adds some unit tests in order to cover `luks_format()` and `luks_open()`
in `util/encryption.py`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit db48850745f218e08cf53ae2d8edf3428f2b4010)

3 years agoceph-volume: honour osd_dmcrypt_key_size option
Guillaume Abrioux [Tue, 25 Jan 2022 09:25:53 +0000 (10:25 +0100)]
ceph-volume: honour osd_dmcrypt_key_size option

ceph-volume doesn't honour osd_dmcrypt_key_size.
It means the default size is always applied.

It also changes the default value in `get_key_size_from_conf()`

From cryptsetup manpage:

> For XTS mode you can optionally set a key size of 512 bits with the -s option.

Using more than 512bits will end up with the following error message:

```
Key size in XTS mode must be 256 or 512 bits.
```

Fixes: https://tracker.ceph.com/issues/54006
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 47c33179f9a15ae95cc1579a421be89378602656)

3 years agooctopus: qa/ceph-ansible: Move to Centos Stream 44986/head
Brad Hubbard [Thu, 10 Feb 2022 03:16:23 +0000 (13:16 +1000)]
octopus: qa/ceph-ansible: Move to Centos Stream

Centos 8 is eol and its package repos no longer exist.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
3 years agoMerge pull request #44860 from aclamk/wip-53392-octopus
Yuri Weinstein [Thu, 10 Feb 2022 01:10:58 +0000 (17:10 -0800)]
Merge pull request #44860 from aclamk/wip-53392-octopus

octopus: Fix data corruption in bluefs truncate()

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #44929 from adk3798/octopus-cephadm-qa-centos-8-stream
Yuri Weinstein [Tue, 8 Feb 2022 17:26:29 +0000 (09:26 -0800)]
Merge pull request #44929 from adk3798/octopus-cephadm-qa-centos-8-stream

octopus: qa/suites/rados/cephadm: use centos 8.stream

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
3 years agoMerge pull request #44864 from cbodley/wip-54089
Yuri Weinstein [Tue, 8 Feb 2022 15:21:49 +0000 (07:21 -0800)]
Merge pull request #44864 from cbodley/wip-54089

octopus: qa: remove centos8 from supported distros

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoqa/suites/rados/cephadm: remove centos 8.2, 8.3 44929/head
Adam King [Mon, 7 Feb 2022 18:18:17 +0000 (13:18 -0500)]
qa/suites/rados/cephadm: remove centos 8.2, 8.3

Signed-off-by: Adam King <adking@redhat.com>
3 years agoqa/suites/orch/cephadm: add 8.stream + container_tools
Sage Weil [Mon, 8 Nov 2021 17:01:45 +0000 (11:01 -0600)]
qa/suites/orch/cephadm: add 8.stream + container_tools

Signed-off-by: Sage Weil <sage@newdream.net>
Conflicts:
qa/suites/rados/cephadm/upgrade/1-start-distro/1-start-centos_8.stream_container-tools.yaml

3 years agoqa/rgw: rgw/verify no longer pins centos 8.0 44864/head
Casey Bodley [Mon, 31 Jan 2022 22:23:25 +0000 (17:23 -0500)]
qa/rgw: rgw/verify no longer pins centos 8.0

the symlink rgw/verify/centos_latest.yaml already selects centos

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 0fad609d4dca01335abda6c48ae2663a8fd15494)

3 years agoqa/distros: remove duplicate centos_8.stream.yaml from supported
Casey Bodley [Mon, 31 Jan 2022 19:52:04 +0000 (14:52 -0500)]
qa/distros: remove duplicate centos_8.stream.yaml from supported

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 3b71b41190bbb0af5026babc82266541b6398e92)

3 years agoqa/distros: centos_8.yaml is now a symlink to centos_8.stream.yaml
Casey Bodley [Mon, 31 Jan 2022 19:51:00 +0000 (14:51 -0500)]
qa/distros: centos_8.yaml is now a symlink to centos_8.stream.yaml

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 0f4e51f05f9b340fe6128b46ea4601ecf01625d2)

3 years agoqa/distro/supported: add centos 8.stream
Sage Weil [Fri, 18 Jun 2021 23:07:30 +0000 (18:07 -0500)]
qa/distro/supported: add centos 8.stream

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 577cbd162ed63bcee9f027776d72d569d9adf93b)

3 years agoos/bluestore/bluefs: Fix data corruption in truncate() 44860/head
Adam Kupczyk [Tue, 2 Nov 2021 15:57:32 +0000 (16:57 +0100)]
os/bluestore/bluefs: Fix data corruption in truncate()

It is possible to create condition in which a BlueFS contains file that is corrupted.
It can happen when BlueFS replay log is on device A and we just wrote to device B and truncated file.

Scenario:
1) write to file h1 on SLOW device
2) flush h1 (initiate transfer, but no fdatasync yet)
3) truncate h1
4) write to file h2 on DB
5) fsync h2 (forces replay log to be written, after fdatasync to DB)
6) poweroff

Fixes: https://tracker.ceph.com/issues/53129
Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 49b7b44b3b5c94ee401562e603999e2b3bd8f9a2)

3 years agoos/objectstore/test: Add test for data corruption in file truncation
Adam Kupczyk [Tue, 2 Nov 2021 15:56:10 +0000 (16:56 +0100)]
os/objectstore/test: Add test for data corruption in file truncation

Test for https://tracker.ceph.com/issues/53129

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 1f7771d4a77ebb271b939b3893d7607d964796f0)

3 years agoMerge pull request #44726 from cfsnyder/wip-53704-octopus
Yuri Weinstein [Thu, 27 Jan 2022 23:15:27 +0000 (15:15 -0800)]
Merge pull request #44726 from cfsnyder/wip-53704-octopus

octopus: osdc: add set_error in BufferHead, when split set_error to right

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Venky Shankar vshankar@redhat.com
3 years agoMerge pull request #44728 from cfsnyder/wip-51826-octopus
Ernesto Puerta [Thu, 27 Jan 2022 10:26:10 +0000 (11:26 +0100)]
Merge pull request #44728 from cfsnyder/wip-51826-octopus

octopus: qa/run-tox-mgr-dashboard: Do not write to /tmp/test_sanitize_password…

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: kevinzs2048 <NOT@FOUND>
Reviewed-by: Yuri Weinstein <yweins@redhat.com>
3 years agoMerge pull request #44544 from cfsnyder/wip-53660-octopus
Yuri Weinstein [Thu, 27 Jan 2022 00:05:09 +0000 (16:05 -0800)]
Merge pull request #44544 from cfsnyder/wip-53660-octopus

octopus: mon: prevent new sessions during shutdown

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #44720 from cfsnyder/wip-53495-octopus
Yuri Weinstein [Thu, 27 Jan 2022 00:04:38 +0000 (16:04 -0800)]
Merge pull request #44720 from cfsnyder/wip-53495-octopus

octopus: mgr: fix locking for MetadataUpdate::finish

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #44700 from cfsnyder/wip-53943-octopus
Yuri Weinstein [Thu, 27 Jan 2022 00:03:59 +0000 (16:03 -0800)]
Merge pull request #44700 from cfsnyder/wip-53943-octopus

octopus: mon/OSDMonitor: avoid null dereference if stats are not available

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #44753 from idryomov/wip-rbd-mirror-delprop-races-octopus
Yuri Weinstein [Wed, 26 Jan 2022 23:57:47 +0000 (15:57 -0800)]
Merge pull request #44753 from idryomov/wip-rbd-mirror-delprop-races-octopus

octopus: rbd-mirror: fix races in snapshot-based mirroring deletion propagation

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Reviewed-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
3 years agoMerge pull request #44741 from idryomov/wip-rbd-switch-arguments-fix-octopus
Yuri Weinstein [Wed, 26 Jan 2022 23:56:02 +0000 (15:56 -0800)]
Merge pull request #44741 from idryomov/wip-rbd-switch-arguments-fix-octopus

octopus: rbd: add missing switch arguments for recognition by get_command_spec()

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
3 years agoMerge pull request #44722 from cfsnyder/wip-53534-octopus
Yuri Weinstein [Wed, 26 Jan 2022 23:54:06 +0000 (15:54 -0800)]
Merge pull request #44722 from cfsnyder/wip-53534-octopus

octopus: mon/MgrStatMonitor: do not spam subscribers (mgr) with service_map

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #44724 from cfsnyder/wip-53609-octopus
Yuri Weinstein [Wed, 26 Jan 2022 23:42:49 +0000 (15:42 -0800)]
Merge pull request #44724 from cfsnyder/wip-53609-octopus

octopus: os/bluestore: avoid premature onode release.

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
3 years agoMerge pull request #44730 from cfsnyder/wip-53850-octopus
Yuri Weinstein [Wed, 26 Jan 2022 19:24:20 +0000 (11:24 -0800)]
Merge pull request #44730 from cfsnyder/wip-53850-octopus

octopus: rgwlc:  remove lc entry on bucket delete

Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #44167 from cfsnyder/wip-53290-octopus
Yuri Weinstein [Wed, 26 Jan 2022 16:33:25 +0000 (08:33 -0800)]
Merge pull request #44167 from cfsnyder/wip-53290-octopus

octopus: rgw: fix `bi put` not using right bucket index shard

Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #44585 from vumrao/wip-vumrao-53877
Yuri Weinstein [Wed, 26 Jan 2022 16:15:58 +0000 (08:15 -0800)]
Merge pull request #44585 from vumrao/wip-vumrao-53877

octopus: osd/PeeringState: separate history's pruub from pg's

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #43446 from k0ste/wip-52849-octopus
Yuri Weinstein [Wed, 26 Jan 2022 16:15:20 +0000 (08:15 -0800)]
Merge pull request #43446 from k0ste/wip-52849-octopus

octopus: mgr: Add check to prevent mgr from crashing

Reviewed-by: Venky Shankar vshankar@redhat.com
3 years agoMerge pull request #43438 from trociny/wip-52833-octopus
Yuri Weinstein [Wed, 26 Jan 2022 16:14:10 +0000 (08:14 -0800)]
Merge pull request #43438 from trociny/wip-52833-octopus

octopus: osd: re-cache peer_bytes on every peering state activate

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #42677 from callithea/wip-51423-octopus
Yuri Weinstein [Wed, 26 Jan 2022 16:13:08 +0000 (08:13 -0800)]
Merge pull request #42677 from callithea/wip-51423-octopus

octopus: mgr: set debug_mgr=2/5 (so INFO goes to mgr log by default)

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
3 years agoMerge pull request #44766 from guits/wip-53954-octopus
Guillaume Abrioux [Tue, 25 Jan 2022 15:01:39 +0000 (16:01 +0100)]
Merge pull request #44766 from guits/wip-53954-octopus

octopus: ceph-volume: don't use MultiLogger in find_executable_on_host()

3 years agoceph-volume: don't use MultiLogger in find_executable_on_host() 44766/head
Guillaume Abrioux [Wed, 19 Jan 2022 14:04:20 +0000 (15:04 +0100)]
ceph-volume: don't use MultiLogger in find_executable_on_host()

This generates a lot of unnecessary messages on the terminal.

Fixes: https://tracker.ceph.com/issues/53934
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3be55621600be3ebc9c70295a3a351dab426b3a3)

3 years agoMerge pull request #44757 from guits/wip-53917-octopus
Guillaume Abrioux [Tue, 25 Jan 2022 09:49:14 +0000 (10:49 +0100)]
Merge pull request #44757 from guits/wip-53917-octopus

octopus: ceph-volume: fix regression introcuded via #43536

3 years agoMerge pull request #44709 from guits/wip-53961-octopus
Guillaume Abrioux [Mon, 24 Jan 2022 12:39:20 +0000 (13:39 +0100)]
Merge pull request #44709 from guits/wip-53961-octopus

octopus: ceph-volume: show RBD devices as not available

3 years agotest/rbd_mirror: drop redundant MockJournaler instances 44753/head
Ilya Dryomov [Fri, 21 Jan 2022 13:26:31 +0000 (14:26 +0100)]
test/rbd_mirror: drop redundant MockJournaler instances

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 303d3ede48a088b947cb99f6fe1b400a6b0871be)

3 years agorbd-mirror: fix races in snapshot-based mirroring deletion propagation
Ilya Dryomov [Fri, 21 Jan 2022 12:41:46 +0000 (13:41 +0100)]
rbd-mirror: fix races in snapshot-based mirroring deletion propagation

When remote image is deleted, rbd-mirror can encounter three cases:

  1) no remote image id
  2) no remote mirror metadata
  3) MIRROR_IMAGE_STATE_DISABLING in remote mirror metadata

Commit d4c66ac5c615 ("rbd-mirror: fix issue with snapshot-based
mirroring deletion propagation") fixed case 1.  Cases 2 and 3 remained
broken because for both of them finalize_snapshot_state_builder() would
populate not only remote_mirror_peer_uuid but also remote_image_id,
thus disabling ENOLINK logic in handle_prepare_remote_image() and
handle_bootstrap().  Commit ff60aec2d9ef ("rbd-mirror: fix bootstrap
sequence while the image is removed") touched on case 3, but it made
a difference only for journal-based mirroring.

Stop calling finalize_snapshot_state_builder() on errors.  Instead,
align with journal-based mirroring by filling remote_mirror_peer_uuid
together with remote_mirror_uuid.

Fixes: https://tracker.ceph.com/issues/53963
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit d634a1df5b19d61955f2f94c7cc29bd4f3b678c8)

3 years agorbd-mirror: don't default replay_requires_remote_image() implementation
Ilya Dryomov [Fri, 21 Jan 2022 12:41:46 +0000 (13:41 +0100)]
rbd-mirror: don't default replay_requires_remote_image() implementation

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ccfbf3e97ed1f50df0adcbec812f1b11fe22cace)

3 years agorbd-mirror: untangle StateBuilder::is_linked() overloads
Ilya Dryomov [Fri, 21 Jan 2022 12:41:46 +0000 (13:41 +0100)]
rbd-mirror: untangle StateBuilder::is_linked() overloads

Make it clear that the local image non-primariness is asserted
independent of the mode; avoid the default implementation being
overridden but still relied on by both modes.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit f49fa483ec6cdc19b4d60debefbb21bf65b7a385)

3 years agorbd-mirror: drop redundant initialization of StateBuilder members
Ilya Dryomov [Thu, 20 Jan 2022 15:25:46 +0000 (16:25 +0100)]
rbd-mirror: drop redundant initialization of StateBuilder members

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit baf57925abdee287cfa0aefc5ba2f602dac8c25e)

3 years agorbd: add missing switch arguments for recognition by get_command_spec() 44741/head
Ilya Dryomov [Wed, 19 Jan 2022 11:54:23 +0000 (12:54 +0100)]
rbd: add missing switch arguments for recognition by get_command_spec()

Currently this

  $ rbd --all children img

doesn't work, while this

  $ rbd children --all img

or this

  $ rbd children img --all

does.  The issue is that -a/--all isn't on the list of known switch
arguments.  The "rbd children" example may seem contrived but for more
complicated commands such as "rbd device map" mixing switches and
positional arguments occurs naturally:

  $ rbd device --device-type nbd --options try-netlink --show-cookie map img

Fixes: https://tracker.ceph.com/issues/53935
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit e1b4811bc324236892e43a6bb841d6278fe1584e)

Conflicts:
src/tools/rbd/ArgumentTypes.h [ snapshot quiesce support
  not in octopus ]
src/tools/rbd/action/Device.cc [ nbd cookie support not in
  octopus ]
src/tools/rbd/action/Migration.cc [ import-only migration
  not supported in octopus ]
src/tools/rbd/action/Wnbd.cc [ wnbd support not in octopus ]

3 years agoMerge pull request #44689 from MrFreezeex/wip-53937-octopus
Yuri Weinstein [Fri, 21 Jan 2022 22:35:12 +0000 (14:35 -0800)]
Merge pull request #44689 from MrFreezeex/wip-53937-octopus

octopus: cls/journal: skip disconnected clients when calculating min_commit_position

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
3 years agoMerge pull request #43806 from sunnyku/wip-53027-octopus
Yuri Weinstein [Fri, 21 Jan 2022 22:33:58 +0000 (14:33 -0800)]
Merge pull request #43806 from sunnyku/wip-53027-octopus

octopus: librbd/object_map: rbd diff between two snapshots lists entire image content

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
3 years agorgwlc: remove lc entry on bucket delete 44730/head
Matt Benjamin [Tue, 4 Jan 2022 16:22:00 +0000 (11:22 -0500)]
rgwlc:  remove lc entry on bucket delete

Buckets with lifecycle policies installed have a state entry that
must also be deleted when the bucket is removed.

Fixes: https://tracker.ceph.com/issues/46728
N.b., should really be generic, not specific to the RADOS store, but
there doesn't seem to be a clean model for implementing generic side
effects in Zipper, currently.

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit cc1e812a003e2af74fe0c69ccae08dd7aa68bbe0)

Conflicts:
src/rgw/rgw_sal_rados.cc

Cherry-pick notes:
- Code from rgw_sal_rados.cc existed in rgw_sal.cc in Octopus

3 years agoqa/run-tox-mgr-dashboard: Do not write to /tmp/test_sanitize_password.txt file 44728/head
Kevin Zhao [Thu, 22 Jul 2021 06:58:20 +0000 (07:58 +0100)]
qa/run-tox-mgr-dashboard: Do not write to /tmp/test_sanitize_password.txt file

To allow running multiple instances of the same tests.

Fixes: https://tracker.ceph.com/issues/51792
Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
(cherry picked from commit d04ef800abd671a564795eba198ca976619b4cc7)

3 years agoosdc: add set_error in BufferHead, when split set_error to right 44726/head
jiawd [Thu, 11 Nov 2021 03:49:29 +0000 (03:49 +0000)]
osdc: add set_error in BufferHead, when split set_error to right

Fixes: https://tracker.ceph.com/issues/53227
Signed-off-by: jiawd <jiawendong@xtaotech.com>
(cherry picked from commit dba751ac0c0e9c8276a59ea3337b31fc71e26bf0)

3 years agoos/bluestore: avoid premature onode release. 44724/head
Igor Fedotov [Tue, 2 Nov 2021 12:03:39 +0000 (15:03 +0300)]
os/bluestore: avoid premature onode release.

This was observed when onode's removal is followed by reading
and the latter causes object release before the removal is finalized.
The root cause is an improper 'pinned' state assessment in Onode::get

More detailed overview is:
At some point Onode::get() might face the case when nref == 2 and pinned = true
which means parallel incomplete put is running on the onode - ref count is
decremented but pinned state is still unmodified (and even lock hasn't been
acquired yet).
This might finally result in two puts racing over the same onode with nref == 2
which finally results in a premature onode release:
  // nref =3, pinned = 1
  // Thread 1                   Thread 2
  //   o->put()                   o->get()
  //   --nref(n = 2, pinned=1)
  //                              nref++ (n=3, pinned = 1)
  //                              return
  //                              ...
  //                              o->put()
  //                              --nref(n = 2)
  //                              pinned = 0,
  //                              --nref(n = 1)
  //                              ocs->_unpin_and_rm(o) -> o->put()
  //                                ...
  //                                --nref(n = 0)
  //                                release o
  //  o->c->get_onode_cache()
  //  FAULT!
  //
The suggested fix is to introduce additional atomic counter tracking
running put() functions. And permit onode release when both regular
nref and put_nref are both equal to zero.

Fixes: https://tracker.ceph.com/issues/53002
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit 96f0efe6d5307a55bea32f7216ef9511da0c5a47)

3 years agomon/MgrStatMonitor: do not spam subscribers (mgr) with service_map 44722/head
Sage Weil [Thu, 2 Dec 2021 22:46:26 +0000 (17:46 -0500)]
mon/MgrStatMonitor: do not spam subscribers (mgr) with service_map

We are comparing the monmap epoch to the service_map epoch!

Fixes: https://tracker.ceph.com/issues/53479
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit d0f5ed13567f54b1bdedc388568d2afc5922ab70)

Conflicts:
src/mon/MgrStatMonitor.cc

Cherry-pick notes:
- MgrStatMonitor mon member was a pointer in Octopus

3 years agomgr: fix locking for MetadataUpdate::finish 44720/head
Sage Weil [Wed, 24 Nov 2021 18:22:26 +0000 (13:22 -0500)]
mgr: fix locking for MetadataUpdate::finish

We need to hold the DaemonState lock here since we are both reading and
writing its content.

Fixes: https://tracker.ceph.com/issues/53393
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 5096685cd623de71a7c45a667e1dd776357fd997)

Conflicts:
src/mgr/Mgr.cc

Cherry-pick notes:
- state variable was declared outside of if condition in Octopus

3 years agoMerge pull request #44320 from guits/wip-53618-octopus
Guillaume Abrioux [Fri, 21 Jan 2022 12:51:09 +0000 (13:51 +0100)]
Merge pull request #44320 from guits/wip-53618-octopus

octopus: ceph-volume: make it possible to skip needs_root()

3 years agoceph-volume: filter RBD devices from the device inventory 44709/head
Michael Fritch [Tue, 18 Jan 2022 22:15:45 +0000 (15:15 -0700)]
ceph-volume: filter RBD devices from the device inventory

Avoid running `blkid` or deploying OSDs on RBD devices by ensuring they
do not appear in the `ceph-volume inventory`

Fixes: https://tracker.ceph.com/issues/53846
Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 47325ec3ec5ce1d53c5eae2952f631e95b7135fe)

3 years agoMerge pull request #44673 from idryomov/wip-diff-iterate-parent-tests-octopus
Yuri Weinstein [Thu, 20 Jan 2022 23:34:17 +0000 (15:34 -0800)]
Merge pull request #44673 from idryomov/wip-diff-iterate-parent-tests-octopus

octopus: backport diff-iterate include_parent tests

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #44595 from idryomov/wip-xfstests-qemu-cert-octopus
Yuri Weinstein [Thu, 20 Jan 2022 21:01:33 +0000 (13:01 -0800)]
Merge pull request #44595 from idryomov/wip-xfstests-qemu-cert-octopus

octopus: qa/run_xfstests_qemu.sh: stop reporting success without actually running any tests

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #44548 from cfsnyder/wip-53840-octopus
Yuri Weinstein [Thu, 20 Jan 2022 21:01:04 +0000 (13:01 -0800)]
Merge pull request #44548 from cfsnyder/wip-53840-octopus

octopus: librbd: diff-iterate reports incorrect offsets in fast-diff mode

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
3 years agoMerge pull request #43663 from MrFreezeex/wip-53031-octopus
Yuri Weinstein [Thu, 20 Jan 2022 20:59:32 +0000 (12:59 -0800)]
Merge pull request #43663 from MrFreezeex/wip-53031-octopus

octopus: rbd-mirror: fix mirror image removal

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
3 years agoceph-volume: fix regression introcuded via #43536 44757/head
Guillaume Abrioux [Mon, 10 Jan 2022 09:21:53 +0000 (10:21 +0100)]
ceph-volume: fix regression introcuded via #43536

The recent changes from PR #43536 introduced a regeression preventing from
running ceph-volume in a containerized context on Ubuntu 18.04.

Given that the path for the binary `lvs` differs between CentOS 8 and Ubuntu 18.04.
(`/usr/sbin/lvs` and `/sbin/lvs` respictively). It means that ceph-volume running
in the container on CentOS 8 sees the `lvs` binary at `/usr/sbin/lvs` and try to
run it with `nsenter` on the host which is running Ubuntu 18.04.

Fixes: https://tracker.ceph.com/issues/53812
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 95e88cda3df76b59b548ae808df0ef7f19db1f63)
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 3c93ffdc92d4d03b9ae7415b548192a572cfc5ea)

3 years agomon/OSDMonitor: avoid null dereference if stats are not available 44700/head
Josh Durgin [Fri, 7 Jan 2022 18:37:13 +0000 (13:37 -0500)]
mon/OSDMonitor: avoid null dereference if stats are not available

Not confirmed yet whether this was the issue in the bug referenced
below, however it's a necessary defensive check for the
'osd pool get-quota' command.

All other uses of get_pool_stats() already handle this case.

Related-to: https://tracker.ceph.com/issues/53740
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
(cherry picked from commit 9c8392be33a574f15bc35b8f49e319af50d99e90)

Conflicts:
src/mon/OSDMonitor.cc

Cherry-pick notes:
- mon variable was a pointer in Octopus

3 years agocls/journal: skip disconnected clients when finding min_commit_position 44689/head
Mykola Golub [Fri, 14 Jan 2022 18:21:29 +0000 (18:21 +0000)]
cls/journal: skip disconnected clients when finding min_commit_position

When a new journal client is registered, all already registered
clients are checked, and a client with min position is selected
as a position for the new client. Thus we may expect that
starting from the registered position all journal entries will be
available (not trimmed) for the new client.

But when looking for a min commit position, the client_register
function did not take into account that a registered client might
be in disconnected state, and in that case the journal entries
might be trimmed for this client.

Fixes: https://tracker.ceph.com/issues/53888
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 078d72e5e6cfa41f809045ff03971ac8acf0d31e)

3 years agotest/librbd: make diff-iterate clone tests exercise fast-diff mode 44673/head
Ilya Dryomov [Fri, 7 Jan 2022 12:31:08 +0000 (13:31 +0100)]
test/librbd: make diff-iterate clone tests exercise fast-diff mode

The fast-diff feature wasn't propagated to the clone so these tests
were exercising the slow list_snaps path no matter what RBD_FEATURES
value was supplied to ceph_test_librbd.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ceb13d76f2b3aba7209e85f3354970c072997742)

Conflicts:
src/test/librbd/test_librbd.cc [ commit d1c82d55827e ("librbd:
  enable image cache after getting exclusive lock") not in
  octopus ]

3 years agolibrbd: restore diff-iterate include_parent functionality in fast-diff mode
Ilya Dryomov [Wed, 5 Jan 2022 19:24:40 +0000 (20:24 +0100)]
librbd: restore diff-iterate include_parent functionality in fast-diff mode

Commit 4429ed4f3f4c ("librbd: switch diff iterate API to use new snaps
list dispatch methods") removed the recursive execute() call.  The new
list_snaps method does indeed handle parent diffs internally but it is
not used in fast-diff mode.  Nothing changed there -- we still need to
load the parent object map, calculate parent object_diff_state, etc.

Fixes: https://tracker.ceph.com/issues/53787
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 04293bef6ccd2b9ca3db53906b63c952e235cdb4)

Conflicts:
src/librbd/api/DiffIterate.cc [ drop all changes, bring in
  just the new test ]

3 years agolibrbd: diff-iterate reports incorrect offsets if whole_object=true 44548/head
Ilya Dryomov [Wed, 19 Jan 2022 20:08:01 +0000 (21:08 +0100)]
librbd: diff-iterate reports incorrect offsets if whole_object=true

It turns out that in octopus both fast-diff and list-snaps (slow)
modes were broken.  As long as whole_object=true, the same incorrect
offset was reported in both modes.  The fast-diff mode is fixed in
in previous commit.

This is an octopus-only patch for list-snaps mode.  In pacific this
issue was addressed with 4429ed4f3f4c ("librbd: switch diff iterate
API to use new snaps list dispatch methods").

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agoMerge pull request #44373 from rhcs-dashboard/cypress-octopus
Ernesto Puerta [Tue, 18 Jan 2022 19:59:00 +0000 (20:59 +0100)]
Merge pull request #44373 from rhcs-dashboard/cypress-octopus

octopus: mgr/dashboard: upgrade Cypress to the latest stable version

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
3 years agoos/bluestore: make shared blob fsck much less RAM-greedy. 44614/head
Igor Fedotov [Tue, 26 Oct 2021 10:35:00 +0000 (13:35 +0300)]
os/bluestore: make shared blob fsck much less RAM-greedy.

Fixes: https://tracker.ceph.com/issues/44924
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit 8fddc8464ee4dbb0ee22a10de21e8c16f38bf1ed)
(cherry picked from commit a902d22b6c785099c704a229db1dc1e6fefee3e2)

 Conflicts:
src/common/options/global.yaml.in
src/os/bluestore/BlueStore.cc
        src/os/bluestore/BlueStore.h
src/os/bluestore/bluestore_types.h
src/test/objectstore/store_test.cc
caused by lack of ZNS stuff and using options.cc for config parameter
defitions rather than yaml file(s)

3 years agomgr/dashboard: upgrade Cypress to the latest stable version 44373/head
Alfonso Martínez [Tue, 23 Nov 2021 14:17:54 +0000 (15:17 +0100)]
mgr/dashboard: upgrade Cypress to the latest stable version

- Remove unneeded dependency that was causing UI performance issues: zone.js
- Ignore 'ResizeObserver loop limit exceeded' error.
- run-frontend-e2e-tests.sh refactoring: create rgw dashboard user through
  'ceph dashboard set-rgw-credentials' and use it on rgw buckets' tests.

Fixes: https://tracker.ceph.com/issues/53357
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit 3e4e29590aa1742fc3b44d21389325a13cca8199)

 Conflicts:
src/pybind/mgr/dashboard/frontend/cypress/integration/rgw/buckets.e2e-spec.ts
   Reject the current changes
        src/pybind/mgr/dashboard/frontend/cypress/integration/rgw/buckets.po.ts
   Reject the current changes
src/pybind/mgr/dashboard/frontend/cypress/integration/ui/navigation.po.ts
   Deleted this file since its not in octopus
src/pybind/mgr/dashboard/frontend/package-lock.json
   Generated new file
src/pybind/mgr/dashboard/frontend/package.json
   Kept zone.js and changed the cypress version to 9.0.0
src/pybind/mgr/dashboard/run-frontend-e2e-tests.sh
   Accept the current change

3 years agoqa/tasks/qemu: get the new Let's Encrypt root certificate 44595/head
Ilya Dryomov [Tue, 11 Jan 2022 20:26:12 +0000 (21:26 +0100)]
qa/tasks/qemu: get the new Let's Encrypt root certificate

Fixes: https://tracker.ceph.com/issues/53841
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit b47965b5773d086eb64e7f91bdc05f483f562b00)

3 years agoqa/run_xfstests_qemu.sh: harden against wget failures
Ilya Dryomov [Tue, 11 Jan 2022 12:13:01 +0000 (13:13 +0100)]
qa/run_xfstests_qemu.sh: harden against wget failures

If wget fails (e.g. due to a certificate issue), it still creates
an empty file.  Then this file is marked executable, ./"${SCRIPT}"
immediately returns 0 and run_xfstests_qemu.sh exits successfully
without running a single xfstest.

This started on Sep 30, 2021 with the expiration of Let's Encrypt
root certificate -- all qemu jobs with "test: qa/run_xfstests_qemu.sh"
just booted the VM for a couple of seconds and reported success.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 387be947948ff1dd40e88ae5288b9a52c7cde403)

3 years agoosd/PeeringState: separate history's pruub from pg's 44585/head
Sage Weil [Thu, 18 Nov 2021 20:46:06 +0000 (15:46 -0500)]
osd/PeeringState: separate history's pruub from pg's

(pruub = prior_readable_until_ub [upper bound])

During peering, a primary may conclude that it does not need to wait for
the prior interval(s)' read lease because it will query all such osds.
However, it is dangerous to reflect that local inference about future
peering effects in the info.history, which is freely shared with other
OSDs.  For example, if the primary cleared the history pruub, shared it,
and then failed, the next primary may conclude that it does not need to
wait for the lease to expire.

Instead, track the pruub in the conventional way.  Only at the end of
peering do we clear it (if there are no prior_interval_down_osds) before
ending our final activate infos out, when we have already talked to the
peer osds and they know the prior interval has finished.

Fixes: https://tracker.ceph.com/issues/53326
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 96d6bf22a5dbf14da1de0ee6128585b9dd9d60d8)

3 years agoMerge pull request #43822 from trociny/wip-48925-octopus
Yuri Weinstein [Wed, 12 Jan 2022 21:02:49 +0000 (13:02 -0800)]
Merge pull request #43822 from trociny/wip-48925-octopus

octopus: cephadm: Fix iscsi client caps (allow mgr <service status> calls)

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
3 years agoMerge pull request #43788 from sebastian-philipp/backport-43039
Yuri Weinstein [Wed, 12 Jan 2022 21:02:09 +0000 (13:02 -0800)]
Merge pull request #43788 from sebastian-philipp/backport-43039

octopus: qa/distros: Remove stale kubic distros

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Adam King adking@redhat.com
3 years agorbd-mirror: make RemoveImmediateUpdate test synchronous 43663/head
Arthur Outhenin-Chalandre [Tue, 23 Nov 2021 14:25:46 +0000 (15:25 +0100)]
rbd-mirror: make RemoveImmediateUpdate test synchronous

Try fixing sporadic failure linked in the tracker in
TestMockMirrorStatusUpdater.RemoveImmediateUpdate by making it
synchronous.

Fixes: https://tracker.ceph.com/issues/53375
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit 9385acfc25a2bd0e214b4191109b7ed84f5826b4)

3 years agorbd-mirror: remove image_map next_state if sets to the same state
Arthur Outhenin-Chalandre [Fri, 6 Aug 2021 13:54:38 +0000 (15:54 +0200)]
rbd-mirror: remove image_map next_state if sets to the same state

In some cases, set_state is called with DISSOCIATING, then ASSOCIATING
and DISSOCIATING again. In this case the state DISSOCIATING is
processed to remove the image and then schedule the next action which is
associating.

To fix this case, this commit removes the next_state if the state is
sets to the same state.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit b664a95889b35d0d3afbd8428e3af4976d7f81eb)

3 years agorbd-mirror: handle disabling/creating image in PrepareLocalImageRequest
Arthur Outhenin-Chalandre [Thu, 29 Jul 2021 09:54:45 +0000 (11:54 +0200)]
rbd-mirror: handle disabling/creating image in PrepareLocalImageRequest

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit 965bc4150eafc8e3bbe69f63beea9c7fbb20ceb6)
Conflicts:
        src/tools/rbd_mirror/image_replayer/PrepareLocalImageRequest.cc
- Trivial conflict resolution; s/lirbd::asio:://

3 years agorbd-mirror: fix bootstrap sequence while the image is removed
Arthur Outhenin-Chalandre [Wed, 28 Jul 2021 12:14:47 +0000 (14:14 +0200)]
rbd-mirror: fix bootstrap sequence while the image is removed

If the image is being removed the PrepareRemoteImageRequest was
returning the same error if the image was disabled or non primary which
doesn't allow the BootstrapRequest to have the correct error handling.

This commit fix this behavior by considering that the remote image is
already deleted if the image is in disabling state.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit ff60aec2d9efa1842383ba0a5c3bd6b5a29389c6)

3 years agorbd-mirror: remove image_mapped condition to remove image_map
Arthur Outhenin-Chalandre [Thu, 22 Jul 2021 16:53:16 +0000 (18:53 +0200)]
rbd-mirror: remove image_mapped condition to remove image_map

In some split-brain scenario the image is removed while the image_mapped
is false. This prevents the removal of image_map in OMAP and thus the
entry will not be removed until the daemon is restarted.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit 35398a5e17dc5a536ccd63417c937f2efe742654)

3 years agocls/rbd: prevent image_status when mirror image is not created
Arthur Outhenin-Chalandre [Thu, 22 Jul 2021 14:37:47 +0000 (16:37 +0200)]
cls/rbd: prevent image_status when mirror image is not created

This prevent image_status_set to succeed when there is no mirror image
yet. This solves some stale entries that were not removed in
rbd-mirror and prevent to add entries that would not be visible from the
rbd cli.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit 416e25794af0252ec45b35c897c8cf2e91aca383)

3 years agorbd-mirror: add image_map cleanup in LoadRequest
Arthur Outhenin-Chalandre [Tue, 13 Jul 2021 12:19:49 +0000 (14:19 +0200)]
rbd-mirror: add image_map cleanup in LoadRequest

In the LoadRequest in the ImageMap class add initial cleanup to remove
stale entries. To cleanup the LoadRequest will query the mirror image
list and remove all the image_map that are notin the list.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit e135403c736295b63fe1c8a861af40de302b8b57)

3 years agoqa/rbd-mirror: add OMAP cleanup checks
Arthur Outhenin-Chalandre [Fri, 11 Jun 2021 07:29:59 +0000 (09:29 +0200)]
qa/rbd-mirror: add OMAP cleanup checks

This make sure that all images are deleted in the existing qa scripts
and checks if all rbd-mirror metadata in OMAP are correctly deleted.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit 4db66da51211504ba0a2353180ae084ba1ab3fcf)

3 years agorbd-mirror: remove mirror image at shut_down when there is no images
Arthur Outhenin-Chalandre [Fri, 25 Jun 2021 08:15:23 +0000 (10:15 +0200)]
rbd-mirror: remove mirror image at shut_down when there is no images

Some cases makes the ImageReplayer to be eternally restarted if there is
no local and remote images.

If both images are absent and that the local image id exists, the
ImageReplayer shutdown will request a mirror image removal.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit 0c1c7fb886fcaaff5f00937cf62cf69feb8d4deb)

 Conflicts:
src/tools/rbd_mirror/image_deleter/TrashMoveRequest.cc

Added a condition to handle the case where m_image_ctx is null on
close_image and handle_close_image in the TrashMoveRequest. This fix is
not needed in newer versions of Ceph as ImageCtx no longer needs to be
destroyed explicitely with a destroy method after Octopus.

3 years agorbd-mirror: add mirror status removal on ImageReplayer shutdown
Arthur Outhenin-Chalandre [Mon, 7 Jun 2021 12:58:03 +0000 (14:58 +0200)]
rbd-mirror: add mirror status removal on ImageReplayer shutdown

In a scenario where you have rbd-mirror daemons on both clusters. The
rbd-mirror daemon on the primary site will not properly cleanup his
status on image removal.

This commit add a path for direct removal at the shut_down of the
ImageReplayer to properly cleanup the metadata.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit a538c5d279c90397d375668baddd65776d2462b0)

Conflicts:
        src/test/rbd_mirror/test_mock_MirrorStatusUpdater.cc
- Trivial conflict resolution; io_ctx exec has 1 less argument

3 years agocls/rbd: add mirror_image_status_remove on client
Arthur Outhenin-Chalandre [Mon, 7 Jun 2021 10:53:48 +0000 (12:53 +0200)]
cls/rbd: add mirror_image_status_remove on client

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit 0e147f779d73d0688c2b89428db6012ed1560f20)

 Conflicts:
src/cls/rbd/cls_rbd_client.h
- Trivial conflict resolution

3 years agorbd-mirror: fix mirror image removal
Arthur Outhenin-Chalandre [Fri, 4 Jun 2021 16:29:37 +0000 (18:29 +0200)]
rbd-mirror: fix mirror image removal

Invoke ImageRemoveRequest instead of calling directly
mirror_image_remove so that the MirrroringWatcher can pick up local
image deletion.

Fixes: https://tracker.ceph.com/issues/51031
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit 34082b7ee48a33e566348395395858e1e0db3013)

 Conflicts:
src/test/rbd_mirror/image_deleter/test_mock_TrashMoveRequest.cc
- Trivial conflict resolution

3 years agolibrbd: diff-iterate reports incorrect offsets in fast-diff mode
Ilya Dryomov [Tue, 4 Jan 2022 19:38:35 +0000 (20:38 +0100)]
librbd: diff-iterate reports incorrect offsets in fast-diff mode

If rbd_diff_iterate2() is called on an image offset that doesn't
correspond to an object boundary, the callback is invoked with an
incorrect image offset.  For example, assuming a fully allocated
image, a diff request for 806354944~57344 results in offs=807403520,
len=57344, exists=true invocation, which is ahead by 1048576 bytes.
This occurs only in fast-diff mode, for a diff request on an image
with the fast-diff feature disabled or if whole_object parameter is
set to false the invocation is correct.

This bug goes back to the introduction of fast-diff mode in commit
6d5b969d4206 ("librbd: add diff_iterate2 to API").

Fixes: https://tracker.ceph.com/issues/53784
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ea07d1e834018c693fc03637d338806f3c2f494f)

Conflicts:
src/librbd/api/DiffIterate.cc

Cherry-pick notes:
- Octopus still has explicit iterator syntax in for loop for extents

3 years agomon: prevent new sessions during shutdown 44544/head
Sage Weil [Thu, 16 Dec 2021 15:24:46 +0000 (10:24 -0500)]
mon: prevent new sessions during shutdown

From shutdown() we set STATE_SHUTDOWN and then call remove_all_sessions().
ms_handle_accept() is the only caller of add_session, so verifying that
we aren't shutting down (while under the session_map_lock) is sufficient
to prevent any new sessions from being added.

Fixes: https://tracker.ceph.com/issues/39150
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit c98b268847a1b79dbd1693f1c5ba120f6fc05855)

3 years agoMerge pull request #43785 from lxbsz/wip-51415
Yuri Weinstein [Fri, 7 Jan 2022 16:48:50 +0000 (08:48 -0800)]
Merge pull request #43785 from lxbsz/wip-51415

Octopus: mds: just respawn mds daemon when osd op requests timeout

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Venky Shankar vshankar@redhat.com
3 years agoMerge pull request #44271 from nmshelke/wip-53331-octopus
Yuri Weinstein [Fri, 7 Jan 2022 16:22:23 +0000 (08:22 -0800)]
Merge pull request #44271 from nmshelke/wip-53331-octopus

octopus: doc: prerequisites fix for cephFS mount

Reviewed-by: Venky Shankar vshankar@redhat.com
3 years agoMerge pull request #44270 from vshankar/tr-53444
Yuri Weinstein [Fri, 7 Jan 2022 16:21:48 +0000 (08:21 -0800)]
Merge pull request #44270 from vshankar/tr-53444

octopus: qa: account for split of the kclient "metrics" debugfs file

Reviewed-by: Xiubo Li <xiubli@redhat.com>