git-server-git.apps.pok.os.sepia.ceph.com Git

qa: add "failover / failback loop" test for rbd-mirror

For snapshot-based mirroring, check that demote (or other mirror
snapshots) don't pile up. Nothing in particular to assert on for
journal-based mirroring but the test is still useful.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 153df2d64b8cb51b5ce7559576613788795a4967)

librbd: make CreatePrimaryRequest remove any unlinked mirror snapshots

After commit ac552c9b4d65 ("librbd: localize snap_remove op for mirror
snapshots"), rbd-mirror daemon no longer removes mirror snapshots when
it's done syncing them -- instead it only unlinks from them.  However,
CreatePrimaryRequest state machine was not adjusted to compensate and
hence two cases were missed:

- primary demotion snapshot (rbd-mirror daemon unlinks from primary
  demotion snapshots just like it does from regular primary snapshots);
  this comes up when an image is demoted but then promoted on the same
  cluster

- non-primary demotion snapshot (unlike regular non-primary snapshots,
  non-primary demotion snapshots store peer uuids and rbd-mirror daemon
  does unlinking just like in the case of primary snapshots); this
  comes up when an image is demoted and promoted on the other cluster

Related is the case of orphan snapshots.  Since they are dummy to begin
with, CreatePrimaryRequest would now clean up the orphan snapshot after
the creation of the force promote snapshot.

Fixes: https://tracker.ceph.com/issues/61707
Co-authored-by: Christopher Hoffman <choffman@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 9c05d3d81f4b06af2cfd47376e9ad86369bdf8cf)

Conflicts:
src/librbd/mirror/snapshot/CreatePrimaryRequest.cc [ commit
  3a93b40721a1 ("librbd: s/boost::variant/std::variant/") not
  in pacific ]

librbd: don't attempt to remove image state on orphan snapshots

Despite being mirror snapshots, orphan snapshots don't have image
state: see CreateNonPrimaryRequest::write_image_state() for a similar
is_orphan() check.  Attempting to remove image state generates bogus
"failed to read image state object" and "failed to remove image state"
errors.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit cfae3f79bd0513e2753b0deb8c2624ab07cf2d1b)

Conflicts:
src/librbd/operation/SnapshotRemoveRequest.cc [ commit
  3a93b40721a1 ("librbd: s/boost::variant/std::variant/") not
  in pacific ]

Merge PR #53189 into pacific

* refs/pull/53189/head:
mgr: register OSDs in ms_handle_accept
msg: indicate ms_handle_authentication is fast

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Merge pull request #53210 from ceph/pacific-release

v16.2.14

Merge pull request #53215 from rhcs-dashboard/wip-62632-pacific

pacific: mgr/dashboard: fix rgw page issues when hostname not resolvable

Reviewed-by: Avan Thakkar <athakkar@redhat.com>

mgr/dashboard: fix rgw page issues when hostname not resolvable

Part of the fix is copied from https://github.com/ceph/ceph/pull/47495/files#diff-ba538532bb5450d415ab80916916bd68fe51d195e57c9aa34effaa8789d7a392R705

Fixes: https://tracker.ceph.com/issues/62396
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 78cfeb6372707ec3f997d28ad617367feb3a983e)

Conflicts:
src/pybind/mgr/dashboard/services/rgw_client.py
- only keep the CephService import

16.2.14

Signed-off-by: Ceph Release Team <ceph-maintainers@ceph.io>

Merge pull request #53202 from rhcs-dashboard/wip-62618-pacific

pacific: mgr/dashboard: set CORS header for unauthorized access

Reviewed-by: Pedro Gonzalez Gomez <pegonzal@redhat.com>

mgr/dashboard: allow CORS for unauthorized access

Fixes: https://tracker.ceph.com/issues/62612
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 8158bdab7134714dc2a9f155e599cc2838c3358d)

Merge pull request #53197 from ljflores/wip-liburing-pacific

pacific: make-dist: download liburing from kernel.io instead of github

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

make-dist: download liburing from kernel.io instead of github

Due to a bug with github.com, wget does not reliably download
packages. See https://github.com/orgs/community/discussions/65227

This change is motivated by this error that occurs when trying
to download liburing from github.com on an ubuntu jammy machine:
```
$ wget https://github.com/axboe/liburing/archive/liburing-0.7.tar.gz
--2023-08-28 21:26:02-- https://github.com/axboe/liburing/archive/liburing-0.7.tar.gz
Resolving github.com (github.com)... 140.82.113.3
Connecting to github.com (github.com)|140.82.113.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/axboe/liburing/tar.gz/refs/tags/liburing-0.7 [following]
--2023-08-28 21:26:02-- https://codeload.github.com/axboe/liburing/tar.gz/refs/tags/liburing-0.7
Resolving codeload.github.com (codeload.github.com)... 140.82.112.10
Connecting to codeload.github.com (codeload.github.com)|140.82.112.10|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2023-08-28 21:26:02 ERROR 403: Forbidden.
```

The same does not happen on centos 8 or fedora.

Downloading from kernel.io works on every distro.

Signed-off-by: Laura Flores <lflores@ibm.com>
(cherry picked from commit 7f109dc612d9590f84a8fe7a047a78c7d621fcab)

Merge pull request #53157 from ljflores/wip-62591-pacific

pacific: python-common: drive_selection: fix KeyError when osdspec_affinity is not set

mgr: register OSDs in ms_handle_accept

It's a no-no to acquire locks in these "fast" messenger methods. This
can lead to messenger slow downs in the best case as it's blocking reads
on the wire. In the worse case, the messenger may deadlock with other
threads, preventing any further message reads off the wire.

It's not obvious this method is "fast" so I've added a comment regarding
this.

Fixes: https://tracker.ceph.com/issues/61874
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 69980823e62f67d502c4045e15c41c5c44cd5127)

msg: indicate ms_handle_authentication is fast

Like other fast Dispatcher methods, it must not acquire locks or do
anything that might take a long time.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 3e2075103a0ab6b7ced5800db1d44d13b1c8b7e6)

python-common: drive_selection: fix KeyError when osdspec_affinity is not set

When osdspec_affinity is not set, the drive selection code will fail.
This can happen when a device has multiple LVs where some of are used
by Ceph and at least one LV isn't used by Ceph.

Fixes: https://tracker.ceph.com/issues/58946
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
(cherry picked from commit 908f1d17a15a9d4c9bf603aa45e5f246bb0263e7)

Merge pull request #51509 from mchangir/wip-59202-pacific

pacific: qa: add subvolume option flavors

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #52979 from k0ste/wip-50915-pacific

pacific: mds: fix cpu_profiler asok crash

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #52974 from batrick/wip-62421-pacific

pacific: mds: adjust cap acquisition throttles

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #52270 from dparmar18/wip-61841-pacific

pacific: do not evict clients if OSDs are laggy

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>

Merge pull request #52753 from mchangir/wip-61793-pacific

pacific: mgr/snap_schedule: catch all exceptions for cli

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #52552 from cfsnyder/wip-62064-pacific

pacific: rgw: fix consistency bug with OLH objects

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #52878 from idryomov/wip-52913-pacific

pacific: rbd-mirror: fix image replayer shut down description on force promote

Reviewed-by: Ramana Raja <rraja@redhat.com>

Merge pull request #52625 from nbalacha/wip-62111-pacific

pacific: rbd-mirror: fix race preventing local image deletion

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>

Merge pull request #52883 from ajarr/wip-59737-pacific

pacific: mgr: store names of modules that register RADOS clients in the MgrMap

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Merge pull request #51464 from ajarr/wip-59712-pacific

pacific: mgr/rbd_support: fixes related to recover from rados client blocklisting

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

rgw: fix/improve test_rgw_versioning.py tests

Signed-off-by: Cory Snyder <csnyder@1111systems.com>
(cherry picked from commit aa1f40e80d78ce08a2f51dad6fadf620866d17a3)

Merge pull request #52943 from YiteGu/fix-duplicate-onode-miss-statistics

pacific: os/bluestore: don't need separate variable to mark hits when lookup oid.

Reviewed-by: Igor Fedotov <ifedotov@suse.com>

Merge pull request #51487 from vshankar/tr-59720

pacific: client: use deep-copy when setting permission during make_request

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #52953 from k0ste/wip-62116-pacific

pacific: qa: use parallel gzip for compressing logs

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #52900 from leonid-s-usov/backport/bulk-data-pool/pacific

pacific: Consider setting "bulk" autoscale pool flag when automatically creating a data pool for CephFS

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #52848 from lxbsz/wip-62193

pacific: mds: do not send split_realms for CEPH_SNAP_OP_UPDATE msg

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #52844 from lxbsz/wip-62202

pacific: mds: skip forwarding request if the session were removed

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #52726 from kotreshhr/wip-62242-pacific

pacific: mds: Fix the linkmerge assert check

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #52682 from batrick/wip-62190-pacific

pacific: mds: update mdlog perf counters during replay

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #53002 from idryomov/wip-62437-pacific

pacific: qa/suites/upgrade/octopus-x: skip TestClsRbd.mirror_snapshot test

Reviewed-by: Mykola Golub <mgolub@suse.com>

Merge pull request #51039 from vshankar/tr-59003

pacific: mgr/volumes: avoid returning -ESHUTDOWN back to cli

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

qa/tasks: set defer_client_eviction_on_laggy_osds=false in api tests

We expect laggy OSDs in this testing environment,
so it makes sense to disable this warning.

Fixes: https://tracker.ceph.com/issues/61907
Signed-off-by: Laura Flores <lflores@redhat.com>
(cherry picked from commit 2322d2c8e0f4902aba49f7441d8dd00bdb675b85)
(cherry picked from commit 2032e8b41efb665db46ad0584058b08bd1aaf561)

Merge pull request #52873 from ifed01/wip-ifed-encrypted-ceph-volume-pac

pacific: ceph_volume: support encrypted volumes for lvm new-db/new-wal/migrate commands

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>

Merge pull request #52212 from ifed01/wip-ifed-bluefs-cumulative-backports-pac

pacific: os/bluestore: cumulative bluefs backport

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

Merge pull request #51773 from ifed01/wip-ifed-fix-bluefs-prealloc-pac

pacific: os/bluestore: proper override rocksdb::WritableFile::Allocate

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

Merge pull request #51418 from ifed01/wip-ifed-fix-fit-to-fast-pac

pacific: os/bluestore: allow 'fit_to_fast' selector for single-volume osd

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

qa/suites/upgrade/octopus-x: skip TestClsRbd.mirror_snapshot test

The behavior of the class method changed in reef; the change was
backported to pacific and quincy.  An octopus test binary used against
pacific OSDs produces an expected failure:

    [ RUN      ] TestClsRbd.mirror_snapshot
    .../ceph-15.2.17/src/test/cls_rbd/test_cls_rbd.cc:2279: Failure
    Expected equality of these values:
      -85
      mirror_image_snapshot_unlink_peer(&ioctx, oid, 1, "peer2")
        Which is: 0
    [  FAILED  ] TestClsRbd.mirror_snapshot (6 ms)

Fixes: https://tracker.ceph.com/issues/62437
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

mon/MDSMonitor: print a note when adding data pool to a file system that doesn't have 'bulk' flag set

The note will only be printed if the pool has pg_autoscale_mode set to ON and the bulk flag is missing

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
(cherry picked from commit 7d09154ce87d24993f605c8bbf829d6415b89562)

mgr/volumes: set the 'bulk' flag for data pools created automatically for a new volume

Signed-off-by: Leonid Usov <leonid.usov@ibm.com>
Fixes: https://tracker.ceph.com/issues/61595
(cherry picked from commit 9a8219cc2bab2b5d4c0cc26783eac0ae87a8b24b)

mds: reset code after cpu_profiler

Signed-off-by: liu shi <liu.shi@navercorp.com>
(cherry picked from commit f1afb7b1b8d1b4873730e1b88a552213e4c51977)

cpu_profiler: fix asok command crash

fixes: https://tracker.ceph.com/issues/50814
Signed-off-by: liu shi <liu.shi@navercorp.com>
(cherry picked from commit be7303aafe34ae470d2fd74440c3a8d51fcfa3ff)

mds: adjust cap acquisition throttles

For production workloads, these defaults rarely help. Adjust
accordingly. For a steady state "find" workload, these new throttles
will prevent acquiring more than ~2300 caps/second which is quite
manageable with typical recall rates.

-ln(0.5) / 30 * 100k = 2310

Fixes: https://tracker.ceph.com/issues/62114
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit f290ef9d0d2d09fb978d56c46be704c6efd45c43)

Conflicts:
src/common/options/mds.yaml.in: trivial

Merge pull request #52468 from k0ste/wip-62031-pacific

pacific: mon/ConfigMonitor: update crush_location from osd entity

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

Merge pull request #51812 from NitzanMordhai/wip-61488-pacific

pacific: pybind/argparse: blocklist ip validation

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

qa: time log compression

For debugging and ad-hoc analytics.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 6739e1156350be7032f8832c04cf28da79e9c4d9)

qa/tasks: give verbose gzip output

For future analysis.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 0a03a47103e465dbd546e0495f8e7720402fbe6f)

qa/tasks: use medium compression

To speed up compression.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 3c76cc3c5172591cdfd88b3d0f29298aba6fde9c)

qa/ceph: parallelize gzip

Our machines have lots of cores, use them!

Fixes: https://tracker.ceph.com/issues/59120
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 23a29d4abe54d987a99434ad0d733e0b44a213f9)

os/bluestore: don't need separate variable to mark hits when lookup oid.

Signed-off-by: locallocal <locallocal@163.com>
(cherry picked from commit 1428544ec66b498830bc884b4824cd90106053d5)

Merge pull request #51382 from RaminNietzsche/wip-ifed-fix-require-osd-release-to-pacific

pacific: mon: avoid exception when setting require-osd-release more than 2 versions up.

Reviewed-by: Igor Fedotov <igor.fedotov@croit.io>

mds: do not send split_realms for CEPH_SNAP_OP_UPDATE msg

The clients won't care about the split_realms and the kclient will
treat this as a corrupted snaptrace.

Fixes commit 93e7267757508520dfc22cff1ab20558bd4a44d4 ("mds: send snap
related messages centrally during mds recovery")
Fixes: https://tracker.ceph.com/issues/61217
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit 3be9c9796246eab96f672218d9e45f118f6cdf12)

Conflicts:
- Misses a dependent commit 7a4c509f7289ff4

Merge pull request #52500 from lxbsz/wip-62040

pacific: client: do not send metrics until the MDS rank is ready

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #52654 from joscollin/wip-62177-pacific

pacific: qa: fix cephfs-mirror unwinding and 'fs volume create/rm' order

Reviewed-by: Venky Shankar <vshankar@redhat.com>

qa: avoid explicit set to client mountpoint as "/"

This causes self.cephfs_mntpt to set as "/" by default which
overrides the config in ceph.conf. `test_client_cache_size`
updates ceph.conf with:

        client mountpoint = /subdir

However, the ceph-fuse mount command has --client_mountpoint explicitly
set as "/", thereby causing the root of the file system to get mounted which
confuses the test.

Fixes: http://tracker.ceph.com/issues/56446
Introduced-by: bf83eaa4e75516a6937e4097b8708c48856a9473
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 4322dcc2e94bab80042eaf1e236174f6e6772cec)

Conflicts:
qa/tasks/cephfs/fuse_mount.py
- merge conflicts due to updated upstream code
- removed offending line; host_mntpt was appended to the mount command
  later in the code; this issue was created due to manual conflict
  resolution during backporting process;
qa/tasks/cephfs/kernel_mount.py
qa/tasks/cephfs/mount.py
- fixed conflicts between 'main' and 'pacific' branches

Merge pull request #52505 from lxbsz/wip-62012

pacific: client: wait rename to finish

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #52506 from lxbsz/wip-61983

pacific: client: force sending cap revoke ack always

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #52513 from joscollin/wip-62055-pacific

pacific: mds: MDLog::_recovery_thread: handle the errors gracefully

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #52499 from lxbsz/wip-62043

pacific: client: trigger to flush the buffer when making snapshot

Reviewed-by: Venky Shankar <vshankar@redhat.com>

mgr/rbd_support: log number of images

... that have one snapshot request pending when the
mirror_snapshot_schedule handler is shutting down.

Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit edc3b0e80653c30d0fc318a24673e8a0568912ad)

Conflicts:
src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py
- Above conflict was due to commit e4a16e2
("mgr/rbd_support: add type annotation") not in pacific

mgr/rbd_support: add user-friendly stderr message

... when the rbd_support module is not ready.

Fixes: https://tracker.ceph.com/issues/61688
Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit 6351ef5c8e691e359b1bf913dde4dbc8a441be1d)

Conflicts:
src/pybind/mgr/rbd_support/module.py
- Above conflict was due to commit dcb51b0
("mgr/rbd_support: define commands using CLICommand") not in pacific

rbd_support: recover from "double blocklisting"

Recover from being blocklisted while recovering from blocklisting.
When the rbd_support  module is being set up to recover from client
blocklisting, the module's new rados client connection can also get
blocklisted. Currently, this will cause the recovery to fail and
the module will remain inoperable. Instead, retry module recovery
when the new client gets blocklisted during the module setup in the
recovery thread.

Fixes: https://tracker.ceph.com/issues/59713
Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit 4523d9b68ee84f69e8665a728d4037b53cdf3d6f)

Conflicts:
src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py
src/pybind/mgr/rbd_support/module.py
src/pybind/mgr/rbd_support/perf.py
src/pybind/mgr/rbd_support/task.py
src/pybind/mgr/rbd_support/trash_purge_schedule.py
- Above conflicts were due to commit e4a16e2
   ("mgr/rbd_support: add type annotation") not in pacific
- Above conflicts were due to commit dcb51b0
   ("mgr/rbd_support: define commands using CLICommand") not in pacific

qa/workunits/rbd: Add tests for rbd_support module recovery

... after the module's RADOS client is blocklisted.

Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit a2f15d4b2f876c79ee1de59fb79851b0eb505951)

mgr/rbd_support: recover from rados client blocklisting

In certain scenarios the OSDs were slow to process RBD requests.
This lead to the rbd_support module's RBD client not being able to
gracefully handover a RBD exclusive lock to another RBD client.
After the condition persisted for some time, the other RBD client
forcefully acquired the lock by blocklisting the rbd_support module's
RBD client, and consequently blocklisted the module's RADOS client. The
rbd_support module stopped working. To recover the module, the entire
mgr service had to be restarted which reloaded other mgr modules.

Instead of recovering the rbd_support module from client blocklisting
by being disruptive to other mgr modules, recover the module
automatically without restarting the mgr serivce. On client getting
blocklisted, shutdown the module's handlers and blocklisted client,
create a new rados client for the module, and start the new handlers.

Fixes: https://tracker.ceph.com/issues/56724
Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit cc0468738e5ddb98f7ac10b50e54446197b9c9a0)

Conflicts:
src/pybind/mgr/rbd_support/mirror_snapshot_schedule.py
src/pybind/mgr/rbd_support/module.py
src/pybind/mgr/rbd_support/perf.py
src/pybind/mgr/rbd_support/task.py
src/pybind/mgr/rbd_support/trash_purge_schedule.py
- Above conflicts were due to commit e4a16e2
("mgr/rbd_support: add type annotation") not in pacific
- Above conflicts were due to commit dcb51b0
("mgr/rbd_support: define commands using CLICommand") not in pacific

pybind/rados: add ConnectionShutdown exception class

Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit e452899013323def87a8b9e6edbdae66067a827c)

mgr/rbd_support: notify the thread waiting on pending snapshot

... requests to be completed.

Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit 38a7e3715f0cee225aa49f3331d85ad37e2b7422)

mgr: take a lock within PyModuleRegistry's mutators and accessor

... that modify and access the data member 'clients' respectively.

Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit a586dcc57ab35a269f0c271756951d49f422662d)

Merge pull request #52397 from mchangir/wip-61961-pacific

pacific: mon: block osd pool mksnap for fs pools

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #52304 from lxbsz/wip-61798

pacific: client: only wait for write MDS OPs when unmounting

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #52244 from batrick/wip-61426-pacific

pacific: mon/MDSMonitor: ignore extraneous up:boot messages

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #52240 from batrick/wip-61414-pacific

pacific: mon/MDSMonitor: do not propose on error in prepare_update

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #52237 from batrick/wip-59372-pacific

pacific: qa: wait for MDSMonitor tick to replace daemons

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #52233 from batrick/wip-61411-pacific

pacific: mon/MDSMonitor: check fscid in pending exists in current

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #52230 from batrick/wip-61692-pacific

pacific: mon/MDSMonitor: batch last_metadata update with pending

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #52125 from joscollin/wip-61734-pacific

pacific: mds: display sane hex value (0x0) for empty feature bit

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #52075 from joscollin/wip-61696-pacific

pacific: debian: install cephfs-mirror systemd unit files and man page

Reviewed-by: Venky Shankar <vshankar@redhat.com>

mgr: store names of modules that register RADOS clients in the MgrMap

The MgrMap stores a list of RADOS clients' addresses registered by the
mgr modules. During failover of ceph-mgr, the list is used to blocklist
clients belonging to the failed ceph-mgr.

Store the names of the mgr modules that registered the RADOS clients
along with the clients' addresses in the MgrMap. During debugging, this
allows easy identification of the mgr module that registered a
particular RADOS client by just dumping the MgrMap (`ceph mgr dump`).

Following is the MgrMap output with a module's client name displayed
along with its client addrvec,
$ ceph mgr dump | jq '.active_clients[0]'
{
  "name": "devicehealth",
  "addrvec": [
    {
      "type": "v2",
      "addr": "10.0.0.148:0",
      "nonce": 612376578
    }
  ]
}

Fixes: https://tracker.ceph.com/issues/58691
Signed-off-by: Ramana Raja <rraja@redhat.com>
(cherry picked from commit b545fb9f5660dca3af4dea195ea4555f09b3a6e8)

Conflicts:
PendingReleaseNotes [ moved to >=16.2.14 section ]

Merge pull request #51508 from lxbsz/wip-59706

pacific: mds: do not take the ino which has been used

Reviewed-by: Venky Shankar <vshankar@redhat.com>

rbd-mirror: fix image replayer shut down description on force promote

On force promote if the opposite site is down then we currently show
image status description as "local image linked to unknown peer"

Previously:
----------
$ rbd --cluster=site-b mirror image status pool1/img1
img1:
  global_id:   a73341a6-8302-4c97-ac6e-278083fd347e
  state:       up+stopping_replay
  description: local image linked to unknown peer
  service:     admin on localhost.localdomain
  last_update: 2023-06-15 19:47:45
  peer_sites:
    name: site-a
    state: up+stopped
    description: local image is primary
    last_update: 2023-06-15 19:47:32
  snapshots:
    9 .mirror.primary.a73341a6-8302-4c97-ac6e-278083fd347e.1f101367-277f-42f0-8308-e51201d0529a (peer_uuids:[c46c6d97-f59b-4591-9d35-d7ff9d0d72f7])

Currently:
---------
$ rbd --cluster=site-b mirror image status pool1/img1
img1:
  global_id:   2a6d61e1-8e76-42c4-af76-8f61ce65c7e2
  state:       up+stopped
  description: orphan (force promoting)
  service:     admin on localhost.localdomain
  last_update: 2023-06-15 19:29:22
  peer_sites:
    name: site-a
    state: down+stopped
    description: local image is primary
    last_update: 2023-06-15 19:29:05
  snapshots:
    9 .mirror.primary.2a6d61e1-8e76-42c4-af76-8f61ce65c7e2.99f82a30-0241-4e51-8428-7a2376d137f6 (peer_uuids:[3150c6ef-aeee-45dc-8d0e-5dc5a53d88eb])

Fixes: https://tracker.ceph.com/issues/52913
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
(cherry picked from commit 947a53677d40fd125f041abab9b5e3fea3a8371a)

ceph-volume: make raw prepare use encryption_units.prepare_dmcrypt

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit 2c3477a69e2e01e999ff23ecf4a6508c87c340de)

test/ceph_volume: add UTs for encrypted volumes migration

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit f69316f627014b720f6b89dcb96c29cd7480cf88)

ceph-volume: close encrypted device using name not path when zapping

Just a tiny cleanup.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit ebc335def0b32d1ecfe93fcfa286ddf460f136a1)

ceph-volume: close encrypted volumes on deactivate

Fixes: https://tracker.ceph.com/issues/58943
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit d8f163d1f55d8234824e57fc6f851db9292fa972)

test/ceph-volume: add UT for adding encrypted DB/WAL lvm.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit 9930225391eccd27d028592a90e677c86f01b59c)

ceph-volume: support creating/migrating encrypted lvm volumes

Fixes: https://tracker.ceph.com/issues/55167
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit b45d9b79aa079f3cfe227c74c1df3a3bfad4c7c9)

Merge pull request #51249 from k0ste/wip-52791-pacific

pacific: common/TrackedOp: fix osd reboot optracker coredump

Reviewed-by: Laura Flores <lflores@redhat.com>

os/bluestore: introduce a cooldown period for failed BlueFS allocations.

When using bluefs_shared_alloc_size one might get a long-lasting state when
that large chunks are not available any more and fallback to shared
device min alloc size occurs. The introduced cooldown is intended to
prevent repetitive allocation attempts with bluefs_shared_alloc_size for
a while. The rationale is to eliminate performance penalty these failing
attempts might cause.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit e52bcc852ce51ab99138420f9069e2f59e1cb706)

Conflicts:
src/common/options/global.yaml.in
(legacy options declarations, no yamls in pacific)

Merge pull request #52704 from rhcs-dashboard/wip-62237-pacific

pacific: mgr/dashboard: allow PUT in CORS

Reviewed-by: Avan Thakkar <athakkar@redhat.com>

mds: skip forwarding request if the session were removed

When forwarding the requests, the corresponding session could be
already closed.

https://tracker.ceph.com/issues/60625
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit d0bfbbea44e22fc545363cd6af47d28e18e353b0)

Merge pull request #52790 from adamemerson/wip-62097-pacific

pacific: build: Remove ceph-libboost* packages in install-deps

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

mgr: fix some flake8 complaints

Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit d199782fb5f3c22f1cd5fda6ddf1d64f40a92726)

Conflicts:
src/pybind/mgr/crash/module.py imports were not present

build: install-deps.sh installs system boost on Jammy

Since on Jammy system boost is new enough for Pacific and we don't have
Jammy packages for older boost (we only have those for Bionic), just
install the system packages rather than fetching ceph-libboost.

No analogous commit exists in main as while main's Jammy case installs
ceph-libboost, we just need a system package here.

Fixes: https://tracker.ceph.com/issues/62103
Signed-off-by: Adam Emerson <aemerson@redhat.com>

build: Remove old ceph-libboost* packages in install-deps

Here, we extract `clean_boost_on_ubuntu()` and call it before other
installs on Debian distributions so that if we install a system boost,
a potentially newer `ceph-libboost` won't get in the way.

As the sources.list.d being removed in the original cleanup code isn't
the one we're currently installing in the install code, add a removal
for the currently used source, then do apt-update so packages from the
removed source are no longer included as available.

Two subsidiary dev packages from conflicting boost libraries can be
installed, but it leaves apt in an inconsistent state. To clean this
up, add `--fix-missing` to the removal line and call
`clean_boost_on_ubuntu()` before other uses of apt.

Fixes: https://tracker.ceph.com/issues/62097
Signed-off-by: Adam Emerson <aemerson@redhat.com>
(cherry picked from commit 0c3f511e14af639b6509e69b889258b2f718f8fd)

Conflicts:
install-deps.sh
- Different boost version for Pacific than Squid.
- ci_debug does not exist in Pacific
- whitespace
- No INSTALL_EXTRA

Fixes: https://tracker.ceph.com/issues/62103
Signed-off-by: Adam Emerson <aemerson@redhat.com>

test/store_test: adjust spillover cases to reflect updated WAL usage.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>

mgr/snap_schedule: catch all exceptions for cli

Any unknown exception causes the module to be unloaded and unresponsive.
So, it'll be ideal to catch all exceptions during command-line interaction
and report them instead of crashing with a traceback.

Fixes: https://tracker.ceph.com/issues/58195
Signed-off-by: Milind Changire <mchangir@redhat.com>
(cherry picked from commit 651fb2e3b515c80e9dc4a24638c9d1a0d487c729)