git-server-git.apps.pok.os.sepia.ceph.com Git

doc/rados: pool and namespace are independent osdcap restrictions

For the "profile {name}" syntax, pool and namespace restrictions are
independent of each other (i.e. specifying namespace doesn't also
require specifying pool like is currently suggested). A cap can look
like "profile rbd namespace=myns", signifying that the RBD profile is
to be allowed in myns namespace of any pool.

For the "allow {access-spec}" syntax, pool restriction is optional.
A cap can look like "allow r namespace=myns", "allow w object_prefix
myprefix" or "allow rw namespace=myns object_prefix myprefix", for
example.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 67f5769ce6e110b89362763cfb41a0e00e595cdf)

Merge pull request #61102 from ivancich/wip-69255-reef

reef: qa/rgw: pull Apache artifacts from mirror instead of archive.apache.org

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #61054 from smanjara/wip-69211-reef

reef: qa/rgw: fix s3 java tests by forcing gradle to run on Java 8

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>

Merge pull request #61121 from ivancich/wip-69270-reef

reef: qa/rgw: force Hadoop to run under Java 1.8

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #61500 from zdover23/wip-doc-2025-01-24-backport-61493-to-reef

reef: doc/cephfs: edit disaster-recovery-experts (5 of x)

doc/cephfs: edit disaster-recovery-experts (5 of x)

Put the procedure in the section called "Using an alternate metadata
pool for recovery" into an ordered list, so that it is in a proper
procedure format.

This commit is meant only to break the procedure into steps. The English
language in each of these steps could be improved, but that improvement
will be done after this formatting has been merged and backported.

Follows https://github.com/ceph/ceph/pull/61462.

https://tracker.ceph.com/issues/69557

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 9af7d9786a11e2b07a095c95ae911dfb13e4a61b)

Merge pull request #60237 from aclamk/wip-aclamk-bluefs-truncate-allocations-reef

reef: os/bluestore: Make truncate() drop unused allocations

Merge pull request #61485 from ivancich/wip-69564-reef

reef: rgw/lc: make lc worker thread name shorter

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #61318 from aclamk/wip-aclamk-ifed-allocation-info-reef

[reef] os/bluestore: introduce allocator state histogram

Reviewed-by: Igor Fedotov <ifedotov@suse.com>

rgw/lc: make lc worker thread name shorter

Fixes: https://tracker.ceph.com/issues/69459
Signed-off-by: lightmelodies <lightmelodies@outlook.com>
(cherry picked from commit 05e241245744d105e285373bb9aa7861c62dcc18)

Merge pull request #61480 from zdover23/wip-doc-2025-01-22-backport-61462-to-reef

reef: doc/cephfs: edit disaster-recovery-experts (4 of x)

Merge pull request #61360 from zdover23/wip-doc-2025-01-14-backport-61352-to-reef

reef: doc/releases: add actual_eol for quincy

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #61059 from zdover23/wip-doc-2024-12-12-backport-61049-to-reef

reef: doc/cephfs: edit 2nd 3rd of mount-using-kernel-driver

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/cephfs: edit disaster-recovery-experts (4 of x)

Edit the seventh and final section of
doc/cephfs/disaster-recovery-experts.rst in preparation for adding
deeper explanations of the contexts in which one should use the various
commands listed on that page.

The section edited in this commit is

* Using an alternate metadata pool for recovery

A future commit might beneficially put this section into the format of
an ordered list. If so, such a commit should only reformat the
content and should not make any changes to the English. It's enough to
verify content or format. Let's not overload our editorial faculties by
forcing ourselves to walk and chew gum at the same time.

Follows https://github.com/ceph/ceph/pull/61442

https://tracker.ceph.com/issues/69557

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit f2529d11745d99e5b5c404193d3dd5e47e48afda)

Merge pull request #61460 from zdover23/wip-doc-2025-01-21-backport-61249-to-reef

reef: doc/cephfs: edit grammar in snapshots.rst

Merge pull request #59786 from zdover23/wip-doc-2024-09-14-backport-59732-to-reef

reef: doc/README.md: improve formatting

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/cephfs: edit grammar in snapshots.rst

This commit improves the grammar in doc/cephfs/snapshots.rst. The PR
associated with this commit follows from
https://github.com/ceph/ceph/pull/61240, the PR raised by Neeraj Pratap
Singh to introduce information about snapshots into the CephFS
documentation.

See also https://tracker.ceph.com/issues/68974.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 9a34ae55ede6dea5bb784a3e6cd45b9e60d4475d)

Merge pull request #61247 from zdover23/wip-doc-2025-01-07-backport-61240-to-reef

reef: doc: add snapshots in docs under Cephfs concepts

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #61309 from zdover23/wip-doc-2025-01-10-backport-61243-to-reef

reef: doc/radosgw/s3: correct eTag op match tables

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #61332 from zdover23/wip-doc-2025-01-11-backport-60081-to-reef

reef: src/exporter: improve usage message

Merge pull request #61454 from zdover23/wip-doc-2025-01-20-backport-61442-to-reef

reef: doc/cephfs: edit disaster-recovery-experts (3 of x)

doc/cephfs: edit disaster-recovery-experts (3 of x)

Edit the fifth and sixth sections of
doc/cephfs/disaster-recovery-experts.rst in preparation for adding
deeper explanations of the contexts in which one should use the various
commands listed on that page.

The sections edited in this commit are

- MDS Map Reset
- Recovery From Mission Metadata Objects

Follows https://github.com/ceph/ceph/pull/61427

https://tracker.ceph.com/issues/69557

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit a69c4142b7f99146af690933b3cfe897a64c7c59)

Merge pull request #61097 from afreen23/wip-69200-reef

reef: mgr/dashboard: handle infinite values for pools

Reviewed-by: Afreen Misbah <afreen@ibm.com>

src/exporter: improve usage message

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 725b4e184798dc38ec60ab81766577b39fd6e488)

Merge pull request #61447 from zdover23/wip-doc-2025-01-20-backport-61445-to-reef

doc/cephfs: disaster-recovery-experts cleanup

doc/cephfs: disaster-recovery-experts cleanup

Properly wrap a poorly-formatted paragraph that looks just awful in an
80-column viewport and change MDS to "MDS daemons" where the latter
makes the sentence a lot clearer.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit dceaab9a1a479ea3a10b6eff5bb4ae23d2ad6aa0)

Merge pull request #61444 from zdover23/wip-doc-2025-01-19-backport-61427-to-reef

reef: doc/cephfs: edit disaster-recovery-experts (2 of x)

doc/cephfs: edit disaster-recovery-experts (2 of x)

Edit the third and fourth sections of
doc/cephfs/disaster-recovery-experts.rst in preparation for adding
deeper explanations of the contexts in which one should use the various
commands listed on that page.

Follows https://github.com/ceph/ceph/pull/61426

https://tracker.ceph.com/issues/69557

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 4f3a69eb919fc0d99cdf943f095ca3a951c82897)

Merge pull request #61438 from zdover23/wip-doc-2025-01-18-backport-61272-to-reef

reef: doc/radosgw/config-ref: fix lc worker thread tuning

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #61424 from zdover23/wip-doc-2025-01-17-backport-61411-to-reef

reef: doc/cephfs: edit disaster-recovery-experts

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #61402 from zdover23/wip-doc-2025-01-16-backport-61373-to-reef

reef: AsyncMessenger.cc : improve error messages

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/radosgw/config-ref: fix lc worker thread tuning

This commit updates RGW Config Reference - Lifecycle Settings section. In particular it addresses an incorrect suggestion to decrease parallel threads in the workers pool for a more aggressive/accelerated per-bucket lifecycle processing. A more aggressive lifecycle processing for a bucket containing higher number of objects is achieved by increasing, not decreasing parallel threads.
Current suggestion is miss-leading.

Fixes: https://tracker.ceph.com/issues/63659
Signed-off-by: Laimis Juzeliunas <laimis.juzeliunas@oxylabs.io>
(cherry picked from commit b7ae18a292c7d1d5139dfb74c575f1af0de29a3e)

Merge pull request #58920 from mohit84/wip-67235-reef

reef: test: Create ParallelPGMapper object before start threadpool

Reviewed-by: Nitzan Mordechai <nmordech@redhat.com>

doc/cephfs: edit disaster-recovery-experts

Edit the first two sections of doc/cephfs/disaster-recovery-experts.rst
in preparation for adding deeper explanations of the contexts in which
one should use the various commands listed on that page.

https://tracker.ceph.com/issues/69557

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit cc8cfeddbd290ef9b5e4e0c68ae94cefb34e1be9)

Merge pull request #59285 from k0ste/wip-64168-reef

reef: osd: Add memstore to unsupported objstores for QoS

Merge pull request #61259 from idryomov/wip-fsx-try-netlink-reef

reef: test/librbd/fsx: switch to netlink interface for rbd-nbd

Merge pull request #61171 from idryomov/wip-69324-reef

reef: rbd: handle --{group,image}-namespace in "rbd group image {add,rm}"

Merge pull request #61167 from idryomov/wip-68998-reef

reef: librbd: avoid data corruption on flatten when object map is inconsistent

Merge pull request #61094 from idryomov/wip-69178-reef

reef: librbd/migration/HttpClient: avoid reusing ssl_stream after shut down

Merge pull request #61182 from rhcs-dashboard/reef-configuration-not-updatable

reef: mgr/dashboard: Administration > Configuration > Some of the config options are not updatable at runtime

Reviewed-by: Afreen Misbah <afreen@ibm.com>

os/bluestore: Fix BlueFS::truncate()

In `struct bluefs_fnode_t` there is a vector `extents` and
the vector `extents_index` that is a log2 seek cache.

Until modifications to truncate() we never removed extents from files.
Modified truncate() did not update extents_index.

For example 10 extents long files when truncated to 0 will have:
0 extents, 10 extents_index.
After writing some data to file:
1 extents, 11 extents_index.

Now, `bluefs_fnode_t::seek` will binary search extents_index,
lets say it located seek at item #3.
It will then jump up from #0 extent (that exists) to #3 extent which
does not exist at.
The worst part is that code is now broken, as #3 != extent.end().

There are 3 parts of the fix:
1) assert in `bluefs_fnode_t::seek` to protect against
   jumping outside extents
2) code in BlueFS::truncate to sync up `extents_index` with `extents`
3) dampening down assert in _replay to give a way out of cases
   where incorrect "offset 12345" (12345 is file size) instead of
   "offset 20000" (allocations occupied) was written to log.

Fixes: https://tracker.ceph.com/issues/69481
Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
(cherry picked from commit 7f3601089d41bfc23f530c7bf3fb7efad2d055ec)

os/bluestore: bluefs unittest for truncate bug

Unittest showing 2 different flavours of problems:
1) bluefs log corruption
2) bluefs sigsegv

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
(cherry picked from commit f2b5e2fa0a9274c1667fccafa597fff9be7a74b1)
+ fixes for add_block_device
+ fix for bad usage of std::string's fill constructor

os/bluestore: Add unittest for BlueFS::truncate()

Add unittest for some truncate scenarios.

Fixes: https://tracker.ceph.com/issues/68385 (addendum)
Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
(cherry picked from commit 612f24b41fb44e478d6c1cc867dfa0720197793f)

os/bluestore: Make truncate() drop unused allocations

Review fixes. Removed overcatious assert.
Improved if .. else style.
Skipped processing extent truncation when seek() goes to end.

Fixes: https://tracker.ceph.com/issues/68385 (addendum)
Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
(cherry picked from commit 85fac18a1cb0491e270261c71a506c4c1ba5e0bf)

test/store_test: get rid off explicit offset specifications in shared
blob repair test case.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit 482e5b85f08a07200c1f18db509b6b0f2ebcf3e6)

os/bluestore: Make truncate() drop unused allocations

Now when truncate() drops unused allocations.
Modified Close() in BlueRocksEnv to unconditionally call truncate.

Fixes: https://tracker.ceph.com/issues/68488
Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
(cherry picked from commit 9fc65f160cd3764a68fb3697d067c358761fc837)

Conflicts:
src/os/bluestore/BlueFS.cc, trivial

Merge pull request #59266 from k0ste/wip-65924-reef

reef: CephContext: acquire _fork_watchers_lock in notify_post_fork()
Reviewed-by: Kamoltat Sirivadhna <ksirivad@redhat.com>

Merge pull request #60981 from sseshasa/wip-69150-reef

reef: osd: adding 'reef' to pending_require_osd_release

Merge pull request #60780 from shraddhaag/wip-68948-reef

reef: qa/standalone/mon/mon_cluster_log.sh: retry check for log line

qa/standalone/mon/mon_cluster_log.sh: retry check for log line

Issue: The test was failing as we were checking for the osd boot
log before it was actually emitted in the log file.

Solution: We retry checking for the desired string in the log file
for a duration of 60s after OSD has come up successfully.

Fixes: https://tracker.ceph.com/issues/67282
Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>
Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
(cherry picked from commit 67928a27357e2d600114db1891db5e7b30c8d1a9)

AsyncMessenger.cc : improve error messages

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 3d36a3b9bbeb8b21b99046aab0d0bdf8f1c30aa2)

doc/releases: add actual_eol for quincy

Add the actual EOL date for the Quincy release (it's 2025-01-13).

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 2c4ab9c571218bc09d14de0b101166fbab965f85)

Merge pull request #61292 from phlogistonjohn/jjm-fix-reef-mypy-69471

reef: mgr/diskprediction_local: avoid mypy error

Reviewed-by: Adam King <adking@redhat.com>

Merge pull request #61343 from zdover23/wip-doc-2025-01-13-backport-61313-to-reef

reef: doc: improve tests-integration-testing-teuthology-workflow.rst

Merge pull request #60445 from mohit84/wip-68663-reef

reef: AsyncMessenger: Don't decrease l_msgr_active_connections if it is negative

Reviewed-by: Nitzan Mordechai <nmordech@redhat.com>

doc: improve tests-integration-testing-teuthology-workflow.rst

This commit adds:
1. workflow summary in the first section along with an image.
2. sub-section "Pushing to ceph-ci repository" to second section.
3. file doc/dev/developer_guide/testing_integration_tests/workflow.png

Signed-off-by: Vallari Agrawal <vallari.agrawal@ibm.com>
(cherry picked from commit dc539b3ea8031d2b02da9d5a5b1f856d96d70362)

CephContext: acquire _fork_watchers_lock in notify_post_fork()

The ceph::spin_unlock() seems incorrect here.

Fixes: http://tracker.ceph.com/issues/63494
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit faef266331ae0e372a345ef1926c239aa6c3bf45)

osd: Remove usage of unsupported objstores for QoS

mClock is supported on Bluestore and a check is currently done to eliminate other unsupported object stores.
With Filestore no longer in the code base, this check can be removed.
In addition, make sure that osd bench will no longer run on setups with memstore.

Fixes: https://tracker.ceph.com/issues/59531
Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>
(cherry picked from commit 18043c5e88d241f43b8fd23a8cbc1b15a3854de9)

Conflicts:
  - file: src/osd/OSD.cc
    comment: `OSD::maybe_override_cost_for_qos()` was removed as part of the backport
      that included mClock changes for PG delete operation

Merge pull request #60612 from cbodley/wip-68824-reef

reef: os: remove unused btrfs_ioctl.h and tests

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #59280 from k0ste/wip-67458-reef

reef: common/TrackedOp: rename and raise prio of slow op perfcounter

Reviewed-by: YiteGu <yitegu0@gmail.com>

Merge pull request #57067 from batrick/wip-65377-reef

reef: osd/OSDMonitor: check svc is writeable before changing pending

Reviewed-by: Igor Fedotov <ifedotov@suse.com>

Merge pull request #59444 from kamoltat/wip-67721-reef

reef: src/pybind/mgr/pg_autoscaler/module.py: fix 'pg_autoscale_mode' output

Reviewed-by: Samuel Just <sjust@redhat.com>

Merge pull request #59404 from Matan-B/wip-67666-reef

reef: mon/OSDMonitor: Add force-remove-snap mon command

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>

Merge pull request #59371 from NitzanMordhai/wip-67644-reef

reef: mgr/rest: Trim requests array and limit size

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

os/bluestore: do not include single AU allocations to allocation
fragmentation stats.

The rationale is that such allocations always get single framgent.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit 5cdd29f78bdd19185d3d93479a57be25fa98d7f1)

test/allocator_replay: introduce allocator fragmentation histogram.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit 600fc5b0f18aca5f129cb18ce4b0ea16dbc3b9f0)

os/bluestore: implement allocator fragmentation histogram

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit 625e3461fbed904e85dfeea7ab3e8f7b74c4f384)

doc/radosgw/s3: correct eTag op match tables

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit f108a3739700c49a46472fb7b936acb9c53f0c0d)

reef: mgr/diskprediction_local: avoid mypy error

Patch is for reef ONLY, not a traditional backport.
Disable mypy check for the given line. This check is triggerin all
reef CI jobs to fail.

Fixes: https://tracker.ceph.com/issues/69471
Signed-off-by: John Mulligan <jmulligan@redhat.com>

Merge pull request #56139 from ifed01/wip-ifed-fix-global-repair-stats-reef

reef: os/store_test: Retune tests to current code

Revewed-by: Md Mahamudur Rahaman Sajib mahamudur.sajib@croit.io

Merge pull request #59499 from pponnuvel/wip-67607-reef

reef: os/bluestore: allow use BtreeAllocator

Reviewed-by: Igor Fedotov <igor.fedotov@croit.io>

Merge pull request #59270 from k0ste/wip-67246-reef

reef: mon: Remove any pg_upmap_primary mapping during remove a pool

test/librbd/fsx: switch to netlink interface for rbd-nbd

The default was flipped in commit fcbf7367d285 ("rbd-nbd: map using
netlink interface by default") in squid. This is a reef-only fixup for
fsx to counter failures like "Size error: expected 0xa5cac00 stat 0x0"
which seem to be quite persistent on CentOS Stream 9.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

doc: add snapshots in docs under Cephfs concepts

Fixes: https://tracker.ceph.com/issues/68974
Signed-off-by: Neeraj Pratap Singh <neesingh@redhat.com>
(cherry picked from commit 885b1bf88ee28b05cc9ed91e8ea5146511b6cf34)

Merge pull request #60741 from mkogan1/wip-62255-reef

reef: rgw: fix the Content-Length in response header of static website

Merge pull request #60701 from cbodley/wip-68898-reef

reef: qa/rgw/crypt: disable failing kmip testing

mgr/dashboard: adding & exposing Param Class to support EndpointDoc creation

Fixes: https://tracker.ceph.com/issues/69272
Signed-off-by: Naman Munet <naman.munet@ibm.com>

Merge pull request #60849 from smanjara/wip-reef-60801

[reef] qa/rgw: the rgw/verify suite runs java tests last

Merge pull request #60165 from cbodley/wip-68426-reef

reef: cls/user: reset stats only returns marker when truncated

Merge pull request #60455 from cbodley/wip-64398-reef

reef: rgw/auth: ignoring signatures for HTTP OPTIONS calls

Merge pull request #61194 from zdover23/wip-doc-2024-12-30-backport-60794-to-reef

reef: doc/cephfs: document purge queue and its perf counters

doc/README.md: improve formatting

Improve the formatting in the section "Building Ceph" in the file
README.md.

Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit b9ca3957303989bbba9301cbbd18ba8faa0b8168)

doc/cephfs: document purge queue and its perf counters

Fixes: https://tracker.ceph.com/issues/68571
Signed-off-by: Dhairya Parmar <dparmar@redhat.com>
(cherry picked from commit ae9277398840bf8576ea5a8c4a2ba4e23f8b9613)

mgr/dashboard: Administration > Configuration > Some of the config options are not updatable at runtime

Fixes: https://tracker.ceph.com/issues/68976
Fixes Includes:
1) by-passing 'can_update_at_runtime' flag for 'rgw' related configurations as the same can be updated at runtime via CLI.
Also implemented a warning popup for user to make force edit to rgw related configurations.

Signed-off-by: Naman Munet <naman.munet@ibm.com>
(cherry picked from commit 3181acc223dafd04e3fc56d418389ad50c5868e4)

Merge pull request #61179 from zdover23/wip-doc-2024-12-26-backport-61177-to-reef

reef: doc: Fixes a typo in controllers section of hardware recommendations

doc: Fixes a typo in controllers section of hardware recommendations

Signed-off-by: Kevin Niederwanger <k.niederwanger@gmail.com>
(cherry picked from commit 089636224910e1cd6231cadd2c422a78c3d08fea)

rbd: drop --pool option from "rbd group image {add,rm}"

It stopped working with removal of get_special_pool_group_names() in
commit 3e8624f157a1 ("rbd: add support for namespaces") over six years
ago. Given how much time has passed, stop accepting this option.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 1f71671dc65fa9e35d451e55d8963d60f3198a93)

rbd: handle --{group,image}-namespace in "rbd group image {add,rm}"

Currently only passing the namespace as part of the group or image spec
works. If --group-namespace or --image-namespace options are used, the
namespace isn't picked up.

Fixes: https://tracker.ceph.com/issues/69324
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit f35e3a6e9d93c2d2122c31d5eeb9fabaef89f2e1)

Conflicts:
src/tools/rbd/action/Group.cc [ "rbd group info" and "rbd group
snap info" commands not in reef ]

test/librbd: add TestInternal.FlattenInconsistentObjectMap

Inject an object map with all possible inconsistencies before
flattening to ensure that something similar to commit 40af4f87b64f
("librbd: flatten operation should use object map") doesn't reappear
in a different form.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ffcd90313b9dd6e5aab8df0f9a5335a69785133c)

librbd: avoid data corruption on flatten when object map is inconsistent

By making flatten skip copyup in case the object is marked
OBJECT_EXISTS or OBJECT_EXISTS_CLEAN, commit 40af4f87b64f ("librbd:
flatten operation should use object map") introduced a critical
regression.  If the object map becomes inconsistent (e.g. because
flatten gets interrupted by killing "rbd flatten" process or a client
running on the clone crashes after updating the object map but before
writing to the image), the following attempt to flatten would corrupt
the clone if the copyup is actually still needed.

By design, it's impossible to tell whether the object is "known to
exist" based on the object map -- only telling whether the object is
"known to NOT exist" is possible (i.e. only OBJECT_NONEXISTENT state
is reliable).  Negating OBJECT_NONEXISTENT tells that the object "may
exist", not that the object is "known to exist".  This is reflected in
the name of object_may_exist() helper that was introduced together with
the object map implementation.  Something like object_may_not_exist()
simply can't be constructed given the rest of librbd.

This effectively reverts commits 4c86bccf07b8 ("librbd: add
object_may_not_exist helper") and 40af4f87b64f ("librbd: flatten
operation should use object map").

Fixes: https://tracker.ceph.com/issues/68998
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 97ed3fced13dd48520ec9c165537ff0bbc7cbb64)

qa/rgw: force Hadoop to run under Java 1.8

The Hadoop test installs Java 1.8 but then just runs the default
version. This makes sure it will run the version it installed.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit c5503187af96dc0179265dc84b2716df851e4cdf)

qa/rgw: pull Apache artifacts from mirror instead of archive.apache.org

Currently maven and kafka are pulled from archive.apache.org. This
uses Apache's "closer" calculator to find a mirror to use instead.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 3aae66611dd7f05612056a757cb7a87dfcf95de0)

mgr/dashboard: handle infinite values for pools

Fixes https://tracker.ceph.com/issues/64724

Issue:
======
Json parsing is failing because of Infinity values present in pools
meteadata. "read_balance": {"score_acting": Infinity, "score_stable":
Infinity,}
Due to this entire pool list is not rendered.

Fix:
====
Added a handler for checking "inf" values and replacing them with a
string "Infinity" so that json parsing does not fail on frontend.

Signed-off-by: Afreen <afreen23.git@gmail.com>
(cherry picked from commit 82d100ad264c35d79262c1983a8005d8d4791855)

librbd/migration/HttpClient: socket isn't shut down on some state transitions

If shut_down() gets delayed until a) the state transition from
STATE_RESET_CONNECTING completes and the reconnect is unsuccessful or
b) the state transition from STATE_RESET_DISCONNECTING completes (i.e.
next_state is STATE_UNINITIALIZED or STATE_RESET_CONNECTING), the
socket needs to be shut down before m_on_shutdown is invoked. The line
of thought here is the same as for the corresponding state transitions
that don't involve STATE_SHUTTING_DOWN.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 88557dff2fe14c7df96992fbb0a5208979c20bf1)

librbd/migration/HttpClient: avoid hitting an assert in advance_state()

If the shutdown gets delayed until the state transition from
STATE_RESET_CONNECTING completes and the reconnect is successful
(i.e. next_state is STATE_READY), we eventually hit "unexpected
state transition" assert in advance_state(). The reason is that
advance_state() would update m_state and call disconnect() under
STATE_READY instead of STATE_SHUTTING_DOWN. After the disconnect
maybe_finalize_shutdown() would enter advance_state() again with
STATE_SHUTDOWN as next_state, but the transition to that from
STATE_READY is invalid.

Plug this by not transitioning to next_state if current_state is
STATE_SHUTTING_DOWN.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 1046d610e3d6852258e6c4bf0355d0d13fb197b4)

librbd/migration/HttpClient: ignore stream_truncated when shutting down SSL

Propagate ec to handle_disconnect() and use it to suppress
stream_truncated errors.  Here is a quote from Beast documentation [1]:

  // Gracefully shutdown the SSL/TLS connection
  error_code ec;
  stream.shutdown(ec);
  // Non-compliant servers don't participate in the SSL/TLS shutdown process and
  // close the underlying transport layer. This causes the shutdown operation to
  // complete with a `stream_truncated` error. One might decide not to log such
  // errors as there are many non-compliant servers in the wild.
  if(ec != net::ssl::error::stream_truncated)
      log(ec);

... and a commit that made ignoring stream_truncated safe [2]:

  // ssl::error::stream_truncated, also known as an SSL "short read",
  // indicates the peer closed the connection without performing the
  // required closing handshake
  // [...]
  // When a short read would cut off the end of an HTTP message,
  // Beast returns the error beast::http::error::partial_message.
  // Therefore, if we see a short read here, it has occurred
  // after the message has been completed, so it is safe to ignore it.

[1] https://www.boost.org/doc/libs/develop/libs/beast/doc/html/beast/using_io/ssl_tls_shutdown.html
[2] https://github.com/boostorg/beast/commit/094f5ec5cb3be1c3ce2d985564f1f39e9bed74ff

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 9fa0bcc67d79d90996cd4ec2b5af56d051ef6be7)

librbd/migration/HttpClient: propagate ec to handle_handshake()

Get rid of get_callback_adapter() which only obfuscates the error:

handle_handshake: failed to complete SSL handshake: (337047686) Unknown error 337047686

vs

handle_handshake: failed to complete SSL handshake: certificate verify failed (SSL routines, tls_process_server_certificate)

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit e305a5908bd7bd3f2fa906af8521aea989f0c0ca)

librbd/migration/HttpClient: drop SslHttpSession::m_ssl_enabled

The remaining callers of disconnect() call it only when m_ssl_enabled
is set to true (i.e. after the handshake is completed):

- shut_down(), in STATE_READY
- maybe_finalize_reset(), very shortly after transitioning out of
STATE_READY as part of performing a reset
- advance_state(), on a transition to STATE_READY that is intercepted
by a previously delayed shut down

m_ssl_enabled isn't used outside of disconnect() and on top of that
is never cleared.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 8566224e9406abca42925f8045077141c2724bed)

librbd/migration/HttpClient: don't call disconnect() in handle_handshake()

With m_ssl_enabled set to false, disconnect() is a no-op. Since
m_ssl_enabled is flipped to true only when the handshake succeeds,
calling disconnect() on "failed to complete handshake" error is bogus
(as would be attempting to shut down SSL there).

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 331b5ea322537d380996ac6b250898ba474500be)

librbd/migration/HttpClient: avoid reusing ssl_stream after shut down

ssl_stream objects can't be reused after shut down: despite
a successful reconnect and handshake, any attempt to read or write
fails with "end of stream" (beast.http:1) or "protocol is shutdown"
(asio.ssl:337690831) error respectively. This doesn't appear to be
documented, but Beast and ASIO authors both mention that the stream
must be destroyed and recreated [1][2].

This was missed because the only integration test with a big enough
image used http instead of https.

[1] https://github.com/boostorg/beast/issues/821#issuecomment-338354949
[2] https://github.com/chriskohlhoff/asio/issues/804#issuecomment-872746894

Fixes: https://tracker.ceph.com/issues/69178
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 20885b11794ba80d5cddd178994865a83da7240f)