git.apps.os.sepia.ceph.com Git

Merge pull request #45896 from idryomov/wip-persistent-cache-status-quincy

quincy: rbd persistent cache UX improvements (status report, metrics, flush command)

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #45766 from cbodley/wip-55175

quincy: cmake: WITH_SYSTEM_UTF8PROC defaults to OFF

Reviewed-by: Laura Flores <lflores@redhat.com>

Merge pull request #45665 from s0nea/wip-55042-quincy

quincy: mgr/cephadm: try to get FQDN for configuration files

Reviewed-by: Adam King adking@redhat.com
Reviewed-by: Michael Fritch <mfritch@suse.com>

Merge pull request #45625 from ljflores/wip-quincy-test-cli-timeout

quincy: qa/tasks/cephadm_cases: increase timeouts in test_cli.py

Reviewed-by: Adam King adking@redhat.com

Merge pull request #45595 from ronen-fr/wip-rf-44050-quincy

quincy: osd/scrub: ignoring unsolicited DigestUpdate events

Reviewed-by: Laura Flores <lflores@redhat.com>

Merge pull request #45568 from mgfritch/backport-45420-quincy

quincy: cephadm: infer the default container image during pull

Reviewed-by: Adam King adking@redhat.com

Merge pull request #45359 from mgfritch/backport-45347-quincy

quincy: cephadm: preserve `authorized_keys` file during upgrade

Reviewed-by: Adam King adking@redhat.com

Merge pull request #45988 from zdover23/wip-doc-os-recommendations-backport-quincy-3

quincy: doc/start: add testing support information

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

Merge pull request #45905 from idryomov/wip-rbd-mirror-test-timer-lock-quincy

quincy: test/rbd_mirror: grab timer lock before calling add_event_after()

Reviewed-by: Christopher Hoffman <choffman@redhat.com>

doc/start: add testing support information

This PR adds information about support for testing,
and information about which distros the Ceph project
builds packages for.

This is one in a series of PRs including the following:

https://github.com/ceph/ceph/pull/45385
https://github.com/ceph/ceph/pull/45764

This PR specifically includes the information that Ernesto
Puerta collected here:
https://github.com/ceph/ceph/pull/45385#pullrequestreview-911766656

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 0364f3afcccc85d190237b0a74b4deeefa4738f3)

Merge pull request #45932 from adk3798/revert-pids-limit-quincy

quincy: cephadm: Revert pids limit

Revert "cephadm: remove containers pids-limit"

This reverts commit 7c1214f38091dde0ba2c5e0557dcd98f97f91302.

Signed-off-by: Adam King <adking@redhat.com>

Revert "qa/suites/orch/cephadm: restrict test_iscsi_pids_limit to CentOS"

This reverts commit 355a819d3a65ef05ccc078fcb58eca4c84dac573.

Signed-off-by: Adam King <adking@redhat.com>

Merge pull request #45885 from markhpc/quincy-bs-avl-cursor-fix

quincy: os/bluestore: Always update the cursor position in AVL near-fit search.

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Igor Fedotov <igor.fedotov@croit.io>

test/rbd_mirror: grab timer lock before calling add_event_after()

add_event_after() expects an externally provided mutex to be held
for the call. This was missed in commit 8965a0f2a6f7 ("rbd-mirror:
synchronize with in-flight stop in ImageReplayer::stop()").

Fixes: https://tracker.ceph.com/issues/55317
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 60e16106837e0d23366709f70f39c4f1ae7a2a45)

os/bluestore: Always update the cursor position in AVL near-fit search.

Signed-off-by: Mark Nelson <mnelson@redhat.com>
(cherry picked from commit 3bed53debfa2f9ec9d31021ce7eaf8b78f78f9e0)

Merge pull request #45780 from vshankar/wip-55110-quincy

quincy: mount.ceph: remove `ms_mode' mount option when switching to old-syntax

Reviewed-by: Xiubo Li <xiubli@redhat.com>

librbd/cache/pwl: remove RBD_FEATURE_DIRTY_CACHE check in DiscardRequest

"m_image_ctx.features &&RBD_FEATURE_DIRTY_CACHE" is obviously wrong
because it would pretty much always be true.  However, even if bitwise
AND was used, this check would still be dead because DiscardRequest is
only invoked if RBD_FEATURE_DIRTY_CACHE is enabled:

  int invalidate_cache(ImageCtx *ictx) {
  {
    ...
    // Delete writeback cache if it is not initialized
    if ((!ictx->exclusive_lock ||
         !ictx->exclusive_lock->is_lock_owner()) &&
ictx->test_features(RBD_FEATURE_DIRTY_CACHE)) {
      C_SaferCond ctx3;
      ictx->plugin_registry->discard(&ctx3);
      r = ctx3.wait();
    }

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit aee78bbb9d7edd606a8a235c57b2b704d7b94e4c)

librbd/cache/pwl: don't crash if cache file removal fails

The non-ec overload will throw fs::filesystem_error on any error
(e.g. EPERM due to unprivileged "rbd persistent-cache invalidate"
being brought up against a privileged workload).

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 63197ff7003fa9e595527a7431f9f3f6790f7d57)

rbd: add persistent-cache flush command

Add a flush command so that users can manually flush cache.

[ idryomov: error messages, incorporate doc and help.t hunks, drop
do_persistent_cache_flush() ]

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 644fbc9fcc8f12eb93d5cc20054cd8598ab001b7)

rbd: rename image-cache invalidate command

Rename command image-cache to persistent-cache. Refactoring the code
of invalidate command.

[ idryomov: error message, incorporate doc and help.t hunks, drop
do_persistent_cache_invalidate() ]

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 05bfe10ad9fde533aa728f9aa0cc8a8f155c03c5)

librbd/cache/pwl: rename persistent cache key

librbd "internal" metadata keys was change to ".rbd" prefix. Change
peristent cache to ".rbd" too.
And the name of persistent cache key is IMAGE_CACHE_STATE. Since
this key is planned to be used outside the pwl directory, it seems
more appropriate to change it to a clear name as PERSISTENT_CACHE_STATE.

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
(cherry picked from commit bd66fdda910f02ffe91bb026f82a85f28a6ff225)

rbd: include persistent cache metrics in "rbd status" report

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit e996fd80601ec8c309c1517f33171e88a2f31cad)

rbd: factor out get_percentage() helper

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 9324ab94711dbe9a1265643adcc79ae0a3cba812)

librbd/cache/pwl: no need to set clean and empty in remove_pool_file()

It is redundant -- the only caller sets both since commit 6593e31fff18
("librbd/cache/pwl: correct cache state").

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit d64a3ae265897806809d9fa08ac72c549b4bca4f)

librbd/cache/pwl: avoid inconsistencies in ImageCacheState

When empty and/or clean bools are updated in I/O handling code paths,
ImageCacheState becomes inconistent for a short while: e.g. with clean
transitioned to true, dirty_bytes counter could still be positive
because the counters are updated only in periodic_stats(). Move to
updating the counters in update_image_cache_state(Context*) to avoid
this.

update_image_cache_state(Context*) now requires m_lock -- most call
sites already hold it anyway. The only problematic call site was
AbstractWriteLog::shut_down() callback chain: perf_stop() needed to
be moved to the very end since perf counters must be alive now for
update_image_cache_state() to work.

Don't override expect_op_work_queue() in unit tests: completing
context in the same thread now results in a deadlock on m_lock in
all test cases that call AbstractWriteLog::init().

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 016882925a63f4f03a9c445d008b2325d479bc30)

librbd/cache/pwl: handle invalid ImageCacheState json

get_json_format() and create_image_cache_state() attempt to get
particular keys which could result in an unhandled std::runtime_error
exception. Conversely, ImageCacheState constructor just swallows that
exception which could leave the newly constructed object incorrectly
initialized. Avoid doing parsing in the constructor and introduce
init_from_config() and init_from_metadata() methods instead.

While at it, move everything out from under "persistent_cache" key.
Also fix init_state_json_write test case which stopped working now
that types are enforced by json_spirit.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 7678ee2490965a8a73c02a47283adaa5036dbcab)

librbd/cache/pwl: add basic metrics to ImageCacheState

Add basic metrics to ImageCacheState and persist them, including
allocated_bytes, cached_bytes, dirty_bytes, free_bytes and hit/miss
info.

Leverage periodic_stats() timer to call update_image_cache_state.
In order to avoid outputting too much debug information, the original
statistics output log level is changed to 5.

Switch to json_spirit for encoding because encode_json encodes bool as
"true"/"false" string.

Remove rbd_persistent_cache_log_periodic_stats option because we need
to always update cache state.

[ idryomov: add cached_bytes and hits_partial; report misses and
miss_bytes instead of respective totals; naming ]

Fixes: https://tracker.ceph.com/issues/50614
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 769f3a06ecf85249c1473cbb6bab7503beb1ba78)

librbd/cache/pwl: correct cache state

update cache state after dirty_entries or log_enties list updated.

Fixes: https://tracker.ceph.com/issues/50614
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
(cherry picked from commit 6593e31fff180ec4123e37107c88eb39f7d10fdf)

Merge pull request #45857 from ljflores/wip-quincy-55269

quincy: mgr/telemetry: anonymize daemons in telemetry `perf_counters`

Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>

Merge pull request #45627 from ljflores/wip-55051-quincy

quincy: admin/doc-requirements: bump sphinx to 4.4.0

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: David Galloway <dgallowa@redhat.com>

mgr/telemetry: fix daemon anonymization in perf_counters

Anonymized daemons now appear with a SHA1 digest instead of their
original identifier, e.g.:

    "perf_counters": {
        "mon.1b1b829ba9298527f4934053a4742a1710937007": {
            "mon": {
                "election_call": {
                    "value": 1
                },
                ...
                "session_trim": {
                    "value": 0
                }
            },
        ...
        }
    ...
    }

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit 2f4cc770e7ac9767d6d3be51c1de03f6014a6f98)

mgr/telemetry: add anonymize_entity_name function

The ability to anonymize entity names should have its own function
to prevent duplicate code.
Will clean up in a separate commit.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit e89d821ee6256b18f94d520baeb07012de80b731)

mgr/telemetry: anonymize daemons in telemetry perf_counters

In the telemetry perf channel we collect 'perf_counters' of individual daemons.
The monitors appear with their full name, which includes the host name.
The host name part must be anonymized.

To err on the safe side, I have anonymized all daemons except for osds,
since they are not attached to host names.

Fixes: https://tracker.ceph.com/issues/55229
Signed-off-by: Laura Flores <lflores@redhat.com>
(cherry picked from commit 0fe47b974ccc591c6108eb7a1b26087e62932bce)

mount.ceph: remove `ms_mode' mount option when switching to old-syntax

... and switch to using v1 addresses (if users haven't specified those
explicitly). kernel versions <5.11 do not understand `ms_mode' mount
option which would result in mount failure.

Fixes: http://tracker.ceph.com/issues/55110
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 6e28d3406df06435bea26b465baf97c259942920)

Merge pull request #45799 from rhcs-dashboard/fix-grafana-quincy

quincy: monitoring: several Grafana fixes

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: sunilangadi2 <NOT@FOUND>
Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #45804 from adk3798/nfs-export-quincy

quincy: mgr/nfs: nfs export management backport

Reviewed-by: John Mulligan <jmulligan@redhat.com>

mgr/nfs: remove redundant check

Remove the extra check of the cluster id from _apply method. As _apply
is a "private" method that should be only called from other private
methods that have already validated the cluster_id. It also removes
a dependency on the orch-requiring func available_clusters.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit dd5a47f83349c8f2b539ba3881f58f5270024cbb)

mgr/nfs: fix unintentional recursion

The `exports` property of the ExportMgr exists to cache the exports
configuration found in the .nfs namespace. Using that property
within the property method is probably not intentional and is probably
only working due to the lucky construction of the _exports dict
immediately after the None check so that the _exports dict is returned
(and is a mutable type).

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit daa455cd168d62cd8fbcaba4d7aa79b56e68ef0d)

mgr/nfs: add known_cluster_ids to generalize nfs cluster id fetching

The changes to the nfs module in 8c711afc are working but when I began
writing more test automation I found a few more places in the
export-configuration code path relying on the orchestration module
only. This change generalizes the logic to source nfs clusters from
orchestration when it's enabled but from the .nfs pool when
orchestration is disabled. It then uses that call when loading
the exports cache on the ExportMgr object.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 4d09660dea5696e5085a75694968aafe9253f47a)

doc/mgr/nfs: document that nfs exports related mgr call requirements

A recent change in the mgr/nfs module should enable the functioning
of export management commands/API calls as long as the rados namespaces
and objects have been already established. Document this fact, noting
that now only the `ceph nfs cluster ...` calls *require* an
orchestration module.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit b5b3e0bcb5e2a27375f50f7717786a3928cba711)

mgr/nfs: support managing exports without orchestration enabled

This change allows the `ceph nfs export ...` commands to function
without the entire mgr/nfs subsystem requiring orchestration to be
enabled. When there's no orchestration available, the code falls back
to examining the namespaces in the ".nfs" rados pool to determine what
cluster_id values are valid.

This change does not add support for creating the rados objects and
namespace needed to manage a nfs cluster. As discussed with the
orchestration group on 2022-01-22, rook does not need the mgr module to
establish the namespace. So, for now, we'll defer the work needed to
create the namespace/objects when orchestration is disabled.

Fixes: https://tracker.ceph.com/issues/54043
Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 8c711afc4ab898942a2569b619eb8379ee02ffba)

mgr/nfs: fix typo in error message

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 56323a2625133d5a53bf1ee1662346daa1b4f09b)

mgr/nfs: add unit test for normalize_path

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit ffa95fbc796aa5c00eaa32138291c0ef2a48949a)

mgr/nfs: change method format_path to function normalize_path

This function was not using self and thus has no need to be a method.
While we're at it, rename it to normalize_path because that's what
it is doing.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit f91dd1bf7bfab251d671a30d622bb544a4ce37d0)

mgr/nfs: clean up rados object naming code

The naming of rados objects used to store the nfs config was spread
all over the code, including inline f-strings, not-static methods,
etc.
This change unifies the naming by putting constant string prefixes
and name generating functions into the utils.py file.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit 88266144423e6876dc392bc6ea59e32393024323)

mgr/nfs: make _check_rados_notify a function

This was previously a staticmethod. This static method was only used by
NFSRados object. Staticmethods are nearly always better implemented as
functions, which is done so here.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit c51a6755b52954910512f804cde9e5255c1db9e7)

mgr/nfs: limit dependency of NFSRados object

Previously, the NFSRados object accepted the "Module" as the
first argument but only used the rados attribute (type rados.Rados).
It's better to limit the scope of types when reasonably possible
so we can see what the true dependencies are. So we restrict
NFSRados to accepting a rados.Rados as the argument.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
(cherry picked from commit d94b63830d94f21ba276452844a46d21e084fb3f)

mgr/dashboard: fix api test issue with pip

Fix
```
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
apache-libcloud 3.5.0 requires requests>=2.26.0, but you have requests 2.25.1 which is incompatible.
Successfully installed CherryPy-13.1.0 PyJWT-2.0.1 Routes-2.4.1 bcrypt-3.1.4 ceph-1.0.0 chardet-4.0.0 cheroot-8.6.0 idna-2.10 jaraco.functools-3.5.0 more-itertools-4.1.0 natsort-8.1.0 portend-3.1.0 pyopenssl-22.0.0 pytz-2022.1 repoze.lru-0.7 requests-2.25.1 tempora-5.0.1
```

Fixes: https://tracker.ceph.com/issues/55060
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 2289ad2bc327b0d86916a1c96f4af2967a80c1b9)

mgr/cephadm: update monitoring stack versions

Fixes: https://tracker.ceph.com/issues/54311
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 6a328ec30cd2c652c27e3bf070d5de7c2d4367b3)

Conflicts:
src/cephadm/cephadm
src/pybind/mgr/cephadm/module.py:
- Accept quincy changes and bring only updates in the Grafana,
Prometheus, Alertmanager and Node Exporter versions

mgr/dashboard: upgrade grafana pie-chart and vonage-status-panel versions

Fixes:https://tracker.ceph.com/issues/55195
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 2877920f58728eab20abe32fed24618449d76c09)

monitoring/grafana: fix version

Fixes: https://tracker.ceph.com/issues/55172
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 8721bd6c5ddd3c09d04a07e5a2564a5772324c82)

grafana/Makefile: don't push to docker

Fixes: https://tracker.ceph.com/issues/55155
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 7e6309fac3c4728b3527ab6c709becfb4dcdb126)

prometheus: spell check the alert descriptions

Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
(cherry picked from commit 9cca95b16abd4af3eb3a5630acb3fb7e0cc73a4e)

mgr/dashboard: Pool overall performance shows multiple entries of same pool in pool overview

This PR intends to fix this issue

Fixes:https://tracker.ceph.com/issues/54513
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 9719cc795e1d6a38ab8a7e8f3eeb56c13f11c25d)

mgr/dashboard: fix promtool test for mtu alert

Fixes: https://tracker.ceph.com/issues/55004
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 49d6068463ae9238b6fffcca690dbb5d74b2448a)

mgr/dashboard: Compare values of MTU alert by device

Fixes: https://tracker.ceph.com/issues/55004
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 3821548a37373f87109ab0dac7f3ee2d8f3ead99)

mgr/dashboard: fix transition-through-oci image workaround in grafana build

Fixes: https://tracker.ceph.com/issues/54311
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 64b0e5ce8a204908e769e7da01a5ee7d075c0481)

mgr/dashboard/monitoring: update grafana version

Fixes: https://tracker.ceph.com/issues/54311
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit c306778889c1c65fa7a5d8fd525c5cd3da7f2b78)

Merge pull request #45738 from benhanokh/wip-45733-quincy

quincy: os/BlueStore: NCB fix for SimpleBitmap boundary check

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

cmake: WITH_SYSTEM_UTF8PROC defaults to OFF

change the default value of WITH_SYSTEM_UTF8PROC from ON to OFF, so that
centos/rhel users can build with the default cmake configuration. no other
WITH_SYSTEM_* variable in ceph defaults to ON, so this is consistent
with other bundled libraries like boost and rocksdb

unfortunately, this also means that users that do have system packages
must opt-in to using them with -DWITH_SYSTEM_UTF8PROC=ON

both deb and rpm builds dependended on the previous default value, so
their logic was negated to match the new default

Fixes: https://tracker.ceph.com/issues/55114
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 90662cad562fffbdeac33f3b79eac6a02eff8c2a)

Merge pull request #45736 from jtlayton/wip-54614

quincy: osd: support truncation sequences in sparse reads

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #45377 from mchangir/wip-54533-quincy

quincy: mds,client: add new getvxattr op

Reviewed-by: Venky Shankar vshankar@redhat.com

Merge pull request #45541 from ajarr/wip-54221-quincy

quincy: mgr/volumes: Add `fs volume rename` command

Reviewed-by: Venky Shankar vshankar@redhat.com
Reviewed-by: Kotresh HR khiremat@redhat.com

Merge pull request #45672 from mchangir/wip-55055-quincy

quincy: mgr/snap_schedule: restart old schedules

Reviewed-by: Venky Shankar vshankar@redhat.com
Reviewed-by: Kotresh HR khiremat@redhat.com

os/BlueStore: NCB fix for SimpleBitmap boundary check
The boundary check in SimpleBitmap is off by one causing an assert to trigger
Also fixed a bug when asking for the next clear_extent on a unaligned map when the last bits in the map were set.
Adding unit-tests
Fixes: https://tracker.ceph.com/issues/55145
Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
(cherry picked from commit 7dfa20863090d5eb58c798b6903386dcce6a52f8)

ceph_test_rados_io_pp: verify sparse_read behavior with non-zero truncate_seq

Fixes: http://tracker.ceph.com/issues/54280
Signed-off-by: Jeff Layton <jlayton@redhat.com>
(cherry picked from commit 387c7f33e232a4e982aeba3b185923efe42137aa)

librados: add ability to pass a truncate_size/seq to sparse_read

Fixes: http://tracker.ceph.com/issues/54280
Signed-off-by: Jeff Layton <jlayton@redhat.com>
(cherry picked from commit b9bf65ac62f50ddf5616e0544e3c7b8c9030ced6)

osd: allow sparse reads with a non-zero truncate-seq

do_read() just uses the truncate_seq to tell how to cap the length of
the read. I see no reason that sparse reads should do anything
differently.

Change do_sparse_read() to cap the requested length at the truncate_size
if the truncate_seq in the request is newer than the one in the object.

Fixes: https://tracker.ceph.com/issues/54280
Signed-off-by: Jeff Layton <jlayton@redhat.com>
(cherry picked from commit 58f3e8bb98b966935898ef1c3eed61be7768d513)

Merge pull request #45558 from vshankar/wip-53911-quincy

quincy: Revert "mds: kill session when mds do ms_handle_remote_reset"

Reviewed-by: Xiubo Li <xiubli@redhat.com>

Merge pull request #45405 from nmshelke/wip-54574-quincy

quincy: mgr/volumes: the 'mode' should honor idempotent subvolume creation

Reviewed-by: Venky Shankar vshankar@redhat.com
Reviewed-by: Kotresh HR khiremat@redhat.com

Merge pull request #45711 from jdurgin/wip-deb-cherrypy-quincy

quincy: debian/control: fix python3-cherrypy*3* dependency

Reviewed-by: Adam King <adking@redhat.com>

debian/control: fix python3-cherrypy*3* dependency

The trailing '3' was missed in one instance, ceph-mgr-cephadm, leading to:

Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
ceph-mgr-cephadm : Depends: python3-cherrypy but it is not installable

Which makes the installation fail.

Fixes: 78983ad0d0cce422da32dc4876ac186f6d32c3f5
Signed-off-by: Koen Kooi <koen@softiron.com>
(cherry picked from commit b7b381fe91c0711249a7185b31f3dd60064f3b5a)

Merge pull request #45695 from amathuria/amathuri-53923-fix-quincy

quincy: osd/osd_types: Increasing decode version of scrub_duration in pg stats

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #45673 from dsavineau/cephadm_container_image_stable

cephadm: set quincy as stable release

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>
Reviewed-by: Adam King adking@redhat.com

Merge pull request #45604 from cbodley/wip-quincy-arrow-submodule

quincy: cmake: add submodule for Apache Arrow at v6.0.1

Reviewed-by: galsalomon66 <gal.salomon@gmail.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

osd/osd_types: Increasing decode version of scrub_duration in pg stats

All new fields added to pg stats after quincy RC need to have the decode field bumped up to avoid decoding errors during an upgrade from quincy RC to the quincy stable version

Fixes: https://tracker.ceph.com/issues/53923
Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>
(cherry picked from commit 3532b78901cc43ceb375da34a681e5a0f8eb53ac)

cephadm: set quincy as stable release

Quincy isn't master anymore so we don't need the DEFAULT_IMAGE_IS_MASTER
variable set to true (which produces a warning message).
This also sets the LATEST_STABLE_RELEASE variable to quincy to match the
DEFAULT_IMAGE_RELEASE variable.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>

qa: test snap_schedule with mgr restart

Scheduled snaps should follow the created schedule even across mgr
restart.

Signed-off-by: Milind Changire <mchangir@redhat.com>
(cherry picked from commit ac6c7240d3b69de128ae2c5f4c172f12e313fd27)

mgr/snap_schedule: restart old schedules

Old schedules were not picked up from database when mgr was restarted.
Restart old schedules on mgr restart.

Fixes: https://tracker.ceph.com/issues/54052
Signed-off-by: Milind Changire <mchangir@redhat.com>
(cherry picked from commit dca7fdb600932d712280dd91a4eb63a17a8800e3)

mgr/util: add function to list all fs names

Signed-off-by: Milind Changire <mchangir@redhat.com>
(cherry picked from commit 24915c8ee926c27e335f6e94341770ee8088e721)

mgr/cephadm: try to get FQDN for inventory address

Fixes: https://tracker.ceph.com/issues/54502
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
(cherry picked from commit 4f14993b1667fff309cd9cd6f9dad638a5a7e502)

Conflicts:
src/pybind/mgr/cephadm/services/monitoring.py
Fixed conflict because https://github.com/ceph/ceph/pull/44751 has not been
backported to quincy (yet).

mgr/cephadm: unify way to get the host address

There are two different ways to get the host address. From the
inventory of the mgr object directly or via the `_inventory_get_addr`
method of `CephadmService`. Update the code in order to use the
`_inventory_get_addr` method only.

Signed-off-by: Tatjana Dehler <tdehler@suse.com>
(cherry picked from commit 30385068eed7a4f93179e8d2748dd6e01bba6ffd)

Merge pull request #45641 from ronen-fr/wip-rf-45640-quincy

Quincy: osd/scrub: restart snap trimming only after scrubbing is done

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #45629 from neha-ojha/wip-quincy-stable

quincy: src/ceph_release: mark quincy stable

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

Merge pull request #45653 from ljflores/wip-quincy-fast-shutdown-backports

Quincy: fast shutdown backports

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>

Merge pull request #45196 from adk3798/quincy-release-default-image

quincy: cephadm: change default image to ceph/ceph:v17

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

Merge pull request #45616 from NitzanMordhai/wip-55021-quincy

quincy: tests: ceph_test_rados_api_watch_notify: watch2Delete reconnect

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #45615 from benhanokh/wip-55032-quincy

quincy: os/bluestore: Disable NCB functionality on rotational drives

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

Merge pull request #45652 from sseshasa/wip-55069-quincy

quincy: Doc: Improve mclock config reference documentation & update PendingReleaseNotes.

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #45637 from idryomov/wip-diff-iterate-striping-fix-quincy

quincy: librbd: make diff-iterate in fast-diff mode sort and merge reported extents

Reviewed-by: Christopher Hoffman <choffman@redhat.com>

Merge pull request #45499 from cfsnyder/wip-54146-quincy

quincy: rgw/admin: fix radosgw-admin datalog list max-entries issue

Merge pull request #45504 from cfsnyder/wip-54154-quincy

quincy: rgw: in bucket reshard list, clarify new num shards is tentative

Merge pull request #45501 from cfsnyder/wip-54150-quincy

quincy: rgw: RGWPostObj::execute() may lost data.

Merge pull request #45498 from cfsnyder/wip-54093-quincy

quincy: rgwlc: warn on missing RGW_ATTR_LC

Merge pull request #45490 from cfsnyder/wip-54076-quincy

quincy: rgw: bucket chown bad memory usage

qa/standalone: Fix test_activate_osd() test in ceph-helpers.sh

Modify test_activate_osd() to get the type of scheduler in use and then
verify the value of osd_max_backfills. This is because mclock scheduler
overrides this option to 1000 upon OSD initialization.

The test earlier used to pass because the OSD daemon was killed but not
marked down and upon being brought up, the wait for OSD up check was
passing quickly. But the OSD still didn't have the latest config values.

But now upon killing the OSD, the osd_fast_shutdown sequence notifies the
mon (see PR: https://github.com/ceph/ceph/pull/44807) and is marked down
and dead. Upon bringing it up, the wait for OSD up check takes a longer
time and this is sufficient for the config values to be updated. This
results in the correct values being read from the config 'Values' map.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit 3aa2df2e0f6f5bafadc96fd72935e5cf8b2fcf17)

osd/OSD: osd_fast_shutdown_notify_mon not quite right

When osd_fast_shutdown and osd_fast_shutdown_notify_mon set as true, OSD marked as Down
it should be marked as Dead,

Fixed: https://tracker.ceph.com/issues/53327

Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
nd

nd

(cherry picked from commit 07302d5e41c49c885c9398c1c478638023e3f264)

osd: make osd_fast_shutdown_notify_mon option true by default

osd_fast_shutdown_notify_mon option is false by default. So users suffer
from error log flood, slow ops, and the long I/O timeouts on voluntary OS
shutdown before they are aware of the existence of this option. Let's
make this option true by default.

Fixes: https://tracker.ceph.com/issues/53328
Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
(cherry picked from commit 729a5b85a6586b47d16acbba2cf8e765e498cd65)