Adam King [Wed, 13 Mar 2024 19:30:25 +0000 (15:30 -0400)]
mgr/cephadm: refresh public_network for config checks before checking
The place it was being run before meant it would only grab the
public_network setting once at startup of the module. This meant
if a user changed the setting, which they are likely to do if they
get the warning, cephadm would ignore the change and continue
reporting that the hosts don't match up with the old setting
for the public_network. This moves the call to refresh the
setting to right before we actually run the checks. It does
mean we'll do the `ceph config dump --format json` call
each serve loop iteration, but I've found that only tends
to take a few milliseconds, which is nothing compared to
the time to refresh other things we check during the serve
loop.
I additionally modified the use of this option to use
the attribute on the mgr, rather than calling
`get_module_option`. This was just to get it more in
line with how we tend to handle other config options
Fixes: https://tracker.ceph.com/issues/64902 Signed-off-by: Adam King <adking@redhat.com>
Matt Benjamin [Mon, 23 Oct 2023 18:57:33 +0000 (14:57 -0400)]
rgwlc: implement NewerNoncurrentVersions
Per AWS doc, this value controls "how many noncurrent versions
Amazon S3 will retain." [1] We understand this to mean, retain
NewerNoncurrentVersions of any object, regardless of expiration.
the endpoint passed down to util.query() is wrong:
is passes the full url (scheme://addr:port/path) where it should only
pass the path. The cause is that RedFishClient.login() basically stores
the value of the Location header in `self.location`.
The consequence of this is that it makes the client unable to properly logout.
Patrick Donnelly [Wed, 13 Mar 2024 13:04:40 +0000 (09:04 -0400)]
Merge PR #54485 into main
* refs/pull/54485/head:
mds/quiesce-db: keep the db thread alive until shutdown
mds/quiesce-db: incorporate review comments
mds/quiesce: declare QuiesceDbPeerListing and QuiesceDbPeerAck
mds/quiesce: resolve the quiesce cluster at the mds monitor
include/types: add an I/O helper for std::unordered_map
messages: avoid using mutable members in MMDSQuiesce*
mds/quiesce-db: incorporate review comments
doc/cephfs/fs-volumes: doc fixes and updates
pybind/mgr: correct type hints for `get_quiesce_leader_info`
mds/quiesce: only use ACTIVE daemons for the quiesce cluster
mds,messages: quiesce db inter-rank messaging
mds/quiesce: MDSRankQuiesce - integration of the quiesce db manager
doc/cephfs/fs-volumes: Add info about the quiesce command
doc: fixes for local dev builds
mgr/volumes: support for `fs subvolume quiesce`
mgr/volumes: use `volume_exception_to_retval` as a decorator
pybind/mgr: add a `one-shot` parameter to send_command
mds/quiesce: QuiesceAgent implementation and unit tests
mds/quiesce: QuiesceDb.h and QuiesceDbManager with tests
common/Timer.cc: improve debug messages from the timer_thread
mds: MDSRank.cc: return status from `send_message_mds`
encoding: add emplace variants for map dencoders
common/Cond: make C_SaferCond private members protected to facilitate inheritance
qa/tasks/cephfs: give the tests more time to run heavy fs workloads
Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
J. Eric Ivancich [Mon, 11 Mar 2024 21:19:40 +0000 (17:19 -0400)]
rgw: rgw-restore-bucket-index -- sort uses specified temp dir
The sort command sometimes makes use of temporary files. When the user
specifies a directory to be used for temp files, have the sort command
use that same directory.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Ronen Friedman [Mon, 11 Mar 2024 17:54:01 +0000 (12:54 -0500)]
osd/scrub: handle 'release' events sent during 'scrub abort'
Scenario:
- the replica is reserved;
- the Primary initiates a chunk operation;
- the replica is in ReplicaActive/ReplicaActiveOp/ReplicaBuildingMap
- 'no-scrub' is set, and the Primary sends a 'release' event to the
replica.
Desired behavior:
- the replica aborts the chunk operation and transitions to
ReplicaReserved;
- the 'release' event is delivered in the new state.
Matt Benjamin [Mon, 19 Feb 2024 14:01:48 +0000 (09:01 -0500)]
rgw_lc: replace strftime w/fmt and chrono:calendar
It's reliably claimed that std::strftime is not
mt-safe, and this would be a likely root cause of
intermittent scrambled expiration header output cases
that have been reported.
Fixes: https://tracker.ceph.com/issues/63973 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
luo rixin [Tue, 27 Feb 2024 07:07:17 +0000 (15:07 +0800)]
CMakeLists: disable Seastar_IO_URING to fix seastar unittest timeout
As pr https://github.com/ceph/ceph/pull/55787 bump liburing from 0.7 to 2.5.
with liburing-dev (2.1) installed on ubuntu jammy, Seastar_IO_URING will be set ON,
seastar will be builded with liburing-dev. Ceph bluestore use build_uring in file Builduring.cmake
to build liburing version 2.5 and set URING_INCLUDE_DIR to
/home/jenkins-build/build/workspace/ceph-pull-requests/build/src/liburing/src/include/liburing/,
seastar use URING_INCLUDE_DIR(version is 2.5) to build, but seastar link liburing.so to system
liburing-dev package(version 2.1). The liburing head file seastar building and liburing binary
seastar linked is mismatched.
I have downgraded the liburing version in file 'cmake/modules/Builduring.cmake' to liburing-2.1,
the seastar unittests work fine, no timeout.
Fixes: https://tracker.ceph.com/issues/64789 Signed-off-by: luo rixin <luorixin@huawei.com>
Patrick Donnelly [Sun, 10 Mar 2024 15:22:09 +0000 (11:22 -0400)]
qa/crontab: correct script paths and environment
At some point the links to the shell scripts in ceph.git were broken in the
$HOME for [1]. Unless a run was done manually with `teuthology-suite` in the
crontab, the job was basically skipped. This is probably one of the reasons
nightlies fell out of use.
I've updated the home directory according to the document (comments) in this
change. The teuthology user now has persistent clones of ceph.git and
teuthology.git. The clones are updated daily by this same crontab.
Instead of using a link in teuthology's $HOME/bin to the scripts used in this
crontab, we just have a cron variable referencing where the script should be in
the ceph.git/teuthology.git clone. Adding to this, the .bash_environment file
sources the virtualenv activate script instead of adding the teuthology binary
directory to its $PATH.
I've updated the hour for these jobs to actually be done "nightly". The first
set of jobs will be scheduled around 4pm EST. Additionally, it was necessary to
include --force-priority as some jobs are below the priority thresholds.
[1] teuthology@teuthology.front.sepia.ceph.com.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Jos Collin [Thu, 29 Feb 2024 10:50:03 +0000 (16:20 +0530)]
qa: enhance labeled perf counters test for cephfs-mirror
Implements checks for labeled perf counters in the appropriate tests.
This patch verifies snaps_synced, snaps_renamed, snaps_deleted and sync_failures metrics are
updated correctly based on the tests.
Fixes: https://tracker.ceph.com/issues/64486 Signed-off-by: Jos Collin <jcollin@redhat.com>
Zac Dover [Sun, 10 Mar 2024 10:43:52 +0000 (20:43 +1000)]
doc/glossary: improve "Crimson" entry
Improve the glossary entry for "Crimson" in accordance with Anthony
D'Atri's suggestions here:
https://github.com/ceph/ceph/pull/56068#discussion_r1518580402
Ilya Dryomov [Wed, 28 Feb 2024 13:20:16 +0000 (14:20 +0100)]
librbd: don't clip expanded diff on truncate in ObjectListSnapsRequest
If the diff was expanded due to LIST_SNAPS_FLAG_WHOLE_OBJECT, clipping
it when handling a truncate is wrong -- when subtracting that interval,
we either split the expanded extent into two or chop off a piece of it.
However the point of LIST_SNAPS_FLAG_WHOLE_OBJECT is to report a single
extent covering the entire object.
Ilya Dryomov [Sun, 18 Feb 2024 10:46:15 +0000 (11:46 +0100)]
librados/snap_set_diff: ignore truncates above size at start
Because currently calc_snap_set_diff() only ever appends to the running
diff, an excessive (either too large or completely bogus) zero extent
is reported in cases where an object is first expanded (with a snapshot
taken at that point) and then truncated but still above the size of the
object as of the starting snapshot.
Afreen [Wed, 6 Mar 2024 20:22:16 +0000 (01:52 +0530)]
mgr/dashboard: handle infinite values for pools
Fixes https://tracker.ceph.com/issues/64724
Issue:
======
Json parsing is failing because of Infinity values present in pools
meteadata. "read_balance": {"score_acting": Infinity, "score_stable":
Infinity,}
Due to this entire pool list is not rendered.
Fix:
====
Added a handler for checking "inf" values and replacing them with a
string "Infinity" so that json parsing does not fail on frontend.
Afreen [Fri, 1 Mar 2024 07:26:25 +0000 (12:56 +0530)]
mgr/dashboard: Locking improvements in bucket create form
Fixes https://tracker.ceph.com/issues/64658
- Addition of help texts
- Addition of info/warnings related to modes and versioning
- change of Locking section layout
- renaming locking to 'Object Locking'
- changes default retention period to 10
- edit bucket only shows lock when its enabled