ethanwu [Sun, 24 Mar 2024 09:33:42 +0000 (17:33 +0800)]
mds: fix rank root doesn't insert root ino into its subtree map when starting
Root ino belongs to subtree of root rank, and should be inserted when creating
subtree map log. This is missing when mds runs at STATE_STARTING, however.
When doing replay, all inode under this subtree will be trimmed by
trim_non_auth_subtree and cause replay failure.
Quick way to reproduce this:
After creating filesystem, mount it and create some directory.
mkdir -p ${cephfs_root}/dir1/dir11/foo
mkdir -p ${cephfs_root}/dir1/dir11/bar
unmount cephfs
./bin/ceph fs set a down true
./bin/ceph fs set a down false
./bin/cephfs-journal-tool --rank=a:0 event get json --path output # Can see that ESubtreeMap only contains 0x100 but no 0x1
mount cephfs
rmdir ${cephfs_root}/dir1/dir11/foo
rmdir ${cephfs_root}/dir1/dir11/bar
unmount cephfs
trigger mds rank 0 failover, and you can find rank 0 fails during replay and is marked damaged
Check mds log will find the following related message:
-49> 2024-03-24T18:06:19.461+0800 7f1542cbf700 10 mds.0.cache trim_non_auth_subtree(0x560372b2df80) [dir 0x1 / [2,head] auth v=12 cv=0/0 dir_auth=-2 state=1073741824 f(v0 m2024-03-24T18:03:30.350260+0800 1=0+1) n(v3 rc2024-03-24T18:03:30.401819+0800 4=0+4) hs=1+0,ss=0+0 | child=1 subtree=1 0x560372b2df80]
ethanwu [Sun, 24 Mar 2024 09:11:16 +0000 (17:11 +0800)]
mds: flush mds log before finishing STATE_STARTING
If we donn't flush mds log before requesting STATE_ACTIVE, and
mds happens to stop later before the log reaches journal.
The take-over mds will have no SubtreeMap to replay, and fail
later at non-empty subtree check.
ethanwu [Sun, 24 Mar 2024 08:17:49 +0000 (16:17 +0800)]
mds/FSMap: go back to STARTING state when rank doesn't make it pass STARTING
Just like STATE_CREATING, mds could fail or being stopped any where at
STATE_STARTING state, so make sure subsequent take-over mds will start
from STATE_STARTING. Otherwise, we'll end up with empty journal(No ESubtreeMap).
The subsequent take-over mds will fail with no subtrees found and rank will be
marked damaged.
Quick way to reproduce this:
./bin/ceph fs set a down true # take down all rank in filesystem a
#wait for fs to stop all rank
./bin/ceph fs set a down true; pidof ceph-mds | xargs kill
# quickly kill all mds soon after they enter starting state
./bin/ceph-mds -i a -c ./ceph.conf
# start all mds. Then we'll find out that mds rank is reported damaged with following log
-1 log_channel(cluster) log [ERR] : No subtrees found for root MDS rank!
5 mds.beacon.a set_want_state: up:rejoin -> down:damaged
Repair the link to cephfs-shell.rst in doc/cephfs/cephfs-shell.rst that
was broken in https://github.com/ceph/ceph/pull/41165/ when
doc/cephfs/cephfs-shell.rst was moved to doc/man/8/cephfs-shell.rst.
This commit is made in response to a request by Lander Duncan that was
made on the [ceph-users] mailing list, and can be seen here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/F7V4CWLIYCAJ4JXI2JLNY6QPCFPR4SLA/
Pere Diaz Bou [Wed, 26 Jun 2024 13:57:47 +0000 (15:57 +0200)]
doc/rados: update how to install c++ header files
In this example librados2-devel only install C header files on fedora 40,
therefore I added libradospp-devel to the command to include C++ header files.
Zac Dover [Mon, 24 Jun 2024 10:32:30 +0000 (20:32 +1000)]
doc/rados: edit troubleshooting-osd.rst
Make minor changes to the "Debugging Slow Requests" section of
doc/rados/troubleshooting/troubleshooting-osd.rst in preparation
for an expansion of this section in response to a reqeust from Joel
Davidow.
Bill Scales [Wed, 19 Jun 2024 08:36:06 +0000 (08:36 +0000)]
osd/ECBackend.cc: Fix double increment of num_shards_repaired stat
Commit https://github.com/ceph/ceph/commit/deffa8209f9c0bd300cfdb54d358402bfc6e41c6 refactored
ECBackend::handle_recovery_push for Crimson but accidentally duplicated the code that increments
the num_shards_repaired OSD statistic.
This caused one of the QA tests to fail because the stat reported twice as much repair work
had been completed than expected:
qa/standalone/scrub/osd-scrub-repair.sh: TEST_repair_stats_ec: test 26 = 13
Fixes: https://tracker.ceph.com/issues/64437 Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
(cherry picked from commit e618dc01a7a1bdfaa3e1a6fa2a9a9ac13eee11b8)
Nizamudeen A [Fri, 24 May 2024 15:16:17 +0000 (20:46 +0530)]
mgr/dashboard: add dueTime to rgw bucket validator
the unique async validator which checks if the typed bucket is existing
or not in the bucket creation form sends a request to the backend on
each keystroke. Each keystroke will raise an exception if the bucket is
not found.
commit [1] introduced a behavior change.
`ceph-volume lvm prepare` used to create VGs/LVs when it was passed partitions
for db and/or wal devices. Since commit [1] has been introduced, it made ceph-volume
consume the partition directly, it doesn't create LV anymore. Although this
doesn't prevent from creating OSDs, this is a behavior change.
Laura Flores [Wed, 19 Jun 2024 21:57:45 +0000 (16:57 -0500)]
qa/suites/upgrade/telemetry-upgrade/reef-x: update how cephadm is pulled and change image reference
Update how cephadm is pulled:
`cephadm_git_url` and `cephadm_branch` are used in releases older than reef
to install cephadm. Both of these keys are needed to install it from the github
repo.
However, in reef and on, the compiled zipapp cephadm needs to be pulled differently
than the old single python script `cephadm` from earlier releases.
Laura Flores [Wed, 19 Jun 2024 21:07:31 +0000 (16:07 -0500)]
qa/suites/upgrade/telemetry-upgrade: add more ignorelist items and require_osd_release=squid
The warnings added to the ignorelist show up in the cluster log, but they are
expected during upgrades and should thus be ignored.
We also need to set require_osd_release=squid to avoid this warning:
```
cluster [WRN] Health check failed: all OSDs are running squid or later but require_osd_release < squid (OSD_UPGRADE_FINISHED)
```
Laura Flores [Tue, 11 Jun 2024 20:10:01 +0000 (15:10 -0500)]
qa/suites/upgrade/telemetry-upgrade: upgrade from reef instead of pacific
With cephadm upgrades, we are only allowed to upgrade from as far back as N-2
releases. On the main branch, that means we can only upgrade from quincy and reef, and
we can no longer upgrade from pacific.
This test was trying to upgrade from pacific, which isn't allowed, which led to an
`UPGRADE_BAD_TARGET_VERSION` cluster error.
Zack Cerza [Fri, 14 Jun 2024 19:37:16 +0000 (13:37 -0600)]
qa/tasks/qemu: Fix OS version comparison
See: https://sentry.ceph.com/share/issue/21ed88d705854238bdafbf6711e795ee/
They're strings, not floats.
This surfaced as a result of https://github.com/ceph/teuthology/pull/1953
Ilya Dryomov [Wed, 5 Jun 2024 06:36:12 +0000 (08:36 +0200)]
pybind/rbd: parse access and modify timestamps in UTC
It appears that commits 08cee16d0a4b ("pybind/rbd: always parse
timestamps in UTC") and 809c5430c292 ("librbd: add image access/last
modified timestamps") raced with each other and we ended up with two
more timezone-dependent timestamps.
Ilya Dryomov [Tue, 4 Jun 2024 19:37:49 +0000 (21:37 +0200)]
test/pybind/rbd: make timestamp tests meaningful
The existing asserts don't really test anything, with some of them
being for inequality against a literal of a mismatching type. As
a result, a bug in access_timestamp() and modify_timestamp() went
unnoticed for years.
Ilya Dryomov [Tue, 4 Jun 2024 19:19:40 +0000 (21:19 +0200)]
test/pybind/rbd: fix tests that compare strings with b''
assert_not_equal(b'', self.image.id()) is bogus because Image::id()
returns a string (str), not bytes. If the types don't match, values
are guaranteed to not match.
`set_dmcrypt_no_workqueue()` from `ceph_volume.util.encryption`
The function `set_dmcrypt_no_workqueue` in `encryption.py` now
dynamically retrieves the installed cryptsetup version using `cryptsetup
--version` command. It then parses the version string using a regular
expression to accommodate varying digit counts. If the retrieved version
is greater than or equal to the specified target version,
`conf.dmcrypt_no_workqueue` is set to True, allowing for flexible version
handling.
Adam King [Tue, 30 Apr 2024 18:17:58 +0000 (14:17 -0400)]
python-common/service_spec: fix some mypy complaints
The python/mypy combination on the jenkins nodes the CI
is running on don't seem to care, but locally I get
mypy: commands[0]> mypy --config-file=../mypy.ini -p ceph
ceph/deployment/service_spec.py: note: In member "validate" of class "NvmeofServiceSpec":
ceph/deployment/service_spec.py:1497: error: Unsupported operand types for > ("float" and "None") [operator]
ceph/deployment/service_spec.py:1497: note: Left operand is of type "Optional[float]"
ceph/deployment/service_spec.py:1500: error: Unsupported operand types for > ("int" and "None") [operator]
ceph/deployment/service_spec.py:1500: note: Left operand is of type "Optional[int]"
ceph/deployment/service_spec.py:1503: error: Unsupported operand types for > ("int" and "None") [operator]
ceph/deployment/service_spec.py:1503: note: Left operand is of type "Optional[int]"
ceph/deployment/service_spec.py:1506: error: Unsupported operand types for > ("int" and "None") [operator]
ceph/deployment/service_spec.py:1506: note: Left operand is of type "Optional[int]"
ceph/deployment/service_spec.py:1509: error: Unsupported operand types for > ("int" and "None") [operator]
ceph/deployment/service_spec.py:1509: note: Left operand is of type "Optional[int]"
ceph/deployment/service_spec.py:1512: error: Unsupported operand types for > ("int" and "None") [operator]
ceph/deployment/service_spec.py:1512: note: Left operand is of type "Optional[int]"
ceph/deployment/service_spec.py:1515: error: Unsupported operand types for > ("float" and "None") [operator]
ceph/deployment/service_spec.py:1515: note: Left operand is of type "Optional[float]"
ceph/deployment/service_spec.py:1518: error: Unsupported operand types for > ("int" and "None") [operator]
ceph/deployment/service_spec.py:1518: note: Left operand is of type "Optional[int]"
ceph/deployment/service_spec.py:1521: error: Unsupported operand types for > ("int" and "None") [operator]
ceph/deployment/service_spec.py:1521: note: Left operand is of type "Optional[int]"
ceph/deployment/service_spec.py:1524: error: Unsupported operand types for > ("int" and "None") [operator]
ceph/deployment/service_spec.py:1524: note: Left operand is of type "Optional[int]"
ceph/deployment/service_spec.py:1527: error: Unsupported operand types for > ("int" and "None") [operator]
ceph/deployment/service_spec.py:1527: note: Left operand is of type "Optional[int]"
ceph/deployment/service_spec.py:1530: error: Unsupported operand types for > ("float" and "None") [operator]
ceph/deployment/service_spec.py:1530: note: Left operand is of type "Optional[float]"
Found 12 errors in 1 file (checked 27 source files)
The errors make sense to me, so I think we should fix them