Ilya Dryomov [Fri, 23 Jan 2026 13:48:53 +0000 (14:48 +0100)]
qa: don't assume that /dev/sda or /dev/vda is present in unmap.t
Instead of hard-coding the block device name, use the block device that
is backing the filesystem that the test is running on. We can be quite
sure it won't be an RBD device ;)
Ville Ojamo [Mon, 19 Jan 2026 13:06:46 +0000 (20:06 +0700)]
doc/cephadm: remove sections not apply to Squid in rgw.rst
4949311 backported changes that do not apply to Squid.
PR #63073 body and the commit referenced therein as cherry-pick do not
correspond to the diff. Remove the additions that do not apply to Squid:
- Wildcard SAN feature in 3c24753 only since Tentacle.
- Shutdown delay feature in b84bb72 only since Tentacle.
The third feature doc addition is valid, d620ba6 was backported to Squid
in PR #61350 for disable multisite sync traffic, commit 59b3f28. This
backport cherry-picked only the feature addition and missed the docs
commit 8878619. Leave this section in.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Kefu Chai [Wed, 24 Dec 2025 05:55:26 +0000 (13:55 +0800)]
debian/control: add iproute2 to build dependencies
Test scripts like qa/tasks/cephfs/mount.py expect the ip command to be
available in the container environment. Without it, tests fail with:
```
/bin/bash: line 1: ip: command not found
File "/ceph/qa/tasks/cephfs/mount.py", line 96, in cleanup_stale_netnses_and_bridge
p = remote.run(args=['ip', 'netns', 'list'],
...
teuthology.exceptions.CommandFailedError: Command failed with status 127: 'ip netns list'
```
Add iproute2 to the debian package build dependencies when the
<pkg.ceph.check> build profile is enabled. This ensures the package is
available during container-based builds, since buildcontainer-setup.sh
→ script/run-make.sh → install-deps.sh → debian/control → generated
dependency package chain respects build profiles configured via
`FOR_MAKE_CHECK` and `WITH_CRIMSON` environment variables set in
Dockerfile.build.
David Galloway [Tue, 16 Dec 2025 22:08:00 +0000 (17:08 -0500)]
install-deps: Replace apt-mirror
apt-mirror.front.sepia.ceph.com has happened to always work because we set up CNAMEs to gitbuilder.ceph.com.
That host is making its way to a new home upstate (literally and figuratively) so we'll get rid of the front subdomain since it's publicly accessible anyway and add TLS while we're at it.
Adam Kupczyk [Tue, 15 Apr 2025 08:37:25 +0000 (08:37 +0000)]
os/bluestore: Fix dirty_range in BlueStore::_do_remove
dirty_range used to have length = 1 byte.
This is good if whole extent is inside shard.
But this has proven not to be the case.
dirty_range(offset, length) is slower only when it crosses shard.
Rishabh Dave [Tue, 18 Feb 2025 12:30:03 +0000 (18:00 +0530)]
qa/cephfs: ignore warning that pg is stuck peering for upgrade jobs
Health warning "pg .* is stuck peering" is seen while Ceph cluster is
under the upgrade process during fs/upgrade QA job. Being an expected
warning, it should be added to the ignorelist.
And besides this one, we already ignore more severe warnings ("pg is
stuck inactive" and "pg is degrarded") for fs/upgrade jobs.
Fixes: https://tracker.ceph.com/issues/70023 Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 9748de76e02254c6dc284dcc20ec5d5761760dcb)
Conflicts:
qa/cephfs/overrides/pg_health.yaml
- Line before the point where the patch was to be applied is different
comapred to main branch.
In statfs, when the quota root for a dir is discovered,
it uses that dir to base values for max_files and max_bytes.
This can be an issue when a dir is found with only one of two potential quota
fields. Take for instance, a dir with only max_files set and parent dir
has only max_bytes set. During a statfs call, it will then use the max_files
value for provided dir, but does not have a value for max_bytes. In this case,
this behavior will cause the size of the filesystem to be displayed.
Instead, find the quota root for max_files and max_bytes separately. This will
allow for mixed quotas to inherit missing values from its parent. In the above
example, max_files from current dir and max_bytes from parent dir will be
displayed.
Fixes: https://tracker.ceph.com/issues/73487 Signed-off-by: Christopher Hoffman <choffman@redhat.com>
(cherry picked from commit dd02ea9b18502b87ce815eba4286ae3516e334b3)
In cases where there is a single element in a batch_op_map,new_batch_head
is a nullptr, when this is retried at Finisher we'd hit one of the asserts when
dereferencing
In SingletonClient::init(), objecter->start() called before
monc->authenticate(), it makes conns of monc authencated before
monc->authenticate() called if mons reply faster, in this case,
monc will not subsribe monmap/config.
Naveen Naidu [Thu, 5 Dec 2024 04:21:21 +0000 (09:51 +0530)]
qa/workunit: update telemetry quincy/reef workunits with "basic_stretch_cluster" collection
Note, this is not a clean cherry pick. The 4dac20e updated the
`test_telemetry_reef_x.sh` and `test_telemetry_squid_x.sh` upgrade
workunits. These upgrade workunits test the upgrade of a cluster from
reef and squid (X-2) releases to the X version of cluster.
Since we are cherry picking the commit to squid (X release), we would
instead have to update the workunit files of quicy and reef i,e the
(X-2) releases.
mds: client is evicted when an export subtree task is interrupted
The importer will force open some sessions provided by the exporter but the client does not know about
the new sessions until the exporter notifies it, and the notifications cannot be sent if the exporter
is interrupted. The client does not renew the sessions regularly that it does not know about, so the client
will be evicted by the importer after `session_autoclose` seconds (300 seconds by default).
The sessions that are forced opened in the importer need to be closed when the import process is reversed.
Zhansong Gao [Fri, 26 May 2023 04:20:17 +0000 (12:20 +0800)]
mds: session in the importing state cannot be cleared if an export subtree task is interrupted while the state of importer is acking
The related sessions in the importer are in the importing state(`Session::is_importing` return true) when the state of importer is `acking`,
`Migrator::import_reverse` called by `MDCache::handle_resolve` should reverse the process to clear the importing state if the exporter restarts
at this time, but it doesn't do that actually because of its bug. And it will cause these sessions to not be cleared when the client is
unmounted(evicted or timeout) until the mds is restarted.
The bug in `import_reverse` is that it contains the code to handle state `IMPORT_ACKING` but it will never be executed because
the state is modified to `IMPORT_ABORTING` at the beginning. Move `stat.state = IMPORT_ABORTING` to the end of import_reverse
so that it can handle the state `IMPORT_ACKING`.
Casey Bodley [Fri, 3 Oct 2025 16:24:18 +0000 (12:24 -0400)]
rgw: fix 'bucket rm --bypass-gc' for copied objects
the `--bypass-gc` argument to `radosgw-admin bucket rm` causes us to
call `RadosBucket::remove_bypass_gc()`, which loops over the tail
objects and removes each with `RGWRados::delete_raw_obj_aio()`
however, this was removing the objects with `cls_rgw_remove_obj()`,
which is for head objects, not tails. tail objects must be removed with
`cls_refcount_put()`, which preserves them until the last copy is
removed
rename `delete_raw_obj_aio()` to `delete_tail_obj_aio()` to clarify its
purpose
Nitzan Mordechai [Wed, 22 Oct 2025 05:41:56 +0000 (05:41 +0000)]
tasks/cbt_performance: Tolerate exceptions during performance data updates
If an exception occurs during the POST request to update CBT performance,
log the error instead of failing the entire job. This ensures that
intermittent update failures do not block the main workflow.
The unlink subcommand did not handle unsharded bucket indices
appropriately. These are when the number of shards listed in the
bucket instance object is 0. In that case there will actually be 1
shard.
When number of shards as 0 is passed into the function that maps
object names to shards, it returns -1. And that was not handled
properly. That is now fixed.
Henry Richter [Wed, 8 Oct 2025 23:00:34 +0000 (01:00 +0200)]
rgw: asio/beast add ssl hot-reload
Adds the `ssl_reload` config option to the beast frontend.
This sets an interval in seconds to periodically reload the ssl context to pick up changes without restarting. It can be disabled (default) be setting it to `0`.
Nitzan Mordechai [Tue, 10 Dec 2024 09:04:34 +0000 (09:04 +0000)]
msg/async: race condition between reset_recv_state and shutdown_connections
when shutting down monitors and valgrind is involved, we can,
sometimes, to hit race condition and locks that causing the shutdown
process to hang for a long time.
reset_recv_state - issuing a message without proper locks that
causing the shutdown to hang during shutdown connection (drain network)
1. Fixes the promql expr used to calculate "In" OSDs in
ceph-cluster-advanced.json.
2. Fixes the color coding for the single state panels used in the OSDs
grafana panel like "In", "Out" etc
according to `dpkg-buildflags`, ubuntu 24 raised this value to
`-D_FORTIFY_SOURCE=3` which causes `error: "_FORTIFY_SOURCE" redefined`
compilation failures because Ceph itself adds `-D_FORTIFY_SOURCE=2`
`_FORTIFY_SOURCE` is a hardening option. both our rpm and debian builds
already specify that via environment variables, so Ceph's cmake should
leave it alone
Anoop C S [Mon, 23 Sep 2024 07:06:55 +0000 (12:36 +0530)]
client: Gracefully handle empty pathname for statxat()
man statx(2)[1] says the following:
. . .
AT_EMPTY_PATH
If pathname is an empty string, operate on the file referred to by
dirfd (which may have been obtained using the open(2) O_PATH flag).
In this case, dirfd can refer to any type of file, not just a
directory.
If dirfd is AT_FDCWD, the call operates on the current working
directory.
. . .
Look out for an empty pathname and use the relative fd's inode in the
presence of AT_EMPTY_PATH flag before calling internal _getattr().
Fixes: https://tracker.ceph.com/issues/68189
Review with: git show -w
Anoop C S [Thu, 17 Oct 2024 16:15:17 +0000 (21:45 +0530)]
libcephfs.h: Fix API documentation for ceph_statxat
flags parameter for ceph_statxat() API is supposed to accept only
AT_STATX_DONT_SYNC and AT_SYMLINK_NOFOLLOW. Modify the corresponding
documentation to reflect the acceptance of above two flags.
Anoop C S [Fri, 20 Sep 2024 08:49:01 +0000 (14:19 +0530)]
client: Gracefully handle empty pathname for chownat()
man fchownat(2)[1] says the following:
. . .
AT_EMPTY_PATH (since Linux 2.6.39)
If pathname is an empty string, operate on the file referred to by
dirfd (which may have been obtained using the open(2) O_PATH flag).
In this case, dirfd can refer to any type of file, not just a
directory. If dirfd is AT_FDCWD, the call operates on the current
working directory.
. . .
Look out for an empty pathname and use the relative fd's inode in the
presence of AT_EMPTY_PATH flag before calling internal _setattr().
Fixes: https://tracker.ceph.com/issues/68189
Review with: git show -w