Alex Ainscow [Fri, 19 Dec 2025 09:04:55 +0000 (09:04 +0000)]
osd: Do not remove objects with divergent logs if only partial writes.
Fixes https://tracker.ceph.com/issues/74221
Note: An AI was used to assist generating unit tests for this commit.
The production code was written by the author.
In the scenario we are fixing here, there is a divergent log, which needs to
be rolled back. The non-primary does not participate in the transaction to
the object, but the log exists describing the transaction. The primary has
a different transaction and has correctly detected the divergence.
The primary correctly concludes that no recovery is needed for the object, since
only partial writes exist on the non-primary.
The non-primary observes its divergent log and incorrectly concludes that
recovery IS needed for the divergent write and prepares by removing that
object.
The consequence of this depends on the next operation:
1. A read will fail with -EIO
2. A RMW involving a read from the removed object will detect the failure
and reconstruct the necessary data.
3. A RMW not involve the write or an append will recreate the object, but with
zeros, so will cause data corruption. A
It is unusual for such a log entry to exist on the non-primary because
normally those are omitted from the non-primary log. The scenario that causes
this when a partial write triggers a clone due to copy on write. We now have
a clone operation which affects ALL shards and so the log entry is sent to
all shards.
This is unusual to see in the field. We must have all of the following:
1. A clone operation (these are infrequent)
2. A partial write.
3. A peering cycle must happen before this write is complete.
The combination of 1 and 3 make this a very unusual operation in teuthology
and will be even rarer in the field.
The fix ensures we skip divergent log entries for partial writes that the shard
did not participate in.
ceph osd erasure-code-profile set alex k=2 m=2
ceph osd pool create mypool --pg_num=1 --pool_type=erasure alex
ceph osd pool set mypool allow_ec_overwrites true
ceph osd pool set mypool allow_ec_optimizations true
ceph osd pool set mypool min_size 2
David Galloway [Tue, 16 Dec 2025 22:08:00 +0000 (17:08 -0500)]
install-deps: Replace apt-mirror
apt-mirror.front.sepia.ceph.com has happened to always work because we set up CNAMEs to gitbuilder.ceph.com.
That host is making its way to a new home upstate (literally and figuratively) so we'll get rid of the front subdomain since it's publicly accessible anyway and add TLS while we're at it.
Signed-off-by: David Galloway <david.galloway@ibm.com>
script/gen-corpus: cleanup and improve readability and performance
- gen-corpus cleanup missed removing the temporary directory.
- improve it a bit for readability
- import.sh script was slow, improve performance by using less forks and
batch processing
Ville Ojamo [Mon, 15 Dec 2025 08:24:22 +0000 (15:24 +0700)]
doc: Fix minor formatting, typo etc issues
Remove formatting syntax from inside literal text in
cephadm/services/rgw.rst.
Use quotation marks similarly to other placement examples with only
parameter value quoted and not the whole parameter in
cephadm/services/rgw.rst.
Capitalize "YAML" in cephadm/services/rgw.rst.
Remove double space in the middle of a sentence in
rados/operations/erasure-code.rst.
Use double backticks consistently for default values in
radosgw/frontends.rst.
Capitalize "I/O", stylize as "OpenSSL" in radosgw/frontends.rst.
Fix typo "and object" to "an object" in radosgw/s3/bucketops.rst.
Stylize as "CentOS" in start/os-recommendations.rst.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
Nizamudeen A [Mon, 15 Dec 2025 07:29:45 +0000 (12:59 +0530)]
mgr/dashboard: emit success and error on copy2cliboard
This is needed since the notification service we have right now is
tightly coupled with the dashboard so toast won't show up in the
applications where this is being consumed. So emitting an output which
the application can use to show relavant toasts.
Fixes: https://tracker.ceph.com/issues/74213 Signed-off-by: Nizamudeen A <nia@redhat.com>
Casey Bodley [Thu, 20 Nov 2025 16:57:35 +0000 (11:57 -0500)]
qa/rgw/upgrade: exclude ceph-osd-classic/crimson on squid and tentacle
split packages for ceph-osd-classic and ceph-osd-crimson were added on
main, but don't exist on squid and tentacle. exclude these packages from
their install tasks
Imran Imtiaz [Fri, 12 Dec 2025 10:02:59 +0000 (10:02 +0000)]
mgr/dashboard: add API endpoint to delete consistency group
Signed-off-by: Imran Imtiaz <imran.imtiaz@uk.ibm.com> Fixes: https://tracker.ceph.com/issues/74201
Add a dashboard API endpoint to delete a consistency group.
Devika Babrekar [Thu, 4 Dec 2025 09:58:39 +0000 (15:28 +0530)]
mgr/dashboard: Adding QAT Compression dropdown on RGW Service form
Fixes:https://tracker.ceph.com/issues/74046 Signed-off-by: Devika Babrekar <devika.babrekar@ibm.com>
Laura Flores [Mon, 24 Nov 2025 17:31:05 +0000 (11:31 -0600)]
qa/suites/upgrade: add "OBJECT_UNFOUND" to ignorelists
The thrashing in the upgrade tests has been configured to be very aggressive;
the tests are permitted to stop up to 4 of the 8 OSDs, so it is expected that
it is causing these kinds of health warnings to be generated.
This commit also cleans up some expected filesystem and pg peering warnings
in the upgrade tests.
Fixes: https://tracker.ceph.com/issues/72424 Signed-off-by: Laura Flores <lflores@ibm.com>
Matan Breizman [Tue, 9 Dec 2025 09:52:08 +0000 (09:52 +0000)]
debian,ceph.spec: fix ceph-osd upgrade conflicts
With https://github.com/ceph/ceph/pull/65782 merged, upgrading ceph-osd
would need to replace the previous ceph-osd existing on the machine.
Otherwise, we won't be able to symlink the newly installed package:
```
2025-12-05T21:09:20.472 INFO:teuthology.orchestra.run.smithi077.stdout:
Installing : ceph-osd-classic-2:20.3.0-4434.g8611241d.el9.x86_6
24/87
2025-12-05T21:09:20.478 INFO:teuthology.orchestra.run.smithi077.stdout:
Running scriptlet: ceph-osd-classic-2:20.3.0-4434.g8611241d.el9.x86_6
24/87
2025-12-05T21:09:20.479
INFO:teuthology.orchestra.run.smithi077.stdout:failed to link
/usr/bin/ceph-osd -> /etc/alternatives/ceph-osd: /usr/bin/ceph-osd
exists and it is not a symlink
```
Note: debian/control ceph-osd-classic already had Replace and Breaks:
- Breaks is replaced with Conflicts to not allow coexistence.
- Release version is bumped up to be relevant for latest main
Elliot Courant [Mon, 24 Nov 2025 17:50:56 +0000 (11:50 -0600)]
deb/cephadm: Don't assume a home directory is configured
cephadm.postinst can fail if cephadm was originally installed using a
version that didn't configure a home directory for the user at all.
Newer versions do configure a home directory (as either `/home/cephadm`
or `/var/lib/cephadm`) so if that is configured then nothing needs to be
done. But if the user was created with no home directory then one needs
to be added for the configure step to succeed.
Fixes: https://tracker.ceph.com/issues/72083
commit 90bc0369243077c2aaf67f0de2bab5810b217f4e added home directories
for new cephadm users created, but didn't add home directories to
cephadm users that already existed.
Alex Ainscow [Fri, 7 Nov 2025 10:44:56 +0000 (10:44 +0000)]
rados: Add API to disable version querying with reads in librados
librados will always request a "user version". Until EC direct reads are implemented
this is a cheap operation and so librados always requests the user version, even if
the client does not need it.
With EC direct reads, requesting the user version requires an extra op to the primary
in some scenarios. The non-primary OSDs do not contain an up to date user
version.
NEORADOS already allows for such optimisations, due to a how the API is organised.
librados is not heavily used by ceph-maintained clients, but this API will still be
useful for testing of EC direct reads, since the test clients will use librados, due
to it simpler nature and performance not being critical in the tests.
Ilya Dryomov [Tue, 9 Dec 2025 14:22:02 +0000 (15:22 +0100)]
librbd: fix ExclusiveLock::accept_request() when !is_state_locked()
To accept an async request, two conditions must be met: a) exclusive
lock must be a firm STATE_LOCKED state and b) async requests shouldn't
be blocked or if they are blocked there should be an exception in place
for a given request_type. If a) is met but b) isn't, ret_val is set
to m_request_blocked_ret_val, as expected -- the reason for denying
the request is that async requests are blocked. However, if a) isn't
met, ret_val also gets set to m_request_blocked_ret_val. This is wrong
because the reason for denying the request in this case isn't that
async requests are blocked (they may or may not be) but a much heavier
circumstance of exclusive lock being in a transient state or not held
at all.
In such scenarios, whether async requests are blocked or not isn't
relevant and ExclusiveLock::accept_request() behaving otherwise can
lead to bogus "duplicate lock owners detected" errors getting raised
during an attempt to handle any maintenance operation notification in
ImageWatcher::handle_operation_request(). This error isn't considered
retryable so the entire operation that needed the exclusive lock would
be spuriously failed with EINVAL.