]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
4 days agorgw/logging: add error message when log_record fails 65494/head
Yuval Lifshitz [Thu, 11 Sep 2025 15:22:57 +0000 (15:22 +0000)]
rgw/logging: add error message when log_record fails

when log_record fails in journal mode due to issues in the target
bucket, the result code that the client get will be confusing, since
there is no indication that the issue is wit hte target bucket and not
the source bucket on which the client was operating.
the HTTP error message will be used to convey this information.

Fixes: https://tracker.ceph.com/issues/72543
Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>
5 days agoMerge pull request #65428 from rkachach/fix_rgw_docs_certmgr
Redouane Kachach [Wed, 10 Sep 2025 14:26:57 +0000 (16:26 +0200)]
Merge pull request #65428 from rkachach/fix_rgw_docs_certmgr

doc: update RGW HTTPS configuration to use certmgr and new fields

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Adam King <adking@redhat.com>
5 days agoMerge pull request #65406 from rkachach/fix_grafana_docs
Redouane Kachach [Wed, 10 Sep 2025 14:23:48 +0000 (16:23 +0200)]
Merge pull request #65406 from rkachach/fix_grafana_docs

doc: update Grafana certificate configuration to use certmgr

Reviewed-by: Adam King <adking@redhat.com>
5 days agoMerge PR #65320 into main
Venky Shankar [Wed, 10 Sep 2025 06:44:11 +0000 (12:14 +0530)]
Merge PR #65320 into main

* refs/pull/65320/head:

Reviewed-by: Anoop C S <anoopcs@cryptolab.net>
6 days agoMerge pull request #64463 from ljflores/wip-qa-summary-script
Laura Flores [Tue, 9 Sep 2025 22:11:41 +0000 (17:11 -0500)]
Merge pull request #64463 from ljflores/wip-qa-summary-script

script: add script to help format QA review summaries

6 days agoMerge pull request #58926 from TRYTOBE8TME/wip-shard-id-option
Daniel Gryniewicz [Tue, 9 Sep 2025 16:08:01 +0000 (12:08 -0400)]
Merge pull request #58926 from TRYTOBE8TME/wip-shard-id-option

src/rgw: Adding "sync error trim" option

6 days agolibcephfs_proxy: fix userperm pointer decoding for older protocols 65320/head
Xavi Hernandez [Mon, 1 Sep 2025 12:43:26 +0000 (14:43 +0200)]
libcephfs_proxy: fix userperm pointer decoding for older protocols

The random data used to decode pointers coming from the old protocol was
taken from the client instead of using the global_random data, which is
the correct one.

Fixes: https://tracker.ceph.com/issues/72800
Signed-off-by: Xavi Hernandez <xhernandez@gmail.com>
6 days agolibcephfs_proxy: remove unnecessary protocol references in daemon
Xavi Hernandez [Mon, 1 Sep 2025 09:58:30 +0000 (11:58 +0200)]
libcephfs_proxy: remove unnecessary protocol references in daemon

With the new protocol structure definitions, it's not necessary to
explicitly access each field inside its version substructure (v0, for
example). Now all fields of the latest version are declared inside an
anonymous substructure that can be accessed without a prefix.

Fixes: https://tracker.ceph.com/issues/72800
Signed-off-by: Xavi Hernandez <xhernandez@gmail.com>
6 days agolibcephfs_proxy: remove unnecessary protocol references in client
Xavi Hernandez [Mon, 1 Sep 2025 09:41:10 +0000 (11:41 +0200)]
libcephfs_proxy: remove unnecessary protocol references in client

With the new protocol structure definitions, it's not necessary to
explicitly access each field inside its version substructure (v0, for
example). Now all fields of the latest version are declared inside an
anonymous substructure that can be accessed without a prefix.

Fixes: https://tracker.ceph.com/issues/72800
Signed-off-by: Xavi Hernandez <xhernandez@gmail.com>
6 days agolibcephfs_proxy: fix protocol structures for backward compatibility
Xavi Hernandez [Mon, 1 Sep 2025 09:22:05 +0000 (11:22 +0200)]
libcephfs_proxy: fix protocol structures for backward compatibility

The structures used for transferring data between the proxy client and
the proxy daemon had been reworked in a recent change to be able to
expand the protocol. This caused an inconsistency in the size of the
data transferred when communication with a peer using the older version.
The result was that the peer receiving the data with an unexpected size
was closing the connection, causing unexpected errors.

The discrepancy in size is the result of how compilers pad structures
combined with the change in the structure layout introduced when
extending the protocol. With these changes, the computation of the size
of each version of the structures was not done correctly.

This change makes the layout equal to the older version, so that
computing the size of the structures becomes easier and doesn't depend
on unexpected paddings.

Fixes: https://tracker.ceph.com/issues/72800
Signed-off-by: Xavi Hernandez <xhernandez@gmail.com>
6 days agoMerge pull request #63895 from Kushal-deb/rgw-qat-compression
Adam King [Tue, 9 Sep 2025 13:07:35 +0000 (09:07 -0400)]
Merge pull request #63895 from Kushal-deb/rgw-qat-compression

cephadm: improve hw qat experience with cephadm

Reviewed-by: Adam King <adking@redhat.com>
6 days agoMerge pull request #65387 from yuvalif/wip-yuval-72542
Yuval Lifshitz [Tue, 9 Sep 2025 12:45:05 +0000 (15:45 +0300)]
Merge pull request #65387 from yuvalif/wip-yuval-72542

rgw/logging: allow committing empty objects

6 days agoMerge pull request #64844 from ljflores/wip-tracker-72312
Radoslaw Zarzynski [Tue, 9 Sep 2025 12:28:21 +0000 (14:28 +0200)]
Merge pull request #64844 from ljflores/wip-tracker-72312

qa/tasks/thrashosds-health: fine tune ignorelist for degraded and undersized pgs

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
6 days agoMerge pull request #64375 from SundownRises/header-component
afreen23 [Tue, 9 Sep 2025 10:58:38 +0000 (16:28 +0530)]
Merge pull request #64375 from SundownRises/header-component

mgr/dashboard: Carbonised Notification Header

Reviewed-by: Afreen Misbah <afreen@ibm.com>
6 days agodoc: update RGW HTTPS configuration to use certmgr and new fields 65428/head
Redouane Kachach [Mon, 8 Sep 2025 13:20:27 +0000 (15:20 +0200)]
doc: update RGW HTTPS configuration to use certmgr and new fields

With the introduction of certmgr, RGW services now support three
certificate sources: cephadm-signed (default), inline, and reference.
Docs have been updated to:

- Show how to provide inline certificates using the new ssl_cert/ssl_key
  fields instead of the deprecated rgw_frontend_ssl_certificate.
- Explain how to register and reference user-provided certs/keys
- Clarify that cephadm-signed certificates remain the default, with
  optional wildcard SANs support.

The usage of rgw_frontend_ssl_certificate is still supported for
backward compatibility, but is now documented as deprecated.

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
6 days agoMerge pull request #59515 from kamoltat/wip-ksirivad-fix-67801
SrinivasaBharathKanta [Tue, 9 Sep 2025 10:38:10 +0000 (16:08 +0530)]
Merge pull request #59515 from kamoltat/wip-ksirivad-fix-67801

mon [stretch mode]: restrict changing mon election strategy post stretch mode

6 days agoMerge PR #63636 into main
Venky Shankar [Tue, 9 Sep 2025 10:24:01 +0000 (15:54 +0530)]
Merge PR #63636 into main

* refs/pull/63636/head:

Reviewed-by: Christopher Hoffman <choffman@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
6 days agoMerge pull request #65015 from connorfawcett/pg-autoscale-threshold-cmd
Connor Fawcett [Tue, 9 Sep 2025 09:36:19 +0000 (10:36 +0100)]
Merge pull request #65015 from connorfawcett/pg-autoscale-threshold-cmd

mgr/pg_autoscaler: Add 'osd pool get threshold' command which returns the current threshold value

6 days agoMerge pull request #64788 from rhcs-dashboard/acl-mapping
afreen23 [Tue, 9 Sep 2025 09:16:52 +0000 (14:46 +0530)]
Merge pull request #64788 from rhcs-dashboard/acl-mapping

mgr/dashboard:RGW- Storage Class ACL Mapping

Reviewed-by: Afreen Misbah <afreen@ibm.com>
6 days agoMerge PR #64958 into main
Venky Shankar [Tue, 9 Sep 2025 06:41:58 +0000 (12:11 +0530)]
Merge PR #64958 into main

* refs/pull/64958/head:

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
6 days agoMerge PR #64967 into main
Venky Shankar [Tue, 9 Sep 2025 06:39:40 +0000 (12:09 +0530)]
Merge PR #64967 into main

* refs/pull/64967/head:

Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
6 days agoMerge pull request #65436 from gbregman/main
Gil Bregman [Tue, 9 Sep 2025 05:38:27 +0000 (08:38 +0300)]
Merge pull request #65436 from gbregman/main

mgr/cephadm/nvmeof: Add fields for prometheus frequency to NVMEOF spec file

6 days agotest/libcephfs: validate asynchronous write and fsync executing concurrently 63636/head
Venky Shankar [Mon, 2 Jun 2025 05:08:01 +0000 (05:08 +0000)]
test/libcephfs: validate asynchronous write and fsync executing concurrently

This synthetic reproducer does three things:

- setup a client mount with a configuration to delay write operations and
  initiates a write operation via a thread.
- a thread that invokes asynchronous fsync
- a thread that invokes setxattr for the client to track early replies

Without the fix[0], the test reproduces the following crash:

```
/home/vshankar/ceph/src/client/Client.cc: In function 'void Client::put_request(MetaRequest*)' thread 7f7210ff9640 time 2025-06-03T09:34:45.634974+0000
/home/vshankar/ceph/src/client/Client.cc: 2290: FAILED ceph_assert(request->ref >= 1)
 ceph version 20.3.0-673-gdd152807f7e (dd152807f7e7f7a82df6cfc0159f5fc65f60ecd5) tentacle (dev - Debug)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x119) [0x7f72222ebb98]
 2: (ceph::__ceph_assert_fail(ceph::assert_data const&)+0x17) [0x7f72222ebedc]
 3: /home/vshankar/ceph/build/lib/libcephfs.so.2(+0x6a075) [0x7f7222e6a075]
 4: /home/vshankar/ceph/build/lib/libcephfs.so.2(+0xb8289) [0x7f7222eb8289]
 5: /home/vshankar/ceph/build/lib/libcephfs.so.2(+0xee951) [0x7f7222eee951]
 6: /home/vshankar/ceph/build/lib/libcephfs.so.2(+0xf167c) [0x7f7222ef167c]
 7: (Context::complete(int)+0x9) [0x7f7222e5949d]
 8: /home/vshankar/ceph/build/lib/libcephfs.so.2(+0x16a853) [0x7f7222f6a853]
 9: /home/vshankar/ceph/build/lib/libcephfs.so.2(+0xa7cc5) [0x7f7222ea7cc5]
 10: /home/vshankar/ceph/build/lib/libcephfs.so.2(+0xf128d) [0x7f7222ef128d]
 11: /home/vshankar/ceph/build/lib/libcephfs.so.2(+0x16e09d) [0x7f7222f6e09d]
 12: (Context::complete(int)+0x9) [0x7f7222e5949d]
 13: /home/vshankar/ceph/build/lib/libcephfs.so.2(+0x6d108) [0x7f7222e6d108]
 14: (Context::complete(int)+0x9) [0x7f7222e5949d]
 15: (Finisher::finisher_thread_entry()+0x665) [0x7f722226fdc1]
 16: (Finisher::FinisherThread::entry()+0xd) [0x7f7222270ddf]
 17: (Thread::entry_wrapper()+0x2f) [0x7f72222b88f5]
 18: (Thread::_entry_func(void*)+0x9) [0x7f72222b8907]
 19: /lib64/libc.so.6(+0x89e92) [0x7f7221089e92]
 20: /lib64/libc.so.6(+0x10ef20) [0x7f722110ef20]
[1]    2162689 IOT instruction (core dumped)  ./bin/ceph_test_libcephfs --gtest_filter=LibCephFS.ConcurrentWriteAndFsync
```

[0]: https://github.com/ceph/ceph/pull/63619

Fixes: http://tracker.ceph.com/issues/71515
Signed-off-by: Venky Shankar <vshankar@redhat.com>
6 days agoclient: catch buggy reference count drop for MetaRequest
Venky Shankar [Tue, 3 Jun 2025 10:04:44 +0000 (10:04 +0000)]
client: catch buggy reference count drop for MetaRequest

With the prior commit that introduces a synthetic delay in write
operation so as to write a test reproducer which would interleave
asynchronous fsync and an operation that makes the MDS send a early
reply to the client (therefore, having the client track the early
replied response for an inode in Inode::unsafe_ops). Now, this is
enough to trick the client into the code path that causes a buggy
reference drop for the request (MetaRequest), but, hitting the
_exact_ crash backtrace requires the request to be a in various
[x]list's.

This last bit is tricky to synthetically massage in the test. So,
in order to catch the buggy reference drop, it would suffice to
assert on the reference count dropping to less than zero (0).

Signed-off-by: Venky Shankar <vshankar@redhat.com>
6 days agoclient: synthetically delay write operation
Venky Shankar [Mon, 2 Jun 2025 05:05:44 +0000 (05:05 +0000)]
client: synthetically delay write operation

To allow the client to hold Fb caps for an extended period of
time, to allow an asynchronous fsync to intervene and block, so
as to hunt [0].

[0]: https://tracker.ceph.com/issues/71510

Signed-off-by: Venky Shankar <vshankar@redhat.com>
6 days agoclient: log unsafe operation count (for debugging)
Venky Shankar [Mon, 2 Jun 2025 05:04:46 +0000 (05:04 +0000)]
client: log unsafe operation count (for debugging)

Signed-off-by: Venky Shankar <vshankar@redhat.com>
6 days agolibcephfs/client: asynchronous fsync interface
Venky Shankar [Mon, 2 Jun 2025 05:03:50 +0000 (05:03 +0000)]
libcephfs/client: asynchronous fsync interface

Mostly for writing test for hunting [0].

[0]: https://tracker.ceph.com/issues/71510

Signed-off-by: Venky Shankar <vshankar@redhat.com>
6 days agoMerge pull request #64999 from rishabh-d-dave/fs-pyx-chown
Rishabh Dave [Tue, 9 Sep 2025 03:54:03 +0000 (09:24 +0530)]
Merge pull request #64999 from rishabh-d-dave/fs-pyx-chown

cephfs.pyx: handle when UID/GID passed to chown() is -1

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
6 days agoMerge pull request #64927 from NitzanMordhai/wip-nitzan-suites-ignore-still-running...
SrinivasaBharathKanta [Tue, 9 Sep 2025 03:42:39 +0000 (09:12 +0530)]
Merge pull request #64927 from NitzanMordhai/wip-nitzan-suites-ignore-still-running-cephadm-osds-suites

suites/rados/cephadm: typo in ignotr list for still running message

7 days agoMerge pull request #65431 from afreen23/doc-release-notes
afreen23 [Mon, 8 Sep 2025 19:49:36 +0000 (01:19 +0530)]
Merge pull request #65431 from afreen23/doc-release-notes

Update dashboard Pending release notes

Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
7 days agoMerge pull request #65419 from ljflores/wip-tracker-72897
Ilya Dryomov [Mon, 8 Sep 2025 19:15:10 +0000 (21:15 +0200)]
Merge pull request #65419 from ljflores/wip-tracker-72897

doc/rados/operations: add kernel client procedure to read balancer documentation

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
7 days agoMerge pull request #64837 from rzarzynski/wip-bug-72412
Laura Flores [Mon, 8 Sep 2025 18:50:40 +0000 (13:50 -0500)]
Merge pull request #64837 from rzarzynski/wip-bug-72412

osd: stop scrub_purged_snaps() from ignoring osd_beacon_report_interval

7 days agomgr/cephadm/nvmeof: Add fields for prometheus frequency to NVMEOF spec file. 65436/head
Gil Bregman [Mon, 8 Sep 2025 16:29:46 +0000 (19:29 +0300)]
mgr/cephadm/nvmeof: Add fields for prometheus frequency to NVMEOF spec file.

Fixes: https://tracker.ceph.com/issues/72805
Signed-off-by: Gil Bregman <gbregman@il.ibm.com>
7 days agoMerge pull request #65384 from rhcs-dashboard/fix-72868-main
afreen23 [Mon, 8 Sep 2025 18:30:22 +0000 (00:00 +0530)]
Merge pull request #65384 from rhcs-dashboard/fix-72868-main

mgr/dashboard: fix RGW Bucket Notification Dashboard units

Reviewed-by: Afreen Misbah <afreen@ibm.com>
Reviewed-by: Ankush Behl <cloudbehl@gmail.com>
7 days agodoc: wq!Update dashboard Pending release notes 65431/head
Afreen Misbah [Mon, 8 Sep 2025 11:18:55 +0000 (16:48 +0530)]
doc: wq!Update dashboard Pending release notes

- added for tentacle
- moved the inccorect notes added in tentacke to umbrella

Signed-off-by: Afreen Misbah <afreen@ibm.com>
7 days agodoc/rados/operations: add kernel client procedure to read balancer documentation 65419/head
Laura Flores [Fri, 5 Sep 2025 21:46:20 +0000 (16:46 -0500)]
doc/rados/operations: add kernel client procedure to read balancer documentation

As of now, the kernel client does not support `pg-upmap-primary`. I have
added some troubleshooting steps to help users who are unable to
mount images and filesystems with the kernel client while using `pg-upmap-primary`.

Once the feature is supported by the kernel client, users will be able
to perform mounts along with `pg-upmap-primary`.

Fixes: https://tracker.ceph.com/issues/72897
Signed-off-by: Laura Flores <lflores@ibm.com>
7 days agorgw/logging: allow committing empty objects 65387/head
Yuval Lifshitz [Thu, 4 Sep 2025 10:53:07 +0000 (10:53 +0000)]
rgw/logging: allow committing empty objects

Fixes: https://tracker.ceph.com/issues/72542
Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com>
7 days agoMerge pull request #62747 from bill-scales/issue70844
Jon Bailey [Mon, 8 Sep 2025 11:01:39 +0000 (12:01 +0100)]
Merge pull request #62747 from bill-scales/issue70844

test: add replica pool support to ceph_test_rados_io_sequence

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
Reviewed-by: Connor Fawcett <connorfa@uk.ibm.com>
7 days agoMerge pull request #65281 from nbalacha/wip-nbalacha-72740
Yuval Lifshitz [Mon, 8 Sep 2025 08:06:10 +0000 (11:06 +0300)]
Merge pull request #65281 from nbalacha/wip-nbalacha-72740

rgw/logging: fixes data loss during rollover

7 days agoMerge pull request #65231 from anthonyeleven/improve-osd-dot-cc
Anthony D'Atri [Mon, 8 Sep 2025 06:19:02 +0000 (01:19 -0500)]
Merge pull request #65231 from anthonyeleven/improve-osd-dot-cc

src/osd: Improve message in OSD.cc

7 days agorgw/logging: fixes data loss during rollover 65281/head
N Balachandran [Thu, 28 Aug 2025 06:22:23 +0000 (11:52 +0530)]
rgw/logging: fixes data loss during rollover

Multiple threads attempting to roll over the same log object can result
in the creation of numerous orphan tail objects, each with a single record.
This occurs when a NULL RGWObjVersionTracker is used during the creation of
a new logging object. These records are inaccessible, leading to data loss,
which is particularly critical in Journal mode.
Furthermore, valid log tail objects may be added to the Garbage Collection (GC)
list, exacerbating data loss.

Fixes: https://tracker.ceph.com/issues/72740
Signed-off-by: N Balachandran <nithya.balachandran@ibm.com>
8 days agoMerge pull request #65421 from ronen-fr/wip-rf-ec72898
Ronen Friedman [Sun, 7 Sep 2025 14:44:10 +0000 (17:44 +0300)]
Merge pull request #65421 from ronen-fr/wip-rf-ec72898

osd/scrub: clear m_ec_digest_map between objects

Reviewed-by: Jon Bailey <jonathan.bailey1@ibm.com>
8 days agoMerge pull request #62106 from rkachach/fix_certmgr_v2
Redouane Kachach [Sun, 7 Sep 2025 10:49:23 +0000 (12:49 +0200)]
Merge pull request #62106 from rkachach/fix_certmgr_v2

Add cephadm-signed certificate support for all services

Reviewed-by: John Mulligan <jmulligan@redhat.com>
8 days agotest: add replica pool support to ceph_test_rados_io_sequence 62747/head
Bill Scales [Wed, 9 Apr 2025 09:58:15 +0000 (10:58 +0100)]
test: add replica pool support to ceph_test_rados_io_sequence

Make 'ceph_test_rados_io_sequenece --pool rbd' work, replica
pools don't have an erausre code profile and do not have the
ec_allow_overwrites or ec_allow_optimizations flags

Fixes: https://tracker.ceph.com/issues/70844
Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
8 days agocommon: Added values to json::OSDPoolGetReply
Jon Bailey [Mon, 14 Jul 2025 12:52:28 +0000 (13:52 +0100)]
common: Added values to json::OSDPoolGetReply

OSDPoolGetReply actually returns a lot more values than what is currently supplied. These have been added in as optionals (as they can not be give as well) so its possible to query them to find out if they exist and use them if they do.

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
8 days agoosd/scrub: clear m_ec_digest_map between objects 65421/head
Ronen Friedman [Sun, 7 Sep 2025 07:19:52 +0000 (02:19 -0500)]
osd/scrub: clear m_ec_digest_map between objects

Fixing a bug introduced by commit 4c61079e931
("caluculate EC digest map size only once").

Fixes: https://tracker.ceph.com/issues/72897
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
8 days agoMerge pull request #65150 from rkachach/fix_issue_nvmeof_prometheus
Redouane Kachach [Sun, 7 Sep 2025 07:18:06 +0000 (09:18 +0200)]
Merge pull request #65150 from rkachach/fix_issue_nvmeof_prometheus

mgr/cepahdm: fixing nvmeof scraping Prometheus config generation

Reviewed-by: Kushal Deb <Kushal.Deb@ibm.com>
8 days agoMerge pull request #65338 from ronen-fr/wip-rf-be-st3
Ronen Friedman [Sun, 7 Sep 2025 05:44:04 +0000 (08:44 +0300)]
Merge pull request #65338 from ronen-fr/wip-rf-be-st3

osd/scrub: modify OMAP stats collection

Reviewed-by: Jon Bailey <jonathan.bailey1@ibm.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
8 days agoMerge pull request #65193 from tchaikov/wip-osd-scrub-fix-buffer-overflow
Kefu Chai [Sun, 7 Sep 2025 03:05:39 +0000 (11:05 +0800)]
Merge pull request #65193 from tchaikov/wip-osd-scrub-fix-buffer-overflow

osd/scrub: fix heap-buffer-overflow when checking digest emptiness

Reviewed-by: Jon Bailey <jonathan.bailey1@ibm.com>
Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
9 days agomgr/cepahdm: fixing nvmeof scraping Prometheus config generation 65150/head
Redouane Kachach [Wed, 20 Aug 2025 12:01:28 +0000 (14:01 +0200)]
mgr/cepahdm: fixing nvmeof scraping Prometheus config generation

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agodoc/cephadm: updating certmgr docs to reflects new changes 62106/head
Redouane Kachach [Tue, 15 Jul 2025 13:50:45 +0000 (15:50 +0200)]
doc/cephadm: updating certmgr docs to reflects new changes

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cephadm: Adding RGW migration for the new certmgr certs format
Redouane Kachach [Wed, 20 Aug 2025 13:54:51 +0000 (15:54 +0200)]
mgr/cephadm: Adding RGW migration for the new certmgr certs format

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cephadm: removing call to previous method to storing all certs
Redouane Kachach [Tue, 29 Apr 2025 11:02:47 +0000 (13:02 +0200)]
mgr/cephadm: removing call to previous method to storing all certs

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cephadm: fixing nvmeof section in cert_mgr UT + new UT
Redouane Kachach [Thu, 7 Aug 2025 13:57:41 +0000 (15:57 +0200)]
mgr/cephadm: fixing nvmeof section in cert_mgr UT + new UT

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cepahdm: adapting UT after all the changes
Redouane Kachach [Tue, 12 Aug 2025 15:26:01 +0000 (17:26 +0200)]
mgr/cepahdm: adapting UT after all the changes

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cephadm: using 5 years for service-discovery internal certs
Redouane Kachach [Wed, 20 Aug 2025 13:55:24 +0000 (15:55 +0200)]
mgr/cephadm: using 5 years for service-discovery internal certs

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cephadm: adding self-signed certifiactes support for nvmeof svc
Redouane Kachach [Tue, 12 Aug 2025 13:53:38 +0000 (15:53 +0200)]
mgr/cephadm: adding self-signed certifiactes support for nvmeof svc

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cepahdm: adapting node-proxy service to use the new cert mgmt
Redouane Kachach [Wed, 11 Jun 2025 13:46:51 +0000 (15:46 +0200)]
mgr/cepahdm: adapting node-proxy service to use the new cert mgmt

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cepahdm: adapting agent service to use the new cert mgmt
Redouane Kachach [Wed, 11 Jun 2025 10:30:46 +0000 (12:30 +0200)]
mgr/cepahdm: adapting agent service to use the new cert mgmt

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cepahdm: adapting node-exporter service to use the new cert mgmt
Redouane Kachach [Mon, 31 Mar 2025 13:02:36 +0000 (15:02 +0200)]
mgr/cepahdm: adapting node-exporter service to use the new cert mgmt

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cepahdm: adapting Prometheus service to use the new cert mgmt
Redouane Kachach [Wed, 13 Aug 2025 17:14:06 +0000 (19:14 +0200)]
mgr/cepahdm: adapting Prometheus service to use the new cert mgmt

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cepahdm: adapting alertmanager service to use the new cert mgmt
Redouane Kachach [Wed, 13 Aug 2025 17:13:11 +0000 (19:13 +0200)]
mgr/cepahdm: adapting alertmanager service to use the new cert mgmt

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cepahdm: adapting oauth2-proxy service to use the new cert mgmt
Redouane Kachach [Fri, 25 Apr 2025 07:38:46 +0000 (09:38 +0200)]
mgr/cepahdm: adapting oauth2-proxy service to use the new cert mgmt

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cepahdm: adapting Grafana service to use the new cert mgmt
Redouane Kachach [Fri, 7 Mar 2025 08:56:12 +0000 (09:56 +0100)]
mgr/cepahdm: adapting Grafana service to use the new cert mgmt

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cephadm: adapting mgmt-gateway service to use the new cert mgmt
Redouane Kachach [Fri, 25 Apr 2025 07:38:22 +0000 (09:38 +0200)]
mgr/cephadm: adapting mgmt-gateway service to use the new cert mgmt

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cephadm: adapting iscsi service to use the new cert mgmt
Redouane Kachach [Fri, 7 Mar 2025 08:55:08 +0000 (09:55 +0100)]
mgr/cephadm: adapting iscsi service to use the new cert mgmt

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cephadm: adapting ingress svc to use the new cert mgmt approach
Redouane Kachach [Fri, 5 Sep 2025 08:42:10 +0000 (10:42 +0200)]
mgr/cephadm: adapting ingress svc to use the new cert mgmt approach

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cephadm: adding automated certificates mgmt for cephadmservice
Redouane Kachach [Wed, 13 Aug 2025 17:11:44 +0000 (19:11 +0200)]
mgr/cephadm: adding automated certificates mgmt for cephadmservice

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cepahdm: adding new filtring options and certificate ref checks
Redouane Kachach [Tue, 12 Aug 2025 14:12:27 +0000 (16:12 +0200)]
mgr/cepahdm: adding new filtring options and certificate ref checks

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cephadm: adding support to prevent editing readonly certs
Redouane Kachach [Tue, 12 Aug 2025 14:37:47 +0000 (16:37 +0200)]
mgr/cephadm: adding support to prevent editing readonly certs

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cephadm: adding enahanced support for self-signed certs
Redouane Kachach [Tue, 12 Aug 2025 14:37:32 +0000 (16:37 +0200)]
mgr/cephadm: adding enahanced support for self-signed certs

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cepahdm: add support for custom duration when generating certs
Redouane Kachach [Tue, 12 Aug 2025 12:34:10 +0000 (14:34 +0200)]
mgr/cepahdm: add support for custom duration when generating certs

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cepahdm: adding support to extract ips and fqdns from cert
Redouane Kachach [Wed, 19 Mar 2025 11:09:17 +0000 (12:09 +0100)]
mgr/cepahdm: adding support to extract ips and fqdns from cert

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cephadm: adapting python-common UT to the spec changes
Redouane Kachach [Tue, 12 Aug 2025 15:29:53 +0000 (17:29 +0200)]
mgr/cephadm: adapting python-common UT to the spec changes

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agomgr/cephadm: adding new spec generic fields fo ssl/certifiactes
Redouane Kachach [Fri, 25 Apr 2025 07:37:38 +0000 (09:37 +0200)]
mgr/cephadm: adding new spec generic fields fo ssl/certifiactes

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
9 days agoosd/scrub: reinstate one-warning-per-chunk behaviour 65338/head
Ronen Friedman [Tue, 2 Sep 2025 17:52:53 +0000 (12:52 -0500)]
osd/scrub: reinstate one-warning-per-chunk behaviour

Modify collect_omap_stats() to guarantee that only
one 'large omap entry' warning message is logged
per chunk, thus maintaining the existing behaviour.

Unlike the existing behaviour - all 'large omap'
entries are counted.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
9 days agoosd/scrub: modify OMAP stats collection
Ronen Friedman [Tue, 2 Sep 2025 15:52:48 +0000 (10:52 -0500)]
osd/scrub: modify OMAP stats collection

Replace the separate all-chunk loop with object-by-object
handling. Use the selected authoritative version of the object
as the info source.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
9 days agoMerge pull request #64859 from ronen-fr/wip-rf-72426movedref
Ronen Friedman [Sat, 6 Sep 2025 07:21:59 +0000 (10:21 +0300)]
Merge pull request #64859 from ronen-fr/wip-rf-72426movedref

osd/scrub: avoid using moved-from auth_n_errs

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Alex Ainscow <aainscow@uk.ibm.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
9 days agoMerge pull request #64963 from edwinzrodriguez/ceph-wip-72452
anrao19 [Sat, 6 Sep 2025 04:59:14 +0000 (10:29 +0530)]
Merge pull request #64963 from edwinzrodriguez/ceph-wip-72452

test/rgw: Refactor CORS configuration handling for S3

9 days agoMerge pull request #64801 from JoshuaGabriel/wip-jblanch/rgw-reindex
anrao19 [Sat, 6 Sep 2025 04:58:52 +0000 (10:28 +0530)]
Merge pull request #64801 from JoshuaGabriel/wip-jblanch/rgw-reindex

rgw: auto-create missing bucket index shards during reindex operations

10 days agoMerge pull request #65414 from ceph/fix-api-tests
Dan Mick [Sat, 6 Sep 2025 00:41:21 +0000 (17:41 -0700)]
Merge pull request #65414 from ceph/fix-api-tests

pybind/mgr/dashboard: Use teuthology's actual requirements

10 days agopybind/mgr/dashboard: Use teuthology's actual requirements 65414/head
David Galloway [Fri, 5 Sep 2025 17:58:43 +0000 (13:58 -0400)]
pybind/mgr/dashboard: Use teuthology's actual requirements

Signed-off-by: David Galloway <david.galloway@ibm.com>
10 days agoMerge pull request #65011 from irq0/pr/fix-barbican-deprecated-call
Marcel Lauhoff [Fri, 5 Sep 2025 15:39:28 +0000 (17:39 +0200)]
Merge pull request #65011 from irq0/pr/fix-barbican-deprecated-call

rgw: Replace deprecated Barbican payload endpoint with alternative

10 days agodoc: update Grafana certificate configuration to use certmgr 65406/head
Redouane Kachach [Fri, 5 Sep 2025 09:11:41 +0000 (11:11 +0200)]
doc: update Grafana certificate configuration to use certmgr

With the introduction of certmgr, users must register their certificates
via `ceph orch certmgr cert set --hostname ...` instead of the old
config-key method. The updated docs clarify that Grafana certificates
are host-scoped and can only be provided by reference (or default to
cephadm-signed).

Signed-off-by: Redouane Kachach <rkachach@ibm.com>
11 days agoscript: add script to help format QA review summaries 64463/head
Laura Flores [Fri, 11 Jul 2025 17:24:44 +0000 (13:24 -0400)]
script: add script to help format QA review summaries

Run the following command for help:
```
   qa-summary.sh --help
```

This is how the script can be used:
```
   qa-summary.sh < test_failure_tickets.txt
```

Before running the script, prep a 'test_failure_tickets.txt' file
(name is subjective) containing links to all the tracker tickets
you want to format in your test failures summary.

For example:
```
$ cat test_failure_tickets.txt
  https://tracker.ceph.com/issues/68586
  https://tracker.ceph.com/issues/69827
  https://tracker.ceph.com/issues/67869
  https://tracker.ceph.com/issues/71344
  https://tracker.ceph.com/issues/70669
  https://tracker.ceph.com/issues/71506
  https://tracker.ceph.com/issues/71182
```

Signed-off-by: Laura Flores <lflores@ibm.com>
11 days agoqa/tasks/thrashosds-health: fine tune ignorelist for degraded and undersized pgs 64844/head
Laura Flores [Thu, 4 Sep 2025 21:02:05 +0000 (16:02 -0500)]
qa/tasks/thrashosds-health: fine tune ignorelist for degraded and undersized pgs

These warnings, part of the Ceph health detail, are expected in osd thrashing
tests. In the original test failure, the cluster eventually got back to
HEALTH_OK after the thrashing task completed.

Fixes: https://tracker.ceph.com/issues/72312
Signed-off-by: Laura Flores <lflores@ibm.com>
Conflicts:
qa/tasks/thrashosds-health.yaml - OSD_DOWN warnings were not present when the commit was first made

11 days agoMerge pull request #65376 from phlogistonjohn/jjm-bwc-npmdir-fix 65382/head
David Galloway [Thu, 4 Sep 2025 17:56:10 +0000 (13:56 -0400)]
Merge pull request #65376 from phlogistonjohn/jjm-bwc-npmdir-fix

build-with-container: ensure npm dir is set up before configure

11 days agoMerge pull request #65062 from adamemerson/wip-redirect-error
Adam Emerson [Thu, 4 Sep 2025 17:52:24 +0000 (13:52 -0400)]
Merge pull request #65062 from adamemerson/wip-redirect-error

common/async: improved `redirect_error`, `run_coro`, and tests

Reviewed-by: Anthony D'Atri anthony.datri@gmail.com
Reviewed-by: Casey Bodley cbodley@redhat.com
Reviewed-by: Jesse F. Williamson jfw@ibm.com
11 days agoMerge pull request #64997 from AliMasarweh/wip-alimasa-72398
Ali Masarwa [Thu, 4 Sep 2025 13:17:24 +0000 (16:17 +0300)]
Merge pull request #64997 from AliMasarweh/wip-alimasa-72398

RGW | fixed enqueueing the overwritten object for gc

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
11 days agoMerge pull request #64977 from MaxKellermann/includes3
Max Kellermann [Thu, 4 Sep 2025 12:53:26 +0000 (14:53 +0200)]
Merge pull request #64977 from MaxKellermann/includes3

Add missing includes to many subsystems

Reviewed-by: Adam Emerson <aemerson@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
11 days agomgr/dashboard: fix RGW Bucket Notification Dashboard units 65384/head
Aashish Sharma [Thu, 4 Sep 2025 08:10:00 +0000 (13:40 +0530)]
mgr/dashboard: fix RGW Bucket Notification Dashboard units

Fixes: https://tracker.ceph.com/issues/72868
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
11 days agoMerge pull request #65120 from leiwen2025/wip-add-rv64-high-precision-counter-support
Kefu Chai [Thu, 4 Sep 2025 05:44:30 +0000 (13:44 +0800)]
Merge pull request #65120 from leiwen2025/wip-add-rv64-high-precision-counter-support

common/Cycles: Add high-precision counter support for riscv64

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
11 days agoMerge pull request #65325 from zdover23/wip-doc-2025-09-02-cephfs-troubleshooting...
Zac Dover [Thu, 4 Sep 2025 02:31:55 +0000 (12:31 +1000)]
Merge pull request #65325 from zdover23/wip-doc-2025-09-02-cephfs-troubleshooting-consistent-cache

doc/cephfs: edit troubleshooting.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
12 days agoMerge pull request #65215 from athanatos/sjust/wip-store-bench-3
Samuel Just [Thu, 4 Sep 2025 01:28:09 +0000 (18:28 -0700)]
Merge pull request #65215 from athanatos/sjust/wip-store-bench-3

crimson: further improvements to store-bench

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>
12 days agoMerge pull request #65016 from jamiepryde/isal-auto-switch-to-cauchy
SrinivasaBharathKanta [Wed, 3 Sep 2025 23:23:51 +0000 (04:53 +0530)]
Merge pull request #65016 from jamiepryde/isal-auto-switch-to-cauchy

erasure-code: Use ISA-L cauchy if reed_sol_van does not support the specified K and M values

12 days agoMerge pull request #64845 from ljflores/wip-tracker-70468
SrinivasaBharathKanta [Wed, 3 Sep 2025 23:22:31 +0000 (04:52 +0530)]
Merge pull request #64845 from ljflores/wip-tracker-70468

qa/suites/orch/cephadm: ignore intentional OSD_DOWN warning from test_host_drain task

12 days agoMerge pull request #65371 from cbodley/wip-72361
David Galloway [Wed, 3 Sep 2025 21:57:03 +0000 (17:57 -0400)]
Merge pull request #65371 from cbodley/wip-72361

cmake: remove _FORTIFY_SOURCE define

12 days agobuild-with-container: ensure npm dir is set up before configure 65376/head
John Mulligan [Thu, 28 Aug 2025 23:39:06 +0000 (19:39 -0400)]
build-with-container: ensure npm dir is set up before configure

When the npm cache path option is passed the npm cache dir is passed
to all container `run` commands, ensure the dir has been created
before the first container command (configure) is used.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
12 days agoMerge pull request #64863 from leonidc/cleanup_pending_map
Yuri Weinstein [Wed, 3 Sep 2025 18:36:58 +0000 (11:36 -0700)]
Merge pull request #64863 from leonidc/cleanup_pending_map

nvmeofgw: cleanup pending map upon monitor restart

Reviewed-by: Samuel Just <sjust@redhat.com>
12 days agoMerge pull request #64764 from badone/wip-tracker-72337-mgr-drop-daemonstateindex...
Yuri Weinstein [Wed, 3 Sep 2025 18:34:32 +0000 (11:34 -0700)]
Merge pull request #64764 from badone/wip-tracker-72337-mgr-drop-daemonstateindex-lock

mgr/DaemonState: Minimise time we hold the DaemonStateIndex lock

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>