Redouane Kachach [Fri, 28 Nov 2025 08:38:45 +0000 (09:38 +0100)]
mgr/cephadm: Fix mgmt-gateway default port in get_port_start()
The mgmt-gateway port was already defaulted to 443 in most places, but
get_port_start() did not apply this default. Since the output of
get_port_start() is used both to configure the daemon ports which are
later used to to open them in firewalld, this inconsistency meant the
HTTPS port was not opened when firewalld service was active.
This change makes get_port_start() also default to port 443, ensuring
the daemon is configured correctly and the corresponding firewalld port
is opened as expected.
Kefu Chai [Mon, 9 Feb 2026 02:09:14 +0000 (10:09 +0800)]
doc: update mgr module command documentation for per-module registries
Update documentation to reflect the new per-module command registry
pattern introduced in PR #66467. The old global CLICommand decorators
have been replaced with module-specific registries.
Changes:
- doc/mgr/modules.rst: Rewrite CLICommand section with setup guide,
update all examples to use AntigravityCLICommand pattern
- src/pybind/mgr/object_format.py: Add note explaining per-module
registries and update all decorator examples
- doc/dev/developer_guide/dash-devel.rst: Update dashboard plugin
examples to use DBCLICommand
All examples now correctly show:
- Creating registry with CLICommandBase.make_registry_subtype()
- Using module-specific decorator names (e.g., @StatusCLICommand.Read)
- Setting CLICommand class attribute for framework registration
rgw/dedup: split-head mechanism
Split head object into 2 objects - one with attributes and no data and
a new tail-object with only data.
The new-tail object will be deduped (unlike the head objects which can't
be dedup)
We will split head for objects with size 16MB or less
A few extra improvemnts:
Skip objects created by server-side-copy
Use reftag for comp-swap instead of manifest
Skip shared-manifest objects after readint attributes
Made max_obj_size_for_split and min_obj_size_for_dedup config value in
rgw.yaml.in
refined test: validate size after dedup
TBD: add rados ls -l to report object size on-bulk to speedup the process
improved tests - verify refcount are working, validate objects, remove
duplicates and then verify the last remaining object making sure it was
not deleted
Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
Kotresh HR [Fri, 6 Mar 2026 07:28:38 +0000 (12:58 +0530)]
tools/cephfs_mirror: Remove additional wait in pop_dataq_entry
An additional wait has sneaked in while popping job from
syncm's data_q. When the conditional wait was converted to
timed wait as part of f6a6e781b887b01a640d6321a2c085577d9ba07e,
this should have been removed. The extra wait causes no
harm in most of the workflow but might cause issues when
the mirror daemon is stopped. So it should be removed.
Ville Ojamo [Thu, 5 Mar 2026 06:02:55 +0000 (13:02 +0700)]
doc: Fix link and improve Crimson doc
Fix Seastar external link that was not working.
Capitalize consistently as Crimson, SeaStore in text.
Fix typos including in a label and in a ref using it.
Wrap text at column 80.
Remove unused highlight directive.
Fix article and hyphenation.
Try to reduce amount of commas in text and improve language.
Use already existing label and ref instead of section title for link.
Use confval role for configuration keys in text.
Use an autoclass reference instead of hardcoding URL.
Trim spaces at end of lines and convert tabs to spaces.
Use a colon instead of a hyphen pretending to be an em dash.
Signed-off-by: Ville Ojamo <git2233+ceph@ojamo.eu>
ShreeJejurikar [Thu, 26 Feb 2026 07:57:55 +0000 (13:27 +0530)]
rgw: add bucket logging pytest suite
Add a pytest-based test suite for RGW bucket logging that exercises the
radosgw-admin bucket logging CLI commands (list, info, flush) and
verifies the associated S3-level cleanup behavior.
John Mulligan [Thu, 5 Mar 2026 13:30:37 +0000 (08:30 -0500)]
Merge pull request #67571 from phlogistonjohn/jjm-smb-remotectl-local
smb: add remote-control local mode feature
Reviewed-by: Adam King <adking@redhat.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Anoop C S <anoopcs@cryptolab.net> Reviewed-by: Xavi Hernandez <xhernandez@gmail.com>
Ville Ojamo [Thu, 5 Mar 2026 09:02:42 +0000 (16:02 +0700)]
doc: Improve start/quick-rbd.rst
Remove mention of FAQ with a broken link.
Use ref for intra-docs links and add labels in destination documents.
Promptify all CLI example commands.
Use standard angle brackets for mandatory arguments in commands.
Remove an unused external link definition.
Trim spaces at end of lines and convert tabs to spaces.
Signed-off-by: Ville Ojamo <git2233+ceph@ojamo.eu>
John Mulligan [Mon, 23 Feb 2026 17:23:06 +0000 (12:23 -0500)]
cephadm: add support for a remote control local socket
It's not an oxymoron, it's Remote Control Local Socket (tm)!
This allows processes on the ceph host to use a unix domain socket
without mTLS to communicate with the remote control sidecar server
in the samba service.
At the higher level We treat the 2nd listener as a "feature" even
though it really configures the same sidecar as "remote-contol".
This way it's easy to have one of "remote-control",
"remote-control-local" or both in the service spec configuring the
smb service.
NOTE: This service does have the ability to verify that the client has
admin-ish access to ceph services by needing the client to pass
the ceph user name and key over the grpc headers.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 4 Mar 2026 17:44:32 +0000 (12:44 -0500)]
Merge pull request #67534 from phlogistonjohn/jjm-smb-debug-opts
smb: add debug level options to smb cluster resource
Reviewed-by: Xavi Hernandez <xhernandez@gmail.com> Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Anoop C S <anoopcs@cryptolab.net> Reviewed-by: Adam King <adking@redhat.com>
Ramana Raja [Mon, 29 Dec 2025 22:17:28 +0000 (17:17 -0500)]
mgr/rbd_support: Stagger mirror snapshot and trash purge schedules
Previously, multiple images or namespaces scheduled with the same
interval ran mirror snapshots or trash purges at around the same time,
creating spikes in cluster activity.
This change staggers scheduled jobs by:
- Adding a deterministic phase offset per image or namespace when no
start-time is set.
- Picking a random element from the queue at each scheduled time, rather
than always the first.
Together, these changes spread snapshot and trash purge operations more
evenly over time and improve cluster stability.
Fixes: https://tracker.ceph.com/issues/74288 Signed-off-by: Ramana Raja <rraja@redhat.com>
John Mulligan [Mon, 2 Mar 2026 21:09:16 +0000 (16:09 -0500)]
mgr/smb: reimplement part of the _search_resources function
Reimplement part of the _search_resources function to avoid using yet
another static mapping between the SMBResource type and it's partner
entry type which is one more place you forget to update when you
add a new type. Now, the type mapping is based on the matcher class
and the typ mapping function provided by the internal.py module.
Fixes: 5712016c2133870da3f704d8457358ad06efc87f Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Mon, 2 Mar 2026 21:07:54 +0000 (16:07 -0500)]
mgr/smb: rename func to map_resource_entry to make it public
Rename the _map_resource_entry to map_resource_entry to make it a public
function and enable easier dynamic mapping between smb resource types
and their partner entry types.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Kotresh HR [Wed, 18 Feb 2026 10:48:51 +0000 (16:18 +0530)]
mgr/mirroring: json pretty formatting
The 'daemon status' and 'peer_list' command
out don't support json-pretty format and isn't reader
friendly. This patch adds support of 'json-pretty'
when format='json-pretty' is passed.
sajibreadd [Thu, 9 Oct 2025 11:48:35 +0000 (13:48 +0200)]
mds: scrub pins more inodes than the mds_cache_memory_limit
For scrubbing dirfrag we are pushing children back into the scrub stack. Instead we can follow the same
strategy for scrub directory and pushing children front of the scrub stack, and in kick_off_scrubs always
start scrubbing from the front of the stack. It will prevent ScrubStack to pinning whole level of the file-system
tree.
Ilya Dryomov [Sun, 1 Mar 2026 21:55:52 +0000 (22:55 +0100)]
qa/workunits/rbd: short-circuit status() if "ceph -s" fails
In mirror-thrash tests, status() can be invoked after one of the
clusters is effectively stopped due to a watchdog bark:
2026-03-01T22:27:38.633 INFO:tasks.daemonwatchdog.daemon_watchdog:thrasher.rbd_mirror.[cluster2] failed
2026-03-01T22:27:38.633 INFO:tasks.daemonwatchdog.daemon_watchdog:BARK! unmounting mounts and killing all daemons
...
2026-03-01T22:32:46.964 INFO:tasks.workunit.cluster1.client.mirror.trial199.stderr:+ status
2026-03-01T22:32:46.964 INFO:tasks.workunit.cluster1.client.mirror.trial199.stderr:+ local cluster daemon image_pool image_ns image
2026-03-01T22:32:46.964 INFO:tasks.workunit.cluster1.client.mirror.trial199.stderr:+ for cluster in ${CLUSTER1} ${CLUSTER2}
In this scenario all commands that are invoked from the loop body
are going to time out anyway.
Ilya Dryomov [Sun, 1 Mar 2026 16:45:51 +0000 (17:45 +0100)]
qa: rbd_mirror_fsx_compare.sh doesn't error out as expected
In mirror-thrash tests, one of the clusters can be effectively stopped
due to a watchdog bark while rbd_mirror_fsx_compare.sh is running and is
in the middle of the "wait for all images" loop:
In this scenario "rbd ls" is going to time out repeatedly, turning the
loop into up to a ~60-hour sleep (up to 720 iterations with a 5-minute
timeout + 10-second sleep per iteration).
Victoria Mackie [Fri, 13 Feb 2026 21:40:01 +0000 (21:40 +0000)]
dashboard: add location field to NVMeoF namespace and gateway group APIs
Namespace location:
- Add location field to Namespace model in nvmeof.py
- Add location parameter to PATCH /api/nvmeof/subsystem/{nqn}/namespace/{nsid}
- Location can now be retrieved via GET and set via PATCH
Gateway group locations:
- Add locations array to gateway group endpoint response
- Extract locations from all gateways in a service group
- Add _get_gateway_locations() helper method using nvme-gw show command
- Locations appear in placement.locations for each service
Signed-off-by: Victoria Mackie <victoriam@uk.ibm.com>
```
283/322 Test #301: run-tox-qa ................................***Failed 92.31 sec
...
flake8: install_deps /ceph/qa> python -I -m pip install flake8
flake8: commands[0] /ceph/qa> flake8 --select=F,E9 --exclude=venv,.tox
./tasks/keycloak.py:51:5: F841 local variable 'os_version' is assigned to but never used
```
Remove the unused os_version assignment to fix flake8 F841 in run-tox-qa.
Ville Ojamo [Mon, 2 Mar 2026 08:26:47 +0000 (15:26 +0700)]
doc/start: Update and fix get-involved.rst
Remove not existing Planet Ceph, Wiki, Commit List rows.
Update Kernel Client, QA, Community mailing list links to working ones.
Use https instead of http.
Fix Ceph calendar link and split the old contribute guide link to a
separate table row.
Remove not working lists.ceph.com external link definition now that it
is unused.
Sort external link definitions in order of use.
Fix invalid space after a hyphen by rewrapping text.
Update Slack invite link.
Signed-off-by: Ville Ojamo <git2233+ceph@ojamo.eu>
Shraddha Agrawal [Wed, 25 Feb 2026 11:31:26 +0000 (17:01 +0530)]
doc: reformat crimson docs
This commit rearranges crimson docs so the deployment steps are inorder
to how they are supposed to be executed. Also, it removed `crimson-osd`
referneces as that is an internal detail that users don't need to be
aware of.