git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log

]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log

projects / ceph.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Kefu Chai [Mon, 11 May 2026 05:07:25 +0000 (13:07 +0800)]

crimson: use uint32_t when calling ioctl(BLKGETNRZONES)

before this change, we pass a pointer to a `size_t` to
ioctl(BLKGETNRZONES), but in the Linux kernel,
include/uapi/linux/blkzoned.h:

```c
#define BLKGETNRZONES _IOR(0x12, 133, __u32)
```
this API reads 32 bits of data into the pointer. on 64-bit
architectures, size_t is 64 bits. fortunately, we initialize
nr_zones with 0, so the upper 32 bits remain zero. this works
on little-endian systems, but not on big-endian systems. it is
also semantically wrong. we should pass a pointer to a 32-bit
value when calling ioctl(BLKGETNRZONES).

in this change, we change the type of nr_zones from size_t to
uint32_t to match what the Linux kernel expects.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Kefu Chai [Mon, 11 May 2026 04:43:47 +0000 (12:43 +0800)]

crimson: coroutinize SegmentManager::get_segment_manager()

this change was inspired by following warning:

```
[1/3] Building CXX object src/crimson/os/seastore/CMakeFiles/crimson-seastore.dir/segment_manager.cc.o
/home/kefu/dev/ceph/src/crimson/os/seastore/segment_manager.cc:45:15: warning: lambda capture 'FNAME' is not used [-Wunused-lambda-capture]
45 | ).then([FNAME,
| ^
```

but we went further by coroutinize the whole method. because the return
value of ioctl() is not checked before this change, and clang correctly
flagged this with a warning, we marker it with `[[maybe_unused]]`, we
will fix it in a separate change.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

SrinivasaBharathKanta [Mon, 11 May 2026 23:16:29 +0000 (04:46 +0530)]

Merge pull request #68131 from rzarzynski/wip-ec-asserted-isa-prepare

ec: validate tcache retrievals in ErasureCodeIsaDefault::prepare()

commit | commitdiff | tree

SrinivasaBharathKanta [Mon, 11 May 2026 23:11:23 +0000 (04:41 +0530)]

Merge pull request #68609 from aainscow/attr_rollback_fix

osd: Fix incorrect rollback logic for partial write OI

commit | commitdiff | tree

SrinivasaBharathKanta [Mon, 11 May 2026 23:09:57 +0000 (04:39 +0530)]

Merge pull request #67292 from JonBailey1993/stats_fix_part_2

osd: Reduce pg_stats invalidations occurring in fast ec

commit | commitdiff | tree

Redouane Kachach [Mon, 11 May 2026 19:29:37 +0000 (21:29 +0200)]

Merge pull request #66559 from timqn22/crash-dir-permission-setting

src/cephadm: updated crash dir creation

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Redouane Kachach <rkachach@ibm.com>

commit | commitdiff | tree

Redouane Kachach [Mon, 11 May 2026 19:27:53 +0000 (21:27 +0200)]

Merge pull request #68848 from rkachach/fix_issue_76511

qa/cephadm: start upgrade tests from tentacle instead of reef on main

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Laura Flores [Mon, 11 May 2026 18:41:49 +0000 (13:41 -0500)]

Merge pull request #68365 from kamoltat/wip-ksirivad-fix-75418

qa/suites/upgrade: ignore PG_DAMAGED

Reviewed-by: Laura Flores <lflores@ibm.com>

commit | commitdiff | tree

Laura Flores [Mon, 11 May 2026 18:35:47 +0000 (13:35 -0500)]

Merge pull request #67915 from falconlee236/fix-osd-df-sorting-main

mon/PGMap: sort 'osd df' and 'osd perf' outputs by OSD ID

Reviewed-by: Kamoltat Sirivadhna <ksirivad@redhat.com>

commit | commitdiff | tree

Laura Flores [Mon, 11 May 2026 18:33:33 +0000 (13:33 -0500)]

Merge pull request #68326 from ljflores/wip-tracker-75763

qa/suites/rados/encoder: remove rocky from supported distros

Reviewed-by: Kamoltat Sirivadhna <ksirivad@redhat.com>

commit | commitdiff | tree

Laura Flores [Mon, 11 May 2026 18:15:14 +0000 (13:15 -0500)]

Merge pull request #67715 from NitzanMordhai/wip-nitzan-is_pg_clean-hang-after-teardown

test/ceph-helpers: add timeout to ceph pg query

Reviewed-by: Radosław Zarzyński <rzarzyns@redhat.com>

commit | commitdiff | tree

Sage McTaggart [Mon, 11 May 2026 18:10:57 +0000 (14:10 -0400)]

Merge pull request #68847 from ceph/wip-doc-SageMcTSecurityCSC

docs/security: added workinggroup.rst and securitylead.rst

commit | commitdiff | tree

John Mulligan [Mon, 11 May 2026 17:10:50 +0000 (13:10 -0400)]

Merge pull request #68401 from phlogistonjohn/jjm-pypkg

build: Update python packaging for src/python-common

Reviewed-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Redouane Kachach [Mon, 11 May 2026 16:32:31 +0000 (18:32 +0200)]

Merge pull request #68747 from Kushal-deb/fix-nvmeof-apply-path

mgr/cephadm: allow nvmeof group assignment for NVMe-oF services

Reviewed-by: Shweta Bhosale <Shweta.Bhosale1@ibm.com>

commit | commitdiff | tree

Sage McTaggart [Mon, 11 May 2026 14:58:57 +0000 (10:58 -0400)]

docs/security: added workinggroup.rst and securitylead.rst
Signed-off-by: Sage McTaggart <sagemct@ibm.com>

commit | commitdiff | tree

Redouane Kachach [Mon, 11 May 2026 16:27:57 +0000 (18:27 +0200)]

Merge pull request #67344 from dermalikmann/fix-mgmt-gateway-use-vip

mgr/cephadm: mgmt-gateway bind to virtual_ip

Reviewed-by: Redouane Kachach <rkachach@ibm.com>

commit | commitdiff | tree

Pedro Gonzalez Gomez [Thu, 7 May 2026 19:36:32 +0000 (21:36 +0200)]

mgr: add prerequisites check before enabling dashboard oauth2 sso

Assisted-by: Claude:claude-4.6-sonnet
Fixes: https://tracker.ceph.com/issues/76476
Signed-off-by: Pedro Gonzalez Gomez <pegonzal@ibm.com>

commit | commitdiff | tree

Kamoltat (Junior) Sirivadhna [Wed, 22 Apr 2026 23:28:18 +0000 (23:28 +0000)]

src/test/mon: test_monmap_monitor.cc

Added the following test cases:
- Test success when explicitly supplied tiebreaker
- Test success when auto-selecting tiebreaker monitor
- Test success with minimal valid configuration (1 monitor per zone)
- Test success with auto-selection and minimal config (1 monitor per zone)
- Test success when strategy is automatically changed to CONNECTIVITY
- Test failure when auto-selecting and tiebreaker is in a data zone
- Test failure when explicitly specifying tiebreaker in a data zone
- Test failure when multiple potential tiebreakers exist
- Test failure when one data zone has 0 monitors
- Test failure when tiebreaker monitor doesn't exist

Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>

commit | commitdiff | tree

Kamoltat (Junior) Sirivadhna [Wed, 22 Apr 2026 17:55:13 +0000 (17:55 +0000)]

doc: update stretch-mode.rst

1. enable_stretch_mode no longer require to supply tiebreaker mon
2. enable_stretch_mode will automatically set monitor election strategy
to Connectivity if not already set.
3. Move away from "sites" and use "zones" instead throughout the doc

Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>

commit | commitdiff | tree

Kamoltat (Junior) Sirivadhna [Wed, 15 Apr 2026 22:34:08 +0000 (22:34 +0000)]

mon: make tiebreaker mon optional in stretch-mode

Motivation:
To future support EC stretch feature, we need to
simplify how we enable stretch-mode

Solution:
Make tiebreaker argument optional.

old:
```
ceph mon enable_stretch_mode <tiebreaker_mon> <new_crush_rule>
<dividing_bucket>
```
new:
ceph mon enable_stretch_mode <tiebreaker_mon (optional)> <new_crush_rule>
<dividing_bucket>

Ceph will try to select a tiebreaker mon that resides in
the crush <dividing_bucket> type but doesn't belong
to any of the data sites which the OSDs resides in.

Also created a helper function
`MonmapMonitor::validate_and_enable_stretch_mode`
inside `MonmapMonitor::try_enable_stretch_mode`
making the logic unittestable

Moreover, ceph mon enable_stretch_mode will
automatically set monitor election strategy to Connectivity.

We now also enforce that at least 1 monitor exists for each data zone.

Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 19 Feb 2026 16:12:49 +0000 (11:12 -0500)]

qa/cephadm: start upgrade tests from tentacle instead of reef on main

Since main is what will become umbrella at this point

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit a137bccbf4330a88f17588d465218165ec03b009)

commit | commitdiff | tree

Ronen Friedman [Sun, 3 May 2026 07:11:40 +0000 (07:11 +0000)]

Crimson/QA: add crimson-fixed-2-mem32g reduced-memory cluster variants

Add QA thrash cluster variants (crimson-fixed-2-mem32g) that set
crimson_memory to 32G or 16GB.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Sun, 3 May 2026 07:10:34 +0000 (07:10 +0000)]

crimson/osd: add crimson_memory config option

Add a crimson_memory configuration option that maps to seastar's
--memory flag, allowing control over the per-OSD seastar memory
allocation. Default is 0, meaning seastar uses its default behavior.

This reduces core dump sizes in test environments (from ~117GB to
~32GB with crimson_memory=32G) by limiting the pre-allocated seastar
memory pool.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Sat, 9 May 2026 02:28:12 +0000 (22:28 -0400)]

script/ptl-tool: continue adding conflicts to review when interactive

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Patrick Donnelly [Fri, 8 May 2026 01:55:34 +0000 (21:55 -0400)]

script/ptl-tool: improve wording for rationale requests

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Patrick Donnelly [Fri, 8 May 2026 01:35:26 +0000 (21:35 -0400)]

script/ptl-tool: refactor verify_commit_parity

Some simple changes to break it into pieces.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Patrick Donnelly [Fri, 8 May 2026 01:29:53 +0000 (21:29 -0400)]

script/ptl-tool: replace gitauth redirection

Just use the class.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 6 May 2026 20:50:42 +0000 (16:50 -0400)]

doc: document the releng-audit workflow and update release examples

Updates SubmittingPatches-backports.rst and development-workflow.rst to
capture the new automated backport auditing system.

Changes include:
- Documenting the CI checks for commit parity, conflict simulation, and
  Redmine linkage.
- Explaining how PR authors can trigger a retest (`/audit retest`) and
  how leads can bypass the audit (`/audit override`).
- Replacing outdated `master` branch references with `main`.
- Updating the release cycle timeline to reflect current releases
  (Quincy, Reef, Squid, Tentacle).
- Removing obsolete instructions for manual PR labeling.

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Patrick Donnelly [Mon, 4 May 2026 22:12:26 +0000 (15:12 -0700)]

script/ptl-tool, actions: introduce event-driven CI backport auditing

This commit entirely restructures the `ptl-tool.py` script and introduces a new GitHub Actions workflow to enable event-driven, asynchronous execution of backport audits.

New GitHub Actions Workflow (`releng-audit.yaml`):
* Implements a state-machine workflow using the `pull_request_target` event for secure execution.
* Introduces an "anti-spam" push shield that halts automated checks and blocks merging if the PR already has a `releng-audit-fail` label.
* Allows developers to easily re-trigger audits by removing the failure label or commenting `/audit retest`.
* Provides an `/audit override` mechanism exclusively for the `ceph-release-manager` team to bypass checks on valid conflict resolutions.

Key Architectural Changes to `ptl-tool.py`:
* Strategy Pattern Refactor: Decoupled the monolithic `verify_pr_readiness` block into modular, extensible classes (`MergeConflictCheck`, `CommitParityCheck`, `ConflictSimulationCheck`, `RedmineLinkageCheck`) conforming to `BaseAuditCheck`.
* AuditContext & Shared State: Replaced the cumbersome 9-argument function signatures with a unified `AuditContext` dataclass.
* Consolidated Error Reporting: Introduced `AuditReport` to collect failures across all checks. In `--ci-mode`, it bundles these failures into a single, consolidated GitHub `REQUEST_CHANGES` review to prevent shadowing and PR comment spam.
* Automated Label Management: Added `--audit-label` parsing to dynamically swap queue/pass/fail labels via the GitHub API during CI runs.

Miscellaneous workflow enhancements:
* Added `--integration` switch for the "Daily Driver" workflow. It auto-detects the target base branch, sets standard release flags, skips conflict simulation, and enforces `--always-fetch`.
* Updated QA Tracker creation/update logic to set Redmine Custom Field IDs directly rather than relying solely on description text.
* Replaced `--release-merge` with `--final-merge` for clarity.
* Introduced `--dry-run` to safely preview GitHub API calls, Redmine updates, and Git operations without altering remote state.
* Added `--examples` flag detailing advanced CLI usage.
* Implemented a local HTML tab-launcher to bypass Firefox race conditions when opening multiple browser tabs via the command line.

Assisted-by: Gemini
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Thu, 30 Apr 2026 18:36:30 +0000 (11:36 -0700)]

script/ptl-tool: introduce interactive backport parity and conflict verification

This patch significantly expands `ptl-tool.py` to automate and improve the
backport review process. It adds robust interactive checks to verify that all
commits from an original PR are correctly represented in a backport PR.

Key additions:

* Commit Parity Verification: Analyzes the local Git DAG to ensure all
  cherry-picked commits map properly to the original main PRs, generating a
  visual map of the commits.

* Conflict Simulation: Creates temporary, detached worktrees to dry-run the
  cherry-pick sequence, verifying conflict resolutions dynamically.

* Automated GitHub Reviews: Enables maintainers to open an editor, preview
  markdown drafts, and post `REQUEST_CHANGES` reviews detailing missing commits
  or backport deviations directly to the pull request.

* Interactive Diffing: Provides file-specific 3-pane patch comparisons
  (range-diffs, original patch, backport patch) within the terminal editor during
  conflict investigations.

* New CLI Flags: Introduces `--audit`, `--always-fetch`,
  `--skip-conflict-check`, and `--release-merge` for greater control over the
  script's behavior.

Assisted-by: Gemini
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Thu, 30 Apr 2026 18:35:25 +0000 (11:35 -0700)]

script/ptl-tool: use Authorization header

Replaces basic authentication with the Authorization: Bearer <token>
header, obviating the need for the PTL_TOOL_GITHUB_USER environment
variable and adhering to modern GitHub API authentication standards.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Matan Breizman [Mon, 11 May 2026 11:35:12 +0000 (14:35 +0300)]

qa/suites/crimson-rados/rgw/sts/tasks/1-keycloak: dont install java-17-openjdk-headless

qa/tasks/keycloak.py already installs it per-os. See: https://github.com/ceph/ceph/pull/67578

Fixes: https://tracker.ceph.com/issues/76353
Signed-off-by: Matan Breizman <mbreizma@redhat.com>

commit | commitdiff | tree

Pedro Gonzalez Gomez [Thu, 7 May 2026 19:55:15 +0000 (21:55 +0200)]

mgr/dashboard: fix missing claims on oauth2 sso

Fixes: https://tracker.ceph.com/issues/76479
Signed-off-by: Pedro Gonzalez Gomez <pegonzal@ibm.com>

commit | commitdiff | tree

Pedro Gonzalez Gomez [Thu, 7 May 2026 19:44:30 +0000 (21:44 +0200)]

mgr/dashboard: raise exception on oauth2 sso expired token

Fixes: https://tracker.ceph.com/issues/76478
Signed-off-by: Pedro Gonzalez Gomez <pegonzal@ibm.com>

commit | commitdiff | tree

Neeraj Pratap Singh [Tue, 2 Dec 2025 12:20:52 +0000 (17:50 +0530)]

src/pybind/mgr: handle json-pretty for perf stats

Fixes: https://tracker.ceph.com/issues/74072
Signed-off-by: Neeraj Pratap Singh <Neeraj.Pratap.Singh1@ibm.com>

commit | commitdiff | tree

Matty Williams [Mon, 11 May 2026 10:01:59 +0000 (11:01 +0100)]

Merge pull request #67422 from MattyWilliams22/cls-fifo

cls/test: Stop the cls_fifo.get_info test from Segmentation Faulting

Reviewed-by: Alex Ainscow <aainscow@uk.ibm.com>
Reviewed-by: Bill Scales <bill_scales@uk.ibm.com>

commit | commitdiff | tree

Matan Breizman [Mon, 11 May 2026 09:57:56 +0000 (12:57 +0300)]

Merge pull request #68386 from tchaikov/wip-crimson-throttling

crimson/osd: acquire throttle when scanning replica/primary for backfill

Reviewed-by: Mohit Agrawal <moagrawa@redhat.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>

commit | commitdiff | tree

Kefu Chai [Mon, 11 May 2026 06:02:00 +0000 (14:02 +0800)]

Merge pull request #68834 from tchaikov/rgw-d4n-boost-1.91

rgw/d4n: fix deprecated async_run overload in RedisPool

Reviewed-by: Pritha Srivastava <prsrivas@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sun, 10 May 2026 18:39:24 +0000 (11:39 -0700)]

Merge pull request #68676 from JoshuaGabriel/msgr-activecon-perfcounter

msg/async: make msgr_active_connections counter a gauge to prevent underflow

https://tracker.ceph.com/issues/76440

commit | commitdiff | tree

Yuri Weinstein [Sun, 10 May 2026 18:39:02 +0000 (11:39 -0700)]

Merge pull request #53891 from Hu-Yuxuan/fix-bug-#63137

osd: Improved pg-upmap computing speed

https://tracker.ceph.com/issues/76440

commit | commitdiff | tree

Ronen Friedman [Thu, 7 May 2026 10:06:28 +0000 (10:06 +0000)]

crimson/osd/snap-mapper: flush pending writes on pg interval change

When a PG interval changes, the snap-mapper's MapCacher backend is
reset to clear any stale state. This change adds a flush-and-reset
method that first flushes any pending writes in the MapCacher into the
objectstore, then resets the backend.

Fixes: https://tracker.ceph.com/issues/76458
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Sun, 10 May 2026 12:40:05 +0000 (15:40 +0300)]

Merge pull request #68489 from ronen-fr/wip-rf-pintest-crimson

crimson/tests: fix test_remap_pin_concurrent

Reviewed-by: Matan Breizman <mbreizma@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sun, 10 May 2026 11:21:51 +0000 (19:21 +0800)]

Merge pull request #68812 from tchaikov/wip-doc-lower-require-min-compat

doc/rados: warn against lowering require_min_compat_client in read-balancer

Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Kefu Chai [Fri, 8 May 2026 01:34:12 +0000 (09:34 +0800)]

doc/rados: warn against lowering require_min_compat_client in read-balancer

Add a note that the value should not be lowered after being set, to
avoid accidentally breaking features that depend on a newer release.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Matan Breizman [Sun, 10 May 2026 08:03:49 +0000 (11:03 +0300)]

Merge pull request #68750 from tchaikov/wip-crimson-merge-coll

crimson/os: implement OP_MERGE_COLLECTION

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Xuehan Xu <xuxuehan@qianxin.com>

commit | commitdiff | tree

Nitzan Mordechai [Sun, 10 May 2026 06:51:08 +0000 (06:51 +0000)]

qa: ignore evicted client warnings for singletone bluestore

After adding mds client into singletone bluestore and increas debug
levels and bluestore_allocator=stupid client is slow enough that MDS
evicts it after ~302 seconds

Fixes: https://tracker.ceph.com/issues/76497
Signed-off-by: Nitzan Mordechai <nmordech@ibm.com>

commit | commitdiff | tree

dheart [Sun, 10 May 2026 01:53:35 +0000 (09:53 +0800)]

os/bluestore: check unshare blobs during snapshot deletion and fix unshare logic.

Signed-off-by: dheart <dheart_joe@163.com>

commit | commitdiff | tree

dheart [Sun, 10 May 2026 01:25:25 +0000 (09:25 +0800)]

os/bluestore: add fsck_with_stats to run fsck and return stats

Signed-off-by: dheart <dheart_joe@163.com>

commit | commitdiff | tree

Anthony D'Atri [Sat, 9 May 2026 19:02:24 +0000 (12:02 -0700)]

Merge pull request #67671 from bluikko/wip-doc-start-bg-improve

doc/start: Improve beginners-guide.rst

commit | commitdiff | tree

Adam C. Emerson [Fri, 1 May 2026 20:19:13 +0000 (16:19 -0400)]

rgw: Work around Boost.Containers bug in 1.91

Since we're inserting from another `flat_set`, we can use
`ordered_unique_range`and sidestep the issue entirely and it's more
efficient.

https://github.com/boostorg/container/issues/334

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sat, 9 May 2026 08:20:48 +0000 (16:20 +0800)]

Merge pull request #68833 from tchaikov/wip-rgw-lua-5.5

rgw: fix build with Lua 5.5

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sat, 9 May 2026 06:39:17 +0000 (14:39 +0800)]

rgw/d4n: fix deprecated async_run overload in RedisPool

The async_run overload taking a logger argument is deprecated since
Boost 1.89. Use the 2-arg async_run(config, token) overload when
building with Boost >= 1.89, and fall back to the 3-arg overload
for Boost 1.87-1.88.

See https://www.boost.org/doc/libs/1_89_0/libs/redis/doc/html/redis/reference/boost/redis/basic_connection/async_run-04.html

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Kefu Chai [Sat, 9 May 2026 05:45:32 +0000 (13:45 +0800)]

Merge pull request #68628 from MaxKellermann/msg__includes2

msg: include cleanup

Reviewed-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Kefu Chai [Sat, 9 May 2026 03:50:10 +0000 (11:50 +0800)]

rgw: fix build with Lua 5.5

Lua 5.5 added a `seed` parameter to `lua_newstate()`. Use a
preprocessor conditional to pass the extra argument when building
against Lua >= 5.5.

Since we require Lua >= 5.3 in src/CMakeLists.txt, we need to be
compatible with Lua 5.5 as well.

See https://www.lua.org/manual/5.5/manual.html#lua_newstate

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Adam Emerson [Fri, 8 May 2026 15:46:48 +0000 (11:46 -0400)]

Merge pull request #68702 from nhoad/compiler-warning

rgw: Move declaration inline to solve compiler warning about an unused variable

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>

commit | commitdiff | tree

John Mulligan [Fri, 8 May 2026 14:02:04 +0000 (10:02 -0400)]

Merge pull request #68522 from phlogistonjohn/jjm-smb-rm-all

smb: wildcard deletion of smb shares and clusters

Reviewed-by: Xavi Hernandez <xhernandez@gmail.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>

commit | commitdiff | tree

Jamie Pryde [Fri, 8 May 2026 13:52:54 +0000 (14:52 +0100)]

Merge pull request #68711 from jamiepryde/isal-arm-build-typo

cmake: Fix ISA-L build on arm

commit | commitdiff | tree

Adam Kupczyk [Thu, 7 May 2026 11:52:50 +0000 (11:52 +0000)]

os/bluestore: No more stray spanning blobs creation

Prevent ExtentMap::reshard_action() from creating stray spanning blobs.
Fixes: https://tracker.ceph.com/issues/75698
Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Adam Kupczyk [Mon, 4 May 2026 15:45:12 +0000 (15:45 +0000)]

test/bluestore: Test for creation of empty spanning blob

Test that produces empty blob after Blob::split.
Results in a stray spanning blob.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Adam Kupczyk [Thu, 7 May 2026 10:38:07 +0000 (10:38 +0000)]

os/bluestore: Cleanup around bluestore_blob_t::get_ondisk_length()

Split get_ondisk_length() into
- get_ondisk_capacity()
- get_ondisk_length()
The change is done to amplify the difference between disk space used,
and potential capacity for disk space use.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

commit | commitdiff | tree

Kefu Chai [Fri, 8 May 2026 13:38:28 +0000 (21:38 +0800)]

Merge pull request #68422 from tchaikov/wip-speedy-backport

script/ceph-backport: skip fetch if merge commit already exists locally

Reviewed-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Kefu Chai [Thu, 7 May 2026 02:59:11 +0000 (10:59 +0800)]

cmake,debian: enable ceph-mon-client-nvmeof on Debian derivatives

5843c6b04ba gated the build on /etc/redhat-release because gRPC devel
libs weren't packaged on Debian yet. They are now (libgrpc++-dev,
protobuf-compiler-grpc), so we can finally bring Debian derivatives on
par with Fedora/RHEL and ship the NVMe-oF gateway monitor client there
too.

This change:

- drops the /etc/redhat-release sniff and unconditionally enables
  WITH_NVMEOF_GATEWAY_MONITOR_CLIENT.
- adds libgrpc++-dev and protobuf-compiler-grpc to debian/control's
  Build-Depends, plus a ceph-mon-client-nvmeof / -dbg pair so the
  binary actually gets packaged.
- adds a pkg-config fallback for gRPC discovery. Jammy's libgrpc++-dev
  (1.30.2) ships no cmake config files [1], so find_package(gRPC CONFIG
  REQUIRED) fails at configure time. Noble's libgrpc++-dev (1.51.1)
  does ship them [2], as do RHEL/Rocky packages. We now try cmake config
  first (QUIET) and fall back to pkg_check_modules(IMPORTED_TARGET
  grpc++) when it isn't found.
- patches PkgConfig::GRPCPP to carry protobuf::libprotobuf as an
  interface dependency. grpc++.pc omits protobuf from its Requires, so
  gateway.pb.cc (which calls GOOGLE_PROTOBUF_VERIFY_VERSION) would fail
  to link on the pkg-config path. The cmake-config gRPC::grpc++ declares
  this dependency properly; we match that behaviour with
  target_link_libraries(PkgConfig::GRPCPP INTERFACE protobuf::libprotobuf).
- applies HAVE_ABSEIL only on the cmake-config path (Noble, RHEL/Rocky),
  where gRPC links system absl. Without it, opentelemetry-cpp's private
  absl (inline namespace otel_v1) collides with system absl (inline
  namespace debian7) in any TU that includes both tracer.h and
  grpcpp/grpcpp.h, giving "reference to base_internal is ambiguous".
  On the pkg-config path (Jammy's gRPC 1.30.2) libabsl-dev is not
  installed, so HAVE_ABSEIL must be skipped there.

[1] https://packages.ubuntu.com/jammy/amd64/libgrpc-dev/filelist
[2] https://packages.ubuntu.com/noble/amd64/libgrpc-dev/filelist

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Kefu Chai [Fri, 8 May 2026 08:20:34 +0000 (16:20 +0800)]

common/admin_socket: use POSIX timer for delayed signal delivery

AdminSocketRaise.AsyncReschedule was flaking on arm64:

  [ RUN      ] AdminSocketRaise.AsyncReschedule
  /ceph/src/test/admin_socket.cc:497: Failure
  Expected equality of these values:
    0
    sig2.count.load()
      Which is: 1
  [  FAILED  ] AdminSocketRaise.AsyncReschedule (1045 ms)

The old approach forked a child that polled the clock and called
kill() at the deadline.  The problem is: the deadline was computed
before fork(), so any fork latency ate directly into the timing
budget.  On a busy arm64 host that 50 ms margin just wasn't enough.

The fix is to let the kernel own the timer via timer_create(). No
child process, no polling, no fork overhead, and SIGCONT still works.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Aishwarya Mathuria [Fri, 8 May 2026 07:32:01 +0000 (13:02 +0530)]

crimson/osd: skip PGAdvanceMap on a deleted PG

A PGAdvanceMap queued by broadcast_map_to_pgs can sit behind in-flight
DeleteSome events on the peering pipeline holding a Ref<PG>. When it
finally runs, the collection has already been removed in seastore and
PGAdvanceMap drives handle_advance_map / check_for_splits on a stale
PG thereby issuing ops on a collection that no longer exists, crashing the OSD.

Following Classic OSD, set peering_state.set_delete_complete() in PG::do_delete_work's
final batch and bail out of PGAdvanceMap::start when pg->is_deleted() is true.

Fixes: https://tracker.ceph.com/issues/76447
Signed-off-by: Aishwarya Mathuria <amathuri@redhat.com>

commit | commitdiff | tree

Afreen Misbah [Mon, 20 Apr 2026 14:51:47 +0000 (20:21 +0530)]

mgr/dashboard: Allow quick bootstrap script to use custom images

- if no image provided default image will be used
- the -i flag was added but its not working. so this commit fixes that.
- exporting the CEPHADM_IMAGE in cpeh_cluster.yaml so that bootstrap script can utilize that
- passing image in cephadm shell

Signed-off-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Max Kellermann [Mon, 7 Oct 2024 02:59:38 +0000 (04:59 +0200)]

msg/async/Protocol: include cleanup

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>

commit | commitdiff | tree

Max Kellermann [Mon, 7 Oct 2024 02:51:31 +0000 (04:51 +0200)]

msg/async/AsyncConnection: include cleanup

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>

commit | commitdiff | tree

Max Kellermann [Mon, 7 Oct 2024 02:53:12 +0000 (04:53 +0200)]

msg/Connection: include cleanup

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>

commit | commitdiff | tree

Max Kellermann [Mon, 20 Apr 2026 10:07:09 +0000 (12:07 +0200)]

mon: add missing includes

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>

commit | commitdiff | tree

Ashwin M. Joshi [Fri, 8 May 2026 07:05:53 +0000 (12:35 +0530)]

mgr: Accept only osd daemon type for bucket params to adhere to upgrade sequence

Fixes: https://tracker.ceph.com/issues/75603
Signed-off-by: Ashwin M. Joshi <ashjosh1@in.ibm.com>

commit | commitdiff | tree

Kefu Chai [Fri, 8 May 2026 07:09:19 +0000 (15:09 +0800)]

Merge pull request #68790 from MaxKellermann/crimson__missing_includes

crimson: add missing includes

Reviewed-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Kefu Chai [Fri, 8 May 2026 00:55:03 +0000 (08:55 +0800)]

include/cpp-btree: fix false -Warray-bounds in child accessors

After inlining, GCC's VRP sees mutable_child() reaching a leaf-root
node whose static type only bounds values[], not children[], and fires
even though the if(!leaf()) guard prevents it at runtime:

btree.h:522: warning: array subscript [33, 287] is outside array
bounds of 'struct M[32]' [-Warray-bounds]

Decay children[] to a raw pointer in child()/mutable_child() so GCC
has no array bounds to check.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Kefu Chai [Fri, 8 May 2026 05:46:26 +0000 (13:46 +0800)]

doc/rados: add kernel client notes to require_min_compat_client

Add a "Min kernel" column to the "Features gated by the flag" table:
pg-upmap requires kernel 4.13, pg-upmap-primary and CRUSH MSR are not
yet implemented in the kernel client.

Add a warning noting the distinct failure modes (pg-upmap-primary causes
I/O hangs from silently dropped misdirected ops; CRUSH MSR rules cause
I/O errors because crush_find_rule fails) and that CRUSH MSR rules are
added via osd setcrushmap without require_min_compat_client gating but
immediately raise the features-in-use floor to squid.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

bluikko [Fri, 8 May 2026 05:10:16 +0000 (12:10 +0700)]

Merge pull request #68151 from bluikko/wip-doc-man-ceph-improvements1

doc/man: improve ceph.rst

commit | commitdiff | tree

bluikko [Fri, 8 May 2026 05:09:43 +0000 (12:09 +0700)]

Merge pull request #68156 from bluikko/wip-doc-man-spelling1

doc/man: fix spelling etc errors (1 of 2)

commit | commitdiff | tree

Kamoltat (Junior) Sirivadhna [Tue, 17 Mar 2026 19:52:56 +0000 (19:52 +0000)]

qa/suites/upgrade: ignore PG_DAMAGED

we can simply ignore this warning that pops up temporary on the logs,
since do check for active+clean in ceph.healthy

Fixes: https://tracker.ceph.com/issues/72424
Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>

commit | commitdiff | tree

Laura Flores [Thu, 7 May 2026 18:22:15 +0000 (13:22 -0500)]

Merge pull request #68637 from aainscow/do_read_assert

osd: Avoid assertion on empty object read when reading multiple objects

Reviewed-by: Bill Scales <bill_scales@uk.ibm.com>

commit | commitdiff | tree

Jon Bailey [Thu, 7 May 2026 13:49:42 +0000 (14:49 +0100)]

Merge pull request #68797 from JonBailey1993/fix-design-definition

doc: Clarification of text in ec stretch cluster design

Reviewed-by: Alex Ainscow <aainscow@uk.ibm.com>

commit | commitdiff | tree

Kefu Chai [Thu, 7 May 2026 12:43:28 +0000 (20:43 +0800)]

debian: package manpage for cephfs-top

the build already produces usr/share/man/man8/cephfs-top.8, and the
fedora spec ships it via the cephfs-top package. on debian,
debian/cephfs-top.install was missing it, so the file was sitting in
debian/tmp unclaimed and dh_missing complained:

dh_missing: warning: usr/share/man/man8/cephfs-top.8 exists in
debian/tmp but is not installed to anywhere

claim the manpage in debian/cephfs-top.install so it lands in the
cephfs-top deb. silences the warning and gives debian users the same
`man cephfs-top` fedora users already have.

Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Casey Bodley [Thu, 7 May 2026 12:42:46 +0000 (08:42 -0400)]

Merge pull request #68364 from linuxbox2/wip-complete-mpu-etag

rgw: return an etag header for all successful complete-multipart

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Afreen Misbah [Thu, 7 May 2026 12:39:40 +0000 (18:09 +0530)]

Merge pull request #68788 from cloudbehl/fix-Application-overview

monitoring: Fix application overview to show Raw used

Reviewed-by: Afreen Misbah <afreen@ibm.com>

commit | commitdiff | tree

Jon Bailey [Thu, 7 May 2026 12:28:01 +0000 (13:28 +0100)]

doc: Clarification of text in ec stretch cluster design

Information regarding min_size in the EC Cluster Design doc was unclear in regards to the intention of what we want to develop. This commit is to clarify this so it is clear to readers.

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>

commit | commitdiff | tree

Ronen Friedman [Thu, 7 May 2026 11:42:09 +0000 (14:42 +0300)]

Merge pull request #68735 from ronen-fr/wip-rf-scrubNtrim-crimson

crimson/osd: defer snap trimming while scrubbing

Reviewed-by: Aishwarya Mathuria <amathuri@redhat.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>

commit | commitdiff | tree

Alex Ainscow [Thu, 7 May 2026 10:47:39 +0000 (11:47 +0100)]

Merge pull request #68614 from tchaikov/wip-test-osd-fix-leaks

test/osd: fix Message and Connection refcount leaks

Reviewed-by: Alex Ainscow <aainscow@uk.ibm.com>

commit | commitdiff | tree

Kefu Chai [Sun, 29 Mar 2026 05:47:47 +0000 (13:47 +0800)]

crimson/osd: acquire throttle when scanning replica/primary for backfill

The backfill state machine called budget_available() before deciding to
scan, but request_primary_scan() and request_replica_scan() never
actually acquired the throttle slot. This meant scans could proceed
without any resource reservation, defeating the QoS intent of the
throttler introduced in 791772f1c0.

In this change, we fix this by acquiring the throttle before initiating
each scan.

Fixes: https://tracker.ceph.com/issues/70808
Signed-off-by: Kefu Chai <k.chai@proxmox.com>

commit | commitdiff | tree

Max Kellermann [Wed, 15 Apr 2026 06:09:26 +0000 (08:09 +0200)]

crimson: add missing includes

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>

commit | commitdiff | tree

Ankush Behl [Thu, 7 May 2026 08:44:25 +0000 (14:14 +0530)]

monitoring: Fix application overview to show Raw used

- Updated capacity used to show Raw capacity
- Pool table shows Raw capacity
- Total used capacity graph shows raw capacity

fixes: https://tracker.ceph.com/issues/76456

Signed-off-by: Ankush Behl <cloudbehl@gmail.com>

commit | commitdiff | tree

NitzanMordhai [Thu, 7 May 2026 07:39:06 +0000 (10:39 +0300)]

Merge pull request #66430 from NitzanMordhai/wip-nitzan-deadlock-eio

aio_cxx: Fix mutual deadlock and resolve test unreliability

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Laura Flores <lflores@ibm.com>

commit | commitdiff | tree

Venky Shankar [Thu, 7 May 2026 05:26:03 +0000 (10:56 +0530)]

Merge PR #68634 into main

* refs/pull/68634/head:
mgr: change cleanup and scanning cephfs connection logs to be less noisy

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: anuragbandhu <anuragbandhu007@gmail.com>

commit | commitdiff | tree

Shilpa Jagannath [Wed, 6 May 2026 21:50:17 +0000 (17:50 -0400)]

rgw: fix frontend crash in abort_early() on client disconnect

wrap the send_body() call in abort_early() with try/catch for
rgw::io::Exception

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 6 May 2026 20:18:44 +0000 (16:18 -0400)]

doc/governance: remove Sam from CSC

Sam requested to be removed from the CSC in a mail to csc@lists.io.

(We hope to see him return one day!)

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 6 May 2026 20:16:29 +0000 (16:16 -0400)]

doc/governance: remove Ken and Jeff from CSC

The CSC has voted [1] these past weeks on the removal of members due to
inactivity.

[1] https://secure.electionbuddy.com/results/7GY2X48XN226

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 6 May 2026 20:12:36 +0000 (16:12 -0400)]

doc/governance: update Ceph Executive Council List

The CSC has voted on the new term for the Ceph Executive Council [1].
I've been elected to take Josh's place.

Many thanks to Josh stepping up to serve on the CEC these past years.

[1] https://secure.electionbuddy.com/results/7GY2X48XN226

Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>

commit | commitdiff | tree

John Mulligan [Tue, 21 Apr 2026 20:36:18 +0000 (16:36 -0400)]

doc: document new options for smb share rm and cluster rm

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Mon, 20 Apr 2026 20:11:51 +0000 (16:11 -0400)]

mgr/smb: add test cases for cluster rm w/ wildcard/recursive

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Mon, 20 Apr 2026 20:07:19 +0000 (16:07 -0400)]

mgr/smb: add --wildcard and --recursive to smb cluster rm

Add new --wildcard and --recursive flags to the smb cluster rm
subcommands. These allow deleting clusters in bulk. The --wildcard
option works like the same option for share rm in that it allows the use
of globbing for the cluster IDs, this includes '*' to delete all
clusters. The --recursive option tells the command to also delete all
child resources (shares) when deleting a cluster.

This was previously doable by streaming the output of `ceph smb show
...` through (sed or) jq and flipping the intent to removed and piping
that to `ceph smb apply` - but this is clearly not obvious nor easy to
document versus these new options.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Mon, 20 Apr 2026 19:17:47 +0000 (15:17 -0400)]

mgr/smb: add unit tests to verify deleting shares by wildcard

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Mon, 20 Apr 2026 19:16:34 +0000 (15:16 -0400)]

mgr/smb: add a --wildcard option to the smb share rm subcommand

The new wildcard option will enable matching multiple shares to delete
or even all shares in a cluster using '*'.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Tue, 21 Apr 2026 18:30:13 +0000 (14:30 -0400)]

mgr/smb: add a new error type to reflect non-matching inputs

It's not that the input was bad, it just reflects that nothing
matching the input value was found.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

John Mulligan [Mon, 20 Apr 2026 19:14:56 +0000 (15:14 -0400)]

mgr/smb: add glob style wildcard support to matcher object

Add glob/wildcard support to the matcher type in the handler.py file.
This will be used in future changes to make matching shares and/or
clusters easier by supporting glob style wildcards on some commands.

Signed-off-by: John Mulligan <jmulligan@redhat.com>

Unnamed repository; edit this file 'description' to name the repository.