Casey Bodley [Mon, 1 Dec 2025 15:25:16 +0000 (10:25 -0500)]
cmake: fix for -DWITH_BREAKPAD=OFF
in 1ba55a20be1023c585ba96617dc6a9d2aa79a51b, i tried to avoid the NOT
condition by swapping the option's defaults. but when the condition is
false, the option is forced to ON even if the user manually set it OFF
fix this by inverting the condition and swapping the default values
Reported-by: Joseph Mundackal <joseph.j.mundackal@gmail.com> Signed-off-by: Casey Bodley <cbodley@redhat.com>
Kefu Chai [Thu, 27 Nov 2025 12:28:31 +0000 (20:28 +0800)]
osd: fix ODR violation in max_prio_map
The static std::map max_prio_map was defined in the osd_types.h header
file, causing every translation unit that included this header to get
its own copy of the variable. This led to One Definition Rule (ODR)
violations where multiple instances of the same variable existed at
runtime.
During program cleanup, destructors for these multiple instances would
attempt to free the same memory regions, resulting in segmentation
faults in tcmalloc/memory allocator as seen with ceph-dencoder.
This issue surfaced after a yet-merged-change which converts erasure_code
and json_spirit to OBJECT libraries. Before that change, these were
STATIC libraries that were linked via target_link_libraries. The
incorrect linkage meant their object files (and thus their copies of
max_prio_map) were kept separate and didn't conflict at runtime.
After converting to OBJECT libraries and properly incorporating them
into libceph-common.so (commit 8b0e3fb2c23), the multiple copies of
max_prio_map from different translation units all ended up in the same
shared library, exposing the ODR violation. During program exit, the
dynamic linker attempted to run destructors for all instances, leading
to double-free crashes.
Fix by moving the map into a static helper function in PeeringState.cc
(the only file that uses it). The map is now a function-local static
const variable, ensuring a single instance that is properly initialized
and destructed.
Backtrace before fix:
```
#0 0x00007ffff7dbb1a0 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned int, int) () from /lib/x86_64-linux-gnu/libtcmalloc.so.4
#1 0x00007ffff7dbb57f in tcmalloc::ThreadCache::Scavenge() () from /lib/x86_64-linux-gnu/libtcmalloc.so.4
#2 0x00007ffff6bc8aa2 in std::__new_allocator<std::_Rb_tree_node<std::pair<int const, int> > >::deallocate (this=0x7ffff7d48f78 <max_prio_map>, __p=0x555555f43890, __n=1)
#3 0x00007ffff6bc89f9 in std::allocator<std::_Rb_tree_node<std::pair<int const, int> > >::deallocate (this=0x7ffff7d48f78 <max_prio_map>, __p=0x555555f43890, __n=1)
#4 std::allocator_traits<std::allocator<std::_Rb_tree_node<std::pair<int const, int> > > >::deallocate (__a=..., __p=0x555555f43890, __n=1)
#5 std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_put_node (this=0x7ffff7d48f78 <max_prio_map>, __p=0x555555f43890)
#6 0x00007ffff6bc892e in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_drop_node (this=0x7ffff7d48f78 <max_prio_map>, __p=0x555555f43890)
#7 0x00007ffff6bc886e in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_erase (this=0x7ffff7d48f78 <max_prio_map>, __x=0x555555f43890)
#8 0x00007ffff6bc8854 in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_erase (this=0x7ffff7d48f78 <max_prio_map>, __x=0x555555f43cb0)
#9 0x00007ffff6bc8854 in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::_M_erase (this=0x7ffff7d48f78 <max_prio_map>, __x=0x555555f43ad0)
#10 0x00007ffff6bc8805 in std::_Rb_tree<int, std::pair<int const, int>, std::_Select1st<std::pair<int const, int> >, std::less<int>, std::allocator<std::pair<int const, int> > >::~_Rb_tree (this=0x7ffff7d48f78 <max_prio_map>)
#11 0x00007ffff6bc7345 in std::map<int, int, std::less<int>, std::allocator<std::pair<int const, int> > >::~map (this=0x7ffff7d48f78 <max_prio_map>)
#12 0x00007ffff484bd51 in __cxa_finalize (d=0x7ffff7d3f440) at ./stdlib/cxa_finalize.c:97
#13 0x00007ffff6af9487 in __do_global_dtors_aux () from /home/kefu/dev/ceph/build/lib/libceph-common.so.2
#14 0x00007ffff7fbfd20 in ?? ()
#15 0x00007ffff7fc8fc2 in _dl_call_fini (closure_map=0x7fffffffd0f0, closure_map@entry=0x7ffff7fbfd20) at ./elf/dl-call_fini.c:43
#16 0x00007ffff7fcbe72 in _dl_fini () at ./elf/dl-fini.c:120
#17 0x00007ffff484c291 in __run_exit_handlers (status=0, listp=0x7ffff49f1680 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:118
#18 0x00007ffff484c35a in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:148
#19 0x00007ffff4833caf in __libc_start_call_main (main=main@entry=0x55555556cd90 <main(int, char const**)>, argc=argc@entry=2, argv=argv@entry=0x7fffffffd488) at ../sysdeps/nptl/libc_start_call_main.h:74
#20 0x00007ffff4833d65 in __libc_start_main_impl (main=0x55555556cd90 <main(int, char const**)>, argc=2, argv=0x7fffffffd488, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffd478) at ../csu/libc-start.c:360
#21 0x00005555555695e1 in _start ()
```
test/ceph-helpers: Pass timeout and add timeout for commands in test_pg_scrub
In test_pg_scrub, after killing an OSD, subsequent pg_scrub checks and calls to flush_pg_stats
can hang or timeout with the default time because the OSD is no longer running.
This was causing test failures.
This fix addresses two issues:
1. test_pg_scrub: Explicitly pass the WAIT_FOR_CLEAN_TIMEOUT and TIMEOUT variables (both set to 2)
to the pg_scrub call after the OSD is killed. This prevents a hang in the wait_for_clean
check within pg_scrub.
2. flush_pg_stats: Add an explicit timeout to the ceph tell osd.$osd flush_pg_stats command,
allowing it to fail quickly when an OSD is unresponsive.
Nizamudeen A [Tue, 25 Nov 2025 11:31:31 +0000 (17:01 +0530)]
mgr/dashboard: support custom prop for table item redirection
use an extra customTemplateConfig called `customRowProperty` where
you can provide the key of the property you wish to route, instead of
relying on the cell's prop itself
Fixes: https://tracker.ceph.com/issues/73989 Signed-off-by: Nizamudeen A <nia@redhat.com>
Ronen Friedman [Thu, 20 Nov 2025 13:54:20 +0000 (07:54 -0600)]
osd/scrub: do not attempt to read past the end of an object
When performing deep scrubs, the scrubber reads object data
in strides. Existing code uses a short read to detect the end
of the object (and if the object size is a multiple of the
stride - an extra read is performed, which returns 0 bytes).
The proposed change is to avoid such extra read attempts,
by using our knowledge of the object size.
Also - some minor code cleanups in the relevant function.
Alex Ainscow [Fri, 7 Nov 2025 10:44:56 +0000 (10:44 +0000)]
rados: Add API to disable version querying with reads in librados
librados will always request a "user version". Until EC direct reads are implemented
this is a cheap operation and so librados always requests the user version, even if
the client does not need it.
With EC direct reads, requesting the user version requires an extra op to the primary
in some scenarios. The non-primary OSDs do not contain an up to date user
version.
NEORADOS already allows for such optimisations, due to a how the API is organised.
librados is not heavily used by ceph-maintained clients, but this API will still be
useful for testing of EC direct reads, since the test clients will use librados, due
to it simpler nature and performance not being critical in the tests.
Issue: Route was being force-reloaded using a two-step navigation hack causing unnecessary redirects and side effects.
Fix: Replaced the hack with Angular’s native same-URL reload using onSameUrlNavigation: 'reload' for a clean, stable route refresh.
Alex Ainscow [Tue, 14 Oct 2025 08:24:56 +0000 (09:24 +0100)]
osdc: Add SplitOp capability to Objecter
This will provide the ability for Objecter to split up
certain ops and distribute them to the OSDs directly if
that provides a preformance advantage.
This is experimental code and is switched off unless the
magic pool flags are enabled. These magic pool flags were
pushed in an earlier commit in the same PR.
Alex Ainscow [Fri, 3 Oct 2025 14:11:00 +0000 (15:11 +0100)]
osdc: Remove unused con parameter from Objecter::_calc_target()
This parameter is not used by the _calc_target code. It is being
removed just to clean up the code, as we are making some changes
to _calc_target in later stages of the split io PR.
Alex Ainscow [Fri, 3 Oct 2025 13:55:56 +0000 (14:55 +0100)]
osdc: Interface to submit IO with ASIO Post.
For direct read failures, the locking is such that we cannot
immediately send a new IO without deadlocking. This new interface
allows an op to be sent as an asio post.
Alex Ainscow [Fri, 3 Oct 2025 13:39:03 +0000 (14:39 +0100)]
osd: Implement sync reads and sparse reads for EC for direct reads
Sparse reads for EC are simple to implement, as the code is essentially
identical to that of replica, with some address translation.
When doing a direct read in EC, only a single OSD is involved and
that OSD, by definition is the only OSD involved. As such we can
do the more performant sync read, rather than async read.
Alex Ainscow [Fri, 3 Oct 2025 13:15:32 +0000 (14:15 +0100)]
osd: Generalise can_serve_replica_read for consumption by EC.
The can_serve_replica_read() function is called by replica to determine whether there are
any uncommitted writes. If such writes exist, then the system will reject the IO to avoid
the risk of reading data from a write which may yet be rolled back.
The same code is going to be useful for EC direct reads.
Alex Ainscow [Fri, 3 Oct 2025 12:53:33 +0000 (13:53 +0100)]
osd: Replace unused EC offset translation function with useful one.
The old chunk_aligned_shard_offset_to_ro_offset was not only unused, it
didn't actually have the correct logic. We replace it here with similar,
but more useful function that will be used in sparse reads for EC
Alex Ainscow [Fri, 3 Oct 2025 12:49:58 +0000 (13:49 +0100)]
osd: Introduce pool flag for "split IO" and Plugin flag for "direct read"
These flags will currently behave as follows:
1. The pool flag is never set, unless by a user with the osd_pool_default_flags
config option.
2. The pool flag will be removed for EC pools where the plugin does not support
direct reads.
3. Replica pools will never remove the flag.
The intention is to eventually invert this logic and allow split IOs upon
upgrade to Umberella in this same function.
Nizamudeen A [Wed, 26 Nov 2025 06:20:40 +0000 (11:50 +0530)]
mgr/dashboard: fix server side table sort
show a loading screen when the sort is being performed through
server-side since the sort will happen a little slow
It will be more visible in bigger environments, and with test env if you
try to sort too many time in a short interval and you start to see some
inconsistencies. This is only there for tables like OSDs or hosts where
we have the server side rendering enabled
Fixes: https://tracker.ceph.com/issues/73994 Signed-off-by: Nizamudeen A <nia@redhat.com>
Seena Fallah [Tue, 25 Nov 2025 18:22:44 +0000 (19:22 +0100)]
rgw: fix offset calculation in copy_obj_data
Set ofs to total bytes read by adding 1 to end offset.
Since 'end' represents the last byte offset (zero-indexed),
we need to add 1 to get the actual number of bytes copied.
Patrick Donnelly [Wed, 19 Nov 2025 23:16:21 +0000 (18:16 -0500)]
mon/HealthMonitor: avoid MON_DOWN for freshly added Monitor
In testing, we often have the scenario where cephadm has created a
cluster but doesn't add more monitors until well past
mon_down_mkfs_grace. This causes useless MON_DOWN warnings to be thrown
which fails QA jobs. Avoid this situation entirely by giving a
reasonable grace period for a monitor added to the MonMap to join
quorum.
Fixes: https://tracker.ceph.com/issues/73934 Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
Afreen Misbah [Tue, 21 Oct 2025 16:37:46 +0000 (22:07 +0530)]
mgr/dashboard: Carbonize the Change Password Form
Fixes https://tracker.ceph.com/issues/73193
- using carbon based stylings, typography and components
- used grid layout for form arrangement
- breadcrumb is slightly off, which needs to be fixed by applying grid layout to the app shell
Kefu Chai [Sat, 22 Nov 2025 00:24:36 +0000 (08:24 +0800)]
qa/suites/rados/encoder: exclude ceph-osd-* when installing LTS releases
In a37b5b5, the ceph-osd-classic and ceph-osd-crimson packages were
added to qa/packages/packages.yaml. The "install" task uses this file as
the default package list for all branches, including LTS releases like
Reef.
However, a37b5b5 only exists in the main branch and won't be backported
to LTS branches. This causes installation failures in the rados/encoder
test suite, which verifies forward compatibility by installing LTS
releases and testing whether they can decode the latest corpus.
Exclude ceph-osd-classic and ceph-osd-crimson from LTS installations to
ensure the test suite can successfully install ceph-dencoder, which is
required for the interoperability tests.