From aecabff391ad039e570643de74a9260202aa10b8 Mon Sep 17 00:00:00 2001 From: Adam Kupczyk Date: Thu, 13 Mar 2025 11:38:31 +0000 Subject: [PATCH] osd/OSD.cc: make osd_superblock recovery more robust We keep superblock bitstream in 2 places: - "osd_superblock" object's data stream - "osd_superblock" object's omap key named "osd_superblock" The recovery procedure works fine. But if data stream is corrupted on ObjectStore::write() request BlueStore might require to read the old data from object's allocation unit. This will cause assert like: "BlueStore.cc: 15865: FAILED ceph_assert(r >= 0 && r <= (int)tail_read)" The solution is to remove object before writing to it, clearing any reason for the read-on-write. Replication, using vstart cluster: > OSD=1 MON=1 FS=0 RGW=0 ../src/vstart.sh -l -n > ../src/stop.sh > OBJECT=$(./bin/ceph-objectstore-tool --no-mon-config --data-path dev/osd0/ --op meta-list |grep osd_superblock) > ./bin/ceph-objectstore-tool --no-superblock --no-mon-config --data-path dev/osd0/ --pool meta $OBJECT get-bytes osd-superblock.data Using no superblock > head -c 500 /dev/random >> osd-superblock.data > ./bin/ceph-objectstore-tool --no-superblock --no-mon-config --data-path dev/osd0/ --pool meta $OBJECT set-bytes osd-superblock.data Using no superblock > ./bin/ceph-objectstore-tool --no-superblock --no-mon-config --data-path dev/osd0/ --pool meta $OBJECT dump | grep '"offset"' Error getting attr on : meta,#-1:7b3f43c4:::osd_superblock:0#, (61) No data available "offset": 8192, <- use this offset in dd > dd if=/dev/random of=dev/osd0/block bs=1 count=4000 seek=8192 conv=notrunc <- seek from offset above 4000+0 records in 4000+0 records out 4000 bytes (4.0 kB, 3.9 KiB) copied, 0.0041956 s, 953 kB/s > ../src/vstart.sh > ./bin/ceph osd pool create xxxx Without the fix osd will be asserted now. Signed-off-by: Adam Kupczyk --- src/osd/OSD.cc | 1 + 1 file changed, 1 insertion(+) diff --git a/src/osd/OSD.cc b/src/osd/OSD.cc index 649aeb56feb..c93dab8f415 100644 --- a/src/osd/OSD.cc +++ b/src/osd/OSD.cc @@ -2133,6 +2133,7 @@ void OSD::write_superblock(CephContext* cct, OSDSuperblock& sb, ObjectStore::Tra bufferlist bl; encode(sb, bl); + t.truncate(coll_t::meta(), OSD_SUPERBLOCK_GOBJECT, 0); t.write(coll_t::meta(), OSD_SUPERBLOCK_GOBJECT, 0, bl.length(), bl); std::map attrs; attrs.emplace(OSD_SUPERBLOCK_OMAP_KEY, bl); -- 2.39.5