From: Patrick Donnelly Date: Tue, 28 Jan 2025 16:24:18 +0000 (-0500) Subject: tools/cephfs/DataScan: create all ancestors during scan_inodes X-Git-Tag: v20.0.0~226^2~5 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=7d59db1d9804b918512d1f0e88654fabcca1be7a;p=ceph.git tools/cephfs/DataScan: create all ancestors during scan_inodes When arbitrary PGs are lost which consequently lose random dirfrag objects, we may need to recover the full ancestry when the immediate parent exists. So, always recover the ancestry and fixup the potential duplicate linkages to a directory during scan_links. Fixes: https://tracker.ceph.com/issues/69692 Signed-off-by: Patrick Donnelly --- diff --git a/src/tools/cephfs/DataScan.cc b/src/tools/cephfs/DataScan.cc index b92b68829680..1a40cdefca2f 100644 --- a/src/tools/cephfs/DataScan.cc +++ b/src/tools/cephfs/DataScan.cc @@ -2046,19 +2046,13 @@ int MetadataDriver::inject_with_backtrace( } } - if (!created_dirfrag) { - // If the parent dirfrag already existed, then stop traversing the - // backtrace: assume that the other ancestors already exist too. This - // is an assumption rather than a truth, but it's a convenient way - // to avoid the risk of creating multiply-linked directories while - // injecting data. If there are in fact missing ancestors, this - // should be fixed up using a separate tool scanning the metadata - // pool. - break; - } else { - // Proceed up the backtrace, creating parents - ino = parent_ino; - } + // N.B.: when the metadata pool has suffered a partial loss (like one PG), then + // an arbitrary ancestor dirfrag may be missing. We need to traverse up the + // backtrace ancestry to create those missing dirfrags/links. There is a risk + // that we create duplicate primary links to a directory this way. scan_links + // will catch this and pick either a legitimate link (with a version >1) or + // an arbitrary injected link, removing the others. + ino = parent_ino; } return 0;