From 8b146ac820c21b2cda3b2d8f9faacdbe63c79a18 Mon Sep 17 00:00:00 2001 From: Zac Dover Date: Fri, 5 May 2023 16:35:28 +1000 Subject: [PATCH] doc/cephfs: repairing inaccessible FSes Add a procedure to doc/cephfs/troubleshooting.rst that explains how to restore access to FileSystems that became inaccessible after post-Nautilus upgrades. The procedure included here was written by Harry G Coin, and merely lightly edited by me. I include him here as a "co-author", but it should be noted that he did the heavy lifting on this. See the email thread here for more context: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/HS5FD3QFR77NAKJ43M2T5ZC25UYXFLNW/ Co-authored-by: Harry G Coin Signed-off-by: Zac Dover (cherry picked from commit 2430127c6e88834c5a6ec46fae15aad04d6d8551) --- doc/cephfs/troubleshooting.rst | 55 ++++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+) diff --git a/doc/cephfs/troubleshooting.rst b/doc/cephfs/troubleshooting.rst index 78ad18ddeb4d1..e8754aedfd94a 100644 --- a/doc/cephfs/troubleshooting.rst +++ b/doc/cephfs/troubleshooting.rst @@ -188,6 +188,61 @@ You can enable dynamic debug against the CephFS module. Please see: https://github.com/ceph/ceph/blob/master/src/script/kcon_all.sh +Filesystems Become Inaccessible After an Upgrade +================================================ + +.. note:: + You can avoid ``operation not permitted`` errors by running this procedure + before an upgrade. As of May 2023, it seems that ``operation not permitted`` + errors of the kind discussed here occur after upgrades after Nautilus + (inclusive). + +IF + +you have CephFS file systems that have data and metadata pools that were +created by a ``ceph fs new`` command (meaning that they were not created +with the defaults) + +OR + +you have an existing CephFS file system and are upgrading to a new post-Nautilus +major version of Ceph + +THEN + +in order for the documented ``ceph fs authorize...`` commands to function as +documented (and to avoid 'operation not permitted' errors when doing file I/O +or similar security-related problems for all users except the ``client.admin`` +user), you must first run:: + + ceph osd pool application set cephfs metadata + +and:: + + + ceph osd pool application set cephfs data + +Otherwise, when the OSDs receive a request to read or write data (not the +directory info, but file data) they will not know which Ceph file system name +to look up. This is true also of pool names, because the 'defaults' themselves +changed in the major releases, from:: + + data pool=fsname + metadata pool=fsname_metadata + +to:: + + data pool=fsname.data and + metadata pool=fsname.meta + +Any setup that used ``client.admin`` for all mounts did not run into this +problem, because the admin key gave blanket permissions. + +A temporary fix involves changing mount requests to the 'client.admin' user and +its associated key. A less drastic but half-fix is to change the osd cap for +your user to just ``caps osd = "allow rw"`` and delete ``tag cephfs +data=....`` + Reporting Issues ================ -- 2.39.5