Patrick Donnelly [Tue, 27 Feb 2024 00:44:27 +0000 (19:44 -0500)]
mds: skip sr moves when target is an unlinked dir
A directory in the stray directory cannot have any HEAD inodes with caps so
there is no need to move anything to the snaprealm opened for the unlinked
directory.
Following the parent commit's reproducer, the behavior now looks expectedly like:
Discussions with Dan van der Ster led to the creation of this patch.
Fixes: https://tracker.ceph.com/issues/53192 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> Signed-off-by: Dan van der Ster <dan.vanderster@clyso.com>
(cherry picked from commit c190a3f1633e9282772e5ec54fe10556856a2540)
Patrick Donnelly [Fri, 12 Nov 2021 00:43:27 +0000 (19:43 -0500)]
mds: memoize descendent results during realm splits
This change uses an unordered_map to memoize results of CInode::is_ancestor_of
so that subsequent invocations can skip directory inodes which are already
known to not be a descendent of the target directory.
In the worst case, this unordered_map can grow to the number of inodes in
memory when all inodes are directories and at least one client has a cap for
each inode. However, in general this will not be the case. The size of each
entry in the map will be a 64-bit pointer and bool. The total size will vary
across platforms but we can say that with a conservative estimate of 192 bits /
entry overhead (including the entry linked list pointer in the bucket), the map
will grow to ~24MB / 1M inodes.
The result of this change is not eye-popping but it does have a significant performance advantage.
For an unpatched MDS with 1M inodes with caps in the global snaprealm (with debugging commits preceding this one):
or about 1,338ms. This caused a split of 100k inodes. This takes more time
because directories are actually moved to the snaprealm with a lot of list
twiddling for caps.
or about 840ms. This can be easily done by making a directory in one of the
trees created (see reproducer below).
Reproducing can be done with:
for ((i =0; i < 10; i++)); do (pushd $(mktemp -d -p . ); for ((j = 0; j < 30; ++j)); do mkdir "$j"; pushd "$j"; done; for ((j = 0; j < 10; ++j)); do for ((k = 0; k < 10000; ++k)); do mkdir $j.$k; done & done) & done
to make 1M directories. We put the majority of directories in a 30-deep nesting
to exercise CInode::is_ancestor_of with some worst-case type scenario.
Make sure all debugging configs are disabled for the MDS/clients. Make sure the
client has a cache size to accomodate 1M caps. Make at least one snapshot:
It is not necessary to delete any snapshots to reproduce this behavior. It's
only necessary to have a lot of inodes_with_caps in a snaprealm and effect a
split.
Fixes: https://tracker.ceph.com/issues/53192 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit a0ccb79fa0806792c7ee666c667328a8aeb09e97)
Aashish Sharma [Wed, 4 Oct 2023 09:07:42 +0000 (14:37 +0530)]
mgr/dashboard: upgrade from old 'graph' type panels to the new
'timeseries' panel
The graph panel type is deprecated, and disappears after Grafana v9.1 (current version is 10.0) to prevent more old type panels being created. These should be migrated to the timeseries panel type, to avoid potential problems with future Grafana versions.
* refs/pull/56479/head:
pybind/mgr/devicehealth: skip legacy objects that cannot be loaded
qa: test devicehealth legacy load of deleted snap obj
qa: allow failing whatever the active mgr is
qa: add unit tests for MgrMap down flag
mon/MgrMonitor: add "down" setting to simplify testing
Niklas Hambüchen [Sat, 30 Mar 2024 16:42:48 +0000 (17:42 +0100)]
doc/rados/operations: Improve crush_location docs
* Fix incorrect syntax
* Use underscores for config options, like other ceph docs did
* Fix incorrect statement that crush_location_hook adds fiels; it replaces
* Explain `root=default host=HOSTNAME` is not set if `crush_location` is given
* Remove duplication across sections
* Point out that `root=default` is important
Afreen [Fri, 1 Mar 2024 07:26:25 +0000 (12:56 +0530)]
mgr/dashboard: Locking improvements in bucket create form
Fixes https://tracker.ceph.com/issues/64658
- Addition of help texts
- Addition of info/warnings related to modes and versioning
- change of Locking section layout
- renaming locking to 'Object Locking'
- changes default retention period to 10
- edit bucket only shows lock when its enabled
Patrick Donnelly [Wed, 27 Mar 2024 13:02:43 +0000 (09:02 -0400)]
Merge PR #54468 into reef
* refs/pull/54468/head:
mds,client: update the oldest_client_tid via the renew caps
mds: add trim_completed_request_list() helper
client: return false if cannot link all the way to mountpoint
client: use the fs' full path instead of from mountpoint's root
qa/tasks/cephfs/test_admin: run root_squash tests only for FUSE client
qa/tasks/cephfs: Add reproducer for https://tracker.ceph.com/issues/56067
qa: add test for checking access in client side of root_squash
qa: add sudo paramter for read_file()
test/libcephfs: remove reduntant test for acccess
mds/Server: disallow clients that have root_squash
mds/Locker: remove session check access when doing cap updates
client: check the cephx mds auth access for open
client: always set the caller_uid/gid to -1
mds: add CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK feature bit
client: check the cephx mds auth access for setattr
client: save the cap_auths in client when session being opened
client: add make_path_string() helpers support
client: add _get_root_ino() helper support
test/libcephfs: add a tag for each test unique directory
client: rename MAY_* to CLIENT_MAY_* to avoid conflicts
mds: send the cap_auths to clients when openning the sessions
mds: add cap_auths in MClientSession
mds: add MDSCapAuth support
mds: encode/decode the MDSCapMatch
mds: add assign operator support for MDSCapMatch
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Xiubo Li [Thu, 19 Oct 2023 02:20:55 +0000 (10:20 +0800)]
client: use the fs' full path instead of from mountpoint's root
The mountpoint's root ino# possibly not be the full CephFS
filesystem root, it's just the mountpoint of this particular client.
Just prepend the mountpoint path to the full path.
Introduced-by: c1bf8d88e9d client: check the cephx mds auth access for setattr Introduced-by: ce216595c03 client: check the cephx mds auth access for open Fixes: https://github.com/ceph/ceph/pull/48027#issuecomment-1741019086 Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit e46dc20cdfb157f94781032451057d1e138535cc)
Ramana Raja [Mon, 8 Aug 2022 18:33:06 +0000 (14:33 -0400)]
qa/tasks/cephfs: Add reproducer for https://tracker.ceph.com/issues/56067
A kernel CephFS client with MDS root_squash caps is able to write to a
file as non-root user. However, the data written is lost after clearing
the kernel client cache, or re-mounting the client. This issue is not
observed with a FUSE CephFS client.
Xiubo Li [Wed, 2 Nov 2022 01:12:16 +0000 (09:12 +0800)]
qa: add test for checking access in client side of root_squash
Test the 'chown' and 'truncate', which will call the setattr and
'cat' will open the files. Before each testing will open the file
by non-root user and keep it to make sure the Fxw caps are issued,
and then user the 'sudo' do to the tests, which will set the uid/gid
to 0/0.
Ramana Raja [Tue, 15 Nov 2022 19:00:24 +0000 (14:00 -0500)]
mds/Server: disallow clients that have root_squash
... MDS auth caps but don't have CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK
feature bit (i.e., can't check the auth caps sent back to it by the
MDS) from establishing a session. Do this in
Server::handle_client_session(), and Server::handle_client_reconnect(),
where old clients try to reconnect to MDS servers after an upgrade.
If the client doesn't have the ability to authorize session access
based on the MDS auth caps send back to it by the MDS, then the
client may buffer changes locally during open and setattr operations
when it's not supposed to, e.g., when enforcing root_squash MDS auth
caps.
Xiubo Li [Fri, 9 Sep 2022 04:17:06 +0000 (12:17 +0800)]
client: always set the caller_uid/gid to -1
Since the setattr will check the cephx mds auth access before
buffering the changes, so it makes no sense any more to let the
cap update to check the access in MDS again.
Xiubo Li [Tue, 25 Apr 2023 09:31:25 +0000 (17:31 +0800)]
client: add make_path_string() helpers support
Will use this to get the path string to do the mds auth check. It
may fail when the there is no any dentry in local cache, which could
be caused by just unlinking the last dentry while the inode keeps
opening and then try to change the mode.
Patrick Donnelly [Thu, 21 Dec 2023 13:48:33 +0000 (08:48 -0500)]
pybind/mgr/devicehealth: skip legacy objects that cannot be loaded
Log looks like after test:
2023-12-21T16:09:28.804+0000 7fbe7fd86700 0 [devicehealth DEBUG root] loading object ABC_DEADB33F_FA
2023-12-21T16:09:28.805+0000 7fbe7fd86700 0 [devicehealth DEBUG root] object rados.Object(ioctx=<rados.Ioctx object at 0x7fbeee0c4668>,key=ABC_DEADB33F_FA,nspace=--default--,locator=None) does not exist because it is deleted in HEAD
2023-12-21T16:09:28.805+0000 7fbe7fd86700 0 [devicehealth DEBUG root] finished reading legacy pool, complete = True
Credit to Greg Farnum for postulating the cause.
Fixes: https://tracker.ceph.com/issues/63882 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 5e6fc0bf5f52732966d5cf2987e679abee8a384d)
Patrick Donnelly [Thu, 21 Dec 2023 15:39:03 +0000 (10:39 -0500)]
qa: test devicehealth legacy load of deleted snap obj
Failure without fix looks like:
2023-12-21T16:05:55.737+0000 7fbe585b0700 0 [devicehealth DEBUG root] loading object ABC_DEADB33F_FA
2023-12-21T16:05:55.737+0000 7fbe585b0700 -1 log_channel(cluster) log [ERR] : Unhandled exception from module 'devicehealth' while running on mgr.x: [errno 2] RADOS object not found (Failed to operate read op for oid ABC_DEADB33F_FA)
2023-12-21T16:05:55.737+0000 7fbe585b0700 -1 devicehealth.serve:
2023-12-21T16:05:55.737+0000 7fbe585b0700 -1 Traceback (most recent call last):
File "/home/pdonnell/ceph/src/pybind/mgr/devicehealth/module.py", line 394, in serve
self._do_serve()
File "/home/pdonnell/ceph/src/pybind/mgr/mgr_module.py", line 524, in check
return func(self, *args, **kwargs)
File "/home/pdonnell/ceph/src/pybind/mgr/devicehealth/module.py", line 354, in _do_serve
finished_loading_legacy = self.check_legacy_pool()
File "/home/pdonnell/ceph/src/pybind/mgr/devicehealth/module.py", line 326, in check_legacy_pool
if self._load_legacy_object(ioctx, obj.key):
File "/home/pdonnell/ceph/src/pybind/mgr/devicehealth/module.py", line 300, in _load_legacy_object
ioctx.operate_read_op(op, oid)
File "rados.pyx", line 3723, in rados.Ioctx.operate_read_op
rados.ObjectNotFound: [errno 2] RADOS object not found (Failed to operate read op for oid ABC_DEADB33F_FA)