This is the MVP for a driver for RGW that operates on top of a POSIX
filesystem. It supports get, put, list, copy, multipart, external
access via the filesystem itself, and ordered bucket listings via an
LRU-based cache.
Note that this is currently a Filter, indended to run on top of dbstore.
This is because it currently doesn't have any User implementation, so it
depends on dbstore's User. Everything else is implemented in
POSIXDriver. Once there is a User implementation, this will become a
Store, instead of a Filter.
Commit messages from bucket listing cache:
rgw/posixdriver: recycle lmdb database handles as required
While LMDB workflows often do not close/return database handles,
ours continually reuses them. This requires us to close each
handle (atomically) when a cache entry is recycled.
rgw/posixdriver: don't instantiate bucket cache entries from notify events
rgw/posixdriver: incorporate lmdb-safe for now
The current inclusion is based on https://github.com/Martchus/lmdb-safe,
which is actively maintained but currently has some packaging issues the
author has agreed to accept fixes for.
For now, skip the submodule to save time and remove an external dependency.
rgw/posixdriver: fix listing of cached, empty bucket
* check lmdb enumeration result in all cases and w/better style
* add unit test for enumeration of an empty cached directory
rgw/posixdriver: nest lmdbs in a directory under the dbroot path to avoid cleanup issues
rgw/posixdriver: refactor for posix integration
* Derive BucketCache types as templates on a SAL driver and SAL
bucket pair.
* Integrate cache fills as callbacks into SAL layer (or mock, for
tests)
* Renaming and cleanups
rgw/posixdriver: add bucket cache implementation and tests
Adds free-standing cache of buckets and object names, with
bucket names (and listing attributes, upcoming) managed in
a hashed set of lmdb databases, which provides ordering and
a high-performance listing cache.
An framework for notification on new object creation (e.g.,
outside S3 workflow) is provided, and a Linux implementation
using inotify.
FindLMDB.cmake taken with attribution and license.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com> Signed-off-by: Ali Maredia <amaredia@redhat.com> Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Milind Changire [Tue, 29 Aug 2023 14:43:50 +0000 (20:13 +0530)]
Merge PR #52686 into main
* refs/pull/52686/head:
PendingReleaseNotes: note about mandatory fs argument
doc/cephfs: add note about mandatory --fs argument to snap-schedule
qa: add test for mandatory fs argument to snap-schedule commands
mgr/snap-schedule: tweaks to keep mypy happy
mgr/snap_schedule: validate fs before execution
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
Nizamudeen A [Tue, 29 Aug 2023 14:16:07 +0000 (19:46 +0530)]
mgr/dashboard: fix cephfs create form validator
dashboard was allowing to create a filesystem with / in its name but the
cli throws out error in doing so. And the created volume in dashboard
just gets ended up as a stale volume
Fixes: https://tracker.ceph.com/issues/62628 Signed-off-by: Nizamudeen A <nia@redhat.com>
Ilya Dryomov [Wed, 7 Jun 2023 19:34:07 +0000 (21:34 +0200)]
doc: drop mention of rbd_mirror_journal_max_fetch_bytes option
It was removed in commit 1ef12ea0d29f ("rbd-mirror: remove
rbd_mirror_journal_max_fetch_bytes option") in 2019. Commit 32375cb789d7 ("doc: misc clarity and capitalization") added a "tip"
mentioning it in 2020 in an attempt to capture advice [1].
Lucian Petrut [Thu, 24 Aug 2023 11:41:01 +0000 (11:41 +0000)]
rbd-wnbd: wait for disks to become available
After a WNBD mapping is created, the driver informs Storport that
the bus changed, expecting it to rescan the bus and expose the disk.
This is an asynchronous process and it usually takes a matter of
milliseconds, up to a few seconds under significant load.
The fsx librbd test ocasionally fails to open the disk as it isn't
ready yet. Unlike the Python rbd-wnbd test, it doesn't perform
any polling.
For convenience, WNBD now includes a helper function called
WnbdPollDiskNumber. We're going to use it to wait for the new
disk attachments to become available.
Zac Dover [Wed, 23 Aug 2023 15:15:54 +0000 (01:15 +1000)]
doc/start: edit os-recommendations.rst
Improve the grammar in one sentence of the "Platforms" section of
doc/start/os-recommendations.rst. Improving that grammar involved
splitting the sentence into two sentences, but that's life. Update:
Anthony substantially rewrote this, so credit for this should rightly
go to him.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
Milind Changire [Wed, 9 Aug 2023 11:20:40 +0000 (16:50 +0530)]
mgr/snap_schedule: validate fs before execution
Stop command execution if there are more than one filesystem and --fs
argument is missing in command-line, i.e. do not use the first fs in the
fsmap if there are more than one filesystem.
This is to ensure that user doesn't mistakenly run the command against
the first fs by missing to specify the desired fs.
Venky Shankar [Fri, 25 Aug 2023 13:16:19 +0000 (18:46 +0530)]
Merge PR #52944 into main
* refs/pull/52944/head:
PendingReleaseNotes: add a note for `mds_session_metadata_threshold` mds config
test: add test to verify that a buggy client is blocklisted
mds: add perf counter to track number of sessions evicted due to metadata threshold being exceeded
mds: blocklist clients with "bloated" session metadata
Reviewed-by: Robin H. Johnson <robbat2@orbis-terrarum.net> Reviewed-by: Dhairya Parmar <dparmar@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Nizamudeen A [Fri, 9 Jun 2023 14:06:41 +0000 (19:36 +0530)]
mgr/dashboard: adapt jest unit tests to angular 14
Important change would be the introduction of `TypedFormControl` which
is now more stricter in typing the FormControl and FormGroups as well.
Right now the dashboard has many number of untypedforms which by default
is migrated to an `UntypedFormControl` class when I ran the angular
upgrade script
Fixes: https://tracker.ceph.com/issues/61641 Signed-off-by: Nizamudeen A <nia@redhat.com>
Aashish Sharma [Wed, 23 Aug 2023 09:59:44 +0000 (15:29 +0530)]
mgr/dashboard: Create realm sets to default
In Multisite page, When we create a realm the realm sets to default even if some other realm is already default and default checkbox in unchecked as well while creating.
osd/SnapMapper:
Maintain the prefix_itr between calls to SnapMapper::get_next_objects_to_trim() to prevent searching depleted prefixes.
We got 8 distinct hash prefixes used for searching objects owned by a given PG.
On each call to SnapMapper::get_next_objects_to_trim() we start from the first prefix even after all objects mapped to it were depleted.
This means that we will be searching for 1 non-existing prefix after the first prefix was depleted, 2 after the first two prefixes were depleted... and so on until we will search 7 non-existing prefixes after the first 7 prefixes were depleted.
This is a performance improvement PR only!
It maintains the existing behavior and does not try to fix/change any of the TRIM logic.
I added an extra step after the last object is trimmed doing a full scan of the DB and only if no object was found it will return ENOENT.
This should make the new code no-worse than existing code which returns ENOENT after a full scan found no object.
It should not impact performance in real life snaps as it should only happen once per-snap.
added snap-mapper tests to rados-test-suite
disabled osd_debug_trim_objects when running (SnapMapperTest, prefix_itr) to prevent asserts(as this code does illegal inserts into DELETED snaps)
Code beautifing
Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>