Adam King [Tue, 10 Oct 2023 18:00:27 +0000 (14:00 -0400)]
cephadm: remove --cleanup-on-failure flag
As discussed in the orch weekly, instead of having the
two flags, we'll just have the --no-cleanup-on-failure
flag on its own. This commit does not change the behavior
at all. It will still do the cleanup if --no-cleanup-on-failure
is not provided and not do the cleanup if it was. This just
removes the additional flag.
Ville Ojamo [Fri, 3 Nov 2023 05:44:00 +0000 (12:44 +0700)]
doc/cephadm/services: remove excess rendered indentation in osd.rst
Start bash command blocks at the left margin, removing
excessive padding/indentation that would render the
block too much towards the right.
At the same time ident the source consistently:
- Two spaces for command blocks and output blocks.
- Four spaces for notes, code blocks.
There seems to be no uniform style for this, sometimes
commands are indented with three spaces but it would
seem two spaces is common. In the end it all renders
the same I guess.
Signed-off-by: Ville Ojamo <14869000+bluikko@users.noreply.github.com>
osd/SnapMapper: maintain the prefix_itr between calls to SnapMapper::get_next_objects_to_trim()
Maintain the prefix_itr between calls to SnapMapper::get_next_objects_to_trim() to prevent searching depleted prefixes.
We got 8 distinct hash prefixes used for searching objects owned by a given PG.
On each call to SnapMapper::get_next_objects_to_trim() we start from the first prefix even after all objects mapped to it were depleted.
This means that we will be searching for 1 non-existing prefix after the first prefix was depleted, 2 after the first two prefixes were depleted... and so on until we will search 7 non-existing prefixes after the first 7 prefixes were depleted.
This is a performance improvement PR only!
It maintains the existing behavior and does not try to fix/change any of the TRIM logic.
I added an extra step after the last object is trimmed doing a full scan of the DB and only if no object was found it will return ENOENT.
This should make the new code no-worse than existing code which returns ENOENT after a full scan found no object.
It should not impact performance in real life snaps as it should only happen once per-snap.
added snap-mapper tests to rados-test-suite
disabled osd_debug_trim_objects when running (SnapMapperTest, prefix_itr) to prevent asserts(as this code does illegal inserts into DELETED snaps)
Code beautifing
Disabled the assert as there is a corner case when we retrieve the last valid object/s in a snap
The prefix_itr is advanced past the last valid value (as we completed a full scan)
If the OSD will call get_next_objects_to_trim() before the retrieved object/s was processed and removed from the SnapMapper DB it won't be found by the next call (as the prefix_itr is invalid).
The object will be found in the second-pass which will seems as if it was added after the trim was started (which is illegal) and will trigger an ASSERT
Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
Ionut Balutoiu [Wed, 1 Nov 2023 16:07:58 +0000 (18:07 +0200)]
rgw: fix cloud-sync multi-tenancy scenario
At the moment, we cannot set buckets prefixed with tenant ID in the
`source_bucket` field from cloud-sync profiles (non-trivial config):
https://docs.ceph.com/en/latest/radosgw/cloud-sync-module/#non-trivial-configuration
This is because the `do_find_profile` function only searches in the
profiles configured using `bucket.name`, and it ignores `bucket.tenant`.
This is problematic in the RGW multi-tenancy scenario:
https://docs.ceph.com/en/latest/radosgw/multitenancy/#rgw-multi-tenancy
At the moment, we can only configure bucket name in the profile
`source_bucket` field. In the multi-tenancy scenario, this would sync
all the buckets (from all the tenants).
Without this fix, we cannot configure a cloud-sync profile that syncs
all the buckets from a tenant to a particular S3 target.
For example, we cannot do this:
* `tenantA/test-bucket` -> S3 target A
* `tenantB/test-bucket` -> S3 target B
* `tenantC/test-bucket` -> S3 target C
We can only do this at the moment:
* `test-bucket` -> S3 target A
If `test-bucket` is present in both `tenantA` and `tenantB`, both
buckets will be synced to S3 target A.
The idea would be to be able to do this:
* `tenantA/*` -> S3 target A
* `tenantB/*` -> S3 target B
* `tenantC/*` -> S3 target C
If `test-bucket` is present in all tenants, each tenant bucket is
synced to its own S3 target.
Zac Dover [Wed, 1 Nov 2023 01:53:59 +0000 (11:53 +1000)]
doc/cephadm: edit troubleshooting.rst (1 of x)
Edit doc/cephadm/troubleshooting.rst. This commit and the PR of which it
is a part was raised in response to
https://github.com/ceph/ceph/pull/53976. The limits of reStructuredText
are particularly visible here in every instance of a BASH for-loop and
in every instance of a command stretched over multiple lines.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
Rishabh Dave [Thu, 26 Oct 2023 10:48:31 +0000 (16:18 +0530)]
cmake: add --progress flag to git submodule update commands
Ceph has lots of submodules that needs to be cloned before building
binaries from the repository. Seeing the progress when these submodules
are being cloned is useful, especially when developers/users have a
network issue or a slow network.
Zac Dover [Mon, 30 Oct 2023 02:37:39 +0000 (12:37 +1000)]
doc/glossary: improve "BlueStore" entry
Initially s/backend/back end/ but then I added a little more information
about BlueStore's use of RocksDB to map object names to block locations
on disk.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
There is no need for CreateSnapshotRequests.__del__() that calls
CreateSnapshotRequests.wait_for_pending().
MirrorSnapshotScheduleHandler.shutdown() already calls
CreateSnapshotRequests.wait_for_pending().
Ramana Raja [Thu, 26 Oct 2023 17:18:52 +0000 (13:18 -0400)]
mgr/rbd_support: fix recursive locking on CreateSnapshotRequests lock
The MirrorSnapshotScheduleHandler's run thread issues asynchronous
create snapshot requests using a CreateSnapshotRequests instance. When
the thread invokes a CreateSnapshotRequests instance's get_ioctx(),
the instance's class variable lock is acquired. With the class
variable lock held, the garbage collection of a CreateSnapshotRequests
instance may race in the thread. The thread would then call
CreateSnapshotRequests __del__() that tries to acquire the class
variable lock that the thread already holds. Fix this
recursive deadlock by converting the CreateSnapshotRequests lock from
a class variable to an instance variable. There is no need to share
the lock across CreateSnapshotRequests instances.
Also convert MirrorSnapshotScheduleHandler, PerfHandler and
TrashPurgeScheduleHandler class variables to instance variables
that don't need to be shared across the instances.
Fixes: https://tracker.ceph.com/issues/62994 Signed-off-by: Ramana Raja <rraja@redhat.com> Co-Authored-By: Ilya Dryomov <idryomov@gmail.com>
Adam Emerson [Sat, 28 Oct 2023 17:29:59 +0000 (13:29 -0400)]
build: Fix fmt version check
Currently, when attempting to build ceph on a system with fmt
installed, we try to build against it whatever the version. This
constantly breaks people's builds, since newer versions of fmt often
change the API.
This change specifies that versions must be below 10 as well as at or
above 8.1.1, so that on systems with a new format, we fall back to
using the submodule.
It also removes the `Findfmt.cmake` module, as that does not check
the installed version. Instead, we use the cmake config file installed by
the system package of fmt and does support version checking.
Aashish Sharma [Mon, 30 Oct 2023 07:47:37 +0000 (13:17 +0530)]
mgr/dashboard: update rgw multisite import form helper info
Change 'To obtain the token, generate it from your secondary Ceph cluster' to 'To obtain the token, generate it from your primary Ceph cluster' in rgw multisite import form helper