Kefu Chai [Fri, 26 Feb 2021 01:38:26 +0000 (09:38 +0800)]
ceph-kvstore-tool: define a noexcept non-default ctor for Deleter
the deleter of a unique_ptr<> should be value-initialized if we use
`unique_ptr()` for constructing the unique_ptr, but somehow, `Deleter`
does have a user-defined constructor which prevents the compiler from
creating a default constructor which could have made Deleter default
constructible.
in this change, a constructor accepts no arguments is explictly defined
to satisfy the requirements for creating `db` using `unique_ptr<>()`.
this change is not cherry-picked from master, as we don't define a
constructor at all for Deleter, so it is *always* value-initialized.
Kotresh HR [Tue, 1 Dec 2020 10:44:17 +0000 (16:14 +0530)]
tasks/cephfs/test_volume_client: Add tests for authorize/deauthorize
1. Add testcase for authorizing auth_id which is not added by
ceph_volume_client
2. Add testcase to test 'allow_existing_id' option
3. Add testcase for deauthorizing auth_id which has got it's caps
updated out of band
Optionally allow authorizing auth-ids not created by ceph_volume_client
via the option 'allow_existing_id'. This can help existing deployers
of manila to disallow/allow authorization of pre-created auth IDs
via a manila driver config that sets 'allow_existing_id' to False/True.
Kotresh HR [Thu, 26 Nov 2020 09:18:16 +0000 (14:48 +0530)]
pybind/ceph_volume_client: Preserve existing caps while authorize/deauthorize auth-id
Authorize/Deauthorize used to overwrite the caps of auth-id which would
end up deleting existing caps. This patch fixes the same by retaining
the existing caps by appending or deleting the new caps as needed.
This patch disallow the ceph_volume_client to authorize the auth_id
which is not created by ceph_volume_client. Those auth_ids could be
created by other means for other use cases which should not be modified
by ceph_volume_client.
Fixes: https://tracker.ceph.com/issues/48555 Signed-off-by: Ramana Raja <rraja@redhat.com> Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 3a85d2d04028a323952a31d18cdbefb710be2e2b)
Adam Kupczyk [Wed, 1 Jul 2020 21:09:17 +0000 (23:09 +0200)]
os/bluestore: Add documentation for large bluefs log recovery
Adds additional paragraph to ceph-bluestore-tool documentation,
describing how to use *special* options --bluefs_replay_recovery
and --bluefs_replay_recovery_disable_compact to recover large
bluefs log.
shenhang [Thu, 27 Feb 2020 06:01:39 +0000 (14:01 +0800)]
mds: Using begin() and empty() to iterate the xlist Fixes: https://tracker.ceph.com/issues/44316
The item p pointed to maybe cleaned during the process
of request_kill previous one.
Signed-off-by: Shen Hang <harryshen18@gmail.com>
(cherry picked from 432ea90)
Sage Weil [Wed, 30 Aug 2017 02:07:05 +0000 (22:07 -0400)]
os/bluestore/BlueFS: compact log even when sync_metadata sees no work
It's possible that when sync_metadata() is called there won't be any new
log data to flush because it was already flushed for other reasons (e.g.,
because fsync was called). However, the log may still be large and in
need of compaction.
This is shown to corrupt otherwise healthy rocksdb databases. Rename to
make it clear that it is generally not safe to run and shoud only be used
as a last resort.
Conflicts:
PendingReleaseNotes: drop this change as "repair" command did
not exist in luminous before this change.
qa/workunits/cephtool/test_kvstore_tool.sh: drop this change,
as this test was not added before this change.
src/tools/ceph_kvstore_tool.cc: trivial resolution.
Kefu Chai [Thu, 15 Nov 2018 05:56:19 +0000 (13:56 +0800)]
tools/ceph_kvstore_tool: do not open rocksdb when repairing it
before this change, the `need_open_db` parameter is passed to the
constructor of BlueStore as `min_alloc_size`. and rocksdb will fail to
repair because Repairer::Run() also tries to acquire the db lock, and it
will fail to do so if the lock file is already acquired by
BlueStore::_mount().
Conflicts:
src/librbd/api/Mirror.cc
- C_ImageGetInfo ctor takes only two arguments in nautilus
- nautilus does not have LambdaContext as a class; use FunctionContext
instead
Jan Fajerski [Wed, 13 Nov 2019 09:13:01 +0000 (10:13 +0100)]
ceph-volume: assume msgrV1 for all branches containing mimic
With nautilus and newer OSDs listen on v1 ports and v2 ports. Assume
that if mimic (or luminous) occur in the branch name, the OSDs are
running msgrv1 only.
Fixes: https://tracker.ceph.com/issues/42791 Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit b8754919df61b118200e210e0bfc8d6df0261dfd)
This was broken by def50d04796, and implicitly fixed during
refactoring in the master (octopus) by adf1486e46c, hence it is a
direct commit to nautilus branch.
Conflicts:
src/tools/rbd_mirror/Mirror.cc (std::lock_guard vs Mutex::Locker, ceph_abort_msgf does not exist)
src/tools/rbd_mirror/PoolReplayer.cc (std::lock_guard vs Mutex::Locker, PoolReplayer is not a template)
Conflicts:
src/test/cli/osdmaptool/help.t (some options not present)
src/tools/osdmaptool.cc (ceph_assert is assert here)
src/test/cli/osdmaptool/missing-argument.t (usage included here)
Igor Fedotov [Thu, 16 Aug 2018 11:51:06 +0000 (14:51 +0300)]
os/bluestore: fix assertion in StupidAllocator::get_fragmentation
One might face an assertion (assert(intervals <= max_intervals))
in StupidAllocator::get_fragmentation method for clusters created
by early Luminous releases and before. The root cause is that block
volume size wasn't aligned with min_alloc_size and hence we missed
that last fraction interval during max_interval calculation.
Fixes: https://tracker.ceph.com/issues/43297
Note: This was a clean cherry-pick from master, but p2roundup was
introduced since mimic release, use P2ROUNDUP instead
Ali Maredia [Mon, 25 Nov 2019 02:30:03 +0000 (21:30 -0500)]
luminous: update s3-test download code for s3-test tasks
- Ensure the download code for all tasks running
s3-tests is consistent.
- Simplify download code to only use the config
variable 'force-branch' for the branch being
cloned.
- Make ceph-luminous the force-branch for all
suites using s3-tests.
- Add force-branch to suites running s3readwrite
& s3roundtrip tasks
osd/MissingLoc.cc: do not rely on missing_loc_sources only
In 624ade487ea4aeaf988cc1767e0b293f76addd5b, we relied on missing_loc_sources
to check for strays and remove an OSD from missing_loc. However, it is
possible that missing_loc_sources is empty while there are still OSDs
present in missing_loc. Since the aim is to just remove a stray OSD from
missing_loc, we do not need to rely on missing_loc_sources. We still
clean missing_loc_sources if any stray is present in it.
xie xingguo [Sat, 31 Aug 2019 02:17:57 +0000 (10:17 +0800)]
osd/PG: fix _finish_recovery vs repair race
On detecting a corrupted object, primary may automatically
repair that object by leveraging the existing recovery procedure,
which turned out to be racy with a previous unfinished _finish_recovery
callback - the problem would then be that _finish_recovery might
continue to purge some strays that we still want to pull data from.
Fix by re-checking if there are any newly added missing objects when
executing _finish_recovery.
Note that before https://github.com/ceph/ceph/pull/29756 we might
instead have to call needs_recovery to catch the race condition
since we did not evict pg from clean state when triggering an auto-repair..
Conflicts:
src/osd/PG.cc
- adjusted if conditional for luminous
- did not add the comment nor state_clear(PG_STATE_REPAIR);. Those lines were
moved but don't exist in luminous.
Neha Ojha [Sat, 31 Aug 2019 01:15:58 +0000 (18:15 -0700)]
osd/MissingLoc, PeeringState: remove osd from missing loc in purge_strays()
We should always try to keep osds in missing_loc consistent with peer_missing
and peer_info. When we remove an osd from peer_missing and peer_info, we
should also remove it from missing_loc during purging strays.
Conflicts:
src/osd/MissingLoc.cc
src/osd/MissingLoc.h
src/osd/PeeringState.cc
- these files do not exist in luminous; made the changes manually to
src/osd/PG.cc and src/osd/PG.h
- ldout(cct, ...) -> ldout(pg->cct, ...)
We should have done this while cherry-picking from master, but we
didn't. And here we are now. It's simpler to apply this one-off patch
than going back to the cherry-picking maze to adjust this one thing.
Conflicts:
src/pybind/mgr/telemetry/module.py
Due to missing context resulting from missing patches.
PendingReleaseNotes
Dropped to prevent conflicts in the future
Note:
This commit was heavily modified. We wanted to provide the number of
ipv4 and ipv6 monitors in the report, so we rewrote that part so we
can report on it; but we had to drop everything else (msgr1 and
msgr2), as well as 'min_mon_release'. Those do not exist in
luminous. In the end, the commit message itself is misleading, but
we are somehow (*shrug*) opting for leaving the commit as the original.
Additionally, we removed PendingReleaseNotes changes to prevent
conflicts in the future.