crimson/os/seastore: measure transactional efforts that are discarded or committed
The efforts of a transaction include the number and bytes of its read,
mutate, retire and fresh extents, and the bytes of delta generated.
This helps to understand the following aspects:
* The ratio of discarded efforts vs committed efforts;
* The average efforts of a transaction;
* The distribution of read/mutate/delta/retire/fresh efforts;
* The memory overhead and potential disk overhead of a transaction;
* How early a transaction invalidation happens;
* The average extent length;
It is possible to extend the effort metrics to be labeled by extent
types, in case we want to distinguish and profile the efforts at the
sub-component level.
crimson/os/seastore: assert the committing delta is not empty
It makes no sense to commit an empty delta. It is mostly an issue that
user forget to generate delta during mutation, or there are futile
copy-on-write operations.
Sage Weil [Thu, 15 Jul 2021 15:05:22 +0000 (11:05 -0400)]
Merge PR #42073 into master
* refs/pull/42073/head:
doc/mgr/nfs: fix 'export apply', pool name
PendingReleaseNotes: document workaround for NFS storage change
qa/tasks/mgr/test_orchestrator_cli: fix test
qa/suites/orch/cephadm/mgr-nfs-upgrade: add test for nfs migration
mgr/cephadm: migrate nfs grace file
mgr/nfs: migrate pre-pacific nfs.ganesha-foo clusters to nfs.foo
doc/cephfs/fs-nfs-exports: document new export apply capabilities
qa/tasks/cephfs/test_nfs: define NFS_POOL_NAME
mgr/nfs: use NFS_POOL_NAME in test_nfs.py
mgr/nfs: test export apply on JSON list
mgr/nfs: add test for ganesha conf apply/import
qa/tasks/cephfs/test_nfs: retry mount a few times
mgr/cephadm: migrate all legacy nfs exports to new .nfs pool
mgr/nfs: adjust cephfs export caps if necessary
python-common: don't accept pool/ns for NFSServiceSpec
mgr/orchestrator: drop rados_config_location ServiceDescription property
mgr/cephadm: move rados_config_location() out of NFSServiceSpec
mgr/nfs: change nfs pool to .nfs
mgr/nfs/export: accept a JSON or ganesha EXPORT config
mgr/nfs: allow 'nfs export apply' to take a list of exports
python-common: remove pool + namespace from NFSServiceSpec
mgr/nfs: used fixed pool + ns
mgr/rook: used fixed pool + ns
mgr/dashboard: use fixed pool + ns
mgr/cephadm: always use fixed pool and namespace
mgr/nfs: adjust test to match pool name
mgr/nfs: always create ganesha pool with well-defined name
Sage Weil [Fri, 2 Jul 2021 19:53:15 +0000 (15:53 -0400)]
mgr/cephadm: migrate all legacy nfs exports to new .nfs pool
Migrate all past NFS pools, whether they were created by mgr/nfs or by
the dashboard, to the new mgr/nfs .nfs pool.
Since this migrations relies on RADOS being available, we have to be a bit
careful here: we only attempt the migration from serve(), not during
module init.
After the exports are re-imported, we destroy existing ganesha daemons so
that new ones will get recreated. This ensures the (new) daemons have
cephx keys to access the new pool.
Note that no attempt is made to clean up the old NFS pools. This is out
of paranoia: if something goes wrong, the old NFS configuration data will
still be there.
Sage Weil [Fri, 2 Jul 2021 16:39:29 +0000 (12:39 -0400)]
mgr/nfs: change nfs pool to .nfs
This is a new pool that we can migrate all past NFS configuration to,
simplifying the migration process (and also allowing us to pick a
.-prefixed name).
Sage Weil [Wed, 14 Jul 2021 18:38:59 +0000 (14:38 -0400)]
Merge PR #42041 into master
* refs/pull/42041/head:
mgr/restful: ignore min/max_size
test/crush: drop min/max_size refs
qa/workunits/mon/pool_ops: remove test for min/max_size check
qa: scrub a few remaining mentions of ruleset
qa/standalone/mon/osd-*: fix tests
PendingReleaseNotes: note min/max_size removal
mgr/dashboard: remove max/min_size and ruleset
mon/OSDMonitor: fix calls to CrushTester
crush: eliminate min_size and max_size
test/cli/crushtool: reunumber rulesets in test maps
crushtool: require min/max or num-rep for --test
crush: remove last traces of 'ruleset'
test/cli/crushtool: use 'id' instead of 'ruleset' in crush inputs
crushtool: take --min-rep and --max-rep explicitly
crush/CrushTester: drop --ruleset
doc: scrub 'ruleset' from docs
src/erasure-code: rule, not ruleset
mon/OSDMonitor: remove check_crush_rule() callers
mon/OSDMonitor: rule, not ruleset
crushtool: remove check for overlapped ruels
crush/CrushWrapper: get_osd_pool_default_crush_replicated_ruleset -> rule
crush: remove find_rule()
mon/OSDMonitor: use pool's crush rule directly
osd/OSDMap: drop checks for ruleset == ruleid
osd/OSDMap: use pool's crush rule_id directly
mon/PGMap: use pool's crush_rule directly
mon/OSDMonitor: remove crush ruleset->rule rewrite
Adam C. Emerson [Wed, 14 Jul 2021 14:57:02 +0000 (10:57 -0400)]
rgw: Rename REMOVE_OBJ to INVALIDATE_OBJ
Also rename ObjectCache::remove to ObjectCache::invalidate_remove
Since we're depending on these message types/functions having
invalidate semantics but NOT caching a negative result, rename and
leave a comment for clarity.
Fixes: https://tracker.ceph.com/issues/51674 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
When the mgr dashboard module isn't enabled then the iSCSI service deletion
is stuck and the cluster state goes ERR.
The `ceph dashboard` commands aren't available when the mgr dashboard module
isnt' enabled.
Javier Cacheiro [Mon, 12 Jul 2021 14:03:27 +0000 (16:03 +0200)]
Fetch the actually running selinux status.
The HostFacts should return the **actual** selinux mode in which the
kernel is running.
The actual mode can be different from the one in the configuration
if the server has not been rebooted or if the mode was changed
after boot using setenforce.
Instead of reading _selinux_path_list we should look at the output of
sestatus or getenforce.
The _selinux_path_list attribute is no longer needed.
Fixes: https://tracker.ceph.com/issues/51632 Signed-off-by: Javier Cacheiro <javier.cacheiro.lopez@cesga.es>
Adam C. Emerson [Tue, 13 Jul 2021 20:05:47 +0000 (16:05 -0400)]
rgw: Don't segfault on datalog trim
Synchronous (or yielded, basically other-than AioCompletion trim)
would try to dereference the past-the-end iterator if we were trimming
to a point in the most recent generation.
https://tracker.ceph.com/issues/51661 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
librbd/cache/pwl/ssd: fix use-after-free on C_BlockIORequest
In setup_schedule_append() function, its first expression
will cause the req to be deleted, and subsequent use of
the variable req becomes an illegal operation. And due to
delete, rep->m_image_ctx will be empty, so it lead to
segfault in AbstractWriteLog::get_context().
So pass the `req` into `schedule_append()` function.
This PR improves the readability and format
of the troubleshooting.rst file. This also
makes a change to the markdown of one of the
sub-subsections so that it is made of tildes
(~) instead of carets (^), because that's
the RST standard.
common/LogEntry: drop support of LogSummary v2 encoding scheme
LogSummary's v3 encoding scheme was introduced in 648aaf271cb02c647f046288656c11f15a7799b2, which was in turn included
by Ceph v13.1.0 and all newer releases. since LogSummary is persistented
by monitor, and it is trimmed regularly by monitor, there is no need
to read a LogSummary encoded by 2 releases older monitor.
in this change, the support of LogSummary v2 encoding scheme is dropped.