Laura Flores [Thu, 12 Jan 2023 23:36:50 +0000 (23:36 +0000)]
test/osd: add read balancer unit tests
This commit adds unit test coverage to the read balancing
feature, including small vs. large osdmap scenarios,
random osdmap scenarios, and scenarios that involve
tweaking primary affinity on an OSD.
Laura Flores [Wed, 1 Feb 2023 23:52:17 +0000 (23:52 +0000)]
tools, test/cli: add read balancer to osdmaptool
This commit adds the capability to balance reads on a
given osdmap with the osdmaptool. The user has the option
of performing a "dry run" of read balancing OR taking it a
step further and applying the results to a live cluster.
Performing a "dry run" would involve simply running an
osdmaptool command and inspecting the results.
The template for the command is:
`osdmaptool <osd map file> --read <file for command output> --read-pool <pool name>`
An example command a user might run is:
`osdmaptool om --read out.txt --read-pool default.rgw.control`
This commit also adds a `--vstart` flag that allows a user to print ceph
commands in the outfile formatted for a vstart cluster. An example command
a user might run is:
`./bin/osdmaptool om --vstart --read out.txt --read-pool default.rgw.control`
The out.txt file would contain ceph commands prefixed with `./bin/`.
The `--vstart` flag may also be applied to an `--upmap` osdmaptool command.
If the user wants to apply read balancing results from their dry run to a
live cluster, they may either manually apply the ceph commands from the out
file, or run `source <outfile>`.
Laura Flores [Wed, 1 Feb 2023 23:50:53 +0000 (23:50 +0000)]
osd: implement read balancer
This commit implements two functions:
1. calc_desired_primary_distribution
Based on the number of pgs in a pool and the pool's
replica count, we calculate the ideal number of primary
pgs that should be assigned to each OSD on that pool in
order for reads to be balanced.
2. balance_primaries
This is the overall algorithm used to balance reads (primary
pgs) in a pool. Based on the first function, we re-distribute
primaries on the OSDs in a pool so each OSD has the ideal
number of primaries. This is done without data movement.
Venky Shankar [Wed, 22 Feb 2023 09:11:09 +0000 (14:41 +0530)]
Merge PR #45669 into main
* refs/pull/45669/head:
client: switch to use 32 bits ext_num_fwd
client: switch to use 32 bits ext_num_retry
ceph_fs.h: add 32 bits extended num_retry and num_fwd support
ceph_fs.h: switch to use its own encode/decode helpers
Venky Shankar [Wed, 22 Feb 2023 06:11:44 +0000 (11:41 +0530)]
Merge PR #49934 into main
* refs/pull/49934/head:
qa: add test_fscrypt_dummy_encryption test case support
qa: add 'options' parameter support for write_local_config
qa: add ceph.exclude file to exclude individual tests
qa: add require_kernel_mount helper support
qa: rename test_fscrypt to test_fscrypt_encrypt
Zac Dover [Wed, 22 Feb 2023 03:36:40 +0000 (13:36 +1000)]
doc/rgw: clarify multisite.rst top matter
Improve the pragmatics of the top matter of multisite.rst. Organize the
text into sections, where doing so makes the nature of multi-site
configurations clearer.
Commit dc69033763cc116c6ccdf1f97149a74248691042 moves cephfs-shell from
"<CEPH-REPO-ROOT>/src/tools/cephfs/" to
"<CEPH-REPO-ROOT>/src/tools/cephfs/shell" but cephfs-shell's location in
src/vstart.sh and qa/tasks/cephfs/test_cephfs_shell.py is left
un-updated. This produces a broken vstart_environment.sh and broken
export command in test_cephfs_shell.py.
Introduced-by: dc69033763cc116c6ccdf1f97149a74248691042 Fixes: https://tracker.ceph.com/issues/58795 Signed-off-by: Rishabh Dave <ridave@redhat.com>
myoungwon oh [Tue, 14 Feb 2023 09:39:54 +0000 (18:39 +0900)]
crimson/os/seastore: fix traversal behavior in scan_mapped_space to avoid replay-allocation conflicts
1. Traverse the allocations from backref-tree leaf-node entries;
2. Traverse the pending alloc-deltas by sequence;
3. Traverse the allocations from backref-tree internal-node entries;
J. Eric Ivancich [Thu, 16 Feb 2023 15:29:31 +0000 (10:29 -0500)]
rgw/flight: don't access non-existant flight store during GetObj
The front end must be configured via ceph.conf to start up both the
flight_server and the flight_store. RGWGetObj needs to check for the
existence of a flight_store prior to trying to use it.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Ilya Dryomov [Thu, 16 Feb 2023 11:53:02 +0000 (12:53 +0100)]
qa/workunits/rbd-nbd: work around "rbd feature disable" hang
"rbd feature disable" appears to reliably hang if the corresponding
remote request is proxied to rbd-nbd (because rbd-nbd happens to own
the exclusive lock after a series of blkdiscard calls) [1]. Work
around it here by enabling journaling before the image is mapped
and disabling it after the image is unmapped.
Also, don't assert on the output of "rbd journal inspect --verbose"
having a certain number of entries. This is racy: if the script gets
delayed after the last blkdiscard call for some reason, there may be
fewer entries present in the journal or none at all.
Ilya Dryomov [Thu, 16 Feb 2023 11:51:04 +0000 (12:51 +0100)]
test/librbd: add LengthModifiedDiscardJournalAppendEnabled test
Currently nothing triggers the length_modified case in
ImageDiscardRequest::prune_object_extents() in isolation. It's only
triggered in DiscardGranularityJournalAppendEnabled test together with
the prune_required case and a bad refactoring could easily break the
length_modified logic again.
N Balachandran [Thu, 16 Feb 2023 04:57:02 +0000 (10:27 +0530)]
rbd-mirror: fix syncing_percent calculation logic in get_replay_status()
When a snapshot sync is resumed and the get_replay_status function
is called before handle_copy_image_progress, the syncing_percent
value may be greater than 100 as the m_local_object_count is still
set to zero. This commit sets the syncing_percent to 0 in such cases.
Fixes: https://tracker.ceph.com/issues/58706 Signed-off-by: N Balachandran <nibalach@redhat.com>
Xiubo Li [Tue, 5 Jul 2022 04:59:11 +0000 (12:59 +0800)]
client: switch to use 32 bits ext_num_fwd
The MDS will increase the forward count, and if the forward count
is less than the one saved in request in client side, that means
the MDS is old version and it was overflowed. Then just stop
forwarding.
Fixes: https://tracker.ceph.com/issues/57854 Signed-off-by: Xiubo Li <xiubli@redhat.com>
Xiubo Li [Tue, 5 Jul 2022 04:59:11 +0000 (12:59 +0800)]
client: switch to use 32 bits ext_num_retry
Check the CEPHFS_FEATURE_32BITS_RETRY_FWD feature bit and if not
set, that means it's connecting to an old MDS and will limit the
max retry to 256 times.
Fixes: https://tracker.ceph.com/issues/57854 Signed-off-by: Xiubo Li <xiubli@redhat.com>
Xiubo Li [Tue, 31 Jan 2023 02:01:46 +0000 (10:01 +0800)]
qa: rename test_fscrypt to test_fscrypt_encrypt
The test_fscrypt_encrypt will only run the 'encrypt' related test
cases without 'test_dummy_encryption' option enabled. This will
test the filename and content verification.
After this I will add the whole test cases with 'test_dummy_encryption'
option.
Yixin Jin [Wed, 15 Feb 2023 17:08:19 +0000 (17:08 +0000)]
rgw: Fix segfault due to concurrent socket use at timeout
This commit fixes a potential segfault risk when
rgw timeout handler works on the socket in one
thread while it is concurrently used by another.
The details of the fix are:
1. Instead of calling socket close(), which resets
descriptor_data in boost::asio socket and risks
segfault due to concurrent use of the socket,
the timeout handler now calls cancel() to abort
all pending ops followed by shutdown() to disable
the underlying transport. The eventual closure of
the socket will be done in the socket destructor.
2. Expose the actual boost::asio socket via get_socket()
from Connection so that the timeout handler can call
cancel() and shutdown() on it, although the socket data
member is already accessible. It allows future expansion
that wants to hide the socket even though it renders the
existing close() less useful.
Fixes: https://tracker.ceph.com/issues/58670 Signed-off-by: Yixin Jin <yjin77@yahoo.ca>
Venky Shankar [Wed, 15 Feb 2023 13:28:29 +0000 (18:58 +0530)]
Merge PR #48053 into main
* refs/pull/48053/head:
test/libcephfs: fix rebasing issues
libcephfs: replace errno.h errors with CEPHFS_E*
messages: avoid converting ceph errors on Windows
client: use CEPHFS_E*
test/libcephfs: use CEPHFS_E* errors
libcephfs: switch to CEPHFS_E* errors
test/libcephfs: disable flaky timestamp assertion on Windows
client: use _setattrx when changing timestamps
client: set nsec to 0 when converting stat struct on Windows
test/libcephfs: skip dirent inode check on Windows
client: avoid trimming inodes on Windows
test/libcephfs: address windows issues
test/libcephfs: include compat.h
test/libcephfs: enable the tests on Windows
Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Anthony D Atri <anthony.datri@gmail.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Rishabh Dave <ridave@redhat.com>
bryanmontalvan [Wed, 3 Aug 2022 01:39:05 +0000 (21:39 -0400)]
mgr/dashboard: dashboard-v3: status card
This commit is the bare-bones work of the status card. The only logic
written in this commit is the Cluster health status icon.
tracker: https://tracker.ceph.com/issues/58728 Signed-off-by: bryanmontalvan <bmontalv@redhat.com>
mgr/dashboard: introduce active alerts to status cards