Adam King [Wed, 28 Aug 2024 12:45:43 +0000 (08:45 -0400)]
Merge pull request #59419 from phlogistonjohn/jjm-smb-ctdb-vips
smb: cluster public ip addresses support
Reviewed-by: Adam King <adking@redhat.com> Reviewed-by: Anoop C S <anoopcs@cryptolab.net> Reviewed-by: Avan Thakkar <athakkar@redhat.com> Reviewed-by: Michael Adam <obnox@samba.org>
John Mulligan [Wed, 21 Aug 2024 15:31:52 +0000 (11:31 -0400)]
python-common/deployment: add a cluster public ip spec for smb
This spec can be used to define one or more public addresses that will
be automatically assigned to hosts by CTDB. The address can be specified
in the "interface" form - an address plus prefix length. Optionally,
networks to bind to can be specified. The network value will be
converted to a network device name later by cephadm.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Patrick Donnelly [Tue, 27 Aug 2024 17:10:54 +0000 (13:10 -0400)]
Merge PR #58419 into main
* refs/pull/58419/head:
mds: generate correct path for unlinked snapped files
qa: add test for cephx path check on unlinked snapped dir tree
mds: add debugging for stray_prior_path
N Balachandran [Thu, 22 Aug 2024 08:15:36 +0000 (13:45 +0530)]
rbd-mirror: use correct ioctx for namespace
The PoolReplayer uses the ioctx for the default namespace
to check if other namespaces are enabled for mirroring, causing
it to incorrectly conclude that they are all enabled.
Fixes: https://tracker.ceph.com/issues/67676 Signed-off-by: N Balachandran <nibalach@redhat.com>
John Mulligan [Wed, 21 Aug 2024 21:03:40 +0000 (17:03 -0400)]
cephadm: add support for cluster public ip addresses to smb daemon
When a list of public addresses (and optional network destination(s))
are supplied at deploy time, convert the networks to device names
and pass that result to the sambcc ctdb configuration.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Wed, 21 Aug 2024 21:03:19 +0000 (17:03 -0400)]
mgr/smb: simplify orch backend enablement
We have a developer/debug module option that allows one to disable
triggering orchestration. When I tried to use it I thought it was
buggy and I had trouble diagnosing it. The mistake was on my side,
but the code change makes it much clearer what is being enabled
so I want to keep it.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Oguzhan Ozmen [Thu, 22 Aug 2024 02:44:01 +0000 (22:44 -0400)]
doc/rgw/account: Handling notification topics when migrating an existing user into an account
Add a subsection under "Migrate an existing User into an Account" to
describe how a client can seamlessly migrate the notification topics
after account migration.
Casey Bodley [Fri, 23 Aug 2024 19:03:31 +0000 (15:03 -0400)]
rgw: ignore zoneless default realm when not configured
"default" zone/zonegroup deployments without a realm can be broken by
the creation of an unrelated realm, because that realm is (was)
automatically set as the default
when startup detects an incomplete default realm (one that doesn't have
a default zone), fall back to the realmless "default" zone/zonegroup
instead
Vallari Agrawal [Mon, 26 Aug 2024 04:23:07 +0000 (09:53 +0530)]
qa/tasks/nvmeof.py: add nvmeof gw-group to deployment
Groups was made a required parameter to be
`ceph orch apply nvmeof <pool> <group>` in
https://github.com/ceph/ceph/pull/58860.
That broke the `nvmeof` suite so this PR fixes that.
Right now, all gateway are deployed in a single group.
Later, this would be changed to have multi groups for a better test.
Samuel Just [Wed, 14 Aug 2024 19:40:50 +0000 (12:40 -0700)]
include/ceph_features: add NVMEOFHA feature bit
Normally, we'd just use the SERVER_SQUID or SERVER_T flags instead of
using an extra feature bit. However, the nvmeof ha monitor paxos
service has had a more complex development journey. There are users
interested in using the nvmeof ha feature in squid, but it didn't make
the cutoff for backporting it. There's an upstream nvmeof-squid branch
in the ceph.git repository with the patches backported for anyone
interested in building it.
However, that means that users of our normal stable releases will see
the feature added to the monitor one release after anyone who chooses to
use the nvmeof-squid branch. We could disallow upgrades from
nvmeof-squid to T, but by adding a feature bit here we make such a
restriction unnecessary.
Nitzan Mordechai [Mon, 10 Jun 2024 10:51:03 +0000 (10:51 +0000)]
crimson: Add support for bench osd command
this commit adds support for the 'bench' admin command in the OSD,
allowing administrators to perform benchmark tests on the OSD. The
'bench' command accepts 4 optional parameters with the following
default values:
1. count - Total number of bytes to write (default: 1GB).
2. size - Block size for each write operation (default: 4MB).
3. object_size - Size of each object to write (default: 0).
4. object_num - Number of objects to write (default: 0).
The results of the benchmark are returned in a JSON formatted output,
which includes the following fields:
1. bytes_written - Total number of bytes written during the benchmark.
2. blocksize - Block size used for each write operation.
3. elapsed_sec - Total time taken to complete the benchmark in seconds.
4. bytes_per_sec - Write throughput in bytes per second.
5. iops - Number of input/output operations per second.
Ronen Friedman [Sat, 17 Aug 2024 16:08:19 +0000 (11:08 -0500)]
osd/scrub: delay both targets on some failures
If the failure of a scrub-job is due to a condition that affects
both targets, both should be delayed. Otherwise, we may end up
with the following bogus scenario:
A high priority deep target is scheduled, but scrub session initiation
fails due to, for example, a concurrent snap trim. The deep target
will be delayed. A second initiation attempt may happen after the
snap trimming is done, but before the updated deep target not-before.
As a result - the lower priority target will be scheduled before the
higher priority one - which is a bug.
Ronen Friedman [Thu, 15 Aug 2024 13:17:48 +0000 (08:17 -0500)]
osd/scrub: reverse OSDRestrictions flags polarity
As most of the flags in OSDRestrictions are of 'true is bad' polarity,
reverse the two non-conforming flags - cpu load and time-of-day
restrictions - to match.
This flag was used to indicate that a deep scrub should
be performed if a shallow scrub finds an error. It was
always set true for shallow, regular, scrubs - if
can_autorepair flag was set. Thus, the ephemeral flag in
the requested_scrub_t object is not really needed.
Ronen Friedman [Tue, 6 Aug 2024 13:07:17 +0000 (08:07 -0500)]
qa/standalone/scrub: disable scrub_extended_sleep test
Disabling osd-scrub-test.sh::TEST_scrub_extended_sleep,
as the test is no longer valid (updated code no longer
produces the same logs or the same behavior).
osd/scrub: OSD's scrub queue now holds SchedEntry-s
The OSD's scrub queue now holds SchedEntry-s, instead of ScrubJob-s.
The queue itself is implemented using the 'not_before_queue_t' class.
Note: this is not a stable state of the scrubber code. In the next
commits:
- modifying the way sched targets are modified and updated, to match the
new queue implementation.
- removing the 'planned scrub' flags.
Important note: the interaction of initiate_scrub() and pop_ready_pg()
is not changed by this commit. Namely:
Currently - pop..() loops over all eligible jobs, until it finds one
that matches the environment restrictions (which most of the time, as the
concurrency limit is usually reached, would be 'high-priority-only').
The other option is to maintain Sam's 'not_before_q' clean interface: we
always pop the top, and if that top fails the preconds tests - we delay and
re-push. This has the following troubling implications:
- it would take a long time to find a viable scrub job, if the problem
is related to, for example, 'no scrub'.
- local resources failure (inc_scrubs() failure) must be handles
separately, as we do not want to reshuffle the queue for this
very very common case.
- but the real problem: unneeded shuffling of the queue, even as the
problem is not with the scrub job itself, but with the environment
(esp. no-scrub etc.).
This is a common case, and it would be wrong to reshuffle the queue
for that.
- and - remember that any change to a sched-entry must be done under PG
lock.
osd/scrub: modify ScrubJob to hold two SchedTarget-s
ScrubJob will now hold two SchedTarget-s - two sets of scheduling
information (times, levels, etc.) for the next shallow and deep scrubs.
This is in preparation for the upcoming changes to the scheduling queue.
The change cannot stand on its own, as the partial implementation
creates some inconsistencies in the scheduling logic.
Specifically, here is what changes here, and how it differs from the
desired implementation:
- The OSD still maintains a queue of scrub jobs - one object only per
PG.
But now - each queue element holds two SchedTarget-s.
- When a scrub is initiated, the Scrubber is handed a ScrubJob object.
Only in the next commit will it also receive the ID of the selected
level. That causes some issues when re-determining the level of the
initiated scrub. A failure to match the queue "intent" results in
failures.
- the 'planned scrub' flags are still here, instead of directly
encoding the characteristics of the next scrub in the relevant
sched-entry.
- the 'urgency' levels do not cover the full required range of
behaviors and priorities.
Ilya Dryomov [Sun, 25 Aug 2024 11:22:08 +0000 (13:22 +0200)]
qa: drop XMLSTARLET variable, use xmlstarlet directly
The variable was added in commit 9b6b7c35d03f ("Handle
differently-named xmlstarlet binary for *suse") but this
compatibility business is long outdated:
Mon Oct 13 08:52:37 UTC 2014 - toms@opensuse.org
- SPEC file changes
- Added link from /usr/bin/xml to /usr/bin/xmlstarlet as other
distributions do the same
- Did the same for the manpage
Ilya Dryomov [Fri, 23 Aug 2024 21:00:24 +0000 (23:00 +0200)]
rbd: "rbd bench" always writes the same byte
It's expected that the buffer is filled with the same byte, but the
byte should differ from run to run:
memset(bp.c_str(), rand() & 0xff, io_size);
This was broken in commit c7f71d14a5d3 ("rbd: migrated existing command
logic to new namespaces") which inadvertently moved the call to srand(),
leaving rand() unseeded for the above memset().
John Mulligan [Fri, 9 Aug 2024 18:37:43 +0000 (14:37 -0400)]
qa/tasks: add a new cephadm_from_container feature to cephadm task
The cephadm_from_container allows one to do a single container build
and then point teuthology at that image as the "single source of truth".
I find this extremely convenient when running teuthology locally and
I keep carrying this patch around - I figure having it upstream will
simplify my workflow. Maybe someday it'll benefit others too.
To use it I set up a yaml overrides file with the following content:
```yaml
overrides:
cephadm:
image: "quay.io/phlogistonjohn/ceph:dev"
cephadm_from_container: true
verify_ceph_hash: false
verify_ceph_hash: false
```
This let's me test my custom builds fairly easily!
Signed-off-by: John Mulligan <phlogistonjohn@asynchrono.us>