Dan Mick [Wed, 26 Jun 2024 02:07:41 +0000 (19:07 -0700)]
Add Containerfile and build.sh to build it.
The intent is to replace ceph-container.git, at first for ci containers
only, and eventually production containers as well.
There is code present for production containers, including
a separate "make-manifest-list.py" to scan for and glue the two
arch-specific containers into a 'manifest-list' 'fat' container,
but that code is not yet fully tested.
This code will not be used until a corresponding change to the
Jenkins jobs in ceph-build.git is pushed.
Note that this tooling does not authenticate to the container repo;
it is assumed that will be done elsewhere. Authentication is
verified by pushing a minimal image to the requested repo.
Zac Dover [Fri, 4 Oct 2024 13:21:32 +0000 (23:21 +1000)]
doc/governance: add exec council responsibilites
Add the Ceph Executive Council's responsibilties to the
doc/governance.rst document. It was decided during the weekly CLT
meeting on 30 Sep 2024 to add this to the ceph/ceph git repository.
qa: avoid a non-standard shell construct in rbd/iscsi_client.t
dash which is used as /bin/sh on Ubuntu interprets "2&> /dev/null" as
an instruction to launch iscsiadm in the background. While that is
mostly compensated by the following sleep, stderr isn't redirected to
/dev/null either -- the output gets polluted and the test fails.
... since it's not available on Ubuntu. In this case mpathconf just
sets a couple of default values and defines an empty blacklist section,
so it's easy enough to replicate.
mgr/dashboard: Allow adding all listeners unders a subsystems
Issue:
- Currently a user cannot add all listeners under a subsystem
- This results into an error: `Failure adding nqn.2001-07.com.ceph:1725013182540 listener at 10.70.44.140:4420: Gateway's host name must match current host (dhcp47-54)`
Reason:
- The gateway address used while creating listener is random now in nvmeof client
- After checking the gateway logs of each node, its is found that no grpc request recieved for adding listener on the respective node rather going to the node that is chosen by default in nvmeof client.
- But nvmeof backend check that current gateway matches the one with sent in request for adding listener (ref: https://github.com/ceph/ceph-nvmeof/blob/devel/control/grpc.py#L2104)
Fix:
- Using `traddr` from listener API to set the current gateway address
- Since `traddr` gives only IP address, without port therefore extracting full address from `NvmeofGatewaysConfig.get_gateways_config()`
- This ensures correct path usage
doc/rados: edit "Placement Groups Never Get Clean"
Make grammar improvements (and correct a verb disagreement) in the
section "Placement Groups Never Get Clean" in
doc/rados/troubleshooting/troubleshooting-pg.rst.
* the steps performed by the Windows CI job
* artifact structure
* frequently asked questions
The document is meant to assist the Ceph developers in investigating
CI failures. This is especially important as the Windows CI job runs
integration tests that would otherwise only be executed by
Teuthology, thus helping catch potential regressions quickly.
Note that the identified regressions are not necessarily Windows
specific, usually affecting Linux builds as well.
ceph-volume: add call to `ceph-bluestore-tool zap-device`
BlueStore now writes its metadata at multiple offset on devices [1].
It means `ceph-volume lvm zap` doesn't remove BlueStore signature altogether.
This can confuse ceph-volume when redeploying an OSD on a previously
zapped device because there is still old BlueStore metadata on it.
ceph-volume should call `ceph-bluestore-tool zap-device` [2]
in addition to the existing calls when wiping a device.
Fixes: https://tracker.ceph.com/issues/67926 Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit c266ef0f61f7de29b56119171625dd61b8c0f0a2)
Conflicts:
'trim' option was not present, but --yes-i-really-really-mean-it was
used
Naman Munet [Tue, 17 Sep 2024 06:59:37 +0000 (12:29 +0530)]
mgr/dashboard: multisite sync policy improvements
https://tracker.ceph.com/issues/68097
Changes for this PR includes:
1) Populating the destination zones select option with a set of options to choose from, for flow and pipe so that user can't enter any invalid zone name
2) Provided zone option as 'All Zones (*)' in pipe, if user want to select all zones for source and destination zones
3) We are hiding the UniqueId column on sync policy table as we do not want to show it as this column is introduced just to uniquely identify a row in the table and should not be displayed to users as it is part of the internal logic to work.
After an OSD is successfully prepared, the activation step fails
because the mapper is left open which makes `systemd-cryptsetup attach`
complain about that and prompt for password.
In order to avoid any other potential issue that would make activation
step hang for ever, I'm adding `headless=true`.
mgr/dashboard: Cloning subvolume not listing _nogroup subvolumegroup if there are no subvols in _nogroup Fixes: https://tracker.ceph.com/issues/67891 Signed-off-by: Dnyaneshwari talwekar <dtalweka@redhat.com>
(cherry picked from commit 5c6c4a07d8dcd7bde46057310fbd1c5580a0da2f)
doc/rados: add confval directives to health-checks
Add confval directives to doc/rados/operations/health-checks.rst, as
requested by Anthony D'Atri here: https://github.com/ceph/ceph/pull/59635#pullrequestreview-2286205705
orch: refactor boolean handling in drive group spec
The intent of 42721c03ee6f was to address an issue where boolean
parameters weren't handled correctly.
I noticed that a parameter (`tpm2`) was missed, which made me realize
that maintaining a list of these boolean parameters is necessary.
To simplify things, we should only accept `"true"` or `"false"` (in any case),
allowing us to avoid the need to maintain a list of boolean parameters.
This change introduces a `list_drive_group_spec_bool_arg` to store boolean
arguments related to drive group specifications, simplifying the validation
process for boolean values by directly checking if the values are 'true' or 'false'.
Rishabh Dave [Thu, 11 Jul 2024 18:28:22 +0000 (23:58 +0530)]
cephfs: disallow removing root_squash via "fs authorize" cmd
Removing root_squasn from MDS auth caps through "fs authorize" command
should not be allowed as this command it not allowed to/meant for
removing caps.
Fixes: https://tracker.ceph.com/issues/65808 Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit c6e2c97c6e9cbf1e37c53d5d490d65091205928c)
Conflict:
- qa/tasks/cephfs/test_admin.py
Test test_idem_unaffected_root_squash (which was fixed by this commit)
was disabled on the main branch since it was buggy. But that wasn't
the case with squid branch.
- enables mTLS support from dashboard
- adds unit tests related to mTLS support
- can enable mTLS
- can disable mTLS
- inlcuded refactoring from prev commit
An indentation of five spaces relative to the previous line creates a
command that is copyable with a single mouse click. This commit adds
those copyabale commands to the procedure in the section "Building
Ceph".
mgr/dashboard: Adding group and pool name to service name
- Pre-populating the service name field with the format `nvmeof.<pool_name>.<group_name>`.
- This can be changed by user but by default this value will be there.
- This will help user to quickly fill form and proceed hence improving usability.
- cephadm also uses this format as of now this convention so it will make UI aligned with CLI experience
- updates unit tests to improve coverage
- hides`count` values as that is not needed for 'nvmeof' only hosts and labels required.
mgr/dashboard: mgr/dashboard: Select no device by default in EC profile
Fixes https://tracker.ceph.com/issues/67853
When EC pools are created with device class specified, the pools are created with just 1 PG and autoscaler does not work.
PG autoscaler not working on a cluster where pools have multiple overlapping roots is a known issue, and bug is raised for same :>
Also renames "let ceph decide" option to "All devices" in crush rule and ec profile component.
Updates unit tests for ec profile modal Signed-off-by: Afreen Misbah <afreen23.git@gmail.com>
(cherry picked from commit 4af51349e5e1d21c200b6bf7db81fa18eb163a61)
Add a second method of changing the value of osd_deep_scrub_interval to
remedy the condition indicated by the "PGs not deep-scrubbed in time"
warning.
This procedure was developed by Eugen Block, and is at the time of this
commit available on his blog at
https://heiterbiswolkig.blogs.nde.ag/2024/09/06/pgs-not-deep-scrubbed-in-time/