mgr/volumes: Use snapshot root directory attrs when creating clone root
If a subvolumes mode or uid/gid values are changed post a snapshot,
and a clone of a snapshot prior to the change is initiated, the clone
inherits the current source subvolumes attributes, rather than the
snapshots attributes.
Fixing this by using the snapshots subvolume root attributes to create
the clone subvolumes root.
Following attributes are picked from the source subvolume snapshot:
- uid, gid, mode, data pool, pool namespace, quota
Patrick Donnelly [Wed, 26 Aug 2020 23:35:08 +0000 (16:35 -0700)]
Merge PR #36804 into nautilus
* refs/pull/36804/head:
qa/workunits/fs: add test for subvolume
mds: don't move inode with nlink > 1 to global snaprealm if it's in subvolume
mds: disallow hardlink across subvolume
mds: disallow across subvolume rename
mds: disallow creating snapshot on descendent directory of subvolume
mds: add vxattr that marks/clears subvolume flag
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Kefu Chai [Fri, 7 Aug 2020 16:26:21 +0000 (00:26 +0800)]
rgw: hold reloader using unique_ptr
instead of using optional<> for holding reloader, use unique_ptr<>.
as `RGWRealmReloader` is neither
trivially_copy_{assignable,constructible} nor
is_trivially_move_{assignable, constructible}, because of the `Cond`
member variable. but Clang++ and libc++ still tries to rely on a
delgating copy constructor for constructing the
optional<RGWRealmReloader> instance even the optional object is
created `in_place`.
in this change, to workaround this issue, reloader is instead
constructed using make_unique<>
Yaarit Hatuka [Thu, 20 Aug 2020 18:21:11 +0000 (18:21 +0000)]
mgr/devicehealth: fix daemon filtering before scraping device
Scraping health metrics of mon devices was introduced in Nautilus, then
disabled (only in Nautilus) since the 'tell' mechanism of mons was not
reliable.
This commit fixes a bug when filtering the daemons on the device to be
scrapped (and allows scraping osd devices solely).
When:
$ ceph device scrape-health-metrics seagate_123
Error EAGAIN: device seagate_123 not claimed by any active OSD daemons
But:
$ ceph device ls
DEVICE HOST:DEV DAEMONS LIFE EXPECTANCY
seagate_123 hostname:sdc osd.1
Patrick Donnelly [Wed, 19 Aug 2020 20:49:00 +0000 (13:49 -0700)]
Merge PR #36448 into nautilus
* refs/pull/36448/head:
mgr: Add python-enum34 dependency to package for older distributions
mgr/volumes: Add documentation regarding --retain-snapshots option
mgr/volumes: Avoid trashing retained subvolume on create errors
mgr/volumes: Add subvolume v2 test cases
mgr/volumes: Derive v2 from v1 to leverage common methods
mgr/volumes: Introduce v2 subvolumes
mgr/volumes: Use operation type during subvolume open
Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Jason Dillaman [Tue, 28 Jul 2020 01:14:18 +0000 (21:14 -0400)]
librbd: update hidden global config when setting pool config override
The new "dev"-level global config setting will be updated when any
pool-level config override is updated. librbd clients will detect
the new global-level config update and trigger a refresh. This avoids
the need for potentially tens of thousands of librbd clients
registering a watch on the pool metadata object or periodically polling
the pool metadata object for updates.
Fixes: https://tracker.ceph.com/issues/46694 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit f45df9fe786e8057c491c082e840483759d67e9e)
Conflicts:
src/common/options.cc
- "rbd_quiesce_notification_attempts", "rbd_default_snapshot_quiesce_mode", and
"rbd_plugins" options have not been backported to Octopus, yet
Jason Dillaman [Mon, 27 Jul 2020 19:31:09 +0000 (15:31 -0400)]
librbd: initial config watcher implementation
The config watcher will initially observe all "rbd_" configuration
updates received from the MON that have not been locally overridden
at the pool and/or image level.
root [Fri, 26 Jun 2020 10:44:45 +0000 (12:44 +0200)]
rgw: fix double slash (//) killing the gateway
When a bucket is inialized as a static website, a curl request on the bucket with double slash kills the gateway.
The problem is on the URL handling of the subdirectory, which tries to remove the last slash of any URL, so when only / is given as a sub-directory, this results to an empty string.
rgw: policy: reuse eval_principal to evaluate the policy principal
Since the other edge case when no Principal or a NotPrincipal is supplied also
must be accounted for, which is already done in eval_principal function. Also
reraising the error as Effect::Pass in line with the previous output, though an
Effect::Deny would also work here.
monclient: use "is_connected" check when scheduling tick
When schedule_tick is called for the first time (in init) we are
still not in hunting mode so mon_client_ping_interval and
mon_client_log_interval were used to schedule the tick.
Mike Christie [Thu, 9 Jan 2020 00:03:40 +0000 (18:03 -0600)]
selinux: Fix ceph-iscsi configfs access
This fixes the the following selinux error when using ceph-iscsi's
rbd-target-api daemon (rbd-target-gw has the same issue). They are
a result of the a python library, rtslib, which the daemons use.
Additional Information:
Source Context system_u:system_r:ceph_t:s0
Target Context system_u:object_r:configfs_t:s0
Target Objects
/sys/kernel/config/target/iscsi/iqn.2003-01.com.re
dhat:ceph-iscsi/tpgt_1/attrib/authentication
[
file ]
Source rbd-target-api
Source Path /usr/libexec/platform-python3.6
Port <Unknown>
Host ans8
Source RPM Packages platform-python-3.6.8-15.1.el8.x86_64
Target RPM Packages
Policy RPM selinux-policy-3.14.3-20.el8.noarch
Selinux Enabled True
Policy Type targeted
Enforcing Mode Enforcing
Host Name ans8
Platform Linux ans8 4.18.0-147.el8.x86_64 #1 SMP
Thu Sep 26
15:52:44 UTC 2019 x86_64 x86_64
Alert Count 1
First Seen 2020-01-08 18:39:47 EST
Last Seen 2020-01-08 18:39:47 EST
Local ID 6f8c3415-7a50-4dc8-b3d2-2621e1d00ca3
mgr: Add python-enum34 dependency to package for older distributions
Subvolumes v2 introduces Enum usage, that is not part of python<=3.4 and is
provided as an extra package python-enum34 till python 2.4 version.
Added this as an explicit dependency to packaging, to ensure it is installed
in required distributions.
python-enum34 is also used by mgr-dashboard
(src/pybind/mgr/dashboard/plugins/feature_toggles.py) but the dependency is
not called out explicitly and is satisfied by, python-pyopenssl, which
depends on python-cryptography, which depends on python-enum34.
Platform availability notes:
- centos7 comes with python-enum34
- centos8 does not require this, as python versions are higher, and Enum is part
of the language itself
- openSUSE 15.1/2 comes with python-enum34
- Ubuntu Xenial/Bionic comes with python-enum34
mgr/volumes: Avoid trashing retained subvolume on create errors
On any create or create_clone errors the entire subvolume was being
removed. This should be conditional and remove only the incarnation
if the subvolume was in the retained state.
Following support is added,
- Ability to retain snapshots on subvolume deletion
- Modify directory where snapshot is created to the subvolume
- "features" supported to subvolume info output, specifically ability
for a subvolume to retain snashots
- Current state of subvolume in info output
- Auto upgrade to v2 from eligible v1 subvolumes
- Adjust other functions as needed to support the changes
Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
Conflicts:
src/pybind/mgr/volumes/fs/volume.py
CephfsClient implementation is not backported, required resolution
Correct single instance of deprecated log.warn invocation
mgr/volumes: Use operation type during subvolume open
Subvolume open currently takes in 2 optional parameters to
denote desired state and type. This enables the open to
allow the operation to suceed based on the (type, state)
tuple.
Instead, pass an operation type to be performed on a subvolume
during open, and decide internal to a subvolume version if the
operation is allowed based on its state and type.
Also modifies the state machine code, to be more amenable to
modifications and improves redability.
Conflicts:
src/pybind/mgr/volumes/fs/operations/op_sm.py
subvol pin operation is not backported, required manual resolution
src/pybind/mgr/volumes/fs/volume.py
mypy type checks are not backported, required manual resolution
David Zafman [Wed, 10 Jun 2020 02:24:00 +0000 (19:24 -0700)]
mgr: Warn when too many reads are repaired on an OSD
Include test case
Configurable by setting mon_osd_warn_num_repaired (default 10)
Ignore new health warning with random eio injection test
Fixes: https://tracker.ceph.com/issues/41564 Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 661996d4342c427209b1eae4b0247f8210a00fc3)
Conflicts:
PendingReleaseNotes
- add pending release note under version heading that makes sense for
nautilus
src/mon/PGMap.cc
- d0eb22f3ba557b9e98836995c813ea77c5e7c2a5 is not backported to
nautilus, so the "health_check_t& add()" function takes only three
arguments: omit the fourth
A client may hold many inodes pinned in its cache for open files. That
client may be unable to release those caps to respond to cache pressure
from the MDS (or quiescent client cap recall). We should not complain if
that number of capabilities is reasonable (< 10k by default).
Fixes: https://tracker.ceph.com/issues/46830 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 63392e1b65fbead6ef8c7acd6a70e6ef5b322390)
Otherwise the files generated are not actually under the sub-directory!
This is correcting a confusing aspect of the test infrastructure but
doesn't actually require any changes to the tests.
so we can reuse run-make.sh for building the artifact used by other
tests than "make check", for instance, dashboard's E2E test and
crimson's performance test.
`ceph-volume lvm zap` command fails under certain conditions.
when passing `--osd-id` or `--osd-fsid` to `ceph-volume lvm zap` command
it tries to zap additionnal devices that have nothing to do with the osd
being zapped.
When calling `api.get_lvs()` in `ensure_associated_lvs()` we have to
pass the osd-id/osd-fsid information so only related devices are
returned by `get_lvs()` method
Zac Dover [Fri, 12 Jun 2020 08:35:54 +0000 (18:35 +1000)]
doc/rbd: add rbd-target-gw enable and start
This commit adds the following commands to the "Configuring the iSCSI Target Using the Command Line" page: "systemctl enable rbd-target-gw" and "systemctl start rbd-target-gw"
client: expose ceph.quota.max_bytes xattr within snapshots
For directories within snapshots, expose the ceph.quota.max_bytes
extended attribute information. This enables fetching quota
information when the snapshot was taken and is particularly useful
when cloning subvolume snapshots, to enforce the quota on the
clone subvolume as well.
append object tail gets GC'ed otherwise as the state has a manifest similar to
atomic obj processor, but if the manifest exists and the position is correct, it
is not an overwrite and shouldn't be GC'ed
Andrew Schoen [Fri, 24 Apr 2020 17:11:52 +0000 (12:11 -0500)]
ceph-volume: handle idempotency with batch and explicit scenarios
If you used --wal-devices or --db-devices with batch and too
many devices are filtered out then a RuntimeError was raised.
However, if --report and --format=json is used then we
should return valid json indicating that no OSDS will be created
so that ceph-ansible and other systems can use that for idempotency checks.