Jan Fajerski [Wed, 9 Dec 2020 16:01:39 +0000 (17:01 +0100)]
Merge PR #38205 into octopus
* refs/pull/38205/head:
ceph-volume: pass *-slots arguments to LV creation
use extent count for slots conversion instead of free count
ceph-volume: available_lvm: vg space takes precedence
Dimitri Savineau [Mon, 26 Oct 2020 19:12:59 +0000 (15:12 -0400)]
ceph-volume: consume mount opt in simple activate
When running ceph-volume simple activate command on a Filestore OSD
then the data device is mounted without any specific options so the
one from the ceph configuration file are ignored.
When deploying Filestore with the lvm subcommand then everything is
fine because the filestore_activate method uses mount_osd which relies
on the mount options defined in the ceph configuration file (if any).
Casey Bodley [Mon, 23 Nov 2020 23:06:26 +0000 (18:06 -0500)]
rgw: temporarily disable calls to defer_gc() in RGWGetObj
cls_rgw_gc_queue_update_entry() is known to cause data loss when called
on objects that have not actually been scheduled for garbage collection
RGWGetObj is the only caller, and uses defer_gc() when reads are taking
a long time compared to rgw_gc_obj_min_wait. if an object has since been
deleted and submitted for garbage collection, this allows RGWGetObj to
defer that gc until the entire read completes
by disabling these calls to defer_gc(), very long reads (longer than 1hr,
with default configuration) may fail if the object gets deleted, and a
retry will result in a 404 Not Found error as expected
J. Eric Ivancich [Sat, 21 Nov 2020 16:10:35 +0000 (11:10 -0500)]
rgw: during GC defer, prevent new GC enqueue
With the new queue-based GC code, when a GC defer operation is
performed, it adds an "urgent" record to prevent GC from removing
objects that are still being read. It does not check whether the
objects are on the GC queue or not and that's OK for the urgent
record.
The code *also* adds a new GC entry to the queue to cause GC to occur
at a later time. This would be incorrect if there was no GC entry to
begin with, however. In such a case this would cause GC to delete tail
objects when no user-initiated remove has happend. In other words a
READ could cause a DELETE of tail objects and therefore data loss.
This fix prevents such a new GC entry from being enqueued, thus
preventing the data loss in this rare case. There is a new risk that
tail object orphans to be created, but as an immediate fix to prevent
data loss, this is appropriate and it is a rare event. A follow-on PR
that will handle these cases is likely.
This PR adds a level 0 log entry as a way to potentially confirm this
case is being triggered in real-world cases. In time, this log entry
should be deleted.
Jan Fajerski [Wed, 18 Nov 2020 08:37:48 +0000 (09:37 +0100)]
ceph-volume inventory: make libstoragemgmt data retrieval optional
Default to not retrieving libstoragemgmt data since it seems this can
cause serious issues on older hardware. Safest way is to only retrieve
lsm data when the user opts in..
Fixes: https://tracker.ceph.com/issues/48270 Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit b29a54d21e314db7a9d681cf5cc089dcfcbf6dc0)
Casey Bodley [Mon, 23 Nov 2020 23:06:26 +0000 (18:06 -0500)]
rgw: temporarily disable calls to defer_gc() in RGWGetObj
cls_rgw_gc_queue_update_entry() is known to cause data loss when called
on objects that have not actually been scheduled for garbage collection
RGWGetObj is the only caller, and uses defer_gc() when reads are taking
a long time compared to rgw_gc_obj_min_wait. if an object has since been
deleted and submitted for garbage collection, this allows RGWGetObj to
defer that gc until the entire read completes
by disabling these calls to defer_gc(), very long reads (longer than 1hr,
with default configuration) may fail if the object gets deleted, and a
retry will result in a 404 Not Found error as expected
J. Eric Ivancich [Sat, 21 Nov 2020 16:10:35 +0000 (11:10 -0500)]
rgw: during GC defer, prevent new GC enqueue
With the new queue-based GC code, when a GC defer operation is
performed, it adds an "urgent" record to prevent GC from removing
objects that are still being read. It does not check whether the
objects are on the GC queue or not and that's OK for the urgent
record.
The code *also* adds a new GC entry to the queue to cause GC to occur
at a later time. This would be incorrect if there was no GC entry to
begin with, however. In such a case this would cause GC to delete tail
objects when no user-initiated remove has happend. In other words a
READ could cause a DELETE of tail objects and therefore data loss.
This fix prevents such a new GC entry from being enqueued, thus
preventing the data loss in this rare case. There is a new risk that
tail object orphans to be created, but as an immediate fix to prevent
data loss, this is appropriate and it is a rare event. A follow-on PR
that will handle these cases is likely.
This PR adds a level 0 log entry as a way to potentially confirm this
case is being triggered in real-world cases. In time, this log entry
should be deleted.
Jan Fajerski [Wed, 4 Mar 2020 10:39:40 +0000 (11:39 +0100)]
ceph-volume: available_lvm: vg space takes precedence
This changes available_lvm to check for generic reasons only if no VGs
were found. A VG can contain a (mounted) lv, which triggers the
ro/locked test, despite the VG having space available.
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/iscsi-target-list/iscsi-target-list.component.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-namespace-list/rbd-namespace-list.component.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-snapshot-list/rbd-snapshot-actions.model.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/hosts/hosts.component.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/mgr-modules/mgr-module-list/mgr-module-list.component.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/pool/pool-list/pool-list.component.ts
- `$localize` calls are not available in Angular 8. They are replaced with i18n.
- Optional chaining syntax is not supported in typescript 3.5.3. Statements with optional chaining are re-coded.
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/dashboard/health-pie/health-pie.component.scss
src/pybind/mgr/dashboard/frontend/src/app/ceph/dashboard/health/health.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/dashboard/health/health.component.ts
src/pybind/mgr/dashboard/frontend/src/styles/defaults/_bootstrap-defaults.scss
Discarded all changes except the relevant code part. The rest was sucessfully backported by b2360b1a6101b5cc61c236047ce7c757fd02c93d.
Fix configuration created by cephadm to prevent any "many-to-many
matching not allowed: matching labels must be unique on one side"
issues. The mgr/prometheus exporter exports suitable instance labels
itself, which can be taken over when `honor_labels` in Prometheus is set
to `true`.
Fixes: https://tracker.ceph.com/issues/47997 Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit ea8a3aca02f2adc1e68a055ab95ced207da1561a)
Volker Theile [Fri, 30 Oct 2020 08:22:30 +0000 (09:22 +0100)]
mgr/cephadm: Allow customizing mgr/cephadm/lsmcli_blink_lights_cmd per host
* Rename key name from 'lsmcli_blink_lights_cmd' to 'blink_device_light_cmd'
* Refactor TemplateMgr::render() method to use the Ceph common behavior how to name store/module option keys. The old implementation required a key like 'mgr/cephadm/services_nfs_ganesha.conf' instead of 'mgr/cephadm/services/nfs/ganesha.conf' or 'mgr/cephadm/mgr0_blink_device_light_cmd' instead of 'mgr/cephadm/mgr0/blink_device_light_cmd'.
Varsha Rao [Wed, 28 Oct 2020 13:37:35 +0000 (19:07 +0530)]
doc/mgr/orchestrator: Update about "{mds, rgw} add" status in rook
"mds add" and "rgw add" are no longer supported in rook. Their implementation
was removed by commits 56cfeb6 and 0580297. Instead "apply mds" and "apply rgw"
is preferred.
diwilli [Wed, 28 Oct 2020 17:43:05 +0000 (17:43 +0000)]
cephadm: Set listen-addresses on alertmanager container
This explicitly passes web.listen-address and cluster.listen-address to the alertmanager container allowing the use of public IP addresses.
Fixes: https://tracker.ceph.com/issues/48031 Signed-off-by: Dan Williams <dw@adventsol.co.uk>
(cherry picked from commit 29730a4bc168913d5dad6d9e487d2dc58a0e3c86)
Tim Serong [Mon, 5 Oct 2020 09:14:42 +0000 (20:14 +1100)]
cephadm: allow uid/gid == 0 in copy_tree, copy_files, move_files
If the uid or gid passed to copy_tree(), copy_files() or
move_files() is 0 (the root user), the current check for
`if not uid or not gid` does the wrong thing, i.e. it
thinks the uid and/or gid aren't set, then calls out to
extract_uid_gid(), which fails when run against
prometheus/grafana/alertmanager containers.
Adam King [Wed, 23 Sep 2020 16:52:37 +0000 (12:52 -0400)]
mgr/cephadm: get rbd-mirror daemon-id when checking for strays
Currently, list_servers() gets the rbd-mirror service-id instead
of the daemon-id so the daemon is marked as stray. This PR uses
that service-id to find the daemon-id and uses that to check if
the daemon is stray.
Ilya Dryomov [Fri, 16 Oct 2020 10:57:50 +0000 (12:57 +0200)]
mon/MonClient: bring back CEPHX_V2 authorizer challenges
Commit c58c5754dfd2 ("msg/async/ProtocolV1: use AuthServer and
AuthClient") introduced a backwards compatibility issue into msgr1.
To fix it, commit 321548010578 ("mon/MonClient: skip CEPHX_V2
challenge if client doesn't support it") set out to skip authorizer
challenges for peers that don't support CEPHX_V2. However, it
made it so that authorizer challenges are skipped for all peers in
both msgr1 and msgr2 cases, effectively disabling the protection
against replay attacks that was put in place in commit f80b848d3f83
("auth/cephx: add authorizer challenge", CVE-2018-1128).
This is because con->get_features() always returns 0 at that
point. In msgr1 case, the peer shares its features along with the
authorizer, but while they are available in connect_msg.features they
aren't assigned to con until ProtocolV1::open(). In msgr2 case, the
peer doesn't share its features until much later (in CLIENT_IDENT
frame, i.e. after the authentication phase). The result is that
!CEPHX_V2 branch is taken in all cases and replay attack protection
is lost.
Only clusters with cephx_service_require_version set to 2 on the
service daemons would not be silently downgraded. But, since the
default is 1 and there are no reports of looping on BADAUTHORIZER
faults, I'm pretty sure that no one has ever done that. Note that
cephx_require_version set to 2 would have no effect even though it
is supposed to be stronger than cephx_service_require_version
because MonClient::handle_auth_request() didn't check it.
To fix:
- for msgr1, check connect_msg.features (as was done before commit c58c5754dfd2) and challenge if CEPHX_V2 is supported. Together
with two preceding patches that resurrect proper cephx_* option
handling in msgr1, this covers both "I want old clients to work"
and "I wish to require better authentication" use cases.
- for msgr2, don't check anything and always challenge. CEPHX_V2
predates msgr2, anyone speaking msgr2 must support it.
This was added in commit 9bcbc2a3621f ("mon,msg: implement
cephx_*_require_version options") and inadvertently dropped in
commit e6f043f7d2dc ("msgr/async: huge refactoring of protocol V1").
As a result, service daemons don't enforce cephx_require_version
and cephx_cluster_require_version options and connections without
CEPH_FEATURE_CEPHX_V2 are allowed through.
(cephx_service_require_version enforcement was brought back a
year later in commit 321548010578 ("mon/MonClient: skip CEPHX_V2
challenge if client doesn't support it"), although the peer gets
TAG_BADAUTHORIZER instead of TAG_FEATURES.)
Resurrect the original behaviour: all cephx_*require_version
options are enforced and the peer gets TAG_FEATURES, signifying
that it is missing a required feature.
Ilya Dryomov [Fri, 16 Oct 2020 09:33:32 +0000 (11:33 +0200)]
msg/async/ProtocolV1: resurrect "include MGR as service when applying cephx settings"
This was added in commit 0ec7d6bbc4af ("msg/async,simple: include MGR
as service when applying cephx settings") and inadvertently dropped in
commit e6f043f7d2dc ("msgr/async: huge refactoring of protocol V1").
As a result, mgr daemons are miscategorized as clients when enforcing
cephx_*require_signatures options.