Kotresh HR [Tue, 1 Dec 2020 10:44:17 +0000 (16:14 +0530)]
tasks/cephfs/test_volume_client: Add tests for authorize/deauthorize
1. Add testcase for authorizing auth_id which is not added by
ceph_volume_client
2. Add testcase to test 'allow_existing_id' option
3. Add testcase for deauthorizing auth_id which has got it's caps
updated out of band
Optionally allow authorizing auth-ids not created by ceph_volume_client
via the option 'allow_existing_id'. This can help existing deployers
of manila to disallow/allow authorization of pre-created auth IDs
via a manila driver config that sets 'allow_existing_id' to False/True.
Kotresh HR [Thu, 26 Nov 2020 09:18:16 +0000 (14:48 +0530)]
pybind/ceph_volume_client: Preserve existing caps while authorize/deauthorize auth-id
Authorize/Deauthorize used to overwrite the caps of auth-id which would
end up deleting existing caps. This patch fixes the same by retaining
the existing caps by appending or deleting the new caps as needed.
This patch disallow the ceph_volume_client to authorize the auth_id
which is not created by ceph_volume_client. Those auth_ids could be
created by other means for other use cases which should not be modified
by ceph_volume_client.
Fixes: https://tracker.ceph.com/issues/48555 Signed-off-by: Ramana Raja <rraja@redhat.com> Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 3a85d2d04028a323952a31d18cdbefb710be2e2b)
Jan Fajerski [Wed, 9 Dec 2020 16:01:39 +0000 (17:01 +0100)]
Merge PR #38205 into octopus
* refs/pull/38205/head:
ceph-volume: pass *-slots arguments to LV creation
use extent count for slots conversion instead of free count
ceph-volume: available_lvm: vg space takes precedence
Xiubo Li [Mon, 23 Nov 2020 12:55:01 +0000 (20:55 +0800)]
common: do not dup the options when reexpanding
The old code will store all the options, which has `$pid` in them,
in may_reexpand_meta map. And when reexpanding later, the reexpand
code will dup them with a higher priority(CONF_OVERRIDE).
This will be a problem, if the default value has `$pid` and be
stored in the may_reexpand_meta map, and then the code set a new
different value, which may have no `$pid`, from CLI or config file.
The reexpand will override it with the default value always.
This will do not duplicate the options with CONF_OVERRIDE priority
when reexpanding, just refresh them and call the observers. And the
finalize_reexpand_meta() will always be called after the fork() is
done in child processes.
Xiubo Li [Fri, 13 Nov 2020 08:08:31 +0000 (16:08 +0800)]
global: reexpand the conf meta in all the child processes
Especially for the tools or the daemons whose config options need
to expand the '$pid', they will be always expanded with the parent
processes. We need to reexpand them in child processes just after
the fork is done.
Dimitri Savineau [Mon, 26 Oct 2020 19:12:59 +0000 (15:12 -0400)]
ceph-volume: consume mount opt in simple activate
When running ceph-volume simple activate command on a Filestore OSD
then the data device is mounted without any specific options so the
one from the ceph configuration file are ignored.
When deploying Filestore with the lvm subcommand then everything is
fine because the filestore_activate method uses mount_osd which relies
on the mount options defined in the ceph configuration file (if any).
Kevin Meijer [Sat, 14 Nov 2020 18:44:07 +0000 (19:44 +0100)]
mgr/dashboard: Disable sso without python3-saml
Removed the requirement for the python3-saml package when wanting to disable SSO for the dashboard, this is currently relevant since the official container that runs Ceph mgr does not have this package installed.
So when upgrading from an older, non-containerized version, you would be stuck using a non-functional dashboard.
This pull requests changes that and allows the ceph dashboard sso disable command without the requirement of the library so that we SSO can always be disabled again.
Fixes: https://tracker.ceph.com/issues/48237 Signed-off-by: Kevin Meijer <admin@kevinmeijer.nl>
(cherry picked from commit 0c18437d2c786ef1ade8b89e42dbf4b0e163aafe)
Casey Bodley [Mon, 23 Nov 2020 23:06:26 +0000 (18:06 -0500)]
rgw: temporarily disable calls to defer_gc() in RGWGetObj
cls_rgw_gc_queue_update_entry() is known to cause data loss when called
on objects that have not actually been scheduled for garbage collection
RGWGetObj is the only caller, and uses defer_gc() when reads are taking
a long time compared to rgw_gc_obj_min_wait. if an object has since been
deleted and submitted for garbage collection, this allows RGWGetObj to
defer that gc until the entire read completes
by disabling these calls to defer_gc(), very long reads (longer than 1hr,
with default configuration) may fail if the object gets deleted, and a
retry will result in a 404 Not Found error as expected
J. Eric Ivancich [Sat, 21 Nov 2020 16:10:35 +0000 (11:10 -0500)]
rgw: during GC defer, prevent new GC enqueue
With the new queue-based GC code, when a GC defer operation is
performed, it adds an "urgent" record to prevent GC from removing
objects that are still being read. It does not check whether the
objects are on the GC queue or not and that's OK for the urgent
record.
The code *also* adds a new GC entry to the queue to cause GC to occur
at a later time. This would be incorrect if there was no GC entry to
begin with, however. In such a case this would cause GC to delete tail
objects when no user-initiated remove has happend. In other words a
READ could cause a DELETE of tail objects and therefore data loss.
This fix prevents such a new GC entry from being enqueued, thus
preventing the data loss in this rare case. There is a new risk that
tail object orphans to be created, but as an immediate fix to prevent
data loss, this is appropriate and it is a rare event. A follow-on PR
that will handle these cases is likely.
This PR adds a level 0 log entry as a way to potentially confirm this
case is being triggered in real-world cases. In time, this log entry
should be deleted.
Jason Dillaman [Thu, 19 Nov 2020 20:59:34 +0000 (15:59 -0500)]
pybind/mgr/rbd_support: delay creation of progress event
Create the progress module event upon receipt of the first
progress callback from the librbd API. This will help to ensure
that all prereqs have been validated for retryable errors like
scheduling an image to be removed while it still has attached
cloned children.
Fixes: https://tracker.ceph.com/issues/48296 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit c0069b00e4748974d4bf5cfa1bdab68d6f043abb)
Jan Fajerski [Wed, 18 Nov 2020 08:37:48 +0000 (09:37 +0100)]
ceph-volume inventory: make libstoragemgmt data retrieval optional
Default to not retrieving libstoragemgmt data since it seems this can
cause serious issues on older hardware. Safest way is to only retrieve
lsm data when the user opts in..
Fixes: https://tracker.ceph.com/issues/48270 Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit b29a54d21e314db7a9d681cf5cc089dcfcbf6dc0)
Casey Bodley [Mon, 23 Nov 2020 23:06:26 +0000 (18:06 -0500)]
rgw: temporarily disable calls to defer_gc() in RGWGetObj
cls_rgw_gc_queue_update_entry() is known to cause data loss when called
on objects that have not actually been scheduled for garbage collection
RGWGetObj is the only caller, and uses defer_gc() when reads are taking
a long time compared to rgw_gc_obj_min_wait. if an object has since been
deleted and submitted for garbage collection, this allows RGWGetObj to
defer that gc until the entire read completes
by disabling these calls to defer_gc(), very long reads (longer than 1hr,
with default configuration) may fail if the object gets deleted, and a
retry will result in a 404 Not Found error as expected
J. Eric Ivancich [Sat, 21 Nov 2020 16:10:35 +0000 (11:10 -0500)]
rgw: during GC defer, prevent new GC enqueue
With the new queue-based GC code, when a GC defer operation is
performed, it adds an "urgent" record to prevent GC from removing
objects that are still being read. It does not check whether the
objects are on the GC queue or not and that's OK for the urgent
record.
The code *also* adds a new GC entry to the queue to cause GC to occur
at a later time. This would be incorrect if there was no GC entry to
begin with, however. In such a case this would cause GC to delete tail
objects when no user-initiated remove has happend. In other words a
READ could cause a DELETE of tail objects and therefore data loss.
This fix prevents such a new GC entry from being enqueued, thus
preventing the data loss in this rare case. There is a new risk that
tail object orphans to be created, but as an immediate fix to prevent
data loss, this is appropriate and it is a rare event. A follow-on PR
that will handle these cases is likely.
This PR adds a level 0 log entry as a way to potentially confirm this
case is being triggered in real-world cases. In time, this log entry
should be deleted.
Jan Fajerski [Wed, 4 Mar 2020 10:39:40 +0000 (11:39 +0100)]
ceph-volume: available_lvm: vg space takes precedence
This changes available_lvm to check for generic reasons only if no VGs
were found. A VG can contain a (mounted) lv, which triggers the
ro/locked test, despite the VG having space available.
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/iscsi-target-list/iscsi-target-list.component.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-namespace-list/rbd-namespace-list.component.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-snapshot-list/rbd-snapshot-actions.model.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/hosts/hosts.component.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/mgr-modules/mgr-module-list/mgr-module-list.component.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/pool/pool-list/pool-list.component.ts
- `$localize` calls are not available in Angular 8. They are replaced with i18n.
- Optional chaining syntax is not supported in typescript 3.5.3. Statements with optional chaining are re-coded.
Conflicts:
src/pybind/mgr/dashboard/frontend/package-lock.json
src/pybind/mgr/dashboard/frontend/package.json
- The master has different packages dependencies.
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/osd/osd-details/osd-details.component.spec.ts
- Imports are refactored: https://github.com/ceph/ceph/pull/37918.
src/pybind/mgr/dashboard/frontend/src/app/ceph/shared/ceph-shared.module.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/shared/smart-list/smart-list.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/shared/smart-list/smart-list.component.spec.ts
- We migrated from ngx-bootstrap to ng-bootstrap.
src/pybind/mgr/dashboard/frontend/src/app/ceph/shared/smart-list/smart-list.component.ts
- I18n services is replaced with $localize function.
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/dashboard/health-pie/health-pie.component.scss
src/pybind/mgr/dashboard/frontend/src/app/ceph/dashboard/health/health.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/dashboard/health/health.component.ts
src/pybind/mgr/dashboard/frontend/src/styles/defaults/_bootstrap-defaults.scss
Discarded all changes except the relevant code part. The rest was sucessfully backported by b2360b1a6101b5cc61c236047ce7c757fd02c93d.