]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
4 years agoceph-volume inventory: make libstoragemgmt data retrieval optional 38299/head
Jan Fajerski [Wed, 18 Nov 2020 08:37:48 +0000 (09:37 +0100)]
ceph-volume inventory: make libstoragemgmt data retrieval optional

Default to not retrieving libstoragemgmt data since it seems this can
cause serious issues on older hardware. Safest way is to only retrieve
lsm data when the user opts in..

Fixes: https://tracker.ceph.com/issues/48270
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit b29a54d21e314db7a9d681cf5cc089dcfcbf6dc0)

4 years agoMerge pull request #38249 from ivancich/wip-48331-octopus
Yuri Weinstein [Wed, 25 Nov 2020 16:12:23 +0000 (08:12 -0800)]
Merge pull request #38249 from ivancich/wip-48331-octopus

octopus: rgw: during GC defer, prevent new GC enqueue

Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agoMerge pull request #37604 from smithfarm/wip-47802-octopus
Yuri Weinstein [Tue, 24 Nov 2020 20:53:55 +0000 (12:53 -0800)]
Merge pull request #37604 from smithfarm/wip-47802-octopus

octopus: test/librados: fix endian bugs in checksum test cases

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #37863 from ideepika/add-stringio
Yuri Weinstein [Tue, 24 Nov 2020 18:03:28 +0000 (10:03 -0800)]
Merge pull request #37863 from ideepika/add-stringio

octopus: qa/tasks/{ceph,ceph_manager}: drop py2 support

Reviewed-by: Kyr Shatskyy <kyrylo.shatskyy@suse.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
4 years agorgw: temporarily disable calls to defer_gc() in RGWGetObj 38249/head
Casey Bodley [Mon, 23 Nov 2020 23:06:26 +0000 (18:06 -0500)]
rgw: temporarily disable calls to defer_gc() in RGWGetObj

cls_rgw_gc_queue_update_entry() is known to cause data loss when called
on objects that have not actually been scheduled for garbage collection

RGWGetObj is the only caller, and uses defer_gc() when reads are taking
a long time compared to rgw_gc_obj_min_wait. if an object has since been
deleted and submitted for garbage collection, this allows RGWGetObj to
defer that gc until the entire read completes

by disabling these calls to defer_gc(), very long reads (longer than 1hr,
with default configuration) may fail if the object gets deleted, and a
retry will result in a 404 Not Found error as expected

Fixes: https://tracker.ceph.com/issues/47866
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 94df9cd37a1ca457130f90803281b166a5fa7eef)

4 years agorgw: during GC defer, prevent new GC enqueue
J. Eric Ivancich [Sat, 21 Nov 2020 16:10:35 +0000 (11:10 -0500)]
rgw: during GC defer, prevent new GC enqueue

With the new queue-based GC code, when a GC defer operation is
performed, it adds an "urgent" record to prevent GC from removing
objects that are still being read. It does not check whether the
objects are on the GC queue or not and that's OK for the urgent
record.

The code *also* adds a new GC entry to the queue to cause GC to occur
at a later time. This would be incorrect if there was no GC entry to
begin with, however. In such a case this would cause GC to delete tail
objects when no user-initiated remove has happend. In other words a
READ could cause a DELETE of tail objects and therefore data loss.

This fix prevents such a new GC entry from being enqueued, thus
preventing the data loss in this rare case. There is a new risk that
tail object orphans to be created, but as an immediate fix to prevent
data loss, this is appropriate and it is a rare event. A follow-on PR
that will handle these cases is likely.

This PR adds a level 0 log entry as a way to potentially confirm this
case is being triggered in real-world cases. In time, this log entry
should be deleted.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 2603485bcb4402260e0f7aadd2f2c8ab05b07399)

4 years agoMerge branch 'octopus-saved' into octopus
Josh Durgin [Thu, 19 Nov 2020 03:30:13 +0000 (19:30 -0800)]
Merge branch 'octopus-saved' into octopus

4 years ago15.2.6 v15.2.6
Jenkins Build Slave User [Tue, 17 Nov 2020 18:12:53 +0000 (18:12 +0000)]
15.2.6

4 years agomon/MonClient: bring back CEPHX_V2 authorizer challenges
Ilya Dryomov [Fri, 16 Oct 2020 10:57:50 +0000 (12:57 +0200)]
mon/MonClient: bring back CEPHX_V2 authorizer challenges

Commit c58c5754dfd2 ("msg/async/ProtocolV1: use AuthServer and
AuthClient") introduced a backwards compatibility issue into msgr1.
To fix it, commit 321548010578 ("mon/MonClient: skip CEPHX_V2
challenge if client doesn't support it") set out to skip authorizer
challenges for peers that don't support CEPHX_V2.  However, it
made it so that authorizer challenges are skipped for all peers in
both msgr1 and msgr2 cases, effectively disabling the protection
against replay attacks that was put in place in commit f80b848d3f83
("auth/cephx: add authorizer challenge", CVE-2018-1128).

This is because con->get_features() always returns 0 at that
point.  In msgr1 case, the peer shares its features along with the
authorizer, but while they are available in connect_msg.features they
aren't assigned to con until ProtocolV1::open().  In msgr2 case, the
peer doesn't share its features until much later (in CLIENT_IDENT
frame, i.e. after the authentication phase).  The result is that
!CEPHX_V2 branch is taken in all cases and replay attack protection
is lost.

Only clusters with cephx_service_require_version set to 2 on the
service daemons would not be silently downgraded.  But, since the
default is 1 and there are no reports of looping on BADAUTHORIZER
faults, I'm pretty sure that no one has ever done that.  Note that
cephx_require_version set to 2 would have no effect even though it
is supposed to be stronger than cephx_service_require_version
because MonClient::handle_auth_request() didn't check it.

To fix:

- for msgr1, check connect_msg.features (as was done before commit
  c58c5754dfd2) and challenge if CEPHX_V2 is supported.  Together
  with two preceding patches that resurrect proper cephx_* option
  handling in msgr1, this covers both "I want old clients to work"
  and "I wish to require better authentication" use cases.

- for msgr2, don't check anything and always challenge.  CEPHX_V2
  predates msgr2, anyone speaking msgr2 must support it.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 4a82c72e3bdddcb625933e83af8b50a444b961f1)

4 years agomsg/async/ProtocolV1: resurrect "implement cephx_*require_version options"
Ilya Dryomov [Fri, 16 Oct 2020 09:35:27 +0000 (11:35 +0200)]
msg/async/ProtocolV1: resurrect "implement cephx_*require_version options"

This was added in commit 9bcbc2a3621f ("mon,msg: implement
cephx_*_require_version options") and inadvertently dropped in
commit e6f043f7d2dc ("msgr/async: huge refactoring of protocol V1").
As a result, service daemons don't enforce cephx_require_version
and cephx_cluster_require_version options and connections without
CEPH_FEATURE_CEPHX_V2 are allowed through.

(cephx_service_require_version enforcement was brought back a
year later in commit 321548010578 ("mon/MonClient: skip CEPHX_V2
challenge if client doesn't support it"), although the peer gets
TAG_BADAUTHORIZER instead of TAG_FEATURES.)

Resurrect the original behaviour: all cephx_*require_version
options are enforced and the peer gets TAG_FEATURES, signifying
that it is missing a required feature.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 6f5c4152ca2c6423e665cde2196c6301f76043a2)

4 years agomsg/async/ProtocolV1: resurrect "include MGR as service when applying cephx settings"
Ilya Dryomov [Fri, 16 Oct 2020 09:33:32 +0000 (11:33 +0200)]
msg/async/ProtocolV1: resurrect "include MGR as service when applying cephx settings"

This was added in commit 0ec7d6bbc4af ("msg/async,simple: include MGR
as service when applying cephx settings") and inadvertently dropped in
commit e6f043f7d2dc ("msgr/async: huge refactoring of protocol V1").
As a result, mgr daemons are miscategorized as clients when enforcing
cephx_*require_signatures options.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 949e2e595eda553aa68f697cee1dcfff3c09cf3f)

4 years agoMerge pull request #38045 from dsavineau/wip-48184-octopus
Yuri Weinstein [Fri, 13 Nov 2020 20:01:34 +0000 (12:01 -0800)]
Merge pull request #38045 from dsavineau/wip-48184-octopus

octopus: ceph-volume: fix lvm batch auto with full SSDs

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>
4 years agoceph-volume: add a unit tests to lvm batch 38045/head
Guillaume Abrioux [Wed, 4 Nov 2020 14:11:58 +0000 (15:11 +0100)]
ceph-volume: add a unit tests to lvm batch

This commit adds unit tests in order to cover `_sort_rotational_disks()`
call when deploying with full hdd/ssd or mixed hdd/sdd scenarios.

Fixes: https://tracker.ceph.com/issues/48150
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 13514a24cfdc32d67cfbc1201aa427168a926978)

4 years agoceph-volume: fix lvm batch auto with full SSDs
Dimitri Savineau [Tue, 3 Nov 2020 23:21:35 +0000 (18:21 -0500)]
ceph-volume: fix lvm batch auto with full SSDs

The ceph-volume lvm batch --auto introduced by [1] breaks the backward
compatibility when using non rotational devices only (SSD and/or NVMe).
Those devices are reaffected as bluestore db or filestore journal
devices while we want them as data devices.

Fixes: https://tracker.ceph.com/issues/48106
[1] https://github.com/ceph/ceph/pull/34740

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2a854ca373fadef099a1d037930eb241e757b2c3)

4 years agoMerge pull request #37553 from Vicente-Cheng/wip-47747-octopus
Yuri Weinstein [Tue, 10 Nov 2020 19:32:18 +0000 (11:32 -0800)]
Merge pull request #37553 from Vicente-Cheng/wip-47747-octopus

octopus: mon: set session_timeout when adding to session_map

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
4 years agoMerge pull request #37885 from bk201/wip-47944-octopus
Lenz Grimmer [Mon, 9 Nov 2020 09:47:59 +0000 (10:47 +0100)]
Merge pull request #37885 from bk201/wip-47944-octopus

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Varsha Rao <varao@redhat.com>
4 years agoMerge pull request #37962 from votdev/custom_container_docs
Lenz Grimmer [Mon, 9 Nov 2020 08:15:08 +0000 (09:15 +0100)]
Merge pull request #37962 from votdev/custom_container_docs

octopus: doc/mgr/orchestrator: Add hints related to custom containers to the docs

Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>
4 years agoMerge pull request #37857 from smithfarm/wip-47940-octopus
Yuri Weinstein [Fri, 6 Nov 2020 19:06:38 +0000 (11:06 -0800)]
Merge pull request #37857 from smithfarm/wip-47940-octopus

octopus: mon/MDSMonitor: divide mds identifier and mds real name with dot

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge pull request #37856 from smithfarm/wip-47936-octopus
Yuri Weinstein [Fri, 6 Nov 2020 19:06:00 +0000 (11:06 -0800)]
Merge pull request #37856 from smithfarm/wip-47936-octopus

octopus: mds: account for closing sessions in hit_session

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge pull request #37855 from smithfarm/wip-47891-octopus
Yuri Weinstein [Fri, 6 Nov 2020 19:05:32 +0000 (11:05 -0800)]
Merge pull request #37855 from smithfarm/wip-47891-octopus

octopus: mgr/volumes/nfs: Fix wrong error message for pseudo path

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge pull request #37854 from smithfarm/wip-46959-octopus
Yuri Weinstein [Fri, 6 Nov 2020 19:05:01 +0000 (11:05 -0800)]
Merge pull request #37854 from smithfarm/wip-46959-octopus

octopus: cephfs-journal-tool: fix incorrect read_offset when finding missing objects

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge pull request #37841 from smithfarm/wip-47991-octopus
Yuri Weinstein [Fri, 6 Nov 2020 19:04:33 +0000 (11:04 -0800)]
Merge pull request #37841 from smithfarm/wip-47991-octopus

octopus: qa/cephfs: add session_timeout option support

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge pull request #37839 from smithfarm/wip-47989-octopus
Yuri Weinstein [Fri, 6 Nov 2020 19:04:01 +0000 (11:04 -0800)]
Merge pull request #37839 from smithfarm/wip-47989-octopus

octopus: cephfs: client: fix inode ll_ref reference count leak

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
4 years agoMerge pull request #37837 from smithfarm/wip-47954-octopus
Yuri Weinstein [Fri, 6 Nov 2020 19:03:04 +0000 (11:03 -0800)]
Merge pull request #37837 from smithfarm/wip-47954-octopus

octopus: vstart.sh: fix fs set max_mds bug

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #37724 from rishabh-d-dave/wip-46610-octopus
Yuri Weinstein [Fri, 6 Nov 2020 19:02:22 +0000 (11:02 -0800)]
Merge pull request #37724 from rishabh-d-dave/wip-46610-octopus

octopus: pybind/cephfs: add special values for not reading conffile

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge pull request #37671 from Vicente-Cheng/wip-47824-octopus
Yuri Weinstein [Fri, 6 Nov 2020 19:01:36 +0000 (11:01 -0800)]
Merge pull request #37671 from Vicente-Cheng/wip-47824-octopus

octopus: mgr/volumes: Make number of cloner threads configurable

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge pull request #37620 from rhcs-dashboard/wip-47811-octopus
Laura Paduano [Fri, 6 Nov 2020 09:57:30 +0000 (10:57 +0100)]
Merge pull request #37620 from rhcs-dashboard/wip-47811-octopus

octopus: mgr/dashboard: get rgw daemon zonegroup name from mgr

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puertat <epuertat@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
4 years agoMerge pull request #37858 from smithfarm/wip-47958-octopus
Yuri Weinstein [Thu, 5 Nov 2020 16:31:28 +0000 (08:31 -0800)]
Merge pull request #37858 from smithfarm/wip-47958-octopus

octopus: mon/MDSMonitor do not ignore mds's down:dne request

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
4 years agoMerge pull request #37853 from smithfarm/wip-47826-octopus
Yuri Weinstein [Thu, 5 Nov 2020 16:27:09 +0000 (08:27 -0800)]
Merge pull request #37853 from smithfarm/wip-47826-octopus

octopus: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning:  return 1

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
4 years agoMerge pull request #37835 from smithfarm/wip-47934-octopus
Yuri Weinstein [Thu, 5 Nov 2020 16:24:10 +0000 (08:24 -0800)]
Merge pull request #37835 from smithfarm/wip-47934-octopus

octopus: tools/rados: flush formatter periodically during json output of "rados ls"

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
Reviewed-by: Adam Emerson <aemerson@redhat.com>
4 years agoMerge pull request #37819 from smithfarm/wip-47994-octopus
Yuri Weinstein [Thu, 5 Nov 2020 16:22:47 +0000 (08:22 -0800)]
Merge pull request #37819 from smithfarm/wip-47994-octopus

octopus: test/store_test: use 'threadsafe' style for death tests

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agoMerge pull request #37817 from smithfarm/wip-47987-octopus
Yuri Weinstein [Thu, 5 Nov 2020 16:21:55 +0000 (08:21 -0800)]
Merge pull request #37817 from smithfarm/wip-47987-octopus

octopus: mon/MonMap: fix unconditional failure for init_with_hosts

Reviewed-by: Wido den Hollander <wido@widodh.nl>
4 years agoMerge pull request #37784 from bk201/wip-47657-octopus
Lenz Grimmer [Thu, 5 Nov 2020 10:52:54 +0000 (11:52 +0100)]
Merge pull request #37784 from bk201/wip-47657-octopus

octopus: mgr/dashboard: display devices' health information within a tabset

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Patrick Seidensal <pnawracay@suse.com>
4 years agodoc/mgr/orchestrator: Add hints related to custom containers to the docs 37962/head
Volker Theile [Thu, 5 Nov 2020 10:16:35 +0000 (11:16 +0100)]
doc/mgr/orchestrator: Add hints related to custom containers to the docs

Fixes: https://tracker.ceph.com/issues/48113
Signed-off-by: Volker Theile <vtheile@suse.com>
(cherry picked from commit 1927809b0b58243dbe84756b9cec7c29bd0a7494)

4 years agoMerge pull request #37239 from rhcs-dashboard/read_only_backport
Lenz Grimmer [Tue, 3 Nov 2020 12:59:36 +0000 (13:59 +0100)]
Merge pull request #37239 from rhcs-dashboard/read_only_backport

octopus: mgr/dashboard: Disabling the form inputs for the read_only modals

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
4 years agoqa/tasks/ceph_manager: use StringIO for capturing COT output 37863/head
Kefu Chai [Sun, 8 Mar 2020 06:00:53 +0000 (14:00 +0800)]
qa/tasks/ceph_manager: use StringIO for capturing COT output

there are couple factors we should consider when choosing between
BytesIO and StringIO:

- if the producer is producing binary
- if we are expecting binary
- if the layers in between them are doing the decoding/encoding
  automatically.

in our case, the producer is either the ChannelFile instances returned
by paramiko.SSHClient or subprocess.CompletedProcess insances returned
by subprocess.run(). the former are file-like objects opened in "r" mode,
but their contents are decoded with utf-8 when reading if
ChannelFile.FLAG_BINARY is not specified. that's why we always try to
add this flag in orchestra/run.py when collecting the stdout and stderr
from paramiko.SSHClient after executing a command.

back in python2, this works just fine. as we don't differentiate bytes
from str by then.

but in python3, we have to make a decision. in the case of
ceph-objectstore-tool (COT for short), it does not produce binary and
we don't check its output with binary, so, if neither Remote.run() nor
LocalRemote.run() decodes/encodes for us, it's fine.

so it boils down to `copy_to_log()`:

i think we we should respect the consumer's expectation, and only decode
the output if a StringIO is passed in as stdout or stderr.

as we always log the output with logging we could either set
`ChannelFile.FLAG_BINARY` depending on the type of `capture` or not.
if it's not set, paramiko will return str (bytes) on python2, and str on
python3. if it's not set paramiko will return str (bytes) on python2,
and bytes on python3.

if there is non-ASCII in the output, logging will bail fail with
`UnicodeDecodeError` exception. and paramiko throws the same exception
when trying to decode for us if `ChannelFile.FLAG_BINARY` is not
specified.

so to ensure that we always have logging messages no matter if the
producer follows the rule of "use StringIO if you only emit text" or
not, we have to use `ChannelFile.FLAG_BINARY`, and force paramiko
to send us the bytes. but we still have the luxury to use StringIO
and do the decode when the caller asks for str explicitly. that'd save
the pain of using `str.decode()` or `six.ensure_str()` everywhere
even if we can assure that the program does not write binary.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit d8d44ed1566b19eec055e07da2a0fed88fed4152)

4 years agoqa/tasks/ceph_manager: capture stderr for COT
Kefu Chai [Sun, 8 Mar 2020 05:39:59 +0000 (13:39 +0800)]
qa/tasks/ceph_manager: capture stderr for COT

as we are expecting the error message written to stderr, and we need to
check for the error messages in it.

this change addresses the regression introduced by
204ceee156cbb8a20bdf56efb0cd0610ee4c107e

Fixes: https://tracker.ceph.com/issues/44500
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 78308f7207a7129aaba01ea8c27e2f563d688318)

4 years agoMerge pull request #37530 from batrick/i47734-octopus
Yuri Weinstein [Thu, 29 Oct 2020 20:08:37 +0000 (13:08 -0700)]
Merge pull request #37530 from batrick/i47734-octopus

octopus: osdc: add timeout configs for mons/osds

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge pull request #37256 from batrick/i47249
Yuri Weinstein [Thu, 29 Oct 2020 20:05:53 +0000 (13:05 -0700)]
Merge pull request #37256 from batrick/i47249

octopus: mon: deleting a CephFS and its pools causes MONs to crash

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge pull request #37852 from smithfarm/wip-47889-octopus
Yuri Weinstein [Thu, 29 Oct 2020 18:50:05 +0000 (11:50 -0700)]
Merge pull request #37852 from smithfarm/wip-47889-octopus

octopus: rbd: librbd: ignore -ENOENT error when disabling object-map

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
4 years agoMerge pull request #37851 from smithfarm/wip-47888-octopus
Yuri Weinstein [Thu, 29 Oct 2020 18:49:42 +0000 (11:49 -0700)]
Merge pull request #37851 from smithfarm/wip-47888-octopus

octopus: rbd: librbd: update AioCompletion return value before evaluating pending count

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
4 years agoMerge pull request #37850 from smithfarm/wip-47886-octopus
Yuri Weinstein [Thu, 29 Oct 2020 18:49:13 +0000 (11:49 -0700)]
Merge pull request #37850 from smithfarm/wip-47886-octopus

octopus: rbd: journal: possible race condition between flush and append callback

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
4 years agoMerge pull request #37849 from smithfarm/wip-48003-octopus
Yuri Weinstein [Thu, 29 Oct 2020 18:48:31 +0000 (11:48 -0700)]
Merge pull request #37849 from smithfarm/wip-48003-octopus

octopus: rgw: fix: S3 API KeyCount incorrect return.

Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agoMerge pull request #37847 from smithfarm/wip-47956-octopus
Yuri Weinstein [Thu, 29 Oct 2020 18:48:02 +0000 (11:48 -0700)]
Merge pull request #37847 from smithfarm/wip-47956-octopus

octopus: rgw/gc: fix for incrementing the perf counter 'gc_retire_object'

Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agoMerge pull request #37846 from smithfarm/wip-47955-octopus
Yuri Weinstein [Thu, 29 Oct 2020 18:47:19 +0000 (11:47 -0700)]
Merge pull request #37846 from smithfarm/wip-47955-octopus

octopus: rgw/gc: fixing the condition when marker for a queue is

Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agoMerge pull request #37845 from smithfarm/wip-47896-octopus
Yuri Weinstein [Thu, 29 Oct 2020 18:46:49 +0000 (11:46 -0700)]
Merge pull request #37845 from smithfarm/wip-47896-octopus

octopus: rgw: use yum rather than dnf for teuthology testing of rgw-orphan-list

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
4 years agoMerge pull request #37812 from smithfarm/wip-48007-octopus
Yuri Weinstein [Thu, 29 Oct 2020 18:46:11 +0000 (11:46 -0700)]
Merge pull request #37812 from smithfarm/wip-48007-octopus

octopus: rbd: rbd-nbd: don't ignore namespace when unmapping by image spec

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
4 years agoMerge pull request #37809 from smithfarm/wip-47962-octopus
Yuri Weinstein [Thu, 29 Oct 2020 18:45:23 +0000 (11:45 -0700)]
Merge pull request #37809 from smithfarm/wip-47962-octopus

octopus: rgw: Add request timeout to beast

Reviewed-by: Friedmann <ofriedma@redhat.com>
4 years agoMerge pull request #37807 from smithfarm/wip-47960-octopus
Yuri Weinstein [Thu, 29 Oct 2020 18:44:45 +0000 (11:44 -0700)]
Merge pull request #37807 from smithfarm/wip-47960-octopus

octopus: rgw: fix expiration header returned even if there is only one tag in the object the same as the rule

Reviewed-by: Friedmann <ofriedma@redhat.com>
4 years agoMerge pull request #37803 from smithfarm/wip-47819-octopus
Yuri Weinstein [Thu, 29 Oct 2020 18:43:55 +0000 (11:43 -0700)]
Merge pull request #37803 from smithfarm/wip-47819-octopus

octopus: rgw: radosgw-admin should paginate internally when listing bucket

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agoMerge pull request #37800 from smithfarm/wip-47817-octopus
Yuri Weinstein [Thu, 29 Oct 2020 18:42:22 +0000 (11:42 -0700)]
Merge pull request #37800 from smithfarm/wip-47817-octopus

octopus: rgw: allow rgw-orphan-list to note when rados objects are in namespace

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
4 years agoMerge pull request #37779 from smithfarm/wip-47037-octopus
Yuri Weinstein [Thu, 29 Oct 2020 18:40:50 +0000 (11:40 -0700)]
Merge pull request #37779 from smithfarm/wip-47037-octopus

octopus: rgw: fix user stats iterative increment

Reviewed-by: Mark Kogan <mkogan@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agomgr/cephadm: do not configure Dashboard Ganesha settings 37885/head
Kiefer Chang [Wed, 2 Sep 2020 12:48:02 +0000 (20:48 +0800)]
mgr/cephadm: do not configure Dashboard Ganesha settings

The Dashboard can get cluster information from the Orchestrator.
For settings that are set by previous revisions, the Dashboard will
check them and ask user to remove them.

Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>
(cherry picked from commit 99e421065748c03da1fc468b2a09bf22f7bc31b0)

Conflicts:
      src/pybind/mgr/cephadm/services/nfs.py

4 years agodoc/dashboard: add information for Orchestrator NFS-Ganesha clusters
Kiefer Chang [Thu, 3 Sep 2020 14:32:12 +0000 (22:32 +0800)]
doc/dashboard: add information for Orchestrator NFS-Ganesha clusters

Fixes: https://tracker.ceph.com/issues/46492
Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>
(cherry picked from commit a5aaaa69cc82af2a6f4b7f0bd60ce2da5015c8c2)

4 years agomgr/dashboard: support Orchestrator and user-defined Ganesha clusters
Kiefer Chang [Wed, 2 Sep 2020 12:28:36 +0000 (20:28 +0800)]
mgr/dashboard: support Orchestrator and user-defined Ganesha clusters

This change make the Dashboard support two types of Ganesha clusters:

- Orchestrator clusters (Since Octopus)
  - Deployed by the Orchestrator.
  - The Dashboard gets the pool/namespace that stores Ganesha
    configuration objects from the Orchestrator.
  - The Dashboard gets the daemons in a cluster from the Orchestrator.

- User-defined clusters (Since Nautilus)
  - Clusters defined by using `ceph dashboard
    set-ganesha-clusters-rados-pool-namespace` command is treated as
    user-defined clusters.
  - Each daemon has its own RADOS configuration objects. The
    Dashboard uses these objects to deduce daemons.

Fixes: https://tracker.ceph.com/issues/46492
Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>
(cherry picked from commit a9accaeccf88e1b0ee4688ef2ae9ddbd3bd3dc5e)

Conflicts:
      src/pybind/mgr/dashboard/openapi.yaml
          - We don't have openapi-check feature in the Octopus. The file
            is removed in the backport.
      src/pybind/mgr/dashboard/services/ganesha.py
      src/pybind/mgr/dashboard/tests/test_ganesha.py
          - The conflicts are mainly caused by code re-format in the
    master.

4 years agomgr/dashboard: refator orchestrator service and daemon APIs
Kiefer Chang [Wed, 2 Sep 2020 12:25:52 +0000 (20:25 +0800)]
mgr/dashboard: refator orchestrator service and daemon APIs

- Allow listing services by service_type.
- Allow listing daemons by daemon_type.

Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>
(cherry picked from commit b88638873bd738af1ce258549abb6c25e0683907)

4 years agoMerge pull request #37551 from Vicente-Cheng/wip-47736-octopus
Yuri Weinstein [Wed, 28 Oct 2020 19:41:53 +0000 (12:41 -0700)]
Merge pull request #37551 from Vicente-Cheng/wip-47736-octopus

octopus: rgw: rgw_file: avoid long-ish delay on shutdown

Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agoMerge pull request #37688 from bk201/wip-47822-octopus
Lenz Grimmer [Wed, 28 Oct 2020 14:56:50 +0000 (15:56 +0100)]
Merge pull request #37688 from bk201/wip-47822-octopus

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
4 years agoMerge pull request #37597 from rhcs-dashboard/wip-47792-octopus
Lenz Grimmer [Wed, 28 Oct 2020 14:54:22 +0000 (15:54 +0100)]
Merge pull request #37597 from rhcs-dashboard/wip-47792-octopus

octopus: mgr/dashboard: Add short descriptions to the telemetry report preview

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Tiago Melo <tmelo@suse.com>
4 years agodoc/PendingReleaseNotes: clean up for 15.2.6 37817/head
Nathan Cutler [Tue, 27 Oct 2020 20:40:38 +0000 (21:40 +0100)]
doc/PendingReleaseNotes: clean up for 15.2.6

This commit drops release notes that have already been published and
organizes the remaining release notes under a heading so it is clear
they are targeting the 15.2.6 release.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
4 years agomon/MonMap: fix unconditional failure for init_with_hosts
Patrick Donnelly [Thu, 22 Oct 2020 17:08:26 +0000 (10:08 -0700)]
mon/MonMap: fix unconditional failure for init_with_hosts

This bug prevents setting mon_host to a DNS name.

Fixes: https://tracker.ceph.com/issues/47951
Fixes: 7a1f02acfe6b5d8a760efd16bb594a0656b39eac
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 21d9f8333b8c76979bbe90d60a8ad6ebb5e36c76)

4 years agotest/mon: add tests for mon_host build by hostname
Patrick Donnelly [Fri, 23 Oct 2020 23:33:23 +0000 (16:33 -0700)]
test/mon: add tests for mon_host build by hostname

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 4022c1f1fb4c37e63bf884e36a2b533735c56f94)

Conflicts:
    src/test/mon/MonMap.cc
- do not attempt to introduce boost::intrusive_ptr into Nautilus
- monmap.build_initial takes bare cct in nautilus (master: cct.get())

4 years agotest/mon: fix compiler errors in MonMap unittest
Patrick Donnelly [Fri, 23 Oct 2020 23:28:08 +0000 (16:28 -0700)]
test/mon: fix compiler errors in MonMap unittest

The code atrophied. Clean this up.

The tests are disabled because they SIGSEGV during SetUp.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 06f44cbf12c20ce8f1862111340f3b3f132577d0)

Conflicts:
    src/test/mon/MonMap.cc
- do not attempt to introduce boost::intrusive_ptr into nautilus
- monmap.build_initial takes bare cct in nautilus (master: cct.get())

4 years agoqa/tasks/{ceph,ceph_manager}: drop py2 support
Kefu Chai [Sun, 28 Jun 2020 11:43:09 +0000 (19:43 +0800)]
qa/tasks/{ceph,ceph_manager}: drop py2 support

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit a7f18e46b911d9047ebb3ac0f9de4d4c2e59c704)

4 years agoMerge pull request #37691 from smithfarm/wip-47877-octopus
Nathan Cutler [Tue, 27 Oct 2020 16:58:32 +0000 (17:58 +0100)]
Merge pull request #37691 from smithfarm/wip-47877-octopus

octopus: doc: cephfs: improve documentation of "ceph nfs cluster create" and "ceph fs volume create" commands

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agostrtol: Add parse/consume for string_view friendly interface 37809/head
Adam C. Emerson [Fri, 6 Mar 2020 04:13:47 +0000 (23:13 -0500)]
strtol: Add parse/consume for string_view friendly interface

Also these don't have the stringstream overhead.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit a29695e82ec8a93b000322773949f30694abf3d3)

4 years agomon/MDSMonitor: do not ignore mds's down:dne request 37858/head
chencan [Sat, 17 Oct 2020 07:26:51 +0000 (15:26 +0800)]
mon/MDSMonitor: do not ignore mds's down:dne request

Fixes: https://tracker.ceph.com/issues/47881
Signed-off-by: chencan <chen.can2@zte.com.cn>
(cherry picked from commit 768d7fc4e8b74c88ea2a623ee4d21ac1f20d8c7a)

4 years agomon/MDSMonitor: divide mds identifier and mds real name with dot 37857/head
Zhi Zhang [Fri, 9 Oct 2020 03:02:44 +0000 (11:02 +0800)]
mon/MDSMonitor: divide mds identifier and mds real name with dot

Fixes: https://tracker.ceph.com/issues/47806
Signed-off-by: Zhi Zhang <zhangz.david@outlook.com>
(cherry picked from commit 4400d70c15e8fb4f013ca22db9fd5fe60c99dc32)

4 years agomds: account for closing sessions in hit_session 37856/head
Dan van der Ster [Tue, 13 Oct 2020 07:08:12 +0000 (09:08 +0200)]
mds: account for closing sessions in hit_session

While stopping an mds we can reply to a request while all client
sessions are closing. We shouldn't assert in this case.

Fixes: https://tracker.ceph.com/issues/47833
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit 6823d8fb619c07b4e749ae564df565eadc59c187)

4 years agomgr/volumes/nfs: Fix wrong error message for pseudo path 37855/head
Varsha Rao [Wed, 7 Oct 2020 17:01:01 +0000 (22:31 +0530)]
mgr/volumes/nfs: Fix wrong error message for pseudo path

Fixes: https://tracker.ceph.com/issues/47783
Signed-off-by: Varsha Rao <varao@redhat.com>
(cherry picked from commit 1552f7239c0e2dc4f661cd80f17369422c919c50)

4 years agocephfs-journal-tool: fix wrong read_offset when get missing objects 37854/head
jhonxue [Thu, 30 Jul 2020 06:40:16 +0000 (14:40 +0800)]
cephfs-journal-tool: fix wrong read_offset when get missing objects

Fixes: https://tracker.ceph.com/issues/45575
Signed-off-by: Xue Yantao <jhonxue@tencent.com>
(cherry picked from commit bfa63666bb40c7939aa4da3c2c8f43a7022a78e8)

4 years agotest: Avoid races by waiting for PGs go clean before query 37853/head
David Zafman [Tue, 29 Sep 2020 18:03:10 +0000 (18:03 +0000)]
test: Avoid races by waiting for PGs go clean before query

Fixes: https://tracker.ceph.com/issues/46405
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 3ba7ebd3e299587b3828a7f794f070d3d01da4c4)

4 years agotest: Inconsequential change to get object names as desired
David Zafman [Tue, 29 Sep 2020 18:01:24 +0000 (18:01 +0000)]
test: Inconsequential change to get object names as desired

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit b20a277f0546b951df8c29650d1d699afd43e498)

4 years agolibrbd: ignore -ENOENT error when disabling object-map 37852/head
Jason Dillaman [Mon, 12 Oct 2020 19:28:52 +0000 (15:28 -0400)]
librbd: ignore -ENOENT error when disabling object-map

Fixes: https://tracker.ceph.com/issues/47840
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 8e88224d8d1e7602392a81ed4da8139a79600d8f)

4 years agolibrbd: update AioCompletion return value before evaluating pending count 37851/head
Jason Dillaman [Tue, 13 Oct 2020 01:34:25 +0000 (21:34 -0400)]
librbd: update AioCompletion return value before evaluating pending count

If the pending count is decremented before the return value is updated,
there is a possibility of two ASIO threads concurrently decrementing the
pending count down from 2 -> 1 -> 0. In the second thread (the one that
performs the final decrement from 1 -> 0), it can finalize the completion
before the first thread has had a chance to update the return value.

Fixes: https://tracker.ceph.com/issues/47847
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 94f3bce53c39017028ce44a80697f55af2a82e68)

4 years agojournal: possible race condition between flush and append callback 37850/head
Jason Dillaman [Fri, 16 Oct 2020 15:25:39 +0000 (11:25 -0400)]
journal: possible race condition between flush and append callback

When notifying the journal recorder of an overflow or if the object
close request has completed due to no more in-flight IO, it was
possible for a race between a flush request and the processing of
an append completion to attempt to kick off duplicate notifications.
Since the overflowed and closed callbacks are properly protected from
duplicates, use a counter instead of a boolean to track possible
in-flight handler callbacks.

Fixes: https://tracker.ceph.com/issues/47880
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 458ab997fe77ea78803a34c6c9715225aa3413ba)

4 years agorgw: fix: S3 API KeyCount incorrect return. 37849/head
胡玮文 [Thu, 24 Sep 2020 15:34:43 +0000 (23:34 +0800)]
rgw: fix: S3 API KeyCount incorrect return.

KeyCount should return object count + common prefix count.
see S3 example: https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html#API_ListObjectsV2_Example_5

Related: https://github.com/docker/distribution/issues/3200

Signed-off-by: 胡玮文 <huww98@outlook.com>
(cherry picked from commit f96a6fdad16da2a7093f538ee577248dfbc65ca1)

4 years agorgw/gc: fix for incrementing the perf counter 'gc_retire_object' 37847/head
Pritha Srivastava [Fri, 25 Sep 2020 07:51:18 +0000 (13:21 +0530)]
rgw/gc: fix for incrementing the perf counter 'gc_retire_object'
in the new gc queue code for omap offload, when gc objects from queue
are deleted. This was missed out initially.

Fixes: https://tracker.ceph.com/issues/47908
Signed-off-by: Pritha Srivastava <prsrivas@redhat.com>
(cherry picked from commit bde4c5bf9123bfa03189accd064b813a0d3179b9)

4 years agorgw/gc: fixing the condition when marker for a queue is 37846/head
Pritha Srivastava [Wed, 14 Oct 2020 11:05:50 +0000 (16:35 +0530)]
rgw/gc: fixing the condition when marker for a queue is
always reset to empty which causes RGWGC::list to get stuck in
a loop, which ultimately is broken out of when the queue's truncated
flag is false.

1. Check for entries size also while evaluating whether objects cache for
a gc object should be marked as 'transitioned' in case of cls_rgw_gc_list.
When there are no entries, we get back a return value of 0, and the
object cache is not marked as 'transitioned'.

2. Also for the last gc object, we need to check whether the queue is still
under process and set the correct flag.

Missing the two conditions above causes the GC::list to loop continously
over the same gc object.

Fixes: https://tracker.ceph.com/issues/47909
Signed-off-by: Pritha Srivastava <prsrivas@redhat.com>
(cherry picked from commit bf3f3ba675d092f48e403826fc0813e23c07045d)

4 years agorgw: use yum rather than dnf for testing rgw-orphan-list 37845/head
J. Eric Ivancich [Thu, 15 Oct 2020 18:14:04 +0000 (14:14 -0400)]
rgw: use yum rather than dnf for testing rgw-orphan-list

The teuthology testing for rgw-orphan-list needs to install
`s3cmd`. Switch from using dnf to yum to work on a wider variety of
platforms.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 11a08a5bb867e05d033f126d9de7c370770ee63f)

4 years agoqa/cephfs: add session_timeout option support 37841/head
Xiubo Li [Tue, 20 Oct 2020 05:26:33 +0000 (01:26 -0400)]
qa/cephfs: add session_timeout option support

When the mds revoking the Fwbl caps, the clients need to flush
the dirty data back to the OSDs, but the flush may make the OSDs
to be overloaded and slow, which may take more than 60 seconds to
finish. Then the MDS daemons will report the WRN messages.

For the teuthology test cases, let's just increase the timeout
value to make it work.

Fixes: https://tracker.ceph.com/issues/47565
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit 0422673b6150df851a4ea1662637a77585cde52d)

4 years agoqa/cephfs: move the cephfs's opertions setting to create()
Xiubo Li [Mon, 12 Oct 2020 02:13:43 +0000 (10:13 +0800)]
qa/cephfs: move the cephfs's opertions setting to create()

Fixes: https://tracker.ceph.com/issues/47565
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit cb8081ce7f4e0897cb2047d409ac2865afb3227c)

4 years agoqa/cephfs: add 'cephfs:' section support
Xiubo Li [Tue, 20 Oct 2020 05:06:25 +0000 (01:06 -0400)]
qa/cephfs: add 'cephfs:' section support

Fixes: https://tracker.ceph.com/issues/47565
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit 3b5303482ff77667d48b85f4a3bc54e3ff6a60a6)

4 years agoclient: fix inode ll_ref reference count leak 37839/head
sepia-liu [Tue, 20 Oct 2020 09:24:57 +0000 (09:24 +0000)]
client: fix inode ll_ref reference count leak

Fixes: https://tracker.ceph.com/issues/47918
Signed-off-by: sepia-liu <liuwei_coder@163.com>
(cherry picked from commit 019ba52c8f3ba8263b67b4d1a3bfd6d20e98eeda)

4 years agovstart.sh: fix fs set max_mds bug 37837/head
jinmyeonglee [Thu, 22 Oct 2020 08:25:51 +0000 (17:25 +0900)]
vstart.sh: fix fs set max_mds bug

Fix a bug where the name used when creating a volume and the name used when setting max_mds were different.
Fixes: https://tracker.ceph.com/issues/47946
Signed-off-by: Jinmyeong Lee <jinmyeong.lee@linecorp.com>
(cherry picked from commit 6a9445c2cbe6c0c7045bfaed007cc1920ad132ed)

4 years agotools/rados: flush formatter periodically during json output of `rados ls` 37835/head
J. Eric Ivancich [Wed, 21 Oct 2020 14:32:30 +0000 (10:32 -0400)]
tools/rados: flush formatter periodically during json output of `rados ls`

While `rados ls` is emitting object info through a json formatter,
flush the formatter after there are at least 4096 bytes are buffered
for output.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 1548ef7a97559f17023f17842dab51d47cef89df)

4 years agotest/store_test: use 'threadsafe' style for death tests. 37819/head
Igor Fedotov [Wed, 21 Oct 2020 13:24:30 +0000 (16:24 +0300)]
test/store_test: use 'threadsafe' style for death tests.

Hopefully Fixes: https://tracker.ceph.com/issues/47328
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 99ac34cbfeb98c36ffcc3e1b5b65174930273c4c)

4 years agotest/mon: build MonMap unittest
Patrick Donnelly [Fri, 23 Oct 2020 23:27:39 +0000 (16:27 -0700)]
test/mon: build MonMap unittest

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 8408b63e908e3f7145b2df5520f28ac12d615967)

4 years agorbd-nbd: don't ignore namespace when unmapping by image spec 37812/head
Mykola Golub [Sun, 27 Sep 2020 16:59:49 +0000 (17:59 +0100)]
rbd-nbd: don't ignore namespace when unmapping by image spec

Fixes: https://tracker.ceph.com/issues/47665
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit b360186eb6463811ce19f21e8d24ab7c44dc7279)

4 years agoqa/workunits/rbd: yet another attempt to improve rbd-nbd unmap
Mykola Golub [Thu, 10 Sep 2020 13:08:42 +0000 (14:08 +0100)]
qa/workunits/rbd: yet another attempt to improve rbd-nbd unmap

Previously it still could race when unmap_device returned success
because the device was not found in `rbd-nbd list-mapped` (the nbd
device was removed) but the test failed because the process was still
found in the ps table.

Fixes: https://tracker.ceph.com/issues/47394
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit f0c69761c8036a57319ead5cdf97cebb0f0fb5d7)

Conflicts:
qa/workunits/rbd/rbd-nbd.sh
- omit changes in tests that are not in octopus

4 years agorgw: Add request timeout beast
Or Friedmann [Wed, 27 May 2020 15:57:44 +0000 (18:57 +0300)]
rgw: Add request timeout beast

Add request timeout beast

The beast frontend will use the same parameter as the civetweb one, request_timeout_ms and will be configured to 65 seconds by default

Fixes: https://tracker.ceph.com/issues/45431
Signed-off-by: Or Friedmann <ofriedma@redhat.com>
(cherry picked from commit 5d5f9a0d41721f08b19f8425149fd13f4ef13696)

4 years agorgw: radosgw-admin should paginate internally when listing bucket 37803/head
J. Eric Ivancich [Thu, 1 Oct 2020 17:33:01 +0000 (13:33 -0400)]
rgw: radosgw-admin should paginate internally when listing bucket

Currently `radosgw-admin bucket list ...`, when listing a bucket, asks
for the value of "--max-entries" internally. To list a large bucket
entirely the user would have to set "--max-entries" to a large value
(e.g., 10000000). Internally this doesn't paginate, so it will try to
produce the entire list at once. This can consume a lot of memory, and
there are known cases where this induces an out-of-memory crash.

So now we'll set a maximum pagination size of 10,000. So even with
large values of "--max-entries" it will still be able to produce the
full listing without stressing memory, because it will ask for at most
10,000 entries at a time.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 6d033061bf9eaebf3dab37b9ed45de22ce6fa6b7)

Conflicts:
src/rgw/rgw_admin.cc
- formatter does not have get() method in octopus

4 years agorgw: fix expiration header returned even if there is only one tag in the object the... 37807/head
Or Friedmann [Thu, 23 Jul 2020 15:36:07 +0000 (18:36 +0300)]
rgw: fix expiration header returned even if there is only one tag in the object the same as the rule

Expiration header returned even if there is only one tag in the object the same as the rule

Signed-off-by: Or Friedmann <ofriedma@redhat.com>
Reported-by: Avi Mor <avmor@redhat.com>
Fixes: https://tracker.ceph.com/issues/46614
(cherry picked from commit bf7c7e59f390afb53cb1e30a440ab26bb093c11c)

4 years agorgw: rgw-orphan-list should use "plain" formatted `rados ls` output 37800/head
J. Eric Ivancich [Fri, 9 Oct 2020 20:06:55 +0000 (16:06 -0400)]
rgw: rgw-orphan-list should use "plain" formatted `rados ls` output

The previous version that used "json-pretty" output for `rados ls`
added complications due to json's escaping of special characters. So
this version returns to the "plain" output for `rados ls` but deals
with entries (oids) that might have namespaces and/or locators as
well.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 5b994f90594208dab81045732099a03686819b30)

4 years agorgw: allow rgw-orphan-list to note when rados objects are in namespace
J. Eric Ivancich [Tue, 6 Oct 2020 19:21:02 +0000 (15:21 -0400)]
rgw: allow rgw-orphan-list to note when rados objects are in namespace

Currently namespaces and locators are ignored when `rados ls` is run
by rgw-orphan-list to record RADOS's known objects.

However there have been cases where RADOS objects have a locator, and
when one is included in the listing, the script does not handle it
correctly. Now when objects have locators, we will prevent their
output from entering the .intermediate file.

Additionally we do not expect RGW data objects to be in RADOS
namespaces, so when a namespaced object is detected, we'll error out
with a message.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit ddf52016fa03ba192f242ad641a5c8e5a95035a1)

4 years agomgr/dashboard: display devices' health information within a tabset 37784/head
Kiefer Chang [Thu, 17 Sep 2020 12:58:47 +0000 (20:58 +0800)]
mgr/dashboard: display devices' health information within a tabset

Wrap all devices' health information within a tabset
instead of displaying them from top to bottom.

Add more guard in the HTML template to prevent referencing undefined
variables.

Fixes: https://tracker.ceph.com/issues/47494
Fixes: https://tracker.ceph.com/issues/43177
Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>
(cherry picked from commit ba3350c7e8c755d5c84c1a027a3a173191cb898d)

Conflicts:
      src/pybind/mgr/dashboard/frontend/src/app/ceph/shared/smart-list/smart-list.component.html
      src/pybind/mgr/dashboard/frontend/src/app/ceph/shared/smart-list/smart-list.component.ts
      - Use ngx-bootstrap tabset for tabs.

4 years agorgw: fix user stats iterative increment 37779/head
Mark Kogan [Mon, 10 Aug 2020 10:19:19 +0000 (13:19 +0300)]
rgw: fix user stats iterative increment

The RGWBucketCtl::sync_user_stats() function can increment or reset the
stats [1][2]

[1]https://github.com/ceph/ceph/blob/master/src/rgw/rgw_bucket.cc#L3745
[2]https://github.com/ceph/ceph/blob/master/src/rgw/services/svc_bi_rados.cc#L379-L381

fixes: https://tracker.ceph.com/issues/46400

Signed-off-by: Mark Kogan <mkogan@redhat.com>
(cherry picked from commit 21e877ca67db7840026b1768751b167e2c0a53da)

Conflicts:
src/rgw/rgw_sal.cc
- master's owner->get_id() becomes user.info.user_id in octopus

4 years agoMerge pull request #37722 from rishabh-d-dave/wip-47845-octopus
Yuri Weinstein [Fri, 23 Oct 2020 18:51:21 +0000 (11:51 -0700)]
Merge pull request #37722 from rishabh-d-dave/wip-47845-octopus

octopus: ceph-volume: add no-systemd argument to zap

Reviewed-by: Jan Fajerski <jfajerski@suse.com>
4 years agoMerge pull request #37705 from smithfarm/wip-47898-octopus
Yuri Weinstein [Fri, 23 Oct 2020 14:03:21 +0000 (07:03 -0700)]
Merge pull request #37705 from smithfarm/wip-47898-octopus

octopus: mon: have 'mon stat' output json as well

Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>
4 years agoMerge pull request #37697 from s0nea/wip-47602-octopus
Yuri Weinstein [Fri, 23 Oct 2020 14:02:49 +0000 (07:02 -0700)]
Merge pull request #37697 from s0nea/wip-47602-octopus

octopus: Enable per-RBD image monitoring

Reviewed-by: Patrick Seidensal <pnawracay@suse.com>
Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>