git-server-git.apps.pok.os.sepia.ceph.com Git

rgw: Disable prefetch of entire head object when GET request with range header

Disable prefetch of entire head object when GET request with range header.
The current behavior for the RGW is getting the whole object although the client asked only for a small bytes offset.
For example: If the client asked for bytes=0-1, The RGW will anyway fetch 0-4194304

Fixes: https://tracker.ceph.com/issues/44508
Signed-off-by: Or Friedmann <ofriedma@redhat.com>
(cherry picked from commit 2be5af0006169cb54547034aa98b7eacb8751d59)

Merge pull request #38354 from ifed01/wip-ifed-fix-statfs-out-nau

nautilus: mgr: don't update osd stat which is already out

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #38334 from b-ranto/wip-prom-fixes-nautilus

nautilus: mgr/prometheus: Make module more stable

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #38085 from orztt/wip-rgw-versioning-nautilus

nautilus: rgw: cls/rgw/cls_rgw.cc: fix multiple lastest version problem

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #37895 from smithfarm/wip-48040-nautilus

nautilus: rbd: librbd: ensure that thread pool lock is held when processing throttled IOs

Reviewed-by: Jason Dillaman <dillaman@redhat.com>

mgr/prometheus: don't store exception as e

Python's logging module's exception() method will log the full exception
and stack trace for us, so we do not need to store the exception in the
"e" variable here.

Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
(cherry picked from commit a17c603effd3367dc64c87a1d6c53d6d3d794fc7)

Merge pull request #38416 from kamoltat/wip-fix-bug-48434

nautilus: mgr/progress: delete all events over the wire

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>

Merge pull request #38411 from dzafman/wip-48444

nautilus: osd: Check for nosrub/nodeep-scrub in between chunks, to avoid races

Reviewed-by: Neha Ojha <nojha@redhat.com>

librbd: ensure that thread pool lock is held when processing throttled IOs

There previously was a potential race for throttled IOs to complete prior
to the main worker thread finishing the processing of the blocked IO.

Fixes: https://tracker.ceph.com/issues/47371
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 2d86e0935aa6f0c392df428676d9ab0a338fccae)

Conflicts:
    src/test/librbd/io/test_mock_ImageRequestWQ.cc
- in Octopus, commit 792d6c53fedc695199cc18916347c1b545fe42c2 did a global
  replace of Mutex to ceph::mutex, so to fix this for Nautilus, we just need to
  do that in test_mock_ImageRequestWQ.cc since the get_pool_lock() method is
  returning a Mutex instead of a ceph::mutex

Merge pull request #37959 from callithea/wip-47995-nautilus

nautilus: monitoring: Use null yaxes min for OSD read latency

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Patrick Seidensal <pseidensal@suse.com>

mgr/progress: 'progress clear' command should clear events in 'ceph -s'

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 6f60d33115d2f583331d31b95a0a33b96a614f09)

osd: Check for nosrub/nodeep-scrub in between chunks, to avoid races

Fixes: https://tracker.ceph.com/issues/47767
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 9b7f911d9a139cc347f2d3ac3068fc1d212058c7)

Conflicts:
src/osd/PG.cc (manual merge due to code rearrangement)

Merge pull request #38362 from badone/wip-nautilus-mon-scrub-testing

nautilus: mon scrub testing

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

Merge PR #38372 into nautilus

* refs/pull/38372/head:
ceph-volume: implement the --log-level flag

Reviewed-by: Rishabh Dave <ridave@redhat.com>

Merge PR #38371 into nautilus

* refs/pull/38371/head:
lvm/create.py: fix a typo in the help message

Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Yuri Weinstein <yweins@redhat.com>

Merge pull request #38382 from badone/wip-nautilus-run-tox-mgr-insights-six-missing

nautilus: mgr/insights: Test environment requires 'six'

Reviewed-by: Kefu Chai <kchai@redhat.com>

mgr/insights: Test environment requires 'six'

Not a backport because python2 support was dropped in master and only
nautilus seems to be affected at this time.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>

cls/rgw/cls_rgw.cc: fix multiple lastest version problem

Fixes: https://tracker.ceph.com/issues/47919
Signed-off-by: Ruan Zitao <ruanzitao@kuaishou.com>
Signed-off-by: Yang Honggang <yanghonggang@kuaishou.com>
(cherry picked from commit f60f9ace1a4bceeda256373cf4603058e1947fa8)

Conflicts:
src/cls/rgw/cls_rgw.cc
- nautilus does not have "rgw_bucket_dir_entry::FLAG_VER"; use "RGW_BUCKET_DIRENT_FLAG_VER" instead

qa/suites/rados/monthrash: Exercise mon scrub error injectors

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
(cherry picked from commit a5bcca7f415790521a76213620ae079318e7bee1)

Conflicts:
qa/suites/rados/monthrash/ceph.yaml - whitelist vs. ignorelist

ceph-volume: implement the --log-level flag

The --log-level flag was being ignored and
the file log level was always set to DEBUG.

Fixes: https://tracker.ceph.com/issues/48045
Resolves: rhbz#1867717

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit ecbd6c13f116b390c782c9ae14b5becd0bdecc8e)

lvm/create.py: fix a typo in the help message
ceph_volume/devices/lvm/create.py:corrected typo of the word when using ceph-volume lvm create -h

Fixes: https://tracker.ceph.com/issues/48273
Signed-off-by: ZhenLiu94 <zhenliu94@163.com>
(cherry picked from commit e3c7d6ff4cec80ee0135abb50d795411c5dc2283)

Merge PR #38279 into nautilus

* refs/pull/38279/head:
ceph-volume batch: reject partitions in argparser

Reviewed-by: Rishabh Dave <ridave@redhat.com>

qa/config/rados.yaml: Test mon scrub

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
(cherry picked from commit f85001e5d5fb11718ab2fd8b708402cd2db951d4)

Merge pull request #37840 from smithfarm/wip-47990-nautilus

nautilus: qa/cephfs: add session_timeout option support

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #37838 from smithfarm/wip-47988-nautilus

nautilus: cephfs: client: fix inode ll_ref reference count leak

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #37836 from smithfarm/wip-47953-nautilus

nautilus: vstart.sh: fix fs set max_mds bug

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #37822 from smithfarm/wip-47957-nautilus

nautilus: mon/MDSMonitor do not ignore mds's down:dne request

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #37821 from smithfarm/wip-47939-nautilus

nautilus: mon/MDSMonitor: divide mds identifier and mds real name with dot

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #37820 from smithfarm/wip-47935-nautilus

nautilus: mds: account for closing sessions in hit_session

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #37725 from rishabh-d-dave/wip-46611-nautilus

nautilus: pybind/cephfs: add special values for not reading conffile

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #38118 from neha-ojha/wip-48227-nautilus

nautilus: mon: Log "ceph health detail" periodically in cluster log

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

mgr: don't update osd stat which is already out

Ceph status still reports slow requests on the OSD which is already out.
When orignal PG monitor handled PGSTATS msg, it wouldn't update osd stat
if this OSD is not in OSD map, but current MGR had no checks on that.

Fixes: https://tracker.ceph.com/issues/46440
Signed-off-by: Zhi Zhang <zhangz.david@outlook.com>
(cherry picked from commit 493ec9d3acd3f57eed3e4b96ad7c6739c2089ff1)

mgr/prometheus: use threading.Event instead of sleep

This allows us to avoid waiting for the sleep to finish when waiting for
the thread to finish.

Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit dd5886c3c006e388283df50cc87addeffb3b2b52)

mgr/prometheus: Log collection issues

Log any issues encountered during the data collection and continue to
collect the data anyway (after a sleep).

Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit 28a5c13bf993679e3098d73df27ded249f34dc99)

mgr/prometheus: Use mgr.release_name for always on modules

The host_version is not populated properly in the early stages of ceph
mgr start up process. We can use mgr.release_name instead. It is more
stable and it provides the data even if mgr_map does not contain the
versions, yet.

Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit aa0650092da3cbf1a73151999874001352cfb9ef)

mgr/prometheus: Clean up collection thread

We need to clean up the metrics collection thread.

Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit 03fcaccafc877d10a894b1c39af5547f172c1ed3)

Conflicts:
prometheus/module.py: Pass _global_instance as an argument to
MetricCollectionThread, collect can't be a static function
anymore

Merge pull request #38295 from badone/wip-nautilus-dont-run-tests-if-build-fails

nautilus: run-make-check.sh: Don't run tests if build fails

Reviewed-by: Kefu Chai <kchai@redhat.com>

run-make-check.sh: Make sure a build failure will exit

We 'set -e' but that is ignored because 'build tests' is executed in a
'&&' list (see 'man set') so move the echo to the following line.

Follow-up to 03ff2146f95

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
(cherry picked from commit e70483133db87a3f04bc1fff31d8472465c305b3)

Conflicts:
run-make-check.sh: cmake call differences and trivial logging
output change.

run-make-check.sh: Don't run tests if build fails

When run-make was taken out we lost the 'set -e' call and therefore
continue after an error.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
(cherry picked from commit 03ff2146f95c7e03a84df1f8c3b38bbbb315b708)

Merge pull request #38268 from idryomov/wip-relax-preauth-asserts-again-nautilus

nautilus: msg/async/ProtocolV2: allow rxbuf/txbuf get bigger in testing, again

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

Merge pull request #38024 from rhcs-dashboard/wip-48180-nautilus

nautilus: mgr/dashboard: Display users current bucket quota usage

Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>

Merge pull request #37995 from callithea/wip-48133-nautilus

nautilus: mgr/dashboard: disable cluster selection in NFS export editing form

Reviewed-by: Sebastian Krah <skrah@suse.com>
Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>

Merge pull request #37756 from tspmelo/wip-47198-nautilus

nautilus: mgr/dashboard: Datatable catches select events from other datatables

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>

Merge pull request #38188 from rakeshgm/rhel_nautilus_repos

nautilus: qa/distros: add rhel 7.9

qa/distros: add rhel 7.9

Signed-off-by: rakeshgm <rakeshgm@redhat.com>

Merge pull request #38296 from badone/wip-nautilus-pin-importlib_metadata

nautilus: mgr: Pin importlib_metadata version 2.1.0

Reviewed-by: Kefu Chai <kchai@redhat.com>

pybind/mgr: Pin importlib_metadata version 2.1.0

Latest release of importlib_metadata breaks the Nautilus build.

Master does not appear to be affected, probably because it uses
python3.6 or greater which is compatible with the latest
importlib_metadata version.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>

Merge pull request #37605 from smithfarm/wip-47803-nautilus

nautilus: test/librados: fix endian bugs in checksum test cases

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #38015 from dsavineau/wip-48087-nautilus

nautilus: ceph-volume: consume mount opt in simple activate

Reviewed-by: Jan Fajerski <jfajerski@suse.com>
Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>

Merge PR #38048 into nautilus

* refs/pull/38048/head:
ceph-volume: fix lvm help test
ceph-volume: remove mention of dmcache from docs and help text

Reviewed-by: Jan Fajerski <jfajerski@suse.com>

Merge PR #37723 into nautilus

* refs/pull/37723/head:
ceph-volume: add no-systemd argument to zap

Reviewed-by: Jan Fajerski <jfajerski@suse.com>

ceph-volume batch: reject partitions in argparser

Fixes: https://tracker.ceph.com/issues/47966
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 9742efa907aa54b3135f5daf73080b7be12534eb)

mon: Log "ceph health detail" periodically in cluster log

change mon_health_to_clog_interval from 1_hr -> 10_min to
log health summary or detail more frequently.

No HealthMonitor class in nautilus.

Fixes: https://tracker.ceph.com/issues/48042
Signed-off-by: Prashant Dhange <pdhange@redhat.com>
(cherry picked from commit f45712c19077c5cf5a9938fc3fd17b64ffe3a4ec)

Conflicts:
PendingReleaseNotes - add and restructure 14.2.16

Merge pull request #37961 from callithea/wip-47620-nautilus

nautilus: mgr/dashboard: fix security scopes of some NFS-Ganesha endpoints

Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #37834 from smithfarm/wip-47933-nautilus

nautilus: tools/rados: flush formatter periodically during json output of "rados ls"

Reviewed-by: Adam Emerson <aemerson@redhat.com>
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>

Merge pull request #37706 from smithfarm/wip-47899-nautilus

nautilus: mon: have 'mon stat' output json as well

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #37693 from smithfarm/wip-47878-nautilus

nautilus: build-integration-branch: take PRs in chronological order

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

Merge pull request #37659 from smithfarm/wip-46014-nautilus

nautilus: log: fix timestap precision of log can't set to millisecond.

Reviewed-by: Adam Emerson <aemerson@redhat.com>

msg/async/ProtocolV2: allow rxbuf/txbuf get bigger in testing, again

With CEPHX_V2 authorizer challenges brought back in commit
4a82c72e3bdd, these need to be bumped again, as two authorizers
(without and then with the challenge) are transmitted and signed
instead of one (without the challenge). See commit 94953dd9398a
("msg/async/ProtocolV2: allow rxbuf/txbuf get bigger in testing")
for details.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 422f922c4acdd0a0db3be41f2d55663c864df59d)

Merge pull request #38198 from jan--f/wip-48302-nautilus

nautilus: ceph-volume: fix filestore/dmcrypt activate

Reviewed-by: Dimitri Savineau <dsavinea@redhat.com>
Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>

ceph-volume: fix test_setup_device_device_name_is_none

Let's call this function by using the same syntax than other tests.
This will make it work with py2 in nautilus branch.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 02e6f33f08e392513aaded4bde61cf15b2fcfb0c)

mon: make mon summary more concise in 'ceph -s'

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 3e7c185bd4b2312f9801b8664fb175801c904871)

Conflicts:
PendingReleaseNotes
- drop already published release notes
src/mon/MonMap.cc
- change ceph::to_string to std::to_string

qa/cephtool: test 'mon stat' commands

Signed-off-by: Joao Eduardo Luis <joao@suse.com>
(cherry picked from commit 122388429d01ef2f294dc2846d16d88aa0bdba68)

Conflicts:
qa/workunits/cephtool/test.sh
- drop unrelated "# test elector" comment (elector test not backported)
- no "test_mon_priority_and_weight" function in nautilus

mon: have 'mon stat' output json as well

Fixes: https://tracker.ceph.com/issues/46816
Signed-off-by: Joao Eduardo Luis <joao@suse.com>
(cherry picked from commit c148a3cde5c256576d0a67a40321e543fdf891bf)

Merge pull request #38173 from kamoltat/wip-ksirivad-nautilus-backports2

nautilus:mgr/progress: introduce turn off/on feature

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #38100 from trociny/wip-48244-nautilus

nautilus: os/bluestore: fix "end reached" check in collection_list_legacy

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #38046 from dsavineau/wip-48185-nautilus

nautilus: ceph-volume: fix lvm batch auto with full SSDs

Reviewed-by: Jan Fajerski <jfajerski@suse.com>
Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>

mgr/progress: introduce turn off/on feature

progress module can be turned off/on by using
the commands: 'progress off' and 'progress on'

As well as refractoring teuthology test suite
to prevent future bugs that can possibly occur

fixes: https://tracker.ceph.com/issues/47238

Signed-off-by: kamoltat <ksirivad@redhat.com>
(cherry picked from commit 993bb02b30cf73a1c1c70da1ef266be8373d56dd)

Conflicts:
PendingReleaseNotes - add release notes about this feature
qa/tasks/mgr/test_progress.py - replace helper functions that is neeeded
for dealing with more than 1 type of events
- remove `period` in wait_until_equal()
src/pybind/mgr/progress/module.py - remove code that deals with mypy in master
qa/suites/rados/singleton/all/pg-autoscaler-progress-off.yaml - remove log-ignorelist

update some files

Signed-off-by: kamoltat <ksirivad@redhat.com>

mgr/dashboard: Display users current bucket quota usage
Fixes: https://tracker.ceph.com/issues/45011
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit 4fabba0bb772d480dcddc83272c83e7714726fc1)

Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/osd/osd-list/osd-list.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/pool/pool-list/pool-list.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-bucket-list/rgw-bucket-list.component.spec.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-bucket-list/rgw-bucket-list.component.ts
src/pybind/mgr/dashboard/frontend/src/app/shared/components/usage-bar/usage-bar.component.ts
- Resolved conflicts due to variable name change and few other import conflicts.

ceph-volume: cover devices.lvm.prepare.setup_device

Add some unit tests to cover setup_device() in devices.lvm.prepare

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9e2a0a3edd12cce51913f4b2982c26464e77e12c)

ceph-volume: fix filestore/dmcrypt activate

The uuid set for tags['ceph.journal_uuid'] should point to its
corresponding lv_uuid instead of the uuid generated for the lv_name.

The variable name 'uuid' used so far was probably too confusing so let's
change it to make it more clear.

Closes: https://tracker.ceph.com/issues/48271
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit ee3aece808fd22e659c2c30c0674f7ec200f411b)

ceph-volume: add a unit tests to lvm batch

This commit adds unit tests in order to cover `_sort_rotational_disks()`
call when deploying with full hdd/ssd or mixed hdd/sdd scenarios.

Fixes: https://tracker.ceph.com/issues/48150
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Co-authored-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 13514a24cfdc32d67cfbc1201aa427168a926978)

ceph-volume: fix lvm batch auto with full SSDs

The ceph-volume lvm batch --auto introduced by [1] breaks the backward
compatibility when using non rotational devices only (SSD and/or NVMe).
Those devices are reaffected as bluestore db or filestore journal
devices while we want them as data devices.

Fixes: https://tracker.ceph.com/issues/48106
[1] https://github.com/ceph/ceph/pull/34740

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 2a854ca373fadef099a1d037930eb241e757b2c3)

mgr/dashboard: disable cluster selection in NFS export editing form

We should not allow changing an export's cluster because an export ID
might live in one cluster but not in another one. Editing a non-existing
export in a cluster causes an error.

Fixes: https://tracker.ceph.com/issues/47373
Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>
(cherry picked from commit d678d8076c2a4c5edfe489d553e3c8770462f023)

Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/nfs/nfs-form/nfs-form.component.spec.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/nfs/nfs-list/nfs-list.component.spec.ts
- Some imports differ from master; remove SummaryService and CephReleaseNamePipe from providers
and no longer needed test case; also remove not working docsUrl (and related lines) from
nfs-form.component.ts

Merge remote-tracking branch 'security/wip-resurrect-authorizer-challenges-nautilus' into nautilus

Merge pull request #38076 from badone/wip-admin_socket_output-invalidated-iterator-crash

nautilus: test/admin_socket_output: Don't invalidate 'target' iterator

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

os/bluestore: fix "end reached" check in collection_list_legacy

To preserve the old bluestore behavior it should compare the
current object with the end using bluestore keys, not oids.

Fixes: https://tracker.ceph.com/issues/48153
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit e63489f249f9ba3bc9cb1806568f860effd8a0b6)

Merge pull request #38069 from smithfarm/wip-48233-nautilus

nautilus: mgr: avoid false alarm of MGR_MODULE_ERROR

Reviewed-by: Neha Ojha <nojha@redhat.com>

test/admin_socket_output: Don't invalidate 'target' iterator

Fixes: https://tracker.ceph.com/issues/48204
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>

mgr: avoid false alarm of MGR_MODULE_ERROR

mgr sends healthy report periodically, the report includes the
information whether the always-on modules are loaded or not. but the
modules are loaded with two steps:

1. load the options and command exposed by modules. the options and
   commands are registered using static methods of the subclasss of
   MgrModule.
2. create an instance of the subclass of MgrModule. this is performed
   in background by a Finisher thread. upon finishing of the construction
   of the instance, ActivePyModules::start_one() adds the module which
   successfully creates the class to `modules`.

but there is chance that when mgr sends healthy report, the always-on
module is still creating its instance of MgrModule subclass, or that
task is still pending in the finisher thread. in that case, mgr would
add a false error message like
```
4 mgr modules have failed (MGR_MODULE_ERROR)
```
in the healthy report

in this change, the number of modules in pending state is tracked,
and mgr will not take the missing always-on modules into account unless
the number of pending modules is 0.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 2d9b3abd1fc50e5fcd9ce2c05e8fac41d389b052)

mgr/PyModuleRegistry: ignore 'obsolete' modules

Old modules may be in the mgrmap (and always_on) but no longer exist. Do
not try to load those or raise errors about them.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit a59f4e5deb49536af23473658e0f04d0f495829f)

ceph-volume: fix lvm help test

ed5ceb0 changed the LVM help code but not the associated test.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 879ed30984de2b94879959de1c3611083c85bd99)

ceph-volume: remove mention of dmcache from docs and help text

With the introduction of bluestore dmcache is no longer needed and
is no longer supported with `ceph-volume lvm`.

Resolves: rhbz#1876827
Fixes: https://tracker.ceph.com/issues/48039
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit ed5ceb04fc8ff57c5f7e2b5fa5e859c2cdbf2ffd)

Merge pull request #37554 from Vicente-Cheng/wip-47748-nautilus

nautilus: mon: set session_timeout when adding to session_map

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

ceph-volume: consume mount opt in simple activate

When running ceph-volume simple activate command on a Filestore OSD
then the data device is mounted without any specific options so the
one from the ceph configuration file are ignored.
When deploying Filestore with the lvm subcommand then everything is
fine because the filestore_activate method uses mount_osd which relies
on the mount options defined in the ceph configuration file (if any).

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1891557
Fixes: https://tracker.ceph.com/issues/48018
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1f4301a15df82bf31468d76fbcccc1c5fa192e38)

mon/MonClient: bring back CEPHX_V2 authorizer challenges

Commit c58c5754dfd2 ("msg/async/ProtocolV1: use AuthServer and
AuthClient") introduced a backwards compatibility issue into msgr1.
To fix it, commit 321548010578 ("mon/MonClient: skip CEPHX_V2
challenge if client doesn't support it") set out to skip authorizer
challenges for peers that don't support CEPHX_V2.  However, it
made it so that authorizer challenges are skipped for all peers in
both msgr1 and msgr2 cases, effectively disabling the protection
against replay attacks that was put in place in commit f80b848d3f83
("auth/cephx: add authorizer challenge", CVE-2018-1128).

This is because con->get_features() always returns 0 at that
point.  In msgr1 case, the peer shares its features along with the
authorizer, but while they are available in connect_msg.features they
aren't assigned to con until ProtocolV1::open().  In msgr2 case, the
peer doesn't share its features until much later (in CLIENT_IDENT
frame, i.e. after the authentication phase).  The result is that
!CEPHX_V2 branch is taken in all cases and replay attack protection
is lost.

Only clusters with cephx_service_require_version set to 2 on the
service daemons would not be silently downgraded.  But, since the
default is 1 and there are no reports of looping on BADAUTHORIZER
faults, I'm pretty sure that no one has ever done that.  Note that
cephx_require_version set to 2 would have no effect even though it
is supposed to be stronger than cephx_service_require_version
because MonClient::handle_auth_request() didn't check it.

To fix:

- for msgr1, check connect_msg.features (as was done before commit
  c58c5754dfd2) and challenge if CEPHX_V2 is supported.  Together
  with two preceding patches that resurrect proper cephx_* option
  handling in msgr1, this covers both "I want old clients to work"
  and "I wish to require better authentication" use cases.

- for msgr2, don't check anything and always challenge.  CEPHX_V2
  predates msgr2, anyone speaking msgr2 must support it.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 4a82c72e3bdddcb625933e83af8b50a444b961f1)

Conflicts:
src/msg/async/ProtocolV1.cc [ commit c58c5754dfd2
  ("msg/async/ProtocolV1: use AuthServer and AuthClient") not
  in nautilus.  This means that only msgr2 is affected, so drop
  ProtocolV1.cc hunk.  As a result, skip_authorizer_challenge is
  never set, but this is fine because msgr1 still uses old ms_*
  auth methods and tests CEPHX_V2 appropriately. ]

msg/async/ProtocolV1: resurrect "implement cephx_*require_version options"

This was added in commit 9bcbc2a3621f ("mon,msg: implement
cephx_*_require_version options") and inadvertently dropped in
commit e6f043f7d2dc ("msgr/async: huge refactoring of protocol V1").
As a result, service daemons don't enforce cephx_require_version
and cephx_cluster_require_version options and connections without
CEPH_FEATURE_CEPHX_V2 are allowed through.

(cephx_service_require_version enforcement was brought back a
year later in commit 321548010578 ("mon/MonClient: skip CEPHX_V2
challenge if client doesn't support it"), although the peer gets
TAG_BADAUTHORIZER instead of TAG_FEATURES.)

Resurrect the original behaviour: all cephx_*require_version
options are enforced and the peer gets TAG_FEATURES, signifying
that it is missing a required feature.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 6f5c4152ca2c6423e665cde2196c6301f76043a2)

Conflicts:
src/msg/async/ProtocolV1.cc [ drop nautilus-only commit
89ffece49097 ("msg/async/ProtocolV1: require CEPHX_V2 if
cephx_service_require_version >= 2") ]

msg/async/ProtocolV1: resurrect "include MGR as service when applying cephx settings"

This was added in commit 0ec7d6bbc4af ("msg/async,simple: include MGR
as service when applying cephx settings") and inadvertently dropped in
commit e6f043f7d2dc ("msgr/async: huge refactoring of protocol V1").
As a result, mgr daemons are miscategorized as clients when enforcing
cephx_*require_signatures options.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 949e2e595eda553aa68f697cee1dcfff3c09cf3f)

Merge pull request #37844 from smithfarm/wip-46118-nautilus

nautilus: mgr: fix race between module load and notify

Reviewed-by: Mykola Golub <mgolub@mirantis.com>

Merge pull request #37843 from smithfarm/wip-47894-nautilus

nautilus: bluestore: attach csum for compressed blobs

Reviewed-by: Igor Fedotov <ifedotov@suse.com>

Merge pull request #37842 from smithfarm/wip-47707-nautilus

nautilus: bluestore: Support flock retry

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #37824 from smithfarm/wip-46008-nautilus

nautilus: bluestore: test/objectstore/store_test: kill ExcessiveFragmentation test case

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #37823 from smithfarm/wip-46628-nautilus

nautilus: bluestore: BlockDevice.cc: use pending_aios instead of iovec size as ios num

Reviewed-by: Igor Fedotov <ifedotov@suse.com>

Merge pull request #37818 from smithfarm/wip-47993-nautilus

nautilus: test/store_test: use 'threadsafe' style for death tests

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>

Merge pull request #37815 from smithfarm/wip-47825-nautilus

nautilus: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #37563 from smithfarm/wip-47761-nautilus

nautilus: mgr/prometheus: add pool compression stats

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Boris Ranto <branto@redhat.com>

Merge pull request #37333 from callithea/wip-46975-nautilus

nautilus: mgr/dashboard: Strange iSCSI discovery auth behavior

Reviewed-by: Tiago Melo <tmelo@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>

mgr/dashboard: fix security scopes of some NFS-Ganesha endpoints

Apply NFS_GANESHA scope to these endpoints:
- `/api/nfs-ganesha/daemon`.
- `/ui-api/nfs-ganesha/*`.

Otherwise, any valid users can access them.

Fixes: https://tracker.ceph.com/issues/47356
Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>
(cherry picked from commit ed123e493cf43e71cb608a31ac8f2a9136f6febf)

Conflicts:
src/pybind/mgr/dashboard/controllers/nfsganesha.py
- ReadPermissions between Endpoint and def lsdir;
def lsdir pylint addition