git.apps.os.sepia.ceph.com Git

mgr/dashboard:Simplify some complex calculations in test_alerts.yml

run-promtool-unittests is failing with difference in floating point values in some complex calculations. This PR intends to simplify those calculations and fix this issue.

Fixes: https://tracker.ceph.com/issues/49952
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 8d2f39e6c568afb6880689160212bcc93057e194)

ceph.spec,install-deps: use golang-github-prometheus for promtools

instead of installing docker for using promtools, install
golang-github-prometheus.

Signed-off-by: Aashish Sharma <aasharma@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit e33e3a931db97d01318643ec686fe63fdd614082)

Conflicts:
install-deps.sh (changed dnf to yumdnf)

test: run promtool test without docker on ubuntu/focal

before this change, we use docker for running promtools offered by
a docker image, but this is not efficient, and quite a few developers
do not want to use docker for running "make check". this change was
introduced by #39246, the reason was that, in Ceph's CI process, we
are using Ubuntu/Bionic for running "make check" jobs, but prometheus
packaged by Bionic does not offer the "test rules" command. so, to
address problem, we are using "dnanexus/promtool:2.9.2" docker image
for verifying monitoring/prometheus/alerts/test_alerts.yml.

after this change, we use prometheus packaged by debian derivatives
instead of pulling a docker image.

* debian/control: add prometheus as a "make check" dependency
* install-deps.sh: partially revert
  53a5816deda0874a3a37e131e9bc22d88bb2a588, as we don't need to
  pull docker or start docker service for using promtool anymore.
* cmake: check if promtool is capable of running "test rules"
  command, bail out if it is not.

see also: https://tracker.ceph.com/issues/49653

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit f381aa8bf0e175940153975fa1534ef0559ecadd)

mgr/dashboard:test prometheus rules through promtool

This PR intends to add unit testing for prometheus rules using promtool. To run the tests run 'run-promtool-unittests.sh' file.

Fixes: https://tracker.ceph.com/issues/45415
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 53a5816deda0874a3a37e131e9bc22d88bb2a588)

Conflicts:
install-deps.sh (changed dnf to yumdnf)

Merge pull request #40790 from smithfarm/wip-50081-octopus

octopus: rbd-mirror: fix UB while registering perf counters

Reviewed-by: Mykola Golub <mgolub@mirantis.com>

Merge pull request #40666 from idryomov/wip-require-ceph-common-for-ioc-octopus

octopus: packaging: require ceph-common for immutable object cache daemon

Reviewed-by: Nathan Cutler <ncutler@suse.com>

Merge pull request #40958 from rhcs-dashboard/wip-50457-octopus

octopus: vstart.sh: disable "auth_allow_insecure_global_id_reclaim"

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

vstart.sh: disable "auth_allow_insecure_global_id_reclaim"

to silence the health warning of "mons are allowing insecure global_id
reclaim", which prevents the cluster from being active+clean. couple
tests are expecting a warning free cluster before they starts.

as this option is enabled by default for appeasing the old clients, but when it
comes to most of upstream testing, we can just disable it.

Fixes: https://tracker.ceph.com/issues/50374
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 77a8376d0731c24e7bbf24523d3d7450e9f978af)

Merge branch 'octopus-saved' into octopus

15.2.11

auth/cephx: make KeyServer::build_session_auth_info() less confusing

The second KeyServer::build_session_auth_info() overload is used only
by the monitor, for mon <-> mon authentication.  The monitor passes in
service_secret (mon secret) and secret_id (-1).  The TTL is irrelevant
because there is no rotation.

However the signature doesn't make it obvious.  Clarify that
service_secret and secret_id are input parameters and info is the only
output parameter.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 6f12cd3688b753633c8ff29fb3bd64758f960b2b)

auth/cephx: cap ticket validity by expiration of "next" key

If auth_mon_ticket_ttl is increased by several times as done in
commit 522a52e6c258 ("auth/cephx: rotate auth tickets less often"),
active clients eventually get stuck because the monitor sends out an
auth ticket with a bogus validity.  The ticket is secured with the
"current" secret that is scheduled to expire according to the old TTL,
but the validity of the ticket is set to the new TTL.  As a result,
the client simply doesn't attempt to renew, letting the secrets rotate
potentially more than once.  When that happens, the client first hits
auth authorizer errors as it tries to renew service tickets and when
it finally gets to renewing the auth ticket, it hits the insecure
global_id reclaim wall.

Cap TTL by expiration of "next" key -- the "current" key may be
milliseconds away from expiration and still be used, legitimately.
Do it in KeyServerData alongside key rotation code and propagate the
capped TTL to the upper layer.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 370c9b13970d47a55b1b20ef983c6f01236c9565)

auth/cephx: drop redundant KeyServerData::get_service_secret() overload

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 3078af716505ae754723864786a41a6d6af0534c)

qa/standalone: default to disable insecure global id reclaim

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 72c4fc75ad301980baebc7789ed6391444057e5b)

qa/suites/upgrade/octopus-x: disable insecure global_id reclaim health warnings

These will trigger on upgrade; suppress them so that our health gates
will still work.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 3e80f61efeafc186ea8130984d64c05b2707d6ba)

Conflicts:
qa/suites/rados/cephadm/upgrade/3-start-upgrade.yaml [ commit
  04a3d4c927e7 ("qa/suites/rados/cephadm/upgrade: deploy a legacy
  r.z-style rgw") not in octopus ]
qa/suites/upgrade/octopus-x/parallel/1-tasks.yaml [ no octopus-x
  upgrade suite in octopus ]
qa/suites/upgrade/octopus-x/rgw-multisite/overrides.yaml [ ditto ]
qa/suites/upgrade/octopus-x/stress-split/1-start.yaml [ ditto ]

qa/tasks/ceph[adm].conf[.template]: disable insecure global_id reclaim health alerts

Turn these off everywhere for our tests so they don't interfere with our health checks.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 9f6fd4fe563c9cd4cf65316921d511b677c972e4)

cephadm: set auth_allow_insecure_global_id_reclaim for mon on bootstrap

If this is a fresh pacific cluster, let's assume that there won't be
legacy clients connecting. (And if there are, let's put the burden on
the user to enable them to do so insecurely.)

This is in contrast to upgrades, where our focus is on not breaking
anything.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 7ca74183226b1125b29f4ea8f324ae9e38b46795)

Conflicts:
src/cephadm/cephadm [ commit 369989ebf90c ("cephadm: split-off
config work on bootstrap") not in octopus ]

mon/HealthMonitor: raise AUTH_INSECURE_GLOBAL_ID_RENEWAL[_ALLOWED]

Two new alerts:

- AUTH_INSECURE_GLOBAL_ID_RENEWAL_ALLOWED if we are allowing clients to reclaim
global_ids in an insecure manner (for backwards compatibility until
clients are upgraded)

- AUTH_INSECURE_GLBOAL_ID_RENEWAL if there are currently clients connected that
do not know how to securely renew their global_id, as exposed by
auth_expose_insecure_global_id_reclaim=true. The client auth names and IPs
are listed the alert details (up to a limit, at least).

The docs recommend operators mute these alerts instead of silencing, but
we still include option that allow the alerts to be disabled entirely.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 18b343b06e5dd904af425dc99e2c848e12f3b552)

Conflicts:
src/mon/HealthMonitor.cc [ commit e4bf716bfa07 ("mon: store
a reference as member variable") not in octopus ]

auth/cephx: ignore CEPH_ENTITY_TYPE_AUTH in requested keys

When handling CEPHX_GET_AUTH_SESSION_KEY requests from nautilus+
clients, ignore CEPH_ENTITY_TYPE_AUTH in CephXAuthenticate::other_keys.
Similarly, when handling CEPHX_GET_PRINCIPAL_SESSION_KEY requests,
ignore CEPH_ENTITY_TYPE_AUTH in CephXServiceTicketRequest::keys.
These fields are intended for requesting service tickets, the auth
ticket (which is really a ticket granting ticket) must not be shared
this way.

Otherwise we end up sharing an auth ticket that a) isn't encrypted
with the old session key even if needed (should_enc_ticket == true)
and b) has the wrong validity, namely auth_service_ticket_ttl instead
of auth_mon_ticket_ttl. In the CEPHX_GET_AUTH_SESSION_KEY case, this
undue ticket immediately supersedes the actual auth ticket already
encoded in the same reply (the reply frame ends up containing two auth
tickets).

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 05772ab6127bdd9ed2f63fceef840f197ecd9ea8)

auth/cephx: rotate auth tickets less often

If unauthorized global_id (re)use is disallowed, a client that has
been disconnected from the network long enough for keys to rotate
and its auth ticket to expire (i.e. become invalid/unverifiable)
would not be able to reconnect.

The default TTL is 12 hours, resulting in a 12-24 hour reconnect
window (the previous key is kept around, so the actual window can be
up to double the TTL). The setting has stayed the same since 2009,
but it also hasn't been enforced. Bump it to get a 72 hour reconnect
window to cover for something breaking on Friday and not getting fixed
until Monday.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 522a52e6c258932274f0753feb623ce008519216)

mon: fail fast when unauthorized global_id (re)use is disallowed

When unauthorized global_id (re)use is disallowed, we don't want to
let unpatched clients in because they wouldn't be able to reestablish
their monitor session later, resulting in subtle hangs and disrupted
user workloads.

Denying the initial connect for all legacy (CephXAuthenticate < v3)
clients is not feasible because a large subset of them never stopped
presenting their ticket on reconnects and are therefore compatible with
enforcing mode: most notably all kernel clients but also pre-luminous
userspace clients.  They don't need to be patched and excluding them
would significantly hamper the adoption of enforcing mode.

Instead, force clients that we are not sure about to reconnect shortly
after they go through authentication and obtain global_id.  This is
done in Monitor::dispatch_op() to capture both msgr1 and msgr2, most
likely instead of dispatching mon_subscribe.

We need to let mon_getmap through for "ceph ping" and "ceph tell" to
work.  This does mean that we share the monmap, which lets the client
return from MonClient::authenticate() considering authentication to be
finished and causing the potential reconnect error to not propagate to
the user -- the client would hang waiting for remaining cluster maps.
For msgr1, this is unavoidable because the monmap is sent immediately
after the final MAuthReply.  But for msgr2 this is rare: most of the
time we get to their mon_subscribe and cut the connection before they
process the monmap!

Regardless, the user doesn't get a chance to start a workload since
there is no proper higher-level session at that point.

To help with identifying clients that need patching, add global_id and
global_id_status to "sessions" output.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 08766a17edebb7450cd9b17cc2dc01efc068bb94)

auth/cephx: option to disallow unauthorized global_id (re)use

global_id is a cluster-wide unique id that must remain stable for the
lifetime of the client instance.  The cephx protocol has a facility to
allow clients to preserve their global_id across reconnects:

(1) the client should provide its global_id in the initial handshake
    message/frame and later include its auth ticket proving previous
    possession of that global_id in CEPHX_GET_AUTH_SESSION_KEY request

(2) the monitor should verify that the included auth ticket is valid
    and has the same global_id and, if so, allow the reclaim

(3) if the reclaim is allowed, the new auth ticket should be
    encrypted with the session key of the included auth ticket to
    ensure authenticity of the client performing reclaim.  (The
    included auth ticket could have been snooped when the monitor
    originally shared it with the client or any time the client
    provided it back to the monitor as part of requesting service
    tickets, but only the genuine client would have its session key
    and be able to decrypt.)

Unfortunately, all (1), (2) and (3) have been broken for a while:

- (1) was broken in 2016 by commit a2eb6ae3fb57 ("mon/monclient:
  hunt for multiple monitor in parallel") and is addressed in patch
  "mon/MonClient: preserve auth state on reconnects"

- it turns out that (2) has never been enforced.  When cephx was
  being designed and implemented in 2009, two changes to the protocol
  raced with each other pulling it in different directions: commits
  0669ca21f4f7 ("auth: reuse global_id when requesting tickets")
  and fec31964a12b ("auth: when renewing session, encrypt ticket")
  added the reclaim mechanism based strictly on auth tickets, while
  commit 5eeb711b6b2b ("auth: change server side negotiation a bit")
  allowed the client to provide global_id in the initial handshake.
  These changes didn't get reconciled and as a result a malicious
  client can assign itself any global_id of its choosing by simply
  passing something other than 0 in MAuth message or AUTH_REQUEST
  frame and not even bother supplying any ticket.  This includes
  getting a global_id that is being used by another client.

- (3) was broken in 2019 with addition of support for msgr2, where
  the new auth ticket ends up being shared unencrypted.  However the
  root cause is deeper and a malicious client can coerce msgr1 into
  the same.  This also goes back to 2009 and is addressed in patch
  "auth/cephx: ignore CEPH_ENTITY_TYPE_AUTH in requested keys".

Because (2) has never been enforced, no one noticed when (1) got
broken and we began to rely on this flaw for normal operation in
the face of reconnects due to network hiccups or otherwise.  As of
today, only pre-luminous userspace clients and kernel clients are
not exercising it on a daily basis.

Bump CephXAuthenticate version and use a dummy v3 to distinguish
between legacy clients that don't (may not) include their auth ticket
and new clients.  For new clients, unconditionally disallow claiming
global_id without a corresponding auth ticket.  For legacy clients,
introduce a choice between permissive (current behavior, default for
the foreseeable future) and enforcing mode.

If the reclaim is disallowed, return EACCES.  While MonClient does
have some provision for global_id changes and we could conceivably
implement enforcement by handing out a fresh global_id instead of
the provided one, those code paths have never been tested and there
are too many ways a sudden global_id change could go wrong.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit abebd643cc60fa8a7cb82dc29a9d5041fb3c3d36)

Conflicts:
src/auth/cephx/CephxProtocol.h [ bufferlist vs
  ceph::buffer::list ]
src/auth/cephx/CephxServiceHandler.h [ ditto ]
src/auth/none/AuthNoneServiceHandler.h [ ditto ]

auth/cephx: make cephx_decode_ticket() take a const ticket_blob

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 6b860684c6e59b11c727206819805f89f0518575)

auth/AuthServiceHandler: keep track of global_id and whether it is new

AuthServiceHandler already has global_id field, but it is unused.
Revive it and let the handler know whether global_id is newly assigned
by the monitor or provided by the client.

Lift the setting of entity_name into AuthServiceHandler.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit b50b6abd60e730176a7ef602bdd25d789a3c467d)

Conflicts:
src/auth/cephx/CephxServiceHandler.cc [ bufferlist vs
ceph::buffer::list ]
src/auth/cephx/CephxServiceHandler.h [ ditto ]
src/auth/none/AuthNoneServiceHandler.h [ ditto ]

auth/AuthServiceHandler: build_cephx_response_header() is cephx-specific

Make the one in CephxServiceHandler private and drop the stub in
AuthNoneServiceHandler.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 49cba02a750d4c1ab68399401f0c04f9c9be5b9e)

Conflicts:
src/auth/cephx/CephxServiceHandler.h [ bufferlist vs
ceph::buffer::list ]
src/auth/none/AuthNoneServiceHandler.h [ ditto ]

auth/AuthServiceHandler: drop unused start_session() args

session_key, connection_secret and connection_secret_required_length
aren't material for start_session() across all three implementations.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit c151c9659bdb71f30b520bbd62f91cc009ec51cd)

Conflicts:
src/auth/cephx/CephxServiceHandler.h [ bufferlist vs
ceph::buffer::list ]
src/auth/none/AuthNoneServiceHandler.h [ ditto ]

mon/MonClient: drop global_id arg from _add_conn() and _add_conns()

Passing anything but MonClient instance's global_id doesn't make
sense.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit a71f6e90d43cca5a79db92ca6a640598796ae7ee)

Conflicts:
src/mon/MonClient.cc [ commit 1e9b18008c5e ("mon: set
MonClient::_add_conn return type to void") not in octopus ]
src/mon/MonClient.h [ ditto ]

mon/MonClient: reset auth state in shutdown()

Destroying AuthClientHandler and not resetting global_id is another
way to get MonClient to send CEPHX_GET_AUTH_SESSION_KEY requests with
CephXAuthenticate::old_ticket not populated. This is particularly
pertinent to get_monmap_and_config() which shuts down the bootstrap
MonClient between retry attempts.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit c9b022e07392979e7f9ea6c11484a7dd872cc235)

mon/MonClient: preserve auth state on reconnects

Commit a2eb6ae3fb57 ("mon/monclient: hunt for multiple monitor in
parallel") introduced a regression where auth state (global_id and
AuthClientHandler) was no longer preserved on reconnects.  The ensuing
breakage was quickly noticed and prompted a follow-on fix 8bb6193c8f53
("mon/MonClient: persist global_id across re-connecting").

However, as evident from the subject, the follow-on fix only took
care of the global_id part.  AuthClientHandler is still destroyed
and all cephx tickets are discarded.  A new from-scratch instance
is created for each MonConnection and CEPHX_GET_AUTH_SESSION_KEY
requests end up with CephXAuthenticate::old_ticket not populated.
The bug is in MonClient, so both msgr1 and msgr2 are affected.

This should have resulted in a similar sort of breakage but didn't
because of a much larger bug.  The monitor should have denied the
attempt to reclaim global_id with no valid ticket proving previous
possession of that global_id presented.  Alas, it appears that this
aspect of the cephx protocol has never been enforced.  This is dealt
with in the next patch.

To fix the issue at hand, clone AuthClientHandler into each
MonConnection so that each respective CEPHX_GET_AUTH_SESSION_KEY
request gets a copy of the current auth ticket.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 236b536b28482ec9d8b872de03da7d702ce4787b)

Conflicts:
src/mon/MonClient.cc [ commit 1e9b18008c5e ("mon: set
  MonClient::_add_conn return type to void") not in octopus ]

mon/MonClient: claim active_con's auth explicitly

Eliminate confusion by moving auth from active_con into MonClient
instead of swapping them.

The existing MonClient::auth can be destroyed right away -- I don't
see why active_con would need it or a reason to delay its destruction
(which is what stashing in active_con effectively does).

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit eec24e4d119c57c7eb5119dc0083616a61b33b89)

mon/MonClient: resurrect "waiting for monmap|config" timeouts

This fixes a regression introduced in commit 85157d5aae3d ("mon:
s/Mutex/ceph::mutex/"). Waiting for monmap and config indefinitely
is not just bad UX, it actually masks other more serious bugs.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 6faa18e0a8e8efba6bd2978942eb9909b6568d5c)

qa/tasks/ceph.conf: shorten cephx TTL for testing

Rotate tickets frequently to exercise those code paths during testing.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 94df76244798cdc0bafd74c9e5197adb5aa990c0)

Merge pull request #39949 from sebastian-philipp/octopus-remove-18.04_podman

octopus: qa/suites/rados/cephadm: rm ubuntu_18.04_podman

Reviewed-by: Sage Weil <sage@redhat.com>

Merge pull request #40399 from rhcs-dashboard/wip-49971-octopus

octopus: mgr/dashboard: Fix for broken User management role cloning

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>

Merge pull request #40297 from rhcs-dashboard/split-tenant-octopus

octopus: mgr/dashboard: Splitting tenant$user when creating rgw user

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>

Merge pull request #40784 from tchaikov/octopus-boost-cmake

octopus: cmake: define BOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT globaly

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>

rbd-mirror: fix UB while registering perf counters

register_perf_counters was called before m_image_spec initialization
resulting in UB in the perf counters' name.

This moves the register_perf_counters() call to the init function
after the m_image_spec initialization.

Fixes: https://tracker.ceph.com/issues/49959
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit 5e3b9d29b3a81923fed51248aa21749dbecfcd73)

cmake: define BOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT globaly

turns out we also need it for compiling librados tests with libboost
1.75, so just define it globally

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 7ce3ee6f346889d4d87d6424c6a1ad18badd139b)

Conflicts:
src/CMakeLists.txt
src/librbd/CMakeLists.txt: trivial resolution

cmake: define BOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT for rgw tests

otherwise unittest_rgw_iam_policy does not compile with boost v1.75

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 36d2f006c6cf309d60857ce85325489865e8374c)

Merge pull request #40673 from smithfarm/wip-50159-octopus

octopus: test/rgw: test_datalog_autotrim filters out new entries

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #40672 from smithfarm/wip-50096-octopus

octopus: rgw: return error when trying to copy encrypted object without key

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>

Merge pull request #40384 from singuliere/wip-49091-octopus

octopus: rgw/http: add timeout to http client

Reviewed-by: Casey Bodley <cbodley@redhat.com>

test/rgw: test_datalog_autotrim filters out new entries

if other sync activity is racing with test_datalog_autotrim, it can
create new datalog entries after the 'datalog autotrim' command runs

instead of asserting that the datalog is empty after trim, assert that
any entries have a marker larger than the max-marker reported by
'datalog status' before the trim

Fixes: https://tracker.ceph.com/issues/45626
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit abd08f1843642e318d74dfadb0f9cf1f6b86d827)

rgw: return error when trying to copy encrypted object without key

Fixes: https://tracker.ceph.com/issues/48554
Signed-off-by: Ilsoo Byun <ilsoobyun@linecorp.com>
(cherry picked from commit dde1303c92b39daa2d760a110f48dc9655e7765f)

packaging: require ceph-common for immutable object cache daemon

This daemon has a systemd service which starts it with --setuser ceph
--setgroup ceph. "ceph" user and group are created by ceph-common and
won't be there unless ceph-common is installed.

Fixes: https://tracker.ceph.com/issues/50207
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit dc55f0bb43226259068545c6e13c2921d225ddbe)

Merge pull request #40392 from neha-ojha/wip-49964-octopus

octopus: common/options: bluefs_buffered_io=true by default

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

Merge pull request #40476 from tchaikov/octopus-pr-38665

octopus: pybind/ceph_argparse.py: use a safe value for timeout

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #40441 from neha-ojha/wip-49990-octopus

octopus: os/bluestore: Make Onode::put/get resiliant to split_cache

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

Merge pull request #40424 from k0ste/wip-49995-octopus

octopus: common/ipaddr: skip loopback interfaces named 'lo' and test it

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #39219 from k0ste/wip-49004-octopus

octopus: mgr: update mon metadata when monmap is updated

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #40127 from neha-ojha/wip-49761-octopus

octopus: pybind/mgr/balancer/module.py: assign weight-sets to all buckets before balancing

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #40278 from singuliere/wip-48596-octopus

octopus: test: cancelling both noscrub *and* nodeep-scrub

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #40277 from singuliere/wip-49009-octopus

octopus: osd: fix potential null pointer dereference when sending ping

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #40276 from singuliere/wip-49527-octopus

octopus: mon/OSDMonitor: fix safety/idempotency of {set,rm}-device-class

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Sage Weil <sage@redhat.com>

Merge pull request #40275 from singuliere/wip-49730-octopus

octopus: debian/ceph-common.postinst: do not chown cephadm log dirs

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #40274 from singuliere/wip-49795-octopus

octopus: osd: propagate base pool application_metadata to tiers

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #40013 from mfoliveira/wip-49681-octopus

octopus: osd: add osd_fast_shutdown_notify_mon option (default false)

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

Merge pull request #39970 from singuliere/wip-48985-octopus

octopus: osd/OSDMap: An empty bucket or OSD is not an error

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>

Merge pull request #39716 from sebastian-philipp/octopus-backport-39373

octopus: mgr/rook: Add timezone info

Reviewed-by: Varsha Rao <varao@redhat.com>

Merge pull request #39487 from ChristinaMeno/wip-49297-octopus

octopus: ceph.spec.in: Enable tcmalloc on IBM Power and Z

Reviewed-by: Nathan Cutler <ncutler@suse.com>

Merge pull request #40298 from tchaikov/octopus-48381

octopus: mon/ConfigMap: fix stray option leak

Reviewed-by: Sage Weil <sage@redhat.com>

Merge pull request #40534 from tchaikov/octopus-pr-40505

octopus: mgr/PyModule: put mgr_module_path before Py_GetPath()

Reviewed-by: Neha Ojha <nojha@redhat.com>

mgr/PyModule: put mgr_module_path before Py_GetPath()

pip comes with _vendor/progress. so there is chance to import the vendored
version of "progress" module instead of the "progress" mgr module, and
fail to import the latter.

in this change, the order of paths are rearranged so the configured
`mgr_module_path` is put before the return value of `Py_GetPath()`.

Fixes: https://tracker.ceph.com/issues/50058
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 8638f526a9d04c3dfd758073980d709165070336)

Conflicts:
src/mgr/PyModule.cc: trivial resolution

Merge pull request #40492 from tchaikov/octopus-flake8

octopus: pybind/mgr/dashboard: bump flake8 to 3.9.0

Reviewed-by: Alfonso Martínez <almartin@redhat.com>

pybind/mgr/dashboard: bump up requests to 2.25.1

request 2.20 is not compatible with urllib3 v1.25.2 and up. this causes
trouble of incompatibility with other python modules. for instance, we
now have following error:

ERROR: pip's dependency resolver does not currently take into account
all the packages that are installed. This behaviour is the source of the
following dependency conflicts.
botocore 1.20.14 requires urllib3<1.27,>=1.25.4, but you have urllib3
1.24.3 which is incompatible.

see also https://github.com/psf/requests/pull/5092

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 8bf07cd1408d0f407ef5e32717bfea159188670d)

mgr/dashboard: update pylint to 2.6.0

* Update pylint to 2.6.0.
* Fix pylint issues.

Fixes: https://tracker.ceph.com/issues/47647
Signed-off-by: Volker Theile <vtheile@suse.com>
s

(cherry picked from commit 298c91958a41674a928d53f010b20f174f16d68f)

Conflicts:
src/pybind/mgr/dashboard/requirements-lint.txt
src/pybind/mgr/dashboard/services/ceph_service.py
src/pybind/mgr/dashboard/services/ganesha.py
src/pybind/mgr/dashboard/services/rgw_client.py
src/pybind/mgr/dashboard/tests/test_access_control.py
src/pybind/mgr/dashboard/tests/test_ganesha.py
src/pybind/mgr/dashboard/tests/test_iscsi.py
src/pybind/mgr/dashboard/tests/test_rgw.py
src/pybind/mgr/dashboard/tests/test_settings.py

admin/build-doc: stop passing --use-feature=2020-resolver to pip

to silence the warning of

WARNING: --use-feature=2020-resolver no longer has any effect, since it is now the default dependency resolver in pip. This will become an error in pip 21.0.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 5cb9d5458745046aaea58cf4af50579925fbb1d0)

Conflicts:
admin/build-doc: trivial resolution

pybind/mgr/dashboard: bump flake8 to 3.9.0

to address the failure of

ERROR: Cannot install -r requirements-lint.txt (line 2) and -r requirements-lint.txt (line 8) because these package versions have conflicting dependencies.

The conflict is caused by:
    flake8 3.8.4 depends on pycodestyle<2.7.0 and >=2.6.0a1
    autopep8 1.5.6 depends on pycodestyle>=2.7.0

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

also, loosen the version of pytest:

The conflict is caused by:
    The user requested pytest<4
    The user requested pytest<4
    pytest-cov 2.11.1 depends on pytest>=4.6

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency
   conflict

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 152964ca360293d9accd18f435efcd66d145063e)

pybind/ceph_argparse.py: use a safe value for timeout

we have reports that on arm32 machines, it timed out immediately, so
to prevent it from int overflow, use a safer value instead of
(1 << (32 - 1)) - 1.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit e59693fd96c34672c4c743514bd173fc70a3a544)

Merge pull request #39914 from mgfritch/octopus-backport-37764-39739

octopus: cephadm: run containers using `--init`

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #40138 from neha-ojha/wip-49402-octopus

octopus: qa/suites/rados/singletone: whitelist MON_DOWN when injecting msgr errors

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

Merge pull request #40009 from mgfritch/octopus-backport-39825

octopus: mgr/cephadm: alias rgw-nfs -> nfs

Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>

Merge pull request #39940 from smithfarm/wip-49663-octopus

octopus: src/global/signal_handler.h: fix preprocessor logic for alpine

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #39922 from smithfarm/wip-49636-octopus

octopus: mgr/telemetry: check if 'ident' channel is active

Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>

Merge pull request #39919 from smithfarm/wip-49530-octopus

octopus: crush/CrushWrapper: update shadow trees on update_item()

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #39884 from singuliere/wip-49386-octopus

octopus: os/bluestore/BlueFS: use iterator_impl::copy instead of bufferlist::c_str() to avoid bufferlist rebuild

Reviewed-by: Igor Fedotov <ifedotov@suse.com>

Merge pull request #37972 from callithea/wip-48131-octopus

octopus: mgr/dashboard: additional logging to SMART data retrieval

Reviewed-by: Patrick Seidensal <pnawracay@suse.com>

os/bluestore: acquire proper lock in split_cache()

Fixes: https://tracker.ceph.com/issues/49900
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 35a3f7be8f2f204ad3b5e720d0534ca3e2a8587c)

os/bluestore: Make Onode::put/get resiliant to split_cache

In
OnodeCacheShard* ocs = c->get_onode_cache();
std::lock_guard l(ocs->lock);
while waiting for lock, split_cache might have changed OnodeCacheShard.
This will result in adding Onode to improper OnodeCacheShard.
Such action is obviously bad, as we will operate in future (at least once) on
different OnodeCacheShard then we got lock for. Particulary sensitive to this
are _trim and split_cache functions, as they iterate over elements.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 343b049a1328d39a69a8c4c9e9cb93ac6ac77280)

mgr: add mon metada using type of "mon"

this change addresses a regression introduced by
c037f4cb5d7436879d58c34748ef516b5269781f

also remove the "P" before the json command.

see also: https://tracker.ceph.com/issues/48905

Fixes: https://tracker.ceph.com/issues/49661
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 8fc290bfba4d71a60d30c2374ce4bcba37e649de)

mgr: update mon metadata when monmap is updated

there is chance that some monitor(s) is updated / upgraded in a single
monmap update without being removed from cluster state's metata first,
so, without this change, we will not update the metadata associated with
that monitor, hence the mgr modules which consumes the metadata is not
updated accordingly and keep reporting the stale information.

in this change, we always update the metadata associated with all monitor
included by the latest monmap. multiple "mon metadata" commands are sent
to monitor for retrieving their updated metadata, instead of sending a
single one, so that we can reuse "MetadataUpdate" to update the metadata
of a given daemon. as the number of monitors in a typical cluster is
relatively small, and the frequency of monmap update is low, so this
overhead should be fine.

unlike other places where we ask mon for metadata in Mgr class, the code
sending the mon command for updated monitor metata is located outside of
`cluster_state.with_monmap()` block, the reason is that `with_monmap()`
is guraded by the monc_lock under the hood, while `start_mon_command()`
also need to acquire the monc_lock, which is not a recursive lock. so we
have to do this out of the `with_monmap()` block.

Fixes: https://tracker.ceph.com/issues/48905
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit c037f4cb5d7436879d58c34748ef516b5269781f)

backport:
- path: src/mgr/Mgr.cc
comment: octopus don't declared `fmt`

common/ipaddr: also skip just `lo`

Skip iface's with name like 'lo' or of the form 'lo:0', 'lo:1'. This
brings back the original behavior from b6d0fc9e0e515e50894c08217d688a8c94db7570

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Fixes: https://tracker.ceph.com/issues/49938
(cherry picked from commit 6147c0917157efd2d35610e759685656a4989abb)

test_ipaddr: check that we correctly skip loopback

We should skip devices named 'lo' or of the form 'lo:0' regardless
of their IP address.

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Related-to: https://tracker.ceph.com/issues/49938
(cherry picked from commit 780125d1ed93cd7b17172752b3e76186a524103b)

Merge pull request #40406 from tchaikov/octopus-pr-40400

octopus: run-make-check.sh: let ctest generate XML output

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

run-make-check.sh: let ctest generate XML output

to enable XUnit plugin of jenkins to consume the ctest output and
publish it in the dashboard, we need to

* let ctest generate XML output instead of plain text output
* do not fail the test if any test case fails. this allows the publisher
to do its job by checking the XML output.
* prevent ctest from compressing the output. see
https://issues.jenkins.io/browse/JENKINS-21737

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 48ba39987d3958531589d7969750ea749e6a6d30)

mgr/dashboard: Fix for broken User management role cloning

Cloning a role in user management gets hit with a 415 error.

Fixes: https://tracker.ceph.com/issues/49880
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 5a105bea820d4154aeff039e08cf9ff0c96dc95e)

common/options: bluefs_buffered_io=true by default

Enable bluefs_buffered_io again because it makes a huge user-visible
improvement in metadata intensive scenarios, such as but not limited to
PG deletion.

In our environment, deleting PGs from 4 hybrid OSDs (sharing one SATA SSD block.db) saturates
the block.db at 350MB/s reads and causes slow reqs and flapping on the OSDs.
Those OSDs have 3GB osd_target_memory.
Enabling bluefs_buffered_io drops the SSD IO down to <1MBps and the OSDs
are performant again. (The underlying PG deletion inefficiency is being
solved separately, but the page cache is so much more effective than
the bluestore cache in this scenario).

Lastly, remove the comment about swap. We should separately advise
operators to disable swap on OSD machines, as it is much better in
our experience to OOM and restart than to chug along swapping.

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Related-to: https://tracker.ceph.com/issues/45765
Related-to: https://tracker.ceph.com/issues/47044
(cherry picked from commit 5ec8e8e63d409860c35e24a192090ac2b70af8f6)

rgw/http: add timeout to http client

also, prevent "Expect: 100-continue" from being sent
when not needed

Signed-off-by: Yuval Lifshitz <yuvalif@yahoo.com>
(cherry picked from commit dd49cc83078c7e268ce3de7ab0bfbf3035ed5d50)

Merge pull request #39360 from kamoltat/wip-octupus-del-period-arg

octopus:qa/tasks/mgr/test_progress: fix wait_until_equal

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #40225 from kamoltat/wip-fix-39289-incomplete-backport

octopus: qa/tasks/mgr/test_progress.py: remove calling of _osd_in_out_completed_events_count()

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #40001 from rhcs-dashboard/wip-49703-octopus

octopus: mgr/dashboard: fix dashboard instance ssl certificate functionality

Reviewed-by: Nizamudeen A <nia@redhat.com>

qa/tasks/mgr/test_progress: fix wait_until_equal

Octopus ceph_test_case doesn't have period arg
so remove that in wait_until_equal. Also increase
time to wait for complete events by using RECOVERY_PERIOD
instead of EVENT_CREATION_PERIOD

Not needed in masters because only octopus and nautilus
doesn't have a period argument in qa/tasks/mgr/test_progress.py
wait_until_equals() function

Fixes: https://tracker.ceph.com/issues/48824
Signed-off-by: Kamoltat <ksirivad@redhat.com>

PendingReleaseNotes: document option osd_fast_shutdown_notify_mon

Let's add the ``osd_fast_shutdown_notify_mon`` option to PendingReleaseNotes
so it is documented.

Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
(cherry picked from commit 7f5aaef5d4585d74535192192c56549bd023bc1f)

Conflicts:
PendingReleaseNotes
- Move snippet into new 15.2.11 section.

osd: add osd_fast_shutdown_notify_mon option (default false)

The osd_fast_shutdown option may cause the cluster log to receive
too many entries of 'osd.X reported immediately failed by osd.Y',
depending on cluster scale.

This might be an issue for LMA stacks/tools that check ceph logs
for failed lines, and then require additional logic to filter on
an intended OSD (fast) shutdown; might not be an option/possible,
and require an admin to analyze.

So, add osd_fast_shutdown_notify_mon option for OSD to also tell
the monitor it is shutting down (done in slow/non-fast shutdown)
under osd_fast_shutdown.

This introduces minimal delay (the ack from the mon is required
to prevent the messages), and addresses the cluster log issue.
Note: the osd_mon_shutdown_timeout option can be used to control
the maximum amount of time waiting for the monitor ack to arrive.

Fixes: http://tracker.ceph.com/issues/46978
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
(cherry picked from commit c75734729764868c5c501722fc8de08dac9ebd4a)

mon/ConfigMap: fix stray option leak

The const Option* needs to remain alive only until the next clear(). Keep
the reference in ConfigMap and clean it up then.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 9397a46aec88e287d56a6286ed4319f65d9c1f31)

Fixes: https://tracker.ceph.com/issues/48381
Conflicts:
src/mon/ConfigMap.h: trivial resolution

mgr/dashboard: Splitting tenant$user when creating rgw user

Fixes: https://tracker.ceph.com/issues/47378
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 7f4387d34fd073a3b0d8c828fecdc5df4b498122)

Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-user-details/rgw-user-details.component.html
    -  Accepted the current changes and pasted the new change
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-user-form/rgw-user-form.component.ts
    -  Accepted incoming change and changed const modalRef =
       this.ModalService.show(RgwUserSubuserModalComponent); to const modalRef = this.bsModalService.show(RgwUserSubuserModalComponent);
    -  Made some modification in getUID() function to adapt with octopus
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-user-list/rgw-user-list.component.ts
    -  Accepted the current change and pasted the new change. Changed
       the $localize to this.i18n to match the octopus way

Merge pull request #40286 from tchaikov/octopus-pr-40272

octopus: install-deps.sh: remove existing ceph-libboost of different version

Reviewed-by: David Galloway <dgallowa@redhat.com>

install-deps.sh: remove existing ceph-libboost of different version

we install different versions of precompiled ceph-libboost packages
for different branches when building and testing them on ubuntu test
nodes. for instance,

- nautilus, octopus: v1.72
- pacific: v1.73

they share the same set of test nodes. and these ceph-libboost packages
conflict with each other, because they install files to the same places.

in order to avoid the confliction, we should uninstall existing packages
before installing a different version of ceph-libboost packages.

ceph-libboost${version}-dev is a package providing the shared headers of
boost library, so, in this change we check if it is installed before
returning or removing the existing packages.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 939b147a55192c21e98d21cb380d0ec0b2ca84d5)

Conflicts:
install-deps.sh: use 1.72

rpm: re-disable SUSE lttng build on z390x

This partially reverts 2b1e646f7aade3135a98c505111ac7ebef5e93a6 which
mistakenly changed a line inside an "%if 0%{?suse_version}" conditional.

Fixes: 2b1e646f7aade3135a98c505111ac7ebef5e93a6
Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit ffd202a08619fc535df593eb41c0769577a1586a)

test: cancelling both noscrub *and* nodeep-scrub

as part of osd-scrub-test.sh.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 43b1129030823817e0b7a21c85de5d3da841510a)