]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
4 years agovstart.sh: disable "auth_allow_insecure_global_id_reclaim" 40959/head
Kefu Chai [Thu, 15 Apr 2021 13:07:53 +0000 (21:07 +0800)]
vstart.sh: disable "auth_allow_insecure_global_id_reclaim"

to silence the health warning of "mons are allowing insecure global_id
reclaim", which prevents the cluster from being active+clean. couple
tests are expecting a warning free cluster before they starts.

as this option is enabled by default for appeasing the old clients, but when it
comes to most of upstream testing, we can just disable it.

Fixes: https://tracker.ceph.com/issues/50374
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 77a8376d0731c24e7bbf24523d3d7450e9f978af)

4 years agoMerge pull request #40270 from kotreshhr/wip-49903-nautilus
Yuri Weinstein [Tue, 20 Apr 2021 18:11:55 +0000 (11:11 -0700)]
Merge pull request #40270 from kotreshhr/wip-49903-nautilus

nautilus: mgr/volumes: Retain suid guid bits in clone

Reviewed-by: Ramana Raja <rraja@redhat.com>
4 years agoMerge branch 'nautilus-saved' into nautilus
Ilya Dryomov [Tue, 20 Apr 2021 08:56:25 +0000 (10:56 +0200)]
Merge branch 'nautilus-saved' into nautilus

4 years ago14.2.20 v14.2.20
Jenkins Build Slave User [Mon, 19 Apr 2021 14:11:15 +0000 (14:11 +0000)]
14.2.20

4 years agoauth/cephx: make KeyServer::build_session_auth_info() less confusing
Ilya Dryomov [Thu, 15 Apr 2021 13:18:58 +0000 (15:18 +0200)]
auth/cephx: make KeyServer::build_session_auth_info() less confusing

The second KeyServer::build_session_auth_info() overload is used only
by the monitor, for mon <-> mon authentication.  The monitor passes in
service_secret (mon secret) and secret_id (-1).  The TTL is irrelevant
because there is no rotation.

However the signature doesn't make it obvious.  Clarify that
service_secret and secret_id are input parameters and info is the only
output parameter.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 6f12cd3688b753633c8ff29fb3bd64758f960b2b)

4 years agoauth/cephx: cap ticket validity by expiration of "next" key
Ilya Dryomov [Thu, 15 Apr 2021 07:48:13 +0000 (09:48 +0200)]
auth/cephx: cap ticket validity by expiration of "next" key

If auth_mon_ticket_ttl is increased by several times as done in
commit 522a52e6c258 ("auth/cephx: rotate auth tickets less often"),
active clients eventually get stuck because the monitor sends out an
auth ticket with a bogus validity.  The ticket is secured with the
"current" secret that is scheduled to expire according to the old TTL,
but the validity of the ticket is set to the new TTL.  As a result,
the client simply doesn't attempt to renew, letting the secrets rotate
potentially more than once.  When that happens, the client first hits
auth authorizer errors as it tries to renew service tickets and when
it finally gets to renewing the auth ticket, it hits the insecure
global_id reclaim wall.

Cap TTL by expiration of "next" key -- the "current" key may be
milliseconds away from expiration and still be used, legitimately.
Do it in KeyServerData alongside key rotation code and propagate the
capped TTL to the upper layer.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 370c9b13970d47a55b1b20ef983c6f01236c9565)

Conflicts:
src/auth/cephx/CephxKeyServer.cc [ commit ef3c42cd6481 ("auth:
  EACCES, not EPERM") not in nautilus ]

4 years agoauth/cephx: drop redundant KeyServerData::get_service_secret() overload
Ilya Dryomov [Thu, 15 Apr 2021 07:47:50 +0000 (09:47 +0200)]
auth/cephx: drop redundant KeyServerData::get_service_secret() overload

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 3078af716505ae754723864786a41a6d6af0534c)

4 years agomgr/volumes: Retain suid/guid bits in subvolume clone 40270/head
Kotresh HR [Thu, 18 Mar 2021 12:54:44 +0000 (18:24 +0530)]
mgr/volumes: Retain suid/guid bits in subvolume clone

Fixes: https://tracker.ceph.com/issues/49882
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 92dc982318fa7d49c3185615b84a7a7764c6ed42)

Conflicts:
  qa/tasks/cephfs/test_volumes.py: Few of the testcases are not preset
      in octopus, hence the conflicts.

4 years agopybind/cephfs: Add lchmod python binding
Kotresh HR [Thu, 18 Mar 2021 12:51:05 +0000 (18:21 +0530)]
pybind/cephfs: Add lchmod python binding

Fixes: https://tracker.ceph.com/issues/49882
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit b2375adce085e98bef521422441b80a945e38c80)

Conflicts:
  src/pybind/cephfs/mock_cephfs.pxi : Not present in octopus
  src/pybind/cephfs/c_cephfs.pxd : Not present in octopus
  src/pybind/cephfs/cephfs.pyx : Few of the fops is not part of octopus
      which got pulled as part of this backport
  src/test/pybind/test_cephfs.py :  Few of the fops is not part of
      octopus, which got pulled as part of this backport. Added missing
      stat import.

4 years agoclient/libcephfs: Add lchmod
Kotresh HR [Thu, 18 Mar 2021 12:51:05 +0000 (18:21 +0530)]
client/libcephfs: Add lchmod

Fixes: https://tracker.ceph.com/issues/49882
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit bb1fd87e3bc45b20f377438cddde6c6307299a29)

4 years agoqa/standalone: default to disable insecure global id reclaim
Sage Weil [Sun, 28 Mar 2021 22:07:57 +0000 (18:07 -0400)]
qa/standalone: default to disable insecure global id reclaim

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 72c4fc75ad301980baebc7789ed6391444057e5b)

4 years agoqa/tasks/ceph[adm].conf[.template]: disable insecure global_id reclaim health alerts
Sage Weil [Fri, 26 Mar 2021 22:08:46 +0000 (18:08 -0400)]
qa/tasks/ceph[adm].conf[.template]: disable insecure global_id reclaim health alerts

Turn these off everywhere for our tests so they don't interfere with our health checks.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 9f6fd4fe563c9cd4cf65316921d511b677c972e4)

Conflicts:
qa/tasks/cephadm.conf [ no cephadm in nautilus ]

4 years agomon/HealthMonitor: raise AUTH_INSECURE_GLOBAL_ID_RENEWAL[_ALLOWED]
Sage Weil [Thu, 25 Mar 2021 22:07:53 +0000 (18:07 -0400)]
mon/HealthMonitor: raise AUTH_INSECURE_GLOBAL_ID_RENEWAL[_ALLOWED]

Two new alerts:

- AUTH_INSECURE_GLOBAL_ID_RENEWAL_ALLOWED if we are allowing clients to reclaim
global_ids in an insecure manner (for backwards compatibility until
clients are upgraded)

- AUTH_INSECURE_GLBOAL_ID_RENEWAL if there are currently clients connected that
do not know how to securely renew their global_id, as exposed by
auth_expose_insecure_global_id_reclaim=true.  The client auth names and IPs
are listed the alert details (up to a limit, at least).

The docs recommend operators mute these alerts instead of silencing, but
we still include option that allow the alerts to be disabled entirely.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 18b343b06e5dd904af425dc99e2c848e12f3b552)

Conflicts:
doc/rados/operations/health-checks.rst [ MON_DISK_* alerts
  present but not documented in nautilus; "ceph health mute"
  not in nautilus -- silencing temporarily is not possible ]
src/mon/HealthMonitor.cc [ commits e4bf716bfa07 ("mon: store
  a reference as member variable") and d0eb22f3ba55
  ("mon/health_checks: associate a count with health_alert_t")
  not in nautilus ]

4 years agoauth/cephx: ignore CEPH_ENTITY_TYPE_AUTH in requested keys
Ilya Dryomov [Tue, 2 Mar 2021 14:09:26 +0000 (15:09 +0100)]
auth/cephx: ignore CEPH_ENTITY_TYPE_AUTH in requested keys

When handling CEPHX_GET_AUTH_SESSION_KEY requests from nautilus+
clients, ignore CEPH_ENTITY_TYPE_AUTH in CephXAuthenticate::other_keys.
Similarly, when handling CEPHX_GET_PRINCIPAL_SESSION_KEY requests,
ignore CEPH_ENTITY_TYPE_AUTH in CephXServiceTicketRequest::keys.
These fields are intended for requesting service tickets, the auth
ticket (which is really a ticket granting ticket) must not be shared
this way.

Otherwise we end up sharing an auth ticket that a) isn't encrypted
with the old session key even if needed (should_enc_ticket == true)
and b) has the wrong validity, namely auth_service_ticket_ttl instead
of auth_mon_ticket_ttl.  In the CEPHX_GET_AUTH_SESSION_KEY case, this
undue ticket immediately supersedes the actual auth ticket already
encoded in the same reply (the reply frame ends up containing two auth
tickets).

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 05772ab6127bdd9ed2f63fceef840f197ecd9ea8)

4 years agoauth/cephx: rotate auth tickets less often
Ilya Dryomov [Mon, 22 Mar 2021 18:16:32 +0000 (19:16 +0100)]
auth/cephx: rotate auth tickets less often

If unauthorized global_id (re)use is disallowed, a client that has
been disconnected from the network long enough for keys to rotate
and its auth ticket to expire (i.e. become invalid/unverifiable)
would not be able to reconnect.

The default TTL is 12 hours, resulting in a 12-24 hour reconnect
window (the previous key is kept around, so the actual window can be
up to double the TTL).  The setting has stayed the same since 2009,
but it also hasn't been enforced.  Bump it to get a 72 hour reconnect
window to cover for something breaking on Friday and not getting fixed
until Monday.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 522a52e6c258932274f0753feb623ce008519216)

4 years agomon: fail fast when unauthorized global_id (re)use is disallowed
Ilya Dryomov [Thu, 25 Mar 2021 19:59:13 +0000 (20:59 +0100)]
mon: fail fast when unauthorized global_id (re)use is disallowed

When unauthorized global_id (re)use is disallowed, we don't want to
let unpatched clients in because they wouldn't be able to reestablish
their monitor session later, resulting in subtle hangs and disrupted
user workloads.

Denying the initial connect for all legacy (CephXAuthenticate < v3)
clients is not feasible because a large subset of them never stopped
presenting their ticket on reconnects and are therefore compatible with
enforcing mode: most notably all kernel clients but also pre-luminous
userspace clients.  They don't need to be patched and excluding them
would significantly hamper the adoption of enforcing mode.

Instead, force clients that we are not sure about to reconnect shortly
after they go through authentication and obtain global_id.  This is
done in Monitor::dispatch_op() to capture both msgr1 and msgr2, most
likely instead of dispatching mon_subscribe.

We need to let mon_getmap through for "ceph ping" and "ceph tell" to
work.  This does mean that we share the monmap, which lets the client
return from MonClient::authenticate() considering authentication to be
finished and causing the potential reconnect error to not propagate to
the user -- the client would hang waiting for remaining cluster maps.
For msgr1, this is unavoidable because the monmap is sent immediately
after the final MAuthReply.  But for msgr2 this is rare: most of the
time we get to their mon_subscribe and cut the connection before they
process the monmap!

Regardless, the user doesn't get a chance to start a workload since
there is no proper higher-level session at that point.

To help with identifying clients that need patching, add global_id and
global_id_status to "sessions" output.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 08766a17edebb7450cd9b17cc2dc01efc068bb94)

Conflicts:
src/mon/Monitor.cc [ commit e1163b445bbf ("mon: print
  entity_name along with caps to debug log") not in nautilus ]

4 years agoauth/cephx: option to disallow unauthorized global_id (re)use
Ilya Dryomov [Sat, 13 Mar 2021 13:53:52 +0000 (14:53 +0100)]
auth/cephx: option to disallow unauthorized global_id (re)use

global_id is a cluster-wide unique id that must remain stable for the
lifetime of the client instance.  The cephx protocol has a facility to
allow clients to preserve their global_id across reconnects:

(1) the client should provide its global_id in the initial handshake
    message/frame and later include its auth ticket proving previous
    possession of that global_id in CEPHX_GET_AUTH_SESSION_KEY request

(2) the monitor should verify that the included auth ticket is valid
    and has the same global_id and, if so, allow the reclaim

(3) if the reclaim is allowed, the new auth ticket should be
    encrypted with the session key of the included auth ticket to
    ensure authenticity of the client performing reclaim.  (The
    included auth ticket could have been snooped when the monitor
    originally shared it with the client or any time the client
    provided it back to the monitor as part of requesting service
    tickets, but only the genuine client would have its session key
    and be able to decrypt.)

Unfortunately, all (1), (2) and (3) have been broken for a while:

- (1) was broken in 2016 by commit a2eb6ae3fb57 ("mon/monclient:
  hunt for multiple monitor in parallel") and is addressed in patch
  "mon/MonClient: preserve auth state on reconnects"

- it turns out that (2) has never been enforced.  When cephx was
  being designed and implemented in 2009, two changes to the protocol
  raced with each other pulling it in different directions: commits
  0669ca21f4f7 ("auth: reuse global_id when requesting tickets")
  and fec31964a12b ("auth: when renewing session, encrypt ticket")
  added the reclaim mechanism based strictly on auth tickets, while
  commit 5eeb711b6b2b ("auth: change server side negotiation a bit")
  allowed the client to provide global_id in the initial handshake.
  These changes didn't get reconciled and as a result a malicious
  client can assign itself any global_id of its choosing by simply
  passing something other than 0 in MAuth message or AUTH_REQUEST
  frame and not even bother supplying any ticket.  This includes
  getting a global_id that is being used by another client.

- (3) was broken in 2019 with addition of support for msgr2, where
  the new auth ticket ends up being shared unencrypted.  However the
  root cause is deeper and a malicious client can coerce msgr1 into
  the same.  This also goes back to 2009 and is addressed in patch
  "auth/cephx: ignore CEPH_ENTITY_TYPE_AUTH in requested keys".

Because (2) has never been enforced, no one noticed when (1) got
broken and we began to rely on this flaw for normal operation in
the face of reconnects due to network hiccups or otherwise.  As of
today, only pre-luminous userspace clients and kernel clients are
not exercising it on a daily basis.

Bump CephXAuthenticate version and use a dummy v3 to distinguish
between legacy clients that don't (may not) include their auth ticket
and new clients.  For new clients, unconditionally disallow claiming
global_id without a corresponding auth ticket.  For legacy clients,
introduce a choice between permissive (current behavior, default for
the foreseeable future) and enforcing mode.

If the reclaim is disallowed, return EACCES.  While MonClient does
have some provision for global_id changes and we could conceivably
implement enforcement by handing out a fresh global_id instead of
the provided one, those code paths have never been tested and there
are too many ways a sudden global_id change could go wrong.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit abebd643cc60fa8a7cb82dc29a9d5041fb3c3d36)

Conflicts:
src/auth/AuthServiceHandler.h [ bufferlist vs
  ceph::buffer::list ]
src/auth/cephx/CephxProtocol.h [ ditto ]
src/auth/cephx/CephxServiceHandler.h [ ditto ]
src/auth/none/AuthNoneServiceHandler.h [ ditto ]

4 years agoauth/cephx: make cephx_decode_ticket() take a const ticket_blob
Ilya Dryomov [Tue, 30 Mar 2021 09:10:17 +0000 (11:10 +0200)]
auth/cephx: make cephx_decode_ticket() take a const ticket_blob

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 6b860684c6e59b11c727206819805f89f0518575)

4 years agoauth/AuthServiceHandler: keep track of global_id and whether it is new
Ilya Dryomov [Tue, 9 Mar 2021 15:33:55 +0000 (16:33 +0100)]
auth/AuthServiceHandler: keep track of global_id and whether it is new

AuthServiceHandler already has global_id field, but it is unused.
Revive it and let the handler know whether global_id is newly assigned
by the monitor or provided by the client.

Lift the setting of entity_name into AuthServiceHandler.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit b50b6abd60e730176a7ef602bdd25d789a3c467d)

Conflicts:
src/auth/AuthServiceHandler.h [ bufferlist vs
  ceph::buffer::list ]
src/auth/cephx/CephxServiceHandler.cc [ ditto ]
src/auth/cephx/CephxServiceHandler.h [ ditto ]
src/auth/none/AuthNoneServiceHandler.h [ ditto ]

4 years agoauth/AuthServiceHandler: build_cephx_response_header() is cephx-specific
Ilya Dryomov [Tue, 9 Mar 2021 13:36:39 +0000 (14:36 +0100)]
auth/AuthServiceHandler: build_cephx_response_header() is cephx-specific

Make the one in CephxServiceHandler private and drop the stub in
AuthNoneServiceHandler.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 49cba02a750d4c1ab68399401f0c04f9c9be5b9e)

Conflicts:
src/auth/cephx/CephxServiceHandler.h [ bufferlist vs
  ceph::buffer::list ]
src/auth/none/AuthNoneServiceHandler.h [ ditto ]

4 years agoauth/AuthServiceHandler: drop unused start_session() args
Ilya Dryomov [Tue, 9 Mar 2021 13:25:39 +0000 (14:25 +0100)]
auth/AuthServiceHandler: drop unused start_session() args

session_key, connection_secret and connection_secret_required_length
aren't material for start_session() across all three implementations.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit c151c9659bdb71f30b520bbd62f91cc009ec51cd)

Conflicts:
src/auth/AuthServiceHandler.h [ bufferlist vs
  ceph::buffer::list ]
src/auth/cephx/CephxServiceHandler.h [ ditto ]
src/auth/none/AuthNoneServiceHandler.h [ ditto ]

4 years agomon/MonClient: drop global_id arg from _add_conn() and _add_conns()
Ilya Dryomov [Tue, 30 Mar 2021 13:19:41 +0000 (15:19 +0200)]
mon/MonClient: drop global_id arg from _add_conn() and _add_conns()

Passing anything but MonClient instance's global_id doesn't make
sense.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit a71f6e90d43cca5a79db92ca6a640598796ae7ee)

Conflicts:
src/mon/MonClient.cc [ commit 1e9b18008c5e ("mon: set
  MonClient::_add_conn return type to void") not in nautilus ]
src/mon/MonClient.h [ ditto ]

4 years agomon/MonClient: reset auth state in shutdown()
Ilya Dryomov [Thu, 1 Apr 2021 08:55:36 +0000 (10:55 +0200)]
mon/MonClient: reset auth state in shutdown()

Destroying AuthClientHandler and not resetting global_id is another
way to get MonClient to send CEPHX_GET_AUTH_SESSION_KEY requests with
CephXAuthenticate::old_ticket not populated.  This is particularly
pertinent to get_monmap_and_config() which shuts down the bootstrap
MonClient between retry attempts.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit c9b022e07392979e7f9ea6c11484a7dd872cc235)

4 years agomon/MonClient: preserve auth state on reconnects
Ilya Dryomov [Mon, 8 Mar 2021 14:37:02 +0000 (15:37 +0100)]
mon/MonClient: preserve auth state on reconnects

Commit a2eb6ae3fb57 ("mon/monclient: hunt for multiple monitor in
parallel") introduced a regression where auth state (global_id and
AuthClientHandler) was no longer preserved on reconnects.  The ensuing
breakage was quickly noticed and prompted a follow-on fix 8bb6193c8f53
("mon/MonClient: persist global_id across re-connecting").

However, as evident from the subject, the follow-on fix only took
care of the global_id part.  AuthClientHandler is still destroyed
and all cephx tickets are discarded.  A new from-scratch instance
is created for each MonConnection and CEPHX_GET_AUTH_SESSION_KEY
requests end up with CephXAuthenticate::old_ticket not populated.
The bug is in MonClient, so both msgr1 and msgr2 are affected.

This should have resulted in a similar sort of breakage but didn't
because of a much larger bug.  The monitor should have denied the
attempt to reclaim global_id with no valid ticket proving previous
possession of that global_id presented.  Alas, it appears that this
aspect of the cephx protocol has never been enforced.  This is dealt
with in the next patch.

To fix the issue at hand, clone AuthClientHandler into each
MonConnection so that each respective CEPHX_GET_AUTH_SESSION_KEY
request gets a copy of the current auth ticket.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 236b536b28482ec9d8b872de03da7d702ce4787b)

Conflicts:
src/mon/MonClient.cc [ commit 1e9b18008c5e ("mon: set
  MonClient::_add_conn return type to void") not in nautilus ]

4 years agomon/MonClient: claim active_con's auth explicitly
Ilya Dryomov [Sat, 6 Mar 2021 10:15:40 +0000 (11:15 +0100)]
mon/MonClient: claim active_con's auth explicitly

Eliminate confusion by moving auth from active_con into MonClient
instead of swapping them.

The existing MonClient::auth can be destroyed right away -- I don't
see why active_con would need it or a reason to delay its destruction
(which is what stashing in active_con effectively does).

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit eec24e4d119c57c7eb5119dc0083616a61b33b89)

4 years agomon: dump json from 'sessions' asok/tell command
Sage Weil [Wed, 29 Jan 2020 21:37:03 +0000 (15:37 -0600)]
mon: dump json from 'sessions' asok/tell command

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 71a0d8a568bd0034cc1e6329cd20269f11635697)

Conflicts:
src/mon/Monitor.cc [ commit adf1486e46cb ("common/admin_socket:
  pass Formatter from generic infrastructure") not in nautilus ]

4 years agoqa/tasks/ceph.conf: shorten cephx TTL for testing 40661/head
Sage Weil [Mon, 5 Apr 2021 18:08:30 +0000 (13:08 -0500)]
qa/tasks/ceph.conf: shorten cephx TTL for testing

Rotate tickets frequently to exercise those code paths during testing.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 94df76244798cdc0bafd74c9e5197adb5aa990c0)

Conflicts:
qa/tasks/cephadm.conf [ no cephadm in nautilus ]

4 years agoMerge pull request #40359 from tchaikov/nautilus-pr-39937
Yuri Weinstein [Mon, 12 Apr 2021 15:23:40 +0000 (08:23 -0700)]
Merge pull request #40359 from tchaikov/nautilus-pr-39937

nautilus: mgr: add mon metada using type of "mon"

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
4 years agoMerge pull request #39397 from kamoltat/wip-nautilus-del-period-arg
Yuri Weinstein [Mon, 12 Apr 2021 15:22:34 +0000 (08:22 -0700)]
Merge pull request #39397 from kamoltat/wip-nautilus-del-period-arg

nautilus: qa/tasks/mgr/test_progress: fix wait_until_equal

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
4 years agoMerge pull request #40619 from smithfarm/wip-50164-nautilus
Yuri Weinstein [Mon, 12 Apr 2021 15:21:30 +0000 (08:21 -0700)]
Merge pull request #40619 from smithfarm/wip-50164-nautilus

nautilus: cmake: set empty RPATH for some test executables

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
4 years agoMerge pull request #40393 from neha-ojha/wip-49966-nautilus
Yuri Weinstein [Mon, 12 Apr 2021 15:21:05 +0000 (08:21 -0700)]
Merge pull request #40393 from neha-ojha/wip-49966-nautilus

nautilus: common/options: bluefs_buffered_io=true by default

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agoMerge pull request #40675 from smithfarm/wip-49768-nautilus
Yuri Weinstein [Mon, 12 Apr 2021 15:18:36 +0000 (08:18 -0700)]
Merge pull request #40675 from smithfarm/wip-49768-nautilus

nautilus: librbd: allow interrupted trash move request to be restarted

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
4 years agoMerge pull request #40567 from singuliere/wip-49991-nautilus
Yuri Weinstein [Mon, 12 Apr 2021 15:18:00 +0000 (08:18 -0700)]
Merge pull request #40567 from singuliere/wip-49991-nautilus

nautilus: common/mempool: only fail tests if sharding is very bad

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
4 years agoMerge pull request #40549 from idryomov/wip-remove-log-early-nautilus
Yuri Weinstein [Mon, 12 Apr 2021 15:17:33 +0000 (08:17 -0700)]
Merge pull request #40549 from idryomov/wip-remove-log-early-nautilus

nautilus: common: remove log_early configuration option

Reviewed-by: Sage Weil <sage@redhat.com>
4 years agoMerge pull request #40674 from smithfarm/wip-50158-nautilus
Yuri Weinstein [Fri, 9 Apr 2021 14:49:52 +0000 (07:49 -0700)]
Merge pull request #40674 from smithfarm/wip-50158-nautilus

nautilus: test/rgw: test_datalog_autotrim filters out new entries

Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agoMerge pull request #40671 from smithfarm/wip-50095-nautilus
Yuri Weinstein [Fri, 9 Apr 2021 14:49:26 +0000 (07:49 -0700)]
Merge pull request #40671 from smithfarm/wip-50095-nautilus

nautilus: rgw: return error when trying to copy encrypted object without key

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
4 years agoMerge pull request #40670 from smithfarm/wip-49744-nautilus
Yuri Weinstein [Fri, 9 Apr 2021 14:48:57 +0000 (07:48 -0700)]
Merge pull request #40670 from smithfarm/wip-49744-nautilus

nautilus: rgw: limit rgw_gc_max_objs to RGW_SHARDS_PRIME_1

Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agoMerge pull request #40668 from smithfarm/wip-50233-nautilus
Yuri Weinstein [Fri, 9 Apr 2021 14:48:27 +0000 (07:48 -0700)]
Merge pull request #40668 from smithfarm/wip-50233-nautilus

nautilus: rgw: return ERR_NO_SUCH_BUCKET early while evaluating bucket policy

Reviewed-by: Matt Benjamin <mbenjami@redhat.com>
4 years agorbd: clarify trash remove error code from interrupted move 40675/head
Jason Dillaman [Wed, 10 Mar 2021 20:31:22 +0000 (15:31 -0500)]
rbd: clarify trash remove error code from interrupted move

Fixes: https://tracker.ceph.com/issues/49716
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 138d71fb0635682510cadda8e4ad5aaab3f39e44)

4 years agolibrbd/trash: don't return -ENOENT error from move state machine
Jason Dillaman [Wed, 10 Mar 2021 20:37:39 +0000 (15:37 -0500)]
librbd/trash: don't return -ENOENT error from move state machine

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit f6ed98d682e562de1cad301696e918c52a4dba5d)

4 years agolibrbd/api: trash remove/purge should indicate interrupted move
Jason Dillaman [Wed, 10 Mar 2021 20:29:11 +0000 (15:29 -0500)]
librbd/api: trash remove/purge should indicate interrupted move

This will help the user self-diagnose that a trash move operation
was interrupted and therefore the state is invalid.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit c808abea64f00e25c6fd3bcaa7ebf9bc763e7ca0)

4 years agolibrbd/api: allow an interrupted trash move to be restarted
Jason Dillaman [Wed, 10 Mar 2021 20:15:26 +0000 (15:15 -0500)]
librbd/api: allow an interrupted trash move to be restarted

Search the trash entries for a matching image name that is
still in the moving state and allow the operation to be
restarted.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit ed2d696e1eafaa59d29ce6fac952e4e5f4f1e920)

4 years agolibrbd/api: helper method for natively listing the trash
Jason Dillaman [Wed, 10 Mar 2021 19:44:36 +0000 (14:44 -0500)]
librbd/api: helper method for natively listing the trash

The existing list method converts the native TrashImageSpec to the
API's rbd_trash_image_info_t which is missing the source field.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 21adc927fe50ae37069d77482edd4c4e098433c9)

4 years agotest/rgw: test_datalog_autotrim filters out new entries 40674/head
Casey Bodley [Mon, 15 Jun 2020 15:45:11 +0000 (11:45 -0400)]
test/rgw: test_datalog_autotrim filters out new entries

if other sync activity is racing with test_datalog_autotrim, it can
create new datalog entries after the 'datalog autotrim' command runs

instead of asserting that the datalog is empty after trim, assert that
any entries have a marker larger than the max-marker reported by
'datalog status' before the trim

Fixes: https://tracker.ceph.com/issues/45626
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit abd08f1843642e318d74dfadb0f9cf1f6b86d827)

4 years agorgw: return error when trying to copy encrypted object without key 40671/head
Ilsoo Byun [Fri, 11 Dec 2020 00:57:49 +0000 (09:57 +0900)]
rgw: return error when trying to copy encrypted object without key

Fixes: https://tracker.ceph.com/issues/48554
Signed-off-by: Ilsoo Byun <ilsoobyun@linecorp.com>
(cherry picked from commit dde1303c92b39daa2d760a110f48dc9655e7765f)

4 years agorgw: limit rgw_gc_max_objs to RGW_SHARDS_PRIME_1 40670/head
Rafał Wądołowski [Wed, 17 Feb 2021 11:47:07 +0000 (12:47 +0100)]
rgw: limit rgw_gc_max_objs to RGW_SHARDS_PRIME_1

Don't allow GC to process more gc ojects than RGW_SHARDS_PRIME_1

Fixes: https://tracker.ceph.com/issues/49321
Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com>
(cherry picked from commit 7b163048d93a078e2354665488a27225042d3f1a)

4 years agorgw: return ERR_NO_SUCH_BUCKET early while evaluating bucket policy 40668/head
Abhishek Lekshmanan [Thu, 21 Feb 2019 16:06:52 +0000 (17:06 +0100)]
rgw: return ERR_NO_SUCH_BUCKET early while evaluating bucket policy

Right now we create a ERR_NO_SUCH_BUCKET ret code but continue further
processing. Since this ret code isn't returned at any stage we end up creating a
bucket instance anyway which shouldn't happen and then succeeding the client
call in cases like put bucket versioning. Return an error code early in these
cases

Fixes: http://tracker.ceph.com/issues/38420
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
(cherry picked from commit cf66a6d5a7eee294313a1a08d0524daf939747e5)

4 years agoMerge pull request #40590 from rhcs-dashboard/wip-50069-nautilus
Ernesto Puerta [Thu, 8 Apr 2021 10:44:04 +0000 (12:44 +0200)]
Merge pull request #40590 from rhcs-dashboard/wip-50069-nautilus

nautilus: mgr/dashboard: Fix for alert notification message being undefined

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
4 years agoqa/tasks/mgr/test_progress: fix wait_until_equal 39397/head
Kamoltat [Mon, 8 Feb 2021 15:45:06 +0000 (15:45 +0000)]
qa/tasks/mgr/test_progress: fix wait_until_equal

Octopus ceph_test_case doesn't have period arg
so remove that in wait_until_equal. Also increase
time to wait for complete events by using RECOVERY_PERIOD
instead of EVENT_CREATION_PERIOD

Not needed in masters because only octopus and nautilus
doesn't have a period argument in qa/tasks/mgr/test_progress.py
wait_until_equals() function

Fixes: https://tracker.ceph.com/issues/48824
Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit df41ea7467db3b40776030865896af0102129283)

 Conflicts:
qa/tasks/mgr/test_progress.py - trivial fix

4 years agoqa/mgr/progress: fix timeout error when waiting for osd in event
Ricardo Dias [Tue, 3 Sep 2019 10:44:05 +0000 (11:44 +0100)]
qa/mgr/progress: fix timeout error when waiting for osd in event

Fixes: https://tracker.ceph.com/issues/40618
Signed-off-by: Ricardo Dias <rdias@suse.com>
(cherry picked from commit b03537949abd8d453aa06c1580a4578868904bd1)

 Conflicts:
qa/tasks/mgr/test_progress.py - trivial fix

4 years agoMerge pull request #40356 from tchaikov/wip-49537-nautilus
Yuri Weinstein [Tue, 6 Apr 2021 17:41:26 +0000 (10:41 -0700)]
Merge pull request #40356 from tchaikov/wip-49537-nautilus

nautilus: rgw : catch non int exception

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
4 years agoMerge pull request #40547 from rhcs-dashboard/nautilus-decouple-tests-from-build
Ernesto Puerta [Tue, 6 Apr 2021 17:19:08 +0000 (19:19 +0200)]
Merge pull request #40547 from rhcs-dashboard/nautilus-decouple-tests-from-build

mgr/dashboard: decouple unit tests from build artifacts

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #40610 from rhcs-dashboard/nautilus-fix-py2-unicode-password
Ernesto Puerta [Tue, 6 Apr 2021 16:24:29 +0000 (18:24 +0200)]
Merge pull request #40610 from rhcs-dashboard/nautilus-fix-py2-unicode-password

nautilus: mgr/dashboard: python 2: fix error when non-ASCII password

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #40621 from tchaikov/nautilus-osd-scrub-dst
Kefu Chai [Tue, 6 Apr 2021 14:19:10 +0000 (22:19 +0800)]
Merge pull request #40621 from tchaikov/nautilus-osd-scrub-dst

nautilus: test/TestOSDScrub: fix mktime() error

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
4 years agotest/TestOSDScrub: fix mktime() error 40621/head
luo rixin [Thu, 20 Feb 2020 12:07:39 +0000 (20:07 +0800)]
test/TestOSDScrub: fix mktime() error

The var tm tm isn't initialized, when the tm.tm_isdst is a
positive value, mktime(&tm) return -1 result in test failed
in ubuntu 19.10 for aarch64 GLIBC2.30.

Signed-off-by: luo rixin <luorixin@huawei.com>
(cherry picked from commit 4806fce7f46899499549d9235ca87625d806f2da)

4 years agobuild/ops: set empty RPATH for some test executables 40619/head
Nathan Cutler [Tue, 27 Aug 2019 08:24:11 +0000 (10:24 +0200)]
build/ops: set empty RPATH for some test executables

Fixes: https://tracker.ceph.com/issues/41524
Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit 51a1c621718e65a7062af7dcf12949bdd79a26d1)

4 years agocmake: SKIP_RPATH if RPATH is not necessary
Kefu Chai [Fri, 30 Aug 2019 08:21:03 +0000 (16:21 +0800)]
cmake: SKIP_RPATH if RPATH is not necessary

some executables like ceph_test_mon_memory_target do not link against
libraries built from source tree, like librados and libceph-common. so
cmake does not set RPATH for them. hence cmake complains like:

before this change, `CMAKE_INSTALL_RPATH` is set globally. so cmake is
asked to rewrite the RPATH for all installed targets. but this is not
needed. as some executables do not link against libceph-common. hence,
cmake complains when installing them, like:

CMake Error at src/test/mon/cmake_install.cmake:90 (file):
  file RPATH_CHANGE could not write new RPATH:
    /usr/lib64/ceph
   to the file:
    /home/abuild/rpmbuild/BUILDROOT/ceph-15.0.0-4347.g85a07b9.x86_64/usr/bin/ceph_test_log_rss_usage
   No valid ELF RPATH or RUNPATH entry exists in the file;

after this change, `SKIP_RPATH` is set for those executables which do
not link against any libraries created from ceph source tree. so we can
avoid setting the RPATH for these executables when `make install`.

the same applies to libceph-common.

Fixes: https://tracker.ceph.com/issues/41524
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 61708155b4d9d211b7da21aed8ccfe4c8ed3d932)

4 years agocmake: only link against necessary libs
Kefu Chai [Mon, 2 Sep 2019 08:07:49 +0000 (16:07 +0800)]
cmake: only link against necessary libs

some executables like ceph_test_mon_memory_target does not link against
gtest or gmock, so no need to link agains them.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 45bab2cb1cf667c2c5c82a78a141fa5883dbd0e3)

4 years agonautilus: mgr/dashboard: decouple unit tests from build artifacts 40547/head
Alfonso Martínez [Tue, 6 Apr 2021 10:34:02 +0000 (12:34 +0200)]
nautilus: mgr/dashboard: decouple unit tests from build artifacts

Signed-off-by: Alfonso Martínez <almartin@redhat.com>
4 years agonautilus: mgr/dashboard: python 2: fix error when setting non-ASCII password 40610/head
Alfonso Martínez [Tue, 6 Apr 2021 07:42:27 +0000 (09:42 +0200)]
nautilus: mgr/dashboard: python 2: fix error when setting non-ASCII password

Fixes: https://tracker.ceph.com/issues/50155
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
4 years agoMerge pull request #40299 from tchaikov/nautilus-48381
Yuri Weinstein [Mon, 5 Apr 2021 15:54:31 +0000 (08:54 -0700)]
Merge pull request #40299 from tchaikov/nautilus-48381

nautilus: mon/ConfigMap: fix stray option leak

Reviewed-by: Sage Weil <sage@redhat.com>
4 years agoMerge pull request #40128 from neha-ojha/wip-49759-nautilus
Yuri Weinstein [Mon, 5 Apr 2021 15:22:33 +0000 (08:22 -0700)]
Merge pull request #40128 from neha-ojha/wip-49759-nautilus

nautilus: pybind/mgr/balancer/module.py: assign weight-sets to all buckets before balancing

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
4 years agoMerge pull request #40283 from smithfarm/wip-39489-nautilus
Yuri Weinstein [Mon, 5 Apr 2021 15:14:11 +0000 (08:14 -0700)]
Merge pull request #40283 from smithfarm/wip-39489-nautilus

nautilus: ceph.spec.in: Enable tcmalloc on IBM Power and Z

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Ken Dreyer <kdreyer@redhat.com>
4 years agoMerge pull request #40047 from pponnuvel/wip-48713-nautilus
Yuri Weinstein [Mon, 5 Apr 2021 15:13:06 +0000 (08:13 -0700)]
Merge pull request #40047 from pponnuvel/wip-48713-nautilus

nautilus: mgr/ActivePyModules.cc: always release GIL before attempting to acquire a lock

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #40014 from mfoliveira/wip-49682-nautilus
Yuri Weinstein [Mon, 5 Apr 2021 15:12:26 +0000 (08:12 -0700)]
Merge pull request #40014 from mfoliveira/wip-49682-nautilus

nautilus: osd: add osd_fast_shutdown_notify_mon option (default false)

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agoMerge pull request #39942 from smithfarm/wip-49664-nautilus
Yuri Weinstein [Mon, 5 Apr 2021 15:11:54 +0000 (08:11 -0700)]
Merge pull request #39942 from smithfarm/wip-49664-nautilus

nautilus: src/global/signal_handler.h: fix preprocessor logic for alpine

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #39923 from smithfarm/wip-49637-nautilus
Yuri Weinstein [Mon, 5 Apr 2021 15:11:25 +0000 (08:11 -0700)]
Merge pull request #39923 from smithfarm/wip-49637-nautilus

nautilus: mgr/telemetry: check if 'ident' channel is active

Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>
4 years agoMerge pull request #39883 from singuliere/wip-49385-nautilus
Yuri Weinstein [Mon, 5 Apr 2021 15:10:49 +0000 (08:10 -0700)]
Merge pull request #39883 from singuliere/wip-49385-nautilus

nautilus: os/bluestore/BlueFS: use iterator_impl::copy instead of bufferlist::c_str() to avoid bufferlist rebuild

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agomgr/dashboard: Fix for alert notification message being undefined 40590/head
Nizamudeen A [Tue, 23 Mar 2021 07:10:46 +0000 (12:40 +0530)]
mgr/dashboard: Fix for alert notification message being undefined

Prometheus alert notification message in the dashboard always comes up
as undefined. Its because we were showing the alert.summary instead of
alert.description for displaying the message. I couldn't find the
summary field in the ceph_default_alerts.yml file. So removed all the
Summary fields from the dashboard code.

Fixes: https://tracker.ceph.com/issues/49342
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 2921b2e9a939e1ad52b07327fdf84885568384b9)

4 years agocommon/mempool: only fail tests if sharding is very bad 40567/head
singuliere [Wed, 17 Mar 2021 06:35:04 +0000 (07:35 +0100)]
common/mempool: only fail tests if sharding is very bad

Fixes: https://tracker.ceph.com/issues/49781
Signed-off-by: singuliere <singuliere@autistici.org>
(cherry picked from commit db79769d6d557acc021a434ff285db2d69458d0a)

4 years agocommon: remove log_early configuration option 40549/head
Changcheng Liu [Tue, 13 Oct 2020 01:47:16 +0000 (09:47 +0800)]
common: remove log_early configuration option

After deciding to always enable tracking log in early phase, there's no
need to keep "log_early" option here and remove it directly.

Suggested-by: Kefu Chai <kefu@redhat.com>
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
(cherry picked from commit dbdcb2535d0d463d92d90169175e0776a0ee58e3)

Conflicts:
qa/tasks/ceph_manager.py [ already dropped in nautilus, commit
  b8dd87424ecb ("qa/tasks/ceph_manager.py: don't use log-early
  in raw_cluster_cmd") ]

4 years agocommon: remove useless log_early check condition here
Changcheng Liu [Wed, 16 Sep 2020 01:50:54 +0000 (09:50 +0800)]
common: remove useless log_early check condition here

The code logic shows that Log thread needs to be started even
log_early is false.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
(cherry picked from commit 71d4178ff453fe3126258f11078048d51398886d)

4 years agoMerge pull request #40536 from tchaikov/nautilus-nose-py3
Kefu Chai [Thu, 1 Apr 2021 09:36:48 +0000 (17:36 +0800)]
Merge pull request #40536 from tchaikov/nautilus-nose-py3

nautilus: test/pybind: s/nosetests/python3/

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
4 years agopybind/mgr/dashboard/.pylintrc: silence more pylint warnings 40536/head
Kefu Chai [Thu, 1 Apr 2021 05:45:08 +0000 (13:45 +0800)]
pybind/mgr/dashboard/.pylintrc: silence more pylint warnings

this change is not cherry-picked from master, as we don't have following
warnings in master, like:

************* Module dashboard.controllers.saml2
        intern-builtin,
controllers/saml2.py:57:8: R1720: Unnecessary "else" after "raise" (no-else-raise)

also bump up the versions of pylint and astroid, so they
can work with python3.8.

see
https://github.com/PyCQA/astroid/commit/28fc86f260f0cd9433d0603da81ee52045f9e4c3

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agomgr/dashboard: do not import tools in access_control
Kefu Chai [Thu, 1 Apr 2021 04:14:31 +0000 (12:14 +0800)]
mgr/dashboard: do not import tools in access_control

this addresses a regression introduced by
2cd94293268116838c3ddcebdedde4fbd9cb93aa. which

from ..tools import ensure_str

and it causes recursive import.

so, in this change, an copy of ensure_str() is added
to access_control.py

this change is not cherry-picked from master, as the offending commit
which is fixed by this change is not included in master.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agotest/pybind: s/nosetests/python3/
Kefu Chai [Thu, 19 Dec 2019 03:36:59 +0000 (11:36 +0800)]
test/pybind: s/nosetests/python3/

different distros package python3-nose in different ways by adding
different postfix to "/usr/bin/nosetests" to differentiate it from
its python2 counterpart.

* on bionic, python3-nose offers "nosetests3"
* on el8, python3-nose offers "nosetests-3" and "nosetests-3.6"

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 717aaad9edb295d15998f37a5d92d11bc8345b33)

4 years agoMerge pull request #40516 from tchaikov/nautilus-zstd
Kefu Chai [Wed, 31 Mar 2021 15:20:31 +0000 (23:20 +0800)]
Merge pull request #40516 from tchaikov/nautilus-zstd

nautilus: cmake,zstd,debian: allow use libzstd in system

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
4 years agoMerge pull request #40522 from tchaikov/nautilus-dashboard-unicode-password
Kefu Chai [Wed, 31 Mar 2021 15:14:52 +0000 (23:14 +0800)]
Merge pull request #40522 from tchaikov/nautilus-dashboard-unicode-password

nautilus: mgr/dashboard: encode non-ascii string before passing it to exec_cmd()

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
4 years agomgr/dashboard: ensure password is a string before encoding it 40522/head
Kefu Chai [Wed, 31 Mar 2021 11:09:56 +0000 (19:09 +0800)]
mgr/dashboard: ensure password is a string before encoding it

otherwise we have following failure:

AttributeError: 'bytes' object has no attribute 'encode'

this change is not cherry-picked from master, as master has dropped
python2 support.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agomgr/dashboard: encode non-ascii string before passing it to exec_cmd()
Kefu Chai [Wed, 31 Mar 2021 11:00:59 +0000 (19:00 +0800)]
mgr/dashboard: encode non-ascii string before passing it to exec_cmd()

because on Python3, tempfile.TemporaryFile() is opened in binary mode by
default, we need to encode non-ascii string before write to it.
otherwise, we have following failure:

self = <dashboard.tests.test_access_control.AccessControlTest testMethod=test_unicode_password>

    def test_unicode_password(self):
        self.test_create_user()
        password = '\u7ae0\u9c7c\u4e0d\u662f\u5bc6\u7801'
        with tempfile.TemporaryFile(mode='w+') as pwd_file:
>           pwd_file.write(password)
E           UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-5: ordinal not in range(256)

tests/test_access_control.py:576: UnicodeEncodeError

this change is not cherry-picked from master, as master has dropped
python2 support.

Signed-off-by: Kefu Chai <kchai@redhat.com>
4 years agoscript/run-make: enable WITH_SYSTEM_ZSTD on focal 40516/head
Kefu Chai [Wed, 31 Mar 2021 04:27:44 +0000 (12:27 +0800)]
script/run-make: enable WITH_SYSTEM_ZSTD on focal

to speed up the build for "make check"

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit f1eda0b99422348865b63826a31f2a6c7078adce)

Conflicts:
debian/control
src/script/run-make.sh: trivial resolution

4 years agocmake: allow use libzstd in system
Kefu Chai [Wed, 31 Mar 2021 04:15:17 +0000 (12:15 +0800)]
cmake: allow use libzstd in system

since we are moving the test nodes from bionic to focal, we are able to
use the prebuilt libzstd libraries when running "make check". to speed
up the build and test, in this change:

* add FindZstd.cmake which allows us to use the libzstd in system
* extract BuildZstd.cmake for better readability
* add an option named "WITH_SYSTEM_ZSTD", which defaults to "OFF",
  so user can enable it on demand.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 16fd07244dd25b46ab1b5a9a3180a354d13c9245)

Conflicts:
src/compressor/zstd/CMakeLists.txt: minor resolution

4 years agozstd: upgrade to v1.4.5
Bryan Stillwell [Fri, 11 Sep 2020 21:49:39 +0000 (15:49 -0600)]
zstd: upgrade to v1.4.5

Since the v1.4.0 release there have been a few improvements to Zstandard
including improved compression ratios, faster compression, and faster
decompression.

Signed-off-by: Bryan Stillwell <bstillwell@godaddy.com>
(cherry picked from commit 7f23cd611f61f41f9c439b7a6dfc91109df741de)

4 years agozstd: compat with v1.4.0
Dan van der Ster [Wed, 19 Jun 2019 14:57:13 +0000 (16:57 +0200)]
zstd: compat with v1.4.0

In zstd d8e215cbee03b038fffe74aebad63b625c42f23c
ZSTD_compress_generic() is renamed to ZSTD_compressStream2().

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit fa9cd3faad689898a12d10d86df6e06dd736497f)

4 years agozstd: upgrade to v1.4.0
Dan van der Ster [Wed, 19 Jun 2019 13:58:13 +0000 (15:58 +0200)]
zstd: upgrade to v1.4.0

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit 2c2797a71f7aca8f86a758237f1ecbc8966e1b51)

4 years agoMerge branch 'nautilus-saved' into nautilus
Josh Durgin [Tue, 30 Mar 2021 18:38:17 +0000 (14:38 -0400)]
Merge branch 'nautilus-saved' into nautilus

4 years ago14.2.19 v14.2.19
Jenkins Build Slave User [Tue, 30 Mar 2021 16:19:18 +0000 (16:19 +0000)]
14.2.19

4 years agocommon/ipaddr: also skip just `lo`
Dan van der Ster [Tue, 23 Mar 2021 08:00:11 +0000 (09:00 +0100)]
common/ipaddr: also skip just `lo`

Skip iface's with name like 'lo' or of the form 'lo:0', 'lo:1'. This
brings back the original behavior from b6d0fc9e0e515e50894c08217d688a8c94db7570

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Fixes: https://tracker.ceph.com/issues/49938
(cherry picked from commit 6147c0917157efd2d35610e759685656a4989abb)
(cherry picked from commit bed79d5bea3183b153ffb223d049074912947516)

4 years agoMerge pull request #40485 from tchaikov/nautilus-pr-focal
Kefu Chai [Tue, 30 Mar 2021 09:22:15 +0000 (17:22 +0800)]
Merge pull request #40485 from tchaikov/nautilus-pr-focal

debian/control: add missing commas, use python3 packages for "make check" on focal

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
4 years agodo_cmake: always pass -DWITH_PYTHON3 to cmake 40485/head
Kefu Chai [Thu, 4 Mar 2021 06:30:17 +0000 (14:30 +0800)]
do_cmake: always pass -DWITH_PYTHON3 to cmake

do not pretend that we support python2 anymore.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 6dbd4f59f3c5cd13df4237136134ad7a4fc46549)

Conflicts:
do_cmake.sh: nautilus still supports python2. so, to
run "make check" with python3 only, we should disable it explicitly.

4 years agodebian/control: install python3-* packages for "make check"
Kefu Chai [Tue, 30 Mar 2021 02:38:43 +0000 (10:38 +0800)]
debian/control: install python3-* packages for "make check"

Signed-off-by: Kefu Chai <kchai@redhat.com>
Conflicts:
debian/control: this change is not cherry-picked from master,
the corresponding commit in master is
50162091461e42939375475f70ecfd0817f2551c, but that commit also includes
the changes to update the runtime dependencies to python3. but we only
need to update the dependencies for running "make check". so instead
of cherry-picking from master, a separated change is made here.

4 years agodebian: remove python >= 2.7 requirement
Alfredo Deza [Mon, 21 Oct 2019 17:06:25 +0000 (13:06 -0400)]
debian: remove python >= 2.7 requirement

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 68025b633567f0d7ec680d94021ae723618683a3)

Conflicts:
       debian/control: we still need python 2.7 at runtime
         so ignore that change, but we need to use tox instead of
         python-tox for running "make check" on focal, so we
         need the tox change in this commit.

4 years agodebian/control: fix Build-Depends
Kefu Chai [Tue, 27 Aug 2019 02:05:15 +0000 (10:05 +0800)]
debian/control: fix Build-Depends

it's a regression introduced by 5d6d770e

Fixes: https://tracker.ceph.com/issues/50040
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit f6d7fd28afc3173f023fc290ad57fbcd9e0f3789)

 Conflicts:
debian/control: Additional package in HEAD

4 years agoMerge pull request #40423 from k0ste/wip-49996-nautilus
Yuri Weinstein [Mon, 29 Mar 2021 15:47:52 +0000 (08:47 -0700)]
Merge pull request #40423 from k0ste/wip-49996-nautilus

nautilus: common/ipaddr: skip loopback interfaces named 'lo' and test it

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agotest_ipaddr: check that we correctly skip loopback 40423/head
Dan van der Ster [Tue, 23 Mar 2021 10:28:37 +0000 (11:28 +0100)]
test_ipaddr: check that we correctly skip loopback

We should skip devices named 'lo' or of the form 'lo:0' regardless
of their IP address.

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Related-to: https://tracker.ceph.com/issues/49938
(cherry picked from commit 780125d1ed93cd7b17172752b3e76186a524103b)

4 years agocommon/ipaddr: also skip just `lo`
Dan van der Ster [Tue, 23 Mar 2021 08:00:11 +0000 (09:00 +0100)]
common/ipaddr: also skip just `lo`

Skip iface's with name like 'lo' or of the form 'lo:0', 'lo:1'. This
brings back the original behavior from b6d0fc9e0e515e50894c08217d688a8c94db7570

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Fixes: https://tracker.ceph.com/issues/49938
(cherry picked from commit 6147c0917157efd2d35610e759685656a4989abb)

4 years agoMerge pull request #40335 from tchaikov/nautilus-prettytable
Kefu Chai [Sat, 27 Mar 2021 18:03:22 +0000 (02:03 +0800)]
Merge pull request #40335 from tchaikov/nautilus-prettytable

nautilus: pybind/ceph_daemon: do not fail if prettytable is not available

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agoMerge pull request #40407 from tchaikov/nautilus-pr-40400
Kefu Chai [Fri, 26 Mar 2021 01:27:16 +0000 (09:27 +0800)]
Merge pull request #40407 from tchaikov/nautilus-pr-40400

nautilus: run-make-check.sh: let ctest generate XML output

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agorun-make-check.sh: let ctest generate XML output 40407/head
Kefu Chai [Thu, 25 Mar 2021 09:08:48 +0000 (17:08 +0800)]
run-make-check.sh: let ctest generate XML output

to enable XUnit plugin of jenkins to consume the ctest output and
publish it in the dashboard, we need to

* let ctest generate XML output instead of plain text output
* do not fail the test if any test case fails. this allows the publisher
  to do its job by checking the XML output.
* prevent ctest from compressing the output. see
  https://issues.jenkins.io/browse/JENKINS-21737

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 48ba39987d3958531589d7969750ea749e6a6d30)

4 years agocommon/options: bluefs_buffered_io=true by default 40393/head
Dan van der Ster [Thu, 12 Nov 2020 16:14:37 +0000 (17:14 +0100)]
common/options: bluefs_buffered_io=true by default

Enable bluefs_buffered_io again because it makes a huge user-visible
improvement in metadata intensive scenarios, such as but not limited to
PG deletion.

In our environment, deleting PGs from 4 hybrid OSDs (sharing one SATA SSD block.db) saturates
the block.db at 350MB/s reads and causes slow reqs and flapping on the OSDs.
Those OSDs have 3GB osd_target_memory.
Enabling bluefs_buffered_io drops the SSD IO down to <1MBps and the OSDs
are performant again. (The underlying PG deletion inefficiency is being
solved separately, but the page cache is so much more effective than
the bluestore cache in this scenario).

Lastly, remove the comment about swap. We should separately advise
operators to disable swap on OSD machines, as it is much better in
our experience to OOM and restart than to chug along swapping.

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Related-to: https://tracker.ceph.com/issues/45765
Related-to: https://tracker.ceph.com/issues/47044
(cherry picked from commit 5ec8e8e63d409860c35e24a192090ac2b70af8f6)