git-server-git.apps.pok.os.sepia.ceph.com Git

rbd-mirror: don't prune older mirror snapshots when pruning incomplete snapshot

Since we normally prune in order, we need to ensure that we don't prune older
snapshots when we need to delete an incomplete mirror snapshot since the
older snapshot might be the only remaining mirror snapshot.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 7ba9214ea5b73d0436af6c2896abf4836d741de9)

qa/workunits/rbd: show snapshot deltas during stress test failure

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit f079116e87161b55acaa08c55bf8b8e79cee8670)

qa/suites/rbd: add snapshot-based mirroring stress test

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 094bfeaf8efe1c4794a2b301314eddedfa5246f9)

librbd/deep_copy: added new migrating flag to object copy

The migration operation and the copyup state machine will set
this flag when attempting to perform a deep-copy due to a
live-migration.

This flag will prevent a possible race condition between the
start of the object deep-copy when migration was enabled and
the writing portion of the deep-copy when migration might
have completed via external means.

Fixes: https://tracker.ceph.com/issues/45694
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 1baba64e213cb808804796575d3f7969cf37a3c6)

Conflicts:
src/librbd/deep_copy/ObjectCopyRequest.cc: trivial resolution

librbd/deep_copy: added bitwise flag parameter to object copy

This initial version subsumes the original "flatten" boolean flag.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit e79f6b1c157e042f57b577bc510debb21e004ea7)

Conflicts:
src/librbd/deep_copy/ObjectCopyRequest.cc: trivial resolution
src/librbd/io/CopyupRequest.cc: trivial resolution
src/test/librbd/deep_copy/test_mock_ObjectCopyRequest.cc: trivial resolution

librbd/deep-copy: object-copy state machine must update object map

If there was no data to copy, the object-copy state machine was bypassing
the object-map update states and prematurely completing. Since the
object-map is default-initialized to all non-existent objects, this results
in incorrect state for OBJECT_EXISTS_CLEAN objects.

This commit was derived from ca0b9bfc28ef7287ca139ca9640c876223eda87b

Signed-off-by: Jason Dillaman <dillaman@redhat.com>

librbd: deep-copy should update object-map before writing to object

For the original use-case of RBD mirroring it was (maybe) more
acceptable to write to the object before updating the object map
because an interrupted sync will be retried. However, when using
the deep-copy object copy state machine as part of copyup, it's
more likely that the object-map has the potential to become
out-of-sync with reality if it's updated after the object is
written.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit e782b85bfda8ae6487c637af0059ab94fba332d6)

Conflicts:
src/librbd/deep_copy/ObjectCopyRequest.cc: trivial resolution
src/test/librbd/deep_copy/test_mock_ObjectCopyRequest.cc: trivial resolution

librbd/object_map: diff state machine should track object existence

The deep-copy snapshot-create state machine initializes the object-map
state to non-existent for all objects. There was an assumption that the
deep-copy object-copy state machine would always update the object map
but that was being skipped for clean objects as an optimization. This
change will support a future commit to run the object-copy state machine
for existing objects.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit b81cd2460de748c71210520f8c819895f257f0c7)

Conflicts:
src/librbd/api/DiffIterate.cc: trivial resolution due to renames

test/librbd: print difference if deep-copy or migration test fails

It may appear to be useful to track the sporadic test failures
observed on jenkins, not reproducible locally.

Previously it was disabled because the output could be too
large. But after the hexdump was improved to skip repeating bytes
the output will hopefully be much smaller.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit bb77f740df749de1bba0e91b03c4eb23d5586e43)

Merge pull request #39540 from tchaikov/octopus-pr-35352

octopus: qa/tasks/vstart_runner: do not teardown test_path if "create-cluster-only"

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>

qa/tasks/vstart_runner: do not teardown test_path if "create-cluster-only"

otherwise we could be removing a "None" directory when tearing down the cluster,
and have following failure:

Exception ignored in: <bound method LocalContext.__del__ of <__main__.LocalContext object at 0x7f99fd4a6cc0>>
Traceback (most recent call last):
  File "../qa/tasks/vstart_runner.py", line 1189, in __del__
    shutil.rmtree(self.teuthology_config['test_path'])
  File "/tmp/tmp.mmM2ugspuR/venv/lib/python3.6/shutil.py", line 477, in rmtree
    onerror(os.lstat, path, sys.exc_info())
  File "/tmp/tmp.mmM2ugspuR/venv/lib/python3.6/shutil.py", line 475, in rmtree
    orig_st = os.lstat(path)
TypeError: lstat: path should be string, bytes or os.PathLike, not NoneType

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 954e9a6fa67cce8e3eb8105ee858340b60b84b15)

Merge pull request #39532 from liewegas/pr-39496-octopus

octopus: mgr/cephadm: fix host refresh

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

mgr/cephadm: fix host refresh

Fixes: 01f60cf4e0a751c314120c02956d4ff941eb71b4
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 9df5a6d73ed21b394c01afe6c9800b6e50737c90)

Merge pull request #39393 from kamoltat/wip-ksirivad-octopus-release-notes

octopus: PendingReleaseNotes: mgr/pg_autoscaler

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

Merge pull request #39230 from ifed01/wip-ifed-fix-pin-octopus

octopus: os/bluestore: fixing onode pinning and more

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

Merge pull request #39004 from Vicente-Cheng/wip-48568-octopus

octopus: qa/tasks/cephfs/nfs: Check if host ip is in cluster info output

Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
Reviewed-by: Varsha Rao <varao@redhat.com>

PendingReleaseNotes: mgr/pg_autoscaler

Added details describing the changes that
occured in after merging #39248 into octopus
upstream.

Signed-off-by: Kamoltat <ksirivad@redhat.com>

Merge pull request #39000 from Vicente-Cheng/wip-48521-octopus

octopus: cephfs: client: add ceph.{cluster_fsid/client_id} vxattrs suppport

Reviewed-by: Ramana Raja <rraja@redhat.com>

Merge pull request #38949 from Vicente-Cheng/wip-48644-octopus

octopus: cephfs: client: ensure we take Fs caps when fetching directory link count from cached inode

Reviewed-by: Ramana Raja <rraja@redhat.com>

Merge pull request #38947 from Vicente-Cheng/wip-48642-octopus

octopus: cephfs: client: set CEPH_STAT_RSTAT mask for dir in readdir_r_cb

Reviewed-by: Ramana Raja <rraja@redhat.com>

Merge pull request #38612 from ShyamsundarR/wip-47158-octopus

octopus: mgr/volumes: Add a per subvolume trash

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>

Merge pull request #38466 from Vicente-Cheng/wip-48458-octopus

octopus: cephfs: client: do not use g_conf().get_val<>() in libcephfs

Reviewed-by: Ramana Raja <rraja@redhat.com>

Merge pull request #39161 from smithfarm/wip-48496-octopus

octopus: mon: paxos: Delete logger in destructor

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>

Merge pull request #39120 from rhcs-dashboard/wip-48739-octopus

octopus: mgr/dashboard: Use secure cookies to store JWT Token

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>

Merge pull request #39248 from kamoltat/wip-ksirivad-octopus-backports

octopus: mgr/pg_autoscaler: avoid scale-down until there is pressure

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #39122 from smithfarm/wip-48692-octopus

octopus: librbd: clear implicitly enabled feature bits when creating images

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Mykola Golub <mgolub@mirantis.com>

Merge pull request #38474 from ifed01/wip-ifed-fix-avl-octopus

octopus: os/bluestore: fix inappropriate ENOSPC from avl/hybrid allocator

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

Merge pull request #38040 from kshtsk/wip-octopuse-cephadm-bootstrap-remote

octopus: tests: qa/task/cephadm: run cephadm only on bootstrap_remote

Reviewed-by: Thomas Bechtold <tbechtold@suse.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>

Merge pull request #38184 from smithfarm/wip-48101-octopus

octopus: rgw/rgw-admin: fixes BucketInfo for missing buckets

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #39169 from sebastian-philipp/octopus-backport-38978

octopus: mgr/cephadm: raise HEALTH_WARN when cephadm daemon in 'error' state

Reviewed-by: Sage Weil <sage@redhat.com>

Merge pull request #39321 from idryomov/wip-krbd-stable-writes-attr-octopus

octopus: qa: krbd_stable_pages_required.sh: move to stable_writes attribute

Reviewed-by: Jason Dillaman <dillaman@redhat.com>

Merge pull request #38422 from smithfarm/wip-48285-octopus

octopus: qa: ignore evicted client warnings

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #39297 from sebastian-philipp/octopus-backport-39106

octopus: cephadm: use `apt-get` for package install/update

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Sage Weil <sage@redhat.com>

Merge pull request #39300 from sebastian-philipp/octopus-backport-38998-38927

octopus: mgr/cephadm: try again calling ceph-volume without --filter-for-batch

Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>

qa: krbd_stable_pages_required.sh: move to stable_writes attribute

bdi/stable_pages_required attribute was deprecated in 5.10 and now
always returns 0. The replacement is queue/stable_writes. (It is
also writeable, so we can simplify these test cases somewhat in the
future.)

Fixes: https://tracker.ceph.com/issues/48232
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 5adfc15b873bc16d698e7398d3ef2c2a46e8a9df)

Merge pull request #39203 from idryomov/wip-krbd-msgr2-octopus

octopus: krbd: add support for msgr2 (kernel 5.11)

Reviewed-by: Jason Dillaman <dillaman@redhat.com>

Merge pull request #38893 from smithfarm/wip-48519-octopus

octopus: pybind/cephfs: fix missing terminating NULL char in readlink()'s C string

Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #38424 from smithfarm/wip-48375-octopus

octopus: cephfs: client: check rdonly file handle on truncate

Reviewed-by: Ramana Raja <rraja@redhat.com>

Merge pull request #38352 from smithfarm/wip-48370-octopus

octopus: cephfs: mds: dir->mark_new() should together with dir->mark_dirty()

Reviewed-by: Ramana Raja <rraja@redhat.com>

Merge pull request #38349 from smithfarm/wip-48129-octopus

octopus: cephfs: release client dentry_lease before send caps release to mds

Reviewed-by: Ramana Raja <rraja@redhat.com>

Merge pull request #39296 from sebastian-philipp/octopus-backport-39113

octopus: python-common: fix test_datetime_to_str_2 on non-UTC hosts

Reviewed-by: Sage Weil <sage@redhat.com>

mgr/cephadm: try again calling ceph-volume without --filter-for-batch

Fixes: https://tracker.ceph.com/issues/48870
This deals with a cephadm upgrade issue:

1. user calls `ceph orch upgrade`
2. mgr/cephadm calls `ceph orch config set mgr.x container_image <new-container>`
3. standby mgr gets upgraded
4. mgr failover to new mgr
5. mgr/cephadm calls `_refresh_host_devices`
6. `_refresh_host_devices` calls` ceph orch config get osd container_image`.
  But this returns the old image
7. `_refresh_host_devices` calls `ceph-volume ... --filter-for-batch`
  with an image that doesn't support `filter-for-batch`

The idea is to simply retiry calling ceph-volume inventory without `--filter-for-batch`

(also removed `out` being used without being declared)

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
(cherry picked from commit ede3d6d147dd7b99d37aee5c5fb9340f2878db18)

Conflicts:
  src/pybind/mgr/cephadm/tests/test_cephadm.py

mgr/cephadm: Properly handle JSON Decode error

Fixes 6d759fb5deac0c52b3c738a2e695738228749420

I.e. don't use `out`, until it is acutally defined

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
(cherry picked from commit eb1a40c3ea8e19259d8ab68a6eeb16d27e4cdbda)

Conflicts:
src/pybind/mgr/cephadm/serve.py

cephadm: use `apt-get` for package install/update

avoids errors during prepare-host:
```
apt: stderr WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
```

Fixes: https://tracker.ceph.com/issues/49032
Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit fa4706039cfece36815df46cd4452dc25448c340)

python-common: fix test_datetime_to_str_2 on non-UTC hosts

The old test parsed to a datetime without a tz, which was interpreted as
the local time zone when rendering back to a string. Specify that it's a
UTC datetime so that behavior is consistent regardless of the test host
timezone.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 78aca4db249c409d0cd5a24bfae81e55cf930bc3)

Merge pull request #39170 from sebastian-philipp/octopus-backport-38910

octopus: cephadm: fix rgw osd cap tag

Merge pull request #39166 from sebastian-philipp/octopus-backport-38804-39003

octopus: cephadm: silence "Failed to evict container" log msg

Reviewed-by: Michael Fritch <mfritch@suse.com>

Merge pull request #39167 from sebastian-philipp/octopus-backport-38850

octopus: mgr/cephadm: tolerate old host inventory without 'hostname' key

Reviewed-by: Michael Fritch <mfritch@suse.com>

Merge pull request #39168 from sebastian-philipp/octopus-backport-38945

octopus: qa/cephadm: Add yaml output to smoke test

Reviewed-by: Michael Fritch <mfritch@suse.com>

Merge pull request #39171 from sebastian-philipp/octopus-backport-39083

octopus: python-common/drivegroups: avoid dropping "rotational: 0" from Device Selection

Reviewed-by: Michael Fritch <mfritch@suse.com>

mgr/pg_autoscaler: avoid scale-down until there is pressure

The autoscaler will start out with scaling each
pools to have a full complements of pgs from the start
and will only decrease it when pools need more due to
increased usage.

Introduced a unit test that tests only the
function get_final_pg_target_and_ratio() which
deals with the distrubtion of pgs amongst the
pools

Edited workunit script to reflect the change
of how pgs are calculated and distrubted.

Signed-off-by: Kamoltat <ksirivad@redhat.com>
(cherry picked from commit daeb6f6ac0c8f77ae07147f9d1e2ed18d6d8e4cc)

Conflicts:
src/pybind/mgr/pg_autoscaler/module.py - trivial fix

Merge pull request #38430 from smithfarm/wip-48281-octopus

octopus: osd: fix bluestore bitmap allocator calculate wrong last_pos with hint

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #38427 from smithfarm/wip-48283-octopus

octopus: rpm,deb: change sudoers file mode to 440

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #38333 from b-ranto/wip-prom-fixes-octopus

octopus: mgr/prometheus: Sync and backport prometheus fixes

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge PR #38425 into octopus

* refs/pull/38425/head:
lvm/create.py: fix a typo in the help message

Reviewed-by: Jan Fajerski <jfajerski@suse.com>

os/bluestore: fix a bug causing unexpected Onode's unpinned state.

There could be a race for Onodes put() and get() methods:

put()(pinned, nref=3)
  int n = --nref; (nref = 2)
  if (n == 2) {
    ..
    std::lock_guard l(ocs->lock);
    ...
    pinned = pinned && nref > 2; (= false)
    ...                                     get()
    if (r) {                                ++nref; (=3)
      n = --nref; (nref = 2)                return;
    }
    ...
    return

As a result nref = 2, pinned = false which is wrong

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit ea0fc57ef57eddc1fb9610850ae766c9fd582bae)

os/bluestore: Prevented erasure of element from onode_map during iteration

When onode.exists == false getting reference and then releasing it might delete it from container.
It must not happen during iteration.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 6c8e8a757485d27bfa93d344f3e04aaf29c68cc4)

os/bluestore: Purge onode when it does exist

Added logic for erasing onode from onode_map it is last reference and exists==false.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit aaebfe0abe78bf46c52ef6c4481c517353edf9ac)

Conflicts:
(trivial) src/os/bluestore/BlueStore.h

os/bluestore: Refactor pin() to get more control over its logic

Got rid of OnodeCacheShard pin() and unpin() functions.
Moved their validator logic right into Onode put and get functions.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit eaf1b2366aa7701b10eca4ed3e53d51909e8011b)

os/bluestore: Only pass that decremented nref to 0 deletes object

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit b0e2964ac8a8f3a273046ab8e87a62c1bc4db55c)

rgw : modify error message to NoSuchBucket when bucket doesn't exist in bucket info API

Fixes: https://tracker.ceph.com/issues/48073
Signed-off-by: caolei <halei15848934852@163.com>
(cherry picked from commit bc5ef5c9cf0ea89fc028332c39766eb8e7e1bd0b)

rgw: fixes BucketInfo for missing buckets

The admin api BucketInfo endpoint should now return 404 for buckets that
are not found where only the bucket name is passed as a parameter.

Fixes: https://tracker.ceph.com/issues/45193
Signed-off-by: Nick Janus <njanus@digitalocean.com>
(cherry picked from commit d70ca81502d25bd7a76dd2ed2a538bf5e6584822)

Merge pull request #38971 from smithfarm/wip-48743-octopus

octopus: rgw: distribute cache for exclusive put

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #38970 from smithfarm/wip-48693-octopus

octopus: rgw: adding user related web token claims to ops log

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #38826 from smithfarm/wip-48725-octopus

octopus: rgw: fix bucket limit check fill_status warnings

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #38821 from smithfarm/wip-48804-octopus

octopus: rgw: cls/user: set from_index for reset stats calls

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #39046 from smithfarm/wip-48968-octopus

octopus: ocf: add support for mapping images within an RBD namespace

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>

Merge pull request #38981 from smithfarm/wip-48864-octopus

octopus: rgw/multisite: Verify if the synced object is identical to source

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #38829 from smithfarm/wip-48544-octopus

octopus: rgw_file: return common_prefixes in lexical order

Reviewed-by: Matt Benjamin <mbenjami@redhat.com>

Merge pull request #38824 from smithfarm/wip-48546-octopus

octopus: rgw: lc: correctly dimension lc shard index vector

Reviewed-by: Matt Benjamin <mbenjami@redhat.com>

Merge pull request #39059 from votdev/issue_48068_tz_octopus

octopus: cephadm: Various properties like 'last_refresh' do not contain timezone

Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>

qa/suites/krbd: add msgr2 modes to most subsuites

basic, rbd and rbd-nomount subsuites are expanded to run with each
of ms_mode=legacy, ms_mode=crc and ms_mode=secure. This increases
the total number of jobs in the suite from 100 to 220.

fsx, singleton and thrash subsuites choose ms_mode at random (from
the above plus ms_mode=prefer-crc).

unmap and wac subsuites remain msgr1-only.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 65948736a41f424d8152b208d013419f8d6038a4)

doc: deprecate [no]cephx_require_signatures map options

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit fd5f5722a29964bb33b305a381bcf9a48cdcbb47)

krbd: add support for msgr2

Recognize ms_mode map option and filter initial monitor addresses
accordingly: if ms_mode is not given or ms_mode=legacy, discard v2
addresses, otherwise discard v1 addresses.

Note that nothing was discarded (i.e. v2 addresses were passed to
the kernel) previously.  The intent was to preserve that behaviour
in case ms_mode is not given, allowing to change the kernel default
in the future.  However, it turns out that mount.ceph helper has
been misguidedly discarding v2 addresses since commit eae01275134e
("mount.ceph: fork a child to get info from local configuration"),
so that ship has sailed.

Fixes: https://tracker.ceph.com/issues/48976
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 08f714964b7fe5024504818f01328a41acc24965)

Conflicts:
src/tools/rbd/action/Kernel.cc [ commit 34f539d8af33 ("rbd:
  delay parsing of default kernel map options") not in octopus ]

Merge pull request #38336 from votdev/wip-48398-octopus

octopus: mgr/dashboard: display placement column in service table

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>

python-common/drivegroups: avoid dropping "rotational: 0" from DeviceSelection

False is a legitimate value for the rotational setting and should be included in the JSON output, only None should be ignored.

Fixes: http://tracker.ceph.com/issues/49014
Fixes: cd6a488ab2ca036dd4fb36751b938f605e97e1c8
Signed-off-by: Lukas Stockner <lstockner@genesiscloud.com>
(cherry picked from commit c32f6f5448e51d3196f7a2644ea97ecd22a04f92)

cephadm: fix rgw osd cap tag

The syntax is "allow rwx tag rgw *=*'.

Sorry, I thought this would have gotten caught in testing :(

Fixes: 373cc847cf0f8b4ec7aefbfe64c01c3f18a4e021
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit de1efbd62b9258630c2a2d55bfd12034cc8b603f)

mgr/cephadm: raise HEALTH_WARN when cephadm daemon in 'error' state

If cephadm daemons are not happy we should raise a warning. Aside from
being an important part of the user experience, this will also help us
catch teuthology test errors.

Fixes: https://tracker.ceph.com/issues/45628
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 01f60cf4e0a751c314120c02956d4ff941eb71b4)

qa/cephadm: Add yaml output to smoke test

this will provide a more detailed output, like

```yaml
...snip...
service_type: node-exporter
service_name: node-exporter
placement:
  host_pattern: '*'
status:
  created: '2021-01-18T11:21:56.024810Z'
  last_refresh: '2021-01-18T11:23:24.477672Z'
  running: 0
  size: 1
events:
- "2021-01-18T11:23:09.602644Z service:node-exporter [ERROR] \"Failed while placing\
  \ node-exporter.ubuntuon ubuntu: cephadm exited with an error code: 1, stderr:Deploy\
  \ daemon node-exporter.ubuntu ...\nVerifying port 9100 ...\nTraceback (most recent\
  \ call last):\n  File \"<stdin>\", line 7274, in <module>\n  File \"<stdin>\", line\
  \ 1563, in _default_image\n  File \"<stdin>\", line 3698, in command_deploy\n  File\
  \ \"<stdin>\", line 2338, in deploy_daemon\n  File \"<stdin>\", line 1961, in create_daemon_dirs\n\
  AssertionError\""
...snip...
```

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
(cherry picked from commit 88c6c34e2ba32e38c7fca93722737c3b4d31fe6c)

mgr/cephadm: tolerate old host inventory without 'hostname' key

Older cephadm clusters lack the 'hostname' key in the host spec. e.g.,

"cpach": {"addr": "cpach", "labels": ["mon"]}, "eutow": {"addr": "eutow", "labels": ["mon"]}, "stud": {"addr": "stud", "labels": ["mon"]}}

Populate hostname from the dict key if necessary for compatibility.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit ad4ea787d063ba65269f38f185ff18a327cb7bbb)

cephadm: fix 2> syntax in unit.run

We need a space between the command (which ends with a container name)
and the 2> or else the 2 is considered part of the command. E.g.,

! /usr/bin/podman rm -f ceph-a9a8c7ee-5b72-11eb-8f93-001a4aab830c-mon.a2> /dev/null

Fixes: 1bed46e4b0094863a119df59c6ae5f254c2e211d
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit ce4743f72e6c7eea9514b8c9b6c20309fea5d455)

Conflicts:
src/cephadm/cephadm

cephadm: silence "Failed to evict container" log msg

Right now, we're printing some evil looking messages in the log:

```
systemd[1]: Starting Ceph mgr.node2.ankmgz for ...
podman[32354]: Error: no container with name or ID ceph-... found: no such container
bash[32363]: Error: Failed to evict container: "": Failed to find container "ceph-..." in state: no container with name or ID ceph-... found: no such container
bash[32363]: Error: no container with ID or name "ceph-..." found: no such container
````

Also, the unit.run command already removes the container. No need
for ExecStartPre to do the same.

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
(cherry picked from commit 1bed46e4b0094863a119df59c6ae5f254c2e211d)

cephadm: Various properties like 'last_refresh' do not contain timezone

Fixes: https://tracker.ceph.com/issues/48068
Signed-off-by: Volker Theile <vtheile@suse.com>
(cherry picked from commit 3fe715201c8c07cf4ea86b590f9682422eeccf33)

mon: paxos: Delete logger in destructor

reset() can race with shutdown() leading to a use-after-free on the
'logger' object.

Fixes: https://tracker.ceph.com/issues/48386
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
(cherry picked from commit cc295d1c87552eb76b9188d88d7e6ab2f3108149)

Conflicts:
src/mon/Paxos.h
- stable branch has slightly different Paxos and get_name function declarations

mgr/dashboard: Use secure cookies to store JWT Token

This PR intends to store the jwt token in secure cookies instead of local storage

Fixes: https://tracker.ceph.com/issues/44591
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit 36703c63381e6723fff57266235f8230e6af1d92)
(cherry picked from commit 3c72dc309936b23e413dc1aee8ca49c795c48a0f)

Conflicts:
qa/tasks/mgr/dashboard/helper.py
qa/tasks/mgr/dashboard/test_auth.py
src/pybind/mgr/dashboard/controllers/__init__.py
src/pybind/mgr/dashboard/controllers/auth.py
src/pybind/mgr/dashboard/controllers/saml2.py
src/pybind/mgr/dashboard/frontend/cypress/integration/orchestrator/01-hosts.e2e-spec.ts
src/pybind/mgr/dashboard/frontend/cypress/integration/orchestrator/02-hosts-inventory.e2e-spec.ts
src/pybind/mgr/dashboard/frontend/cypress/integration/orchestrator/03-inventory.e2e-spec.ts
src/pybind/mgr/dashboard/frontend/cypress/integration/orchestrator/04-osds.e2e-spec.ts
src/pybind/mgr/dashboard/frontend/cypress/integration/ui/language.e2e-spec.ts
src/pybind/mgr/dashboard/frontend/cypress/integration/ui/navigation.e2e-spec.ts
src/pybind/mgr/dashboard/frontend/package-lock.json
src/pybind/mgr/dashboard/frontend/package.json
src/pybind/mgr/dashboard/frontend/src/app/app.module.ts
src/pybind/mgr/dashboard/frontend/src/app/core/navigation/dashboard-help/dashboard-help.component.ts
- Adopting the changes from the master branch, ignoring few e2e changes
as few files doesn't exist in octopus.

Merge pull request #39018 from sebastian-philipp/octopus-backport-38766

octopus: cephadm: make "ceph orch {restart|...}" asynchronous

Reviewed-by: Michael Fritch <mfritch@suse.com>

Merge pull request #39019 from sebastian-philipp/octopus-backport-38815

octopus: mgr/cephadm: lock multithreaded access to OSDRemovalQueue

Reviewed-by: Michael Fritch <mfritch@suse.com>

Merge pull request #39020 from sebastian-philipp/octopus-backport-38904

octopus: cephadm: Don't make sysctl spam the log file

Reviewed-by: Michael Fritch <mfritch@suse.com>

rgw: use static_ptr for etag verifiers

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 45a060612ec376110b84c2a2b7783c8a6aee191c)

rgw: add factory function create_etag_verifier()

move all of the etag verifier initialization into a helper function.
none of the errors there should be fatal and fail the download, they
should just turn etag verification off

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 8fa8974bbd63fbc8be9cdf929a875910e2147d65)

rgw: move etag verifiers to namespace rgw::putobj

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 6ad2e3eef3f9bbf41de471ef5bad9502023e113c)

rgw: simplify out SourceObjType

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 21cc9034410bb6e675b1b04888e1b85e3eb5d71f)

rgw: rgw_sync_obj_etag_verify accounts for compressed multipart uploads

the etag verifier for multipart uploads uses the manifest to get the
logical offsets for each part. but when compression is enabled, those
are offsets into the compressed data. use the source object's compression
info to translate those compressed part offsets back to their original
offsets

Fixes: https://tracker.ceph.com/issues/45992
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 51f79fad8398d35e07f01fa45704124e16fadeec)

rgw: ETagVerifier_MPU takes existing offset vector

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 2c1934bd3746697249b95b3f79f4c05425d7b40e)

rgw: add helper to decode compression info from single attr

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit a5520be135156c867a6502845603e2afdbb0a44a)

RGW:Multisite: Check rgw_sync_obj_etag_verify option only once

Signed-off-by: Prasad Krishnan <prasad.krishnan@flipkart.com>
(cherry picked from commit f92cfaf74f1eb8809653867b8c67a3ace37619f2)

RGW:Multisite: Convert is_mpu_obj into an enum SourceObjType

Signed-off-by: Prasad Krishnan <prasad.krishnan@flipkart.com>
(cherry picked from commit fa5422597837032d97f9afceff8b5a22fad0cda7)

RGW:Multisite: Rename rgw_sync_obj_integrity to rgw_sync_obj_etag_verify

Signed-off-by: Prasad Krishnan <prasad.krishnan@flipkart.com>
(cherry picked from commit 6c4262bfa7d54563ea8bf616154a6b3491d59347)

RGW:Multisite: Rename rgw_copy_verify_object to rgw_sync_obj_integrity

This patch renames the option rgw_copy_verify_object to
rgw_sync_obj_integrity and incorporates more changes suggested through
code-review comments.

Signed-off-by: Prasad Krishnan <prasad.krishnan@flipkart.com>
(cherry picked from commit 31e944fced60e47139973361cbb753aeaeb3c863)

RGW:Multisite: Create a new filter for ETag Verifier

This patch re-writes the ETag verifier into a filter that peeks into the
incoming stream of data and calculates MD5 checksum.

Signed-off-by: Prasad Krishnan <prasad.krishnan@flipkart.com>
(cherry picked from commit 2677c4b88806d4af6d525157e7006c1b0ca1b964)