git.apps.os.sepia.ceph.com Git

rpm, debian: move smartmontools and nvme-cli to ceph-base

We wish to be able to scrape SMART and NVMe metrics from OSD and MON
nodes. For this we require / recommend smartmontools and nvme-cli
dependencies for both the ceph-osd and ceph-mon packages. However, the
sudoers file (which is required for invoking `smartctl` by user 'ceph')
was installed only in the ceph-osd package. Since different packages
cannot own the same file, and because we want to be able to scrape from
every daemon, we move the dependencies and the sudoers installation to
ceph-base. For generalization, we rename:
sudoers.d/ceph-osd-smartctl -> sudoers.d/ceph-smartctl

Fixes: https://tracker.ceph.com/issues/50657
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit 7ca39fa92b47427af2f1c6000c653bb4dffc47fe)

Conflicts:
ceph.spec.in
debian/rules

Cherry-pick notes:
- conflict due to octopus not having jaeger dep
- conflict due to octopus not installing rbd-nbd_quiesce on debian

Merge pull request #42958 from ifed01/wip-ifed-fix-huge-omap-rename-oct

octopus: os/bluestore: cap omap naming scheme upgrade transactoin.

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

Merge pull request #43959 from guits/wip-53283-octopus

octopus: ceph-volume: `get_first_lv()` refactor

os/bluestore: Fix omap upgrade to per-pg scheme

This is fix to regression introduced by fix to omap upgrade: https://github.com/ceph/ceph/pull/43687
The problem was that we always skipped first omap entry.
This worked fine with objects having omap header key.
For objects without header key we skipped first actual omap key.

Fixes: https://tracker.ceph.com/issues/53260
Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 65a3f374aa1c57c5bb9401e57dab98a643b4360a)

ceph-volume: `get_first_*()` refactor

As indicated by commit 17957d9beb42a04b8f180ccb7ba07d43179a41d3 those
fuctions were meant to avoid writing something like following:

```
lvs = get_lvs()
if len(lvs) >= 1:
lvs = lv[0]
```

Those functions should return `None` if 0 or more than 1 item is returned.
The current name of these functions are confusing and can lead to thinking that
we just want the first item returned, even though it returns more than 1
item, let's rename them to `get_single_pv()`, `get_single_vg()` and
`get_single_lv()`

Closes: https://tracker.ceph.com/issues/49643
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a5e4216b49704783c55fb83b3ae6dde35b0082ad)

Merge pull request #43952 from guits/wip-52597-octopus

octopus: ceph-volume: util/prepare fix osd_id_available()

Merge pull request #43950 from guits/wip-53189-octopus

octopus: ceph-volume: fix a typo causing AttributeError

ceph-volume: util/prepare fix osd_id_available()

The current check only allows to request an OSD id that exists but
marked as 'destroyed'.
With this small fix, we can now use `--osd-id` with an id that doesn't
exist at all.

Fixes: https://tracker.ceph.com/issues/50880
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 73bfa5d2b0157f92721d8bf36619fd35ee265cdd)

ceph-volume: fix a typo causing AttributeError

Signed-off-by: Taha Jahangir <mtjahangir@gmail.com>
(cherry picked from commit 4cdbba3344fe26b6351e88ce00a8655890a02115)

Merge pull request #43747 from mfoliveira/wip-53100-octopus

octopus: os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_* options

Reviewed-by: Igor Fedotov <ifedotov@suse.com>

os/bluestore: fix invalid omap name conversion when upgrading to per-pg.

Fixes: https://tracker.ceph.com/issues/53062
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit cbc97018d883333f81ab9a3cfa99d2f68a9874cd)
(cherry picked from commit dc0a7e49434f76d97016934feed9a8ec806d1e42)

Merge pull request #43658 from badone/wip-octopus-ceph-ansible-systemd-bug

octopus: qa/ceph-ansible: Bump OS version for centos

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_search_bytes

so AvlAllocator can switch from the first-first mode to best-fit mode
without walking through the whole space map tree. in the
highly-fragmented system, iterating the whole tree could hurt the
performance of fast storage system a lot.

the idea comes from openzfs's metaslab allocator.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 5a26875049d13130ffe5954428da0e1b9750359f)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Conflicts:
src/common/options/global.yaml.in:
- Moved new option into src/common/options.cc

os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_search_count

so AvlAllocator can switch from the first-first mode to best-fit mode
without walking through the whole space map tree. in the
highly-fragmented system, iterating the whole tree could hurt the
performance of fast storage system a lot.

the idea comes from openzfs's metaslab allocator.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 40f05b971f5a8064cf9819f80fc3bbf21d5206da)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Conflicts:
src/common/options/global.yaml.in
- Moved new option into src/common/options.cc

os/bluestore/AvlAllocator: use cbit for counting the order of alignment

no need to calculate the alignment first, cbits() would suffice. as it
counts the first set bit and the follow 0's in a number. the result
is identical to the cbit(alignment of that number).

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 573cbb796e8ba2f433caa308925735101a8161a6)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>

os/bluestore/AvlAllocator: use delegated ctor

less repeating this way

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 9b52ba1dd0a5e199833d7ab2561a7b388d85afc1)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Conflicts:
src/os/bluestore/AvlAllocator.cc
- Replace `std::string_view name` w/ `const std::string& name`.

os/bluestore/AvlAllocator: specialize _block_picker()

before this change AvlAllocator::_block_picker() is used by both the
best-fit mode and first-fit mode. but since we cannot achieve the
locality by searching in the area pointed by curosr in best-fit mode,
we just pass a dummy cursor to AvlAllocator::_block_picker() when
searching in the best-fit mode.

but since the range_size_tree is already sorted by the size of ranges,
if _block_picker() fails to find one by the size, we should just give
up right away, and instead try again using a smaller size.

after this change, instead of sharing AvlAllocator::_block_picker()
across both the first-fit mode and the best-fit mode, this method
is specialize to two different variants: one for first-fit, and the
simplified one for best-fit.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 4837166f9e7a659742d4184f021ad12260247888)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>

os/bluestore: Improve _block_picker function

Make _block_picker function scan (*cursor, end) + (begin, *cursor) instead of (*cursor, end) + (begin, end).
The second run over range (*cursor, end) could never yield any results.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit c732060d3e3ef96c6da06c9dde3ed8c064a50965)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>

os/bluestore: do not call _block_picker() again if already searched from start()

Fixes: https://tracker.ceph.com/issues/48272
Signed-off-by: Xue Yantao <jhonxue@tencent.com>
(cherry picked from commit fd5ca26e4a23d6c8992ab5927ce85ade958e251f)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>

octopus: qa/ceph-ansible: Bump OS version for centos

The systemd version in the 8.3 image is buggy so use 8.4 instead.

Fixes: https://tracker.ceph.com/issues/52923
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>

os/bluestore: cap omap naming scheme upgrade transactoin.

We shouldn't use single per-onode transaction for such an upgrade when onode's omap list is huge. This results in similarly sized WAL/SST files which are inefficient, might cause high memory usage and sometimes error-prone.

Fixes: https://tracker.ceph.com/issues/49170
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit e897fa243c1dd38329733b452872616023f14ac8)

Conflicts (caused by lack of per-pg omap scheme):
src/os/bluestore/BlueStore.cc
src/os/bluestore/BlueStore.h
src/os/bluestore/bluestore_types.h

Merge pull request #43758 from tchaikov/octopus-pr-43748

octopus: admin/doc-requirements.txt: pin Sphinx at 3.5.4

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>

test/rgw: use spawn library for test_rgw_dmclock_scheduler

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit a8e3589a2c875b6fadc853c75f20cb9256f294ca)

Conflicts:
src/test/rgw/CMakeLists.txt
src/test/rgw/test_rgw_dmclock_scheduler.cc: trivial resolution

test/rgw: fix use of poll() with timers in unittest_rgw_dmclock_scheduler

the AsyncScheduler uses an asio timer to dispatch work to its executor
with an optional delay. when no delay is requested, it waits on the
timer with an expiration time in the past (crimson::dmclock::TimeZero)

tests are failing here because poll() is returning without executing the
handlers of those expired timers

asio implements these timers with timerfd and epoll. debugging with
strace, i see that these timers armed with timerfd_settime() are not
always immediately ready according to epoll_wait():

  eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK)   = 3
  epoll_create1(EPOLL_CLOEXEC)            = 4
  timerfd_create(CLOCK_MONOTONIC, TFD_CLOEXEC) = 5
  epoll_ctl(4, EPOLL_CTL_ADD, 3, {events=EPOLLIN|EPOLLERR|EPOLLET, data={u32=14164052, u64=14164052}}) = 0
  epoll_ctl(4, EPOLL_CTL_ADD, 5, {events=EPOLLIN|EPOLLERR, data={u32=14164064, u64=14164064}}) = 0
  timerfd_settime(5, TFD_TIMER_ABSTIME, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=0, tv_nsec=1}}, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=0, tv_nsec=0}}) = 0
  epoll_wait(4, [{events=EPOLLIN, data={u32=14164052, u64=14164052}}], 128, 0) = 1
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [{events=EPOLLIN, data={u32=14164064, u64=14164064}}], 128, 0) = 1

in this example, it took 6 calls to context.poll() before it was ready
to execute the timer's handler

to work around this, replace calls to context.poll() with calls to
context.run_for() with a very short duration

Fixes: https://tracker.ceph.com/issues/42788
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 21baed999e31c5e69c75f0cbb8757ef91585d917)

include/rados/librados.h: avoid redefinition of rados_object_list_item

doxygen complains at seeing rados_object_list_item gets defined twice.
so let's fix it.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 4c53a02ac56e4e87631262fffc711e2e009561d7)

mgr/dashboard: pin a version for autopep8 and pyfakefs

Fixes: https://tracker.ceph.com/issues/53024
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 946dab4f608ec47e0a3cfefdf8e7d1afda69117f)

Conflicts:
src/pybind/mgr/dashboard/requirements-lint.txt: trivial
resolution

admin/doc-requirements.txt: pin Sphinx at 3.5.4

* pin Sphinx at 3.5.4
* pin docutils at 0.18

at least the combination of these two versions
is known to compile.

to address the bug reported at
https://sourceforge.net/p/docutils/bugs/431/

the backtrace looks like:

/home/jenkins-build/build/workspace/ceph-pr-docs/build-doc/virtualenv/lib/python3.8/site-packages/sphinx/util/docutils.py:285:
RemovedInSphinx30Warning: function based directive support is now
deprecated. Use class based directive instead.
  warnings.warn('function based directive support is now deprecated. '

Exception occurred:
  File
"/home/jenkins-build/build/workspace/ceph-pr-docs/build-doc/virtualenv/lib/python3.8/site-packages/docutils/writers/html5_polyglot/__init__.py",
line 445, in section_title_tags
    if (ids and self.settings.section_self_link
AttributeError: 'Values' object has no attribute 'section_self_link'

please note this change is not cherry-picked from
master, because master already bumped Sphinx to 3.5.4
in 4968baa2523bd2a5ca6be147b26bc28906a864c9.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 6362b554e2e57227c9fda51c7703f14674da8358)

Conflicts:
admin/doc-requirements.txt: trivial resolution

admin/doc-requirements.txt: require breathe >= 4.20

to be compatible with Sphinx 3.2
see https://github.com/michaeljones/breathe/tree/v4.20.0#change-log

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 2f1f16ed865114faedc3bdb3049da722284fab61)

doc/scripts/gen_state_diagram.py: Fix literal comparison syntax warnings

In python3.8 comparing strings using 'is' and 'is not' throws syntax warning.
Instead use equality operator.

Signed-off-by: Varsha Rao <varao@redhat.com>
(cherry picked from commit 61e7bcded852e90e6249ab0f3c37ec2688537c83)

doc/conf.py: define CEPH_RADOS_API for breathe

otherwise we could have following errors:

Invalid C declaration: Expected identifier in nested name, got keyword: int [error at 18]
CEPH_RADOS_API int rados_aio_append (rados_ioctx_t io, const char *oid, rados_completion_t completion, const char *buf, size_t len)
------------------^

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit cf357e17ab5cbd9af4f67c51117b5f368cb9913f)

Merge pull request #43441 from rhcs-dashboard/wip-52836-octopus

octopus: qa/mgr/dashboard/test_pool: don't check HEALTH_OK

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #43381 from nkshirsagar/octopus

octopus: rgw: clear buckets before calling list_buckets()

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Reviewed-by: Prashant D <pdhange@redhat.com>
Reviewed-by: Ponnuvel Palaniyappan <pponnuvel@gmail.com>

Merge pull request #43325 from gerald-yang/octopus-50482

octopus: msgr/async: fix unsafe access in unregister_conn()

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

Merge pull request #43310 from gerald-yang/octopus-51199

octopus: msg/async: allow connection reaping to be tuned; fix cephfs test

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #43557 from badone/wip-octopus-ceph-ansible-max-usable-version

octopus: qa/ceph-ansible: Pin to last compatible stable release

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

Merge pull request #43607 from rhcs-dashboard/wip-52986-octopus

octopus: mgr/dashboard/api: set a UTF-8 locale when running pip

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

mgr/dashboard/api: set a UTF-8 locale when running pip

ansible-core started to include files whose filenames are encoded in
non-ascii characters, so we have to use a more capable encoding for the
locale in order to install this package. otherwise we'd have following
error:

Collecting ansible-core<2.12,>=2.11.3
  Using cached ansible-core-2.11.4.tar.gz (6.8 MB)
ERROR: Exception:

Traceback (most recent call last):
  File "/tmp/tmp.fX76ASIrch/venv/lib/python3.8/site-packages/pip/_internal/cli/base_command.py", line 173, in _main
    status = self.run(options, args)
...
  File "/tmp/tmp.fX76ASIrch/venv/lib/python3.8/site-packages/pip/_internal/utils/unpacking.py", line 226, in untar_file
    with open(path, "wb") as destfp:
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 137-140: ordinal not in range(256)

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 05e4145856bb5ed19ecc879f2e50b5a88cb2045e)

15.2.15

octopus: qa/ceph-ansible: Pin to last compatible stable release

https://github.com/ceph/ceph-ansible/pull/6892 introduces a breaking
change so pin the CA version to stable-6.0 and ansible 2.9.

Fixes: https://tracker.ceph.com/issues/52943
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>

rgw: clear buckets before calling list_buckets()

The ragodgw-admin bucket limit check command has a bug in
octopus. Since we do not clear the bucket list before
list_buckets() returns the next max_entries, they are appended
to the existing list and we end up counting the first ones again.

This bug is triggered if bucket count exceeds max_entries and
causes duplicates in the output of radosgw-admin bucket limit check.

The fix clears the buckets structure before the list_buckets()
populates it again with the next lot of buckets to iterate through.

partial manual cherry-pick of 99f7c4aa1286edfea6961b92bb44bb8fe22bd599

Signed-off-by: Nikhil Kshirsagar <nkshirsagar@gmail.com>

Merge pull request #43418 from trociny/wip-51645-octopus

octopus: osd/OSD: mkfs need wait for transcation completely finish

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #43407 from trociny/wip-52809-octopus

octopus: tools/erasure-code: new tool to encode/decode files

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #43268 from cfsnyder/wip-52587-octopus

octopus: ceph-volume: fix lvm activate --all --no-systemd

qa/mgr/dashboard/test_pool: don't check HEALTH_OK

Fixes: https://tracker.ceph.com/issues/48845
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 2283cb068b82033b14587c7bac6a28440221dcd8)

Merge pull request #43424 from cfsnyder/wip-51695-octopus

octopus: rgw: fail as expected when set/delete-bucket-website attempted on a non-exis…

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #43314 from idryomov/wip-rbd-mirror-snapshot-rx-only-octopus

octopus: rbd-mirror: unbreak one-way snapshot-based mirroring

Reviewed-by: Mykola Golub <mgolub@mirantis.com>

Merge pull request #43312 from idryomov/wip-keyring-resolve-error-octopus

octopus: auth,mon: don't log "unable to find a keyring" error when key is given

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #43263 from cfsnyder/wip-51552-octopus

octopus: ceph-monstore-tool: use a large enough paxos/{first,last}_committed

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #42986 from MrFreezeex/wip-52460-octopus

octopus: rbd-mirror: add perf counters to snapshot replayed

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Mykola Golub <mgolub@mirantis.com>

Merge pull request #42837 from steveftaylor/48212

octopus: mon/OSDMonitor: account for PG merging in epoch_by_pg accounting

Reviewed-by: Neha Ojha <nojha@redhat.com>

rgw: fail as expected when set/delete-bucket-website attempted on a non-existent bucket, rgw should return HTTP 404 and NoSuchBucket.

Fixes: https://tracker.ceph.com/issues/51536
Signed-off-by: xiangrui meng <mengxr@chinatelecom.cn>
(cherry picked from commit c623aa45d35b269c6701a57e44ac05bb29a79dc8)

Conflicts:
- src/rgw/rgw_op.cc

Cherry-pick notes:
- rgw_op.cc forward_reqeuest_to_master takes different arguments in Octopus vs. Quincy

osd/OSD: mkfs need wait for transcation completely finish

when do ceph-osd mkfs, when ceph-osd process exit, sometimes
the block data could be written incompletely. we need add
wait for it complete.

Signed-off-by: Chen Fan <fan.chen@easystack.cn>
(cherry picked from commit 0ffadad3a83b3ca634d7d58a80c84d1d8761e2ea)

Merge pull request #43349 from cfsnyder/wip-52351-octopus

octopus: rgw: fix sts memory leak

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #43270 from cfsnyder/wip-51778-octopus

octopus: rgw : add check for tenant provided in RGWCreateRole

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

Merge pull request #43265 from cfsnyder/wip-52331-octopus

octopus: cmake: s/Python_EXECUTABLE/Python3_EXECUTABLE/

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #43234 from MrFreezeex/wip-51838-octopus

octopus: ceph.spec: selinux scripts respect CEPH_AUTO_RESTART_ON_UPGRADE

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #43369 from tchaikov/octopus-pr-39602

octopus: mgr/influx: use "N/A" for unknown hostname

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #43352 from rhcs-dashboard/wip-52773-octopus

octopus: qa/mgr/dashboard: add extra wait to test

Reviewed-by: Nizamudeen A <nia@redhat.com>

test/erasure-code: remove ceph_erasure_code

Its functionality is moved to ceph-erasure-code-tool.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 56aaaf8a97574a0284bad37cc0ba5e1c262f33e0)

rpm,deb: add ceph-erasure-code-tool to ceph-osd package

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 53c75eebbdc9d85b07dd90608b45ad794a54f848)

ceph-erasure-code-tool: new tool to encode/decode files

E.g. it may be useful as a last resort when recovering an object from
a damaged PG: extract the encoded object chunks from the PG shards
with ceph-objectstore-tool and then decode with ceph-erasure-code-tool.

It also has functionality similar to what ceph_erasure_code test provides.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit b274357d29fb25a29f75c08283fec185902e7970)

mgr/influx: use "N/A" for unknown hostname

in theory, there is chance that get_metadata() returns None, so let use
"N/A" in this case.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit e457ca50011f70cf01a62323998af233a484f338)

qa/mgr/dashboard: add extra wait to test

Fixes: https://tracker.ceph.com/issues/49344
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 9ff778cdaa1ef40fcfa04f221a1da786a0e19655)

rgw: fix sts memory leak

fix https://tracker.ceph.com/issues/52290

Signed-off-by: yuliyang_yewu <yuliyang_yewu@cmss.chinamobile.com>
(cherry picked from commit ef921bcdaa78d33ed0611a60ec58826d8e6ccb45)

rgw : add check for tenant provided in RGWCreateRole

Fixes: https://tracker.ceph.com/issues/51206
Signed-off-by: caolei <halei15848934852@163.com>
(cherry picked from commit 3c99ac14080c9f5b1611c9bbe4a223a9fd2927a0)

Conflicts:
src/rgw/rgw_rest_role.cc

- Octopus constructs role explicitly vs. using store->get_role(), and does not wrap in a unique_ptr

tasks/ceph_manager: ignore EACCES when waiting for quorum

mon_tick_interval is 5 seconds by default. monitors update their
rotating keys every mon_tick_interval. before monitors forms a
quorum, the auth requests from clients are put into the wait list.
these requests are re-enqueued once the monitors form a quorum. but
there is a small window of mon_tick_interval, before they are able
to serve the auth requests even after their claim to be able to
server requests. if these re-enqueued requests happen to be served
in this window, and if authx is enabled, they will be greeted with
errors like

handle_auth_bad_method server allowed_methods [2] but i only support [2]

in the case of ceph cli, the error would look like:

[errno 13] RADOS permission denied (error connecting to the cluster)

so, to address this issue, the EACCES error is ignored when waiting
for a quorum.

Signed-off-by: Kefu Chai <kchai@redhat.com>

tasks/ceph_manager: use safe_while() to refactor the wait for quorum

for better readability

Signed-off-by: Kefu Chai <kchai@redhat.com>

ceph-monstore-tool: use a large enough paxos/{first,last}_committed

so the rebuild paxos transaction won't be overwritten by the ones
created before recovery completes.

when the quorum is recovering, the leader will collect the paxos
transactions from peons. if the quorum accept the proposal for setting
the fingerprint, the peon will update the monitor with the paxos
transaction with a newer "last_committed" than the one created using
update_paxos() in ceph_monstore_tool.cc. the latter "last_committed" is
always 0.

so, to avoid this extra paxos proposal obsoleting the "rebuilding" paxos
transaction, we use a large enough number for {first,last}_committed.

Fixes: http://tracker.ceph.com/issues/38219
Signed-off-by: Kefu Chai <kchai@redhat.com>

Merge pull request #43094 from mgfritch/use-quay-octopus

octopus: cephadm: use quay, not docker

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>

Merge pull request #43266 from cfsnyder/wip-51555-octopus

octopus: mon: return -EINVAL when handling unknown option in 'ceph osd pool get'

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #42616 from neha-ojha/wip-51967-octopus

octopus: common/options: Set osd_client_message_cap to 256.

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

qa/suites/rbd: test case for one-way snapshot-based mirroring

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 366e9c51a8d83a7c185a00d9fb9e4cde290145e4)

rbd-mirror: fix a couple of brainos in log messages

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit fdcdeae2a26927b51ab8a480d48a52b896b5532b)

rbd-mirror: unbreak one-way snapshot-based mirroring

Snapshot replayer needs the remote's mirror peer uuid to find its
snapshots in the remote image. It is obtained by listing remote's
mirror peers but RemotePoolPoller::handle_mirror_peer_list() skips
tx-only (MIRROR_PEER_DIRECTION_TX) peers. In effect only rx-tx
(MIRROR_PEER_DIRECTION_RX_TX) peers are considered for matching
and snapshot replayer always fails with "failed to retrieve mirror
peer uuid from remote pool" error.

Instead, skip rx-only (MIRROR_PEER_DIRECTION_RX) peers as we are
definitely not interested in anything having to do with mirroring
_to_ the remote cluster.

Fixes: https://tracker.ceph.com/issues/52675
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit b02d3b0c5aa59aa294de43f94c793f5abf71ac03)

auth,mon: don't log "unable to find a keyring" error when key is given

This error is logged even if --key or --keyring are specified and
confuses users because the command actually does its job and exits
with success.  This primarily affects "rbd mirror pool peer bootstrap
import" command and rbd-mirror and cephfs-mirror daemons which connect
to the remote cluster with just mon_host and key:

  $ rbd mirror pool peer bootstrap import mypool tokenfile
  ... -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory

Local cluster commands are affected too:

  $ rados --no-config-file --mon-host $MON_HOST --key $KEY lspools
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  device_health_metrics
  rbd

This was introduced in commit 98a2e5c59daa ("rados: translate errno to
str in CLI").

Fixes: https://tracker.ceph.com/issues/51628
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 70aa026b097d919b41b2a1221d73b326557f75e3)

qa/tasks/cephfs/test_sessionmap: reap connections immediately

We have to reap connections promptly for this test to work.

This test was broken indirectly by d51d80b3234e17690061f65dc7e1515f4244a5a3,
which moved the counter decrement to reap time instead of mark_down/stop
time.

The reaping is asynchronous, so allow for a delay in the count change.

Fixes: https://tracker.ceph.com/issues/50622
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit c8c5071dcd4b0b788f5e924a678095ce5dc1d7f8)
Signed-off-by: Gerald Yang <gerald.yang.tw@gmail.com>

msg/async: configurable threshold for reaping dead connections

It is helpful to set this to 1 for tests.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 8129d6bb953015cc05db458afa6aa9b8f5f62614)
Signed-off-by: Gerald Yang <gerald.yang.tw@gmail.com>

msgr/async: fix unsafe access in unregister_conn()

We were looking at anon_conns and accepting_conns without holding
the lock (deleted_lock is not sufficient).

Drop this test, and move the decrements:

- inc when we add to conns or anon_conns (no changes there)
- dec when we remove from deleted_conns (several different paths!)

Fixes: https://tracker.ceph.com/issues/49237
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit d51d80b3234e17690061f65dc7e1515f4244a5a3)
Signed-off-by: Gerald Yang <gerald.yang.tw@gmail.com>

Merge pull request #43024 from ifed01/wip-ifed-fix-bluefs-replay-crc-oct

octopus: os/bluestore: accept undecodable multi-block bluefs transactions on log

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #43133 from mgfritch/octopus-backport-43010

octopus: cephadm: add thread ident to log messages

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #43186 from rhcs-dashboard/wip-52616-octopus

octopus: mgr/dashboard: Incorrect MTU mismatch warning

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #43140 from ifed01/wip-ifed-fix-migrate-oct

octopus: os/bluestore: fix bluefs migrate command

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #43008 from ifed01/wip-ifed-fix-52311-oct

octopus: os/bluestore: fix using incomplete bluefs log when dumping it.

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

Merge pull request #42975 from idryomov/wip-51419-octopus

octopus: common/buffer: fix SIGABRT in rebuild_aligned_size_and_memory

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

Merge pull request #43273 from cfsnyder/wip-52052-octopus

octopus: rgw: when deleted obj removed in versioned bucket, extra del-marker added

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

Merge pull request #43272 from cfsnyder/wip-51330-octopus

octopus: rgw: avoid infinite loop when deleting a bucket

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

Merge pull request #43271 from cfsnyder/wip-51012-octopus

octopus: rgw: remove quota soft threshold

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

rgw: remove quota soft threshold

Remove quota soft threshold, which causes expensive checks for sharded buckets

Fixes: 14eabd4aa7b8a2e2c0c43fe7f877ed2171277526
Signed-off-by: Zulai Wang <wangzl31@outlook.com>
(cherry picked from commit 32a39705765af0f87bec9101e5d337b797e05fea)

Conflicts:
src/common/options/rgw.yaml.in
src/rgw/rgw_quota.cc

Cherry-pick notes:
- Options defined in src/common/options.cc in Octopus vs src/common/options/rgw.yaml.in
- RGWQuotaCache::get_stats does not take optional_yeild or DoutPrefixProvider arguments in Octopus

rgw: when deleted obj removed in versioned bucket, extra del-marker added

After initial checks are complete, this will read the OLH earlier than
previously to check the delete-marker flag and under the bug's
conditions will return -ENOENT rather than create a spurious delete
marker.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 69d7589fb1305b7d202ffd126c3c835e7cd0dda3)

Conflicts:
src/cls/rgw/cls_rgw_types.h
src/rgw/rgw_rados.cc

Cherry-pick notes:
- RGWRados::apply_olh_log does not take DoutPrefixProvider in Octopus
- change to use some namespace-qualified names in cls_rgw_types

rgw: avoid infinite loop when deleting a bucket

When deleting a bucket with an incomplete multipart upload that
has about 2000 parts uploaded, we noticed an infinite loop, which
stopped s3cmd from deleting the bucket forever.
Per check, when the bucket index was sharded (for example 128
shards), the original logic in
RGWRados::cls_bucket_list_unordered() did not calculate
the bucket shard ID correctly when the index key of a data
part was taken as the marker.

The issue is not necessarily reproduced each time. It will depend
on the key of the object. To reproduce it in 128-shard bucket,
we use 334 as the key for the incomplete multipart upload,
which will be located in Shard 127 (known by experiment). In this
setup, the original logic will usually come out a shard ID smaller
than 127 (since 127 is the largest one) from the marker and
thus a circle is constructed, which results in an infinite loop.

PS: Some times the bucket ID calculation may incorrectly going forward
instead of backward. Thus, the check logic may skip some shards,
which may have regular keys. In such scenarios, some non-empty buckets may
be deleted by accident.

Fixes: http://tracker.ceph.com/issues/49206
Signed-off-by: Jeegn Chen <jeegnchen@tencent.com>
(cherry picked from commit 3cafe5774a5a453d58a3a6bed1f02d3200c4bb1d)

Conflicts:
src/rgw/rgw_rados.cc

Cherry-pick notes:
- Octopus cls_bucket_list_unordered doesn't take DoutPrefixProvider as first arg

ceph-volume: fix lvm activate --all --no-systemd

When using a system without systemd then the `lvm activate --all --no-systemd`
subcommand still calls systemd.
We already allow users to activate a single OSD without systemd so there's
no reason to not do the same with --all (because activate_all calls activate).

Fixes: https://tracker.ceph.com/issues/25070
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8e402e112a6383555e2df31ba3321e5956f1841a)

mon: return -EINVAL when handling unknown option in 'ceph osd pool get'

Signed-off-by: Zhao Cuicui <brucen1030@163.com>
(cherry picked from commit 7ed494076e2390f8e6a386278346632d00ee718a)

cmake: s/Python_EXECUTABLE/Python3_EXECUTABLE/

pass the python3 exec when creating the ceph-volume build venv
fixup for 5fc657b40dc7

Fixes: https://tracker.ceph.com/issues/52304
Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 7db830598507d90d1c9e1f4468f818bebce58037)

cephadm: quay.io for non-ceph images too

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit dbc1d6303f4c2a22f5fa59218aa032fc92073906)

mgr/cephadm: Put together default container images references

Placed all in the same location in order to make easy downstream modifications
and future changes

Signed-off-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
(cherry picked from commit ce246479443a64b292c7cff2a662161c8a598e09)

Merge pull request #43189 from rhcs-dashboard/wip-51275-octopus

octopus: mgr/dashboard: deprecated variable usage in Grafana dashboards

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: p-se <NOT@FOUND>
Reviewed-by: Yuri Weinstein <yweins@redhat.com>

ceph.spec: selinux scripts respect CEPH_AUTO_RESTART_ON_UPGRADE

In /etc/sysconfig/ceph we allow operators to define if ceph daemons
should be restarted on upgrade: CEPH_AUTO_RESTART_ON_UPGRADE.

But the post selinux scripts will stop ceph.target regardless if this
is set to `no`, leading to operators adding various hacks to prevent
these unexpected or inconvenient daemon restarts. By now, if users
are using rpms directly, they are likely orchestrating their own
daemon restarts so should not rely on the rpm itself to do this.

Fixes: https://tracker.ceph.com/issues/21672
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit 092a6e3e83e9ef8e37cb6f1033c345dcb5224cfc)

mgr/dashboard: deprecated variable usage in Grafana dashboards

Fixes: https://tracker.ceph.com/issues/50059
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit a709abf8bf5a6b25c21db100e87af3a6c2cf382d)

mgr/dashboard: Incorrect MTU mismatch warning

The MTU mismatch warning was being fired for those NIC's as well that are in down state. This PR intends to fix this issue

Fixes:https://tracker.ceph.com/issues/52028
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 58d635455d1f59921d5ad821168f31b6f937588a)

Merge pull request #42533 from liewegas/use-quay-octopus

octopus: cephadm: default to quay.io, not docker.io

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>