git.apps.os.sepia.ceph.com Git

pybind/mgr/balancer: define Plan.{dump,show}()

as they are called by the commands

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 0d48b03)

Conflicts:
src/pybind/mgr/balancer/module.py

Cherry-pick notes:
- Conflicts due to missing type annotations in Octopus

Merge pull request #43967 from cfsnyder/wip-53224-octopus

octopus: qa/rgw: bump tempest version to resolve dependency issue

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #43963 from cfsnyder/wip-53099-octopus

octopus: qa/rgw: Fix vault token file access.case

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #43953 from guits/wip-53279-octopus

octopus: cephadm/ceph-volume: do not use lvm binary in containers

Merge pull request #44177 from cfsnyder/wip-52451-octopus

octopus: rpm, debian: move smartmontools and nvme-cli to ceph-base

Reviewed-by: Yaarit Hatuka <yaarit@redhat.com>
Reviewed-by: Kefu Chai <tchaikov@gmail.com>

Merge pull request #43947 from guits/wip-53277-octopus

octopus: ceph-volume: fix bug with miscalculation of required db/wal slot size for VGs with multiple PVs

ceph-volume: remove --root param from nsenter cmd

This is redundant and makes nsenter throw messages like following:
```
  Failed to find sysfs mount point
  dev/block/11:0/holders/: opendir failed: Not a directory
  dev/block/252:0/holders/: opendir failed: Not a directory
  dev/block/253:0/holders/: opendir failed: Not a directory
  dev/block/252:1/holders/: opendir failed: Not a directory
  dev/block/253:1/holders/: opendir failed: Not a directory
  dev/block/252:2/holders/: opendir failed: Not a directory
  dev/block/253:2/holders/: opendir failed: Not a directory
  dev/block/252:3/holders/: opendir failed: Not a directory
  dev/block/253:3/holders/: opendir failed: Not a directory
  dev/block/252:16/holders/: opendir failed: Not a directory
  dev/block/252:32/holders/: opendir failed: Not a directory
  dev/block/252:48/holders/: opendir failed: Not a directory
  dev/block/252:64/holders/: opendir failed: Not a directory
  ```

Fixes: https://tracker.ceph.com/issues/52926
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e4667e81301295f4c81328505e4376d2aef66fb2)

cephadm: mount rootfs in osd containers

See ceph-volume tracker for details [1]

[1] https://tracker.ceph.com/issues/52926

Fixes: https://tracker.ceph.com/issues/51592
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 48b369e2caf3222bf594dc09f87b5969a53dfbe7)

ceph-volume: implement lvm wrapper

ceph-volume should run pv/vg/lv commands in the host namespace rather than
running them inside the container in order to avoid lvm metadata corruption.

Fixes: https://tracker.ceph.com/issues/52926
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 4d33630deeaee51578868fb29337da802e9cb231)

Merge pull request #44210 from guits/wip-53372-octopus

octopus: ceph-volume: human_readable_size() refactor

Merge pull request #44174 from cfsnyder/wip-51149-octopus

octopus: osd: set r only if succeed in FillInVerifyExtent

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #44176 from cfsnyder/wip-51171-octopus

octopus: common/PriorityCache: low perf counters priorities for submodules.

Reviewed-by: Igor Fedotov <ifedotov@suse.com>

Merge pull request #44165 from cfsnyder/wip-52710-octopus

octopus: osd: fix partial recovery become whole object recovery after restart osd

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #44097 from cfsnyder/wip-53389-octopus

octopus: osd/OSDMap.cc: clean up pg_temp for nonexistent pgs

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>

Merge pull request #44254 from neha-ojha/wip-perf-octopus

octopus: qa: miscellaneous perf suite fixes

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

Merge pull request #43962 from cfsnyder/wip-53200-octopus

octopus: osd: fix 'ceph osd stop <osd.nnn>' doesn't take effect

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #43861 from pponnuvel/wip-53198-octopus

octopus: mon/MgrStatMonitor: ignore MMgrReport from non-active mgr

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #43887 from ifed01/wip-ifed-more-errors-shared-blob-repair-oct

octopus: os/bluestore: fix additional errors during missed shared blob repair.

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #43885 from ifed01/wip-ifed-fix-invalid-offset-repair-oct

octopus: os/bluestore: fix writing to invalid offset when repairing

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #43883 from ifed01/wip-ifed-fix-53011-oct

octopus: os/bluestore: use proper prefix when removing undecodable Share Blob.

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #43757 from ifed01/wip-ifed-fix-write-small-head-pad-oct

octopus: os/bluestore: _do_write_small fix head_pad

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #44172 from cfsnyder/wip-52074-octopus

octopus: rgw: user stats showing 0 value for "size_utilized" and "size_kb_utilized" fields

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #44170 from cfsnyder/wip-53133-octopus

octopus: rgw: disable prefetch in rgw_file to fix 3x read amplification

Reviewed-by: Casey Bodley <cbodley@redhat.com>

qa/suites/rados/perf/ceph.yaml: remove rgw

This is no longer required because we removed cosbench workloads in
fd350fd0150a2d4072f055658c20314a435a19ba. This is also required to prevent
failures like the following or any other changes that break the rgw task:

```
2021-08-06T20:13:25.812 INFO:teuthology.orchestra.run.smithi060.stderr:curl: (7) Failed to connect to smithi060.front.sepia.ceph.com port 80: Connection refused
2021-08-06T20:15:33.813 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_04c2febe7099917d97a71271f17abb5710030132/teuthology/contextutil.py", line 31, in nested
    vars.append(enter())
  File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_3c0f8c8164075af7aac4d1f2805d3f4580709461/qa/tasks/rgw.py", line 191, in start_rgw
    wait_for_radosgw(url, remote)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_3c0f8c8164075af7aac4d1f2805d3f4580709461/qa/tasks/util/rgw.py", line 94, in wait_for_radosgw
    assert exit_status == 0
AssertionError
```

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 119544bb29e253322af33e593ffd09e325c2af8a)

qa: remove cosbench workloads from perf suites

Due to https://tracker.ceph.com/issues/49139

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit fd350fd0150a2d4072f055658c20314a435a19ba)

qa: use ubuntu_latest for perf suites

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 5957d1797a4f67b4545c2554dff240463af87359)

Merge pull request #44227 from ceph/octopus-mistune

doc: Use older mistune

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #43863 from ivancich/wip-multipart-purge-fix-octopus

octopus: rgw: fix bucket purge incomplete multipart uploads

Reviewed-by: Adam Emerson <aemerson@redhat.com>

Merge pull request #43961 from cfsnyder/wip-53272-octopus

octopus: rgw/beast: optimizations for request timeout

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #43810 from cbodley/wip-qa-rgw-java-octopus

qa/rgw: octopus branch targets ceph-octopus branch of java_s3tests

Reviewed-by: Nathan Cutler <ncutler@suse.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Ali Maredia <amaredia@redhat.com>

Merge pull request #43696 from cfsnyder/wip-52959-octopus

octopus: rgw/rgw_rados: make RGW request IDs non-deterministic

Reviewed-by: Casey Bodley <cbodley@redhat.com>

doc: Use older mistune

https://github.com/miyakogi/m2r/issues/66

Signed-off-by: David Galloway <dgallowa@redhat.com>
(cherry picked from commit ed2ad24a4ba3ad3f8103926bfea2466b9eb61222)

ceph-volume: human_readable_size() refactor

This commit refactors the `human_readable_size()` function.

The current implementation has a couple of issues:

in a 'human readable' mindset, I would expect `human_readable_size(1024)` to
return '1.00 KB' instead of '1024.00 KB'.

```
In [1]: from ceph_volume.util.disk import human_readable_size

In [2]: human_readable_size(1024)
Out[2]: '1024.00 B'

In [3]: human_readable_size(1024*1024)
Out[3]: '1024.00 KB'

```

Also, it doesn't support PB unit:

```
In [4]: human_readable_size(1024*1024*1024*1024*1024)
Out[4]: '1024.00 TB'

In [5]: human_readable_size(1024*1024*1024*1024*1024*1024)
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-31-0859861661dc> in <module>
----> 1 human_readable_size(1024*1024*1024*1024*1024*1024)

~/GIT/ceph/src/ceph-volume/ceph_volume/util/disk.py in human_readable_size(size)
    640     return "{size:.2f} {suffix}".format(
    641         size=size,
--> 642         suffix=suffixes[suffix_index])
    643
    644

IndexError: list index out of range
```

This commit fixes this.

Fixes: https://tracker.ceph.com/issues/48492
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 6940856f233f4d365a119eed90ff88fd918f6916)

rpm, debian: move smartmontools and nvme-cli to ceph-base

We wish to be able to scrape SMART and NVMe metrics from OSD and MON
nodes. For this we require / recommend smartmontools and nvme-cli
dependencies for both the ceph-osd and ceph-mon packages. However, the
sudoers file (which is required for invoking `smartctl` by user 'ceph')
was installed only in the ceph-osd package. Since different packages
cannot own the same file, and because we want to be able to scrape from
every daemon, we move the dependencies and the sudoers installation to
ceph-base. For generalization, we rename:
sudoers.d/ceph-osd-smartctl -> sudoers.d/ceph-smartctl

Fixes: https://tracker.ceph.com/issues/50657
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit 7ca39fa92b47427af2f1c6000c653bb4dffc47fe)

Conflicts:
ceph.spec.in
debian/rules

Cherry-pick notes:
- conflict due to octopus not having jaeger dep
- conflict due to octopus not installing rbd-nbd_quiesce on debian

common/PriorityCache: low perf counters priorities for submodules.

Having too many perf counters with nicknames priorities >= PRIO_INTERESTING spoils daemonperf output and causes no "osd" section there due to presumably too many columns.

Fixes: https://tracker.ceph.com/issues/51002
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 35238d41360a22e22fae7d8ceddf3a2a047e5464)

osd: set r only if succeed in FillInVerifyExtent

When read failed, ret can be taken as data len in FillInVerifyExtent, which should be avoided.
It may cause errors in crc repair or retry read because of the data len. In my case, we use FillInVerifyExtent for EC read,
when meet -EIO，we will try crc repair, which need read data from other shard accrding to data len.
And I meet assert in ECBackend.cc (loc: line 2288 ceph_assert(range.first != range.second) ), But it seems master branch not support EC crc repair.
In shot, when reuse the readop may cause unpredictable error.

Fixes: https://tracker.ceph.com/issues/51115
Signed-off-by: yanqiang-ux <yanqiang_ux@163.com>
(cherry picked from commit 127745161fbcdee06b2dfa8464270c3934bcd06a)

rgw: user stats showing 0 value for "size_utilized" and "size_kb_utilized" fields

When accumulating user stats, the "utilized" fields are not looked
at. Updates RGWStorageStats::dump so it only outputs the "utilized"
data if they're updated.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 248fbce6b54f6c91e63b05861f8631ca64c8df81)

Conflicts:
src/rgw/rgw_admin.cc

Cherry-pick notes:
- Conflicts due to change of interface for rgw user and rados store

rgw: disable prefetch in rgw_file

Each call to rgw_read (rgw_file.cc) invokes three calls to RGWRados::get_obj_state with s->prefetch_data=true. It results in great read amplification. If length argument in rgw_read call is smaller than rgw_max_chunk_size, then the amplification is threefold.

Signed-off-by: Kajetan Janiak <kjaniak@cloudferro.com>
(cherry picked from commit f915e21e5a1baf6030c1407b3058d4f58c638df9)

Conflicts:
src/rgw/rgw_op.cc

Cherry-pick notes:
- Octopus sets prfetch data flag through Rados method vs. method on object

osd: fix partial recovery become whole object recovery after restart osd

support SERVER_OCTOPUS feature for pg_missing_item::encode()

Fixes: https://tracker.ceph.com/issues/52583
Signed-off-by: Jianwei Zhang <jianwei1216@qq.com>
(cherry picked from commit dcdb188b6f577551fb377ba34145419f81322b03)

Merge pull request #42958 from ifed01/wip-ifed-fix-huge-omap-rename-oct

octopus: os/bluestore: cap omap naming scheme upgrade transactoin.

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

osd/OSDMap.cc: clean up pg_temp for nonexistent pgs

Fixes an issue where the OSDMap does not clear pg-temp entries for PGs that no longer exist.

Fixes: https://tracker.ceph.com/issues/53308
Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit 86367ea008281cf4398073466f3ece5ea18e82af)

Merge pull request #43959 from guits/wip-53283-octopus

octopus: ceph-volume: `get_first_lv()` refactor

os/bluestore: Fix omap upgrade to per-pg scheme

This is fix to regression introduced by fix to omap upgrade: https://github.com/ceph/ceph/pull/43687
The problem was that we always skipped first omap entry.
This worked fine with objects having omap header key.
For objects without header key we skipped first actual omap key.

Fixes: https://tracker.ceph.com/issues/53260
Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 65a3f374aa1c57c5bb9401e57dab98a643b4360a)

ceph-volume: `get_first_*()` refactor

As indicated by commit 17957d9beb42a04b8f180ccb7ba07d43179a41d3 those
fuctions were meant to avoid writing something like following:

```
lvs = get_lvs()
if len(lvs) >= 1:
lvs = lv[0]
```

Those functions should return `None` if 0 or more than 1 item is returned.
The current name of these functions are confusing and can lead to thinking that
we just want the first item returned, even though it returns more than 1
item, let's rename them to `get_single_pv()`, `get_single_vg()` and
`get_single_lv()`

Closes: https://tracker.ceph.com/issues/49643
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a5e4216b49704783c55fb83b3ae6dde35b0082ad)

Merge pull request #43952 from guits/wip-52597-octopus

octopus: ceph-volume: util/prepare fix osd_id_available()

Merge pull request #43950 from guits/wip-53189-octopus

octopus: ceph-volume: fix a typo causing AttributeError

ceph-volume/tests: update setup_mixed_type playbook

we need to create a file with a larger size.
see https://github.com/ceph/ceph/pull/43300#issuecomment-951961243

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 8af00e25aa4ab60d0309e31f6c20edd6cd5be1ee)

qa/rgw: bump tempest version to resolve dependency issue

Fixes: https://tracker.ceph.com/issues/53095
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 0bb60469d1c8439eaabd1b89dc494e49e7863b33)

Fix vault token file access.

Put the vault token file in a location that ceph can read.
Make it readable only by ceph.

On rhel8 (and indeed, any vanilla rhel machine), $HOME is liable to be
mode 700. This means the ceph user can't read things in that user's
directory. This causes radosgw to emit the confusing message "ERROR:
Vault token file ... not found" even though the teuthology log will
plainly show it was created and made readable by ceph.

Fixes: http://tracker.ceph.com/issues/51539
Signed-off-by: Marcus Watts <mwatts@redhat.com>
(cherry picked from commit 454cc8a18c4c3851de5976d3e36e42644dbb1a70)

Conflicts:
qa/tasks/rgw.py

Cherry-pick notes:
- Conflict due to ctx.rgw.vault_role not set in Octopus test

osd: fix 'ceph osd stop <osd.nnn>' doesn't take effect

when the osd state is in the non-active state, the osd daemon can be stopped.

Fixes: https://tracker.ceph.com/issues/53039
Signed-off-by: tan changzhi <544463199@qq.com>
(cherry picked from commit d595c95ef6c3dc34b8389ff4270639ff1550d269)

rgwi/beast: stream timer with duration 0 disables timeout

fixes all S3 operations failing with:
`2021-11-15T15:46:05.992+0000 7ffee17fa700 20 failed to read header: Bad file descriptor`
when `--rgw_frontends="beast port=8000 request_timeout_ms=0"`

Signed-off-by: Mark Kogan <mkogan@redhat.com>

rgw/beast: reference count Connections for timeout_handler

resolves a use-after-free in the timeout_handler, where a timeout fires
and schedules the timeout_handler for execution, but the coroutine exits
and destroys the socket before asio executes the timeout_handler

timeout_handler now holds a reference on the Connection to extend its
lifetime

now that the Connection is allocated on the heap, we can include the
parse_buffer in this memory instead of allocating it separately

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw/beast: replace beast::tcp_stream with manual timeouts

remove the beast::tcp_stream wrapper from the socket, and track timeouts
manually with a timeout_timer. this timer uses ceph's coarse_mono_clock
which is cheaper to sample than std::chrono::steady_clock

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw/beast: use explicit executor type for tcp socket and stream

Signed-off-by: Casey Bodley <cbodley@redhat.com>

spawn: use explicit strand executor

the default spawn::yield_context uses the polymorphic boost::asio::executor
to support any executor type

rgw's beast frontend always uses the same executor type for these
coroutines, so we can use that type directly to avoid the overhead of
type erasure and virtual function calls

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 9d9258e06b78bb47fd0156d9bd7bb00b52a726b0)

Conflicts:
src/common/async/yield_context.h
src/rgw/rgw_d3n_cacherequest.h
src/rgw/rgw_notify.cc
src/rgw/rgw_sync_checkpoint.cc

Cherry-pick notes:
- src/rgw/rgw_d3n_cacherequest.h doesn't exist in Octopus
- src/rgw/rgw_sync_checkpoint.cc doesn't exist in Octopus
- conflicts due to rename of structs after Octopus
- conflicts due to macro for conditional inclusion of beast context in Octopus

rgw: clean up WITH_RADOSGW_BEAST_OPENSSL

the #ifdef was covering more includes than it should have

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 44f4b083dea933f62cd76279593e3f1b3cd21f77)

ceph-volume: util/prepare fix osd_id_available()

The current check only allows to request an OSD id that exists but
marked as 'destroyed'.
With this small fix, we can now use `--osd-id` with an id that doesn't
exist at all.

Fixes: https://tracker.ceph.com/issues/50880
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 73bfa5d2b0157f92721d8bf36619fd35ee265cdd)

ceph-volume: fix a typo causing AttributeError

Signed-off-by: Taha Jahangir <mtjahangir@gmail.com>
(cherry picked from commit 4cdbba3344fe26b6351e88ce00a8655890a02115)

ceph-volume: fix bug with miscalculation of required db/wal slot size for VGs with multiple PVs

Previous logic for calculating db/wal slot sizes made the assumption that there would only be
a single PV backing each db/wal VG. This wasn't the case for OSDs deployed prior to v15.2.8,
since ceph-volume previously deployed multiple SSDs in the same VG. This fix removes the
assumption and does the correct calculation in either case.

Fixes: https://tracker.ceph.com/issues/52730
Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit cd6aa1329f70f89338757ba295e279ecfdbc2d07)

rgw: fix bucket purge incomplete multipart uploads

The marker was not working correctly as segments of the bucket index
were listed to shut down any incomplete multipart uploads. This fixes
the marker, so it's maintained properly across iterations.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 9f2c2d901dff0acc16f80cb6ad32bb8c39c9ac6e)

Conflicts:
src/rgw/rgw_multi.cc
src/rgw/rgw_multi.h
- dpp changes

os/bluestore: fix additional errors during missed shared blob repair.

Fixes: https://tracker.ceph.com/issues/51762
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit d92ebd3ebea9a153c22a711bb2aae0ce17f5b304)

test/objectstore/store_test: reveal incomplete fix for missed shared
blob repair.

Related-to: https://tracker.ceph.com/issues/51762
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 9893710328cef0942c86676e2dee72ee1fbecffd)

os/bluestore: fix improper offset calculation when repairing.

While repairing misreferenced blobs BlueStore could improperly calculate
an offset within a blob being fixed. This could happen when single
physical extent has been replaced by multiple ones - the following
pextent (if any in the current blob) would be treated with the improper offset within the blob. Offset calculation didn't account for each of that new pextents but the last one only.

Fixes: https://tracker.ceph.com/issues/51682
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit ca4b6675fc3fd2f4cadad58044c97c5bb23d5938)

test/objectstore/bluestore_types: add map_bl test case

Along with the basic bluestore_blob_t::map_any functionality
verification this UT shows how invalid offset might appear in
https://tracker.ceph.com/issues/51682

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 81f8e063c6f15d7763f51247babe2db7bf4c2aae)

os/bluestore: use proper prefix when removing undecodable Share Blob.

Fixes: https://tracker.ceph.com/issues/53011
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit aaa0a172080a5a9ecba76be364af9a5277bc2187)

Merge pull request #43747 from mfoliveira/wip-53100-octopus

octopus: os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_* options

Reviewed-by: Igor Fedotov <ifedotov@suse.com>

os/bluestore: fix invalid omap name conversion when upgrading to per-pg.

Fixes: https://tracker.ceph.com/issues/53062
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit cbc97018d883333f81ab9a3cfa99d2f68a9874cd)
(cherry picked from commit dc0a7e49434f76d97016934feed9a8ec806d1e42)

Merge pull request #43658 from badone/wip-octopus-ceph-ansible-systemd-bug

octopus: qa/ceph-ansible: Bump OS version for centos

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_search_bytes

so AvlAllocator can switch from the first-first mode to best-fit mode
without walking through the whole space map tree. in the
highly-fragmented system, iterating the whole tree could hurt the
performance of fast storage system a lot.

the idea comes from openzfs's metaslab allocator.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 5a26875049d13130ffe5954428da0e1b9750359f)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Conflicts:
src/common/options/global.yaml.in:
- Moved new option into src/common/options.cc

os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_search_count

so AvlAllocator can switch from the first-first mode to best-fit mode
without walking through the whole space map tree. in the
highly-fragmented system, iterating the whole tree could hurt the
performance of fast storage system a lot.

the idea comes from openzfs's metaslab allocator.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 40f05b971f5a8064cf9819f80fc3bbf21d5206da)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Conflicts:
src/common/options/global.yaml.in
- Moved new option into src/common/options.cc

os/bluestore/AvlAllocator: use cbit for counting the order of alignment

no need to calculate the alignment first, cbits() would suffice. as it
counts the first set bit and the follow 0's in a number. the result
is identical to the cbit(alignment of that number).

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 573cbb796e8ba2f433caa308925735101a8161a6)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>

os/bluestore/AvlAllocator: use delegated ctor

less repeating this way

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 9b52ba1dd0a5e199833d7ab2561a7b388d85afc1)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Conflicts:
src/os/bluestore/AvlAllocator.cc
- Replace `std::string_view name` w/ `const std::string& name`.

os/bluestore/AvlAllocator: specialize _block_picker()

before this change AvlAllocator::_block_picker() is used by both the
best-fit mode and first-fit mode. but since we cannot achieve the
locality by searching in the area pointed by curosr in best-fit mode,
we just pass a dummy cursor to AvlAllocator::_block_picker() when
searching in the best-fit mode.

but since the range_size_tree is already sorted by the size of ranges,
if _block_picker() fails to find one by the size, we should just give
up right away, and instead try again using a smaller size.

after this change, instead of sharing AvlAllocator::_block_picker()
across both the first-fit mode and the best-fit mode, this method
is specialize to two different variants: one for first-fit, and the
simplified one for best-fit.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 4837166f9e7a659742d4184f021ad12260247888)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>

os/bluestore: Improve _block_picker function

Make _block_picker function scan (*cursor, end) + (begin, *cursor) instead of (*cursor, end) + (begin, end).
The second run over range (*cursor, end) could never yield any results.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit c732060d3e3ef96c6da06c9dde3ed8c064a50965)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>

os/bluestore: do not call _block_picker() again if already searched from start()

Fixes: https://tracker.ceph.com/issues/48272
Signed-off-by: Xue Yantao <jhonxue@tencent.com>
(cherry picked from commit fd5ca26e4a23d6c8992ab5927ce85ade958e251f)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>

octopus: qa/ceph-ansible: Bump OS version for centos

The systemd version in the 8.3 image is buggy so use 8.4 instead.

Fixes: https://tracker.ceph.com/issues/52923
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>

mon/MgrStatMonitor: ignore MMgrReport from non-active mgr

If it's not the active mgr, we should ignore it.

Since the mgr instance is best identified by the gid, add that to the
message. (We can't use the source_addrs for the message since that is
the MgrStandby monc addr, not the active mgr addrs in the MgrMap.)

This fixes a problem where a just-demoted mgr report gets processed and a
new mgr gets a ServiceMap with an epoch >= its pending map. (At least,
that is my theory!)

Fixes: https://tracker.ceph.com/issues/48022
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 4d447092c3542bf57dfb4942db766adf2923c069)

Conflicts:
src/messages/MMonMgrReport.h
src/mon/MgrStatMonitor.cc

mgr: tell monc when we get new servicemap, fsmap

Otherwise, when we re-subscribe we'll request an old map again. In the
case of the servicemap, that can lead to a failed assertion.

Fixes: https://tracker.ceph.com/issues/48022
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 3dbc1f0578d944217ad2bacb58ef561e678abb6c)

os/bluestore: cap omap naming scheme upgrade transactoin.

We shouldn't use single per-onode transaction for such an upgrade when onode's omap list is huge. This results in similarly sized WAL/SST files which are inefficient, might cause high memory usage and sometimes error-prone.

Fixes: https://tracker.ceph.com/issues/49170
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit e897fa243c1dd38329733b452872616023f14ac8)

Conflicts (caused by lack of per-pg omap scheme):
src/os/bluestore/BlueStore.cc
src/os/bluestore/BlueStore.h
src/os/bluestore/bluestore_types.h

qa/rgw: octopus branch targets ceph-octopus branch of java_s3tests

this commit is applied directly to the octopus branch instead of a
backport from master, because it targets the ceph-octopus branch instead
of the ceph-master branch on master

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw/rgw_rados: make RGW request IDs non-deterministic

Use a random number vs. incremental counter for first component of request ID.

Fixes: https://tracker.ceph.com/issues/52818
Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit bce34dd68634d241b451111dcf2e931837eb4bfd)

Merge pull request #43758 from tchaikov/octopus-pr-43748

octopus: admin/doc-requirements.txt: pin Sphinx at 3.5.4

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>

test/rgw: use spawn library for test_rgw_dmclock_scheduler

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit a8e3589a2c875b6fadc853c75f20cb9256f294ca)

Conflicts:
src/test/rgw/CMakeLists.txt
src/test/rgw/test_rgw_dmclock_scheduler.cc: trivial resolution

test/rgw: fix use of poll() with timers in unittest_rgw_dmclock_scheduler

the AsyncScheduler uses an asio timer to dispatch work to its executor
with an optional delay. when no delay is requested, it waits on the
timer with an expiration time in the past (crimson::dmclock::TimeZero)

tests are failing here because poll() is returning without executing the
handlers of those expired timers

asio implements these timers with timerfd and epoll. debugging with
strace, i see that these timers armed with timerfd_settime() are not
always immediately ready according to epoll_wait():

  eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK)   = 3
  epoll_create1(EPOLL_CLOEXEC)            = 4
  timerfd_create(CLOCK_MONOTONIC, TFD_CLOEXEC) = 5
  epoll_ctl(4, EPOLL_CTL_ADD, 3, {events=EPOLLIN|EPOLLERR|EPOLLET, data={u32=14164052, u64=14164052}}) = 0
  epoll_ctl(4, EPOLL_CTL_ADD, 5, {events=EPOLLIN|EPOLLERR, data={u32=14164064, u64=14164064}}) = 0
  timerfd_settime(5, TFD_TIMER_ABSTIME, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=0, tv_nsec=1}}, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=0, tv_nsec=0}}) = 0
  epoll_wait(4, [{events=EPOLLIN, data={u32=14164052, u64=14164052}}], 128, 0) = 1
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [{events=EPOLLIN, data={u32=14164064, u64=14164064}}], 128, 0) = 1

in this example, it took 6 calls to context.poll() before it was ready
to execute the timer's handler

to work around this, replace calls to context.poll() with calls to
context.run_for() with a very short duration

Fixes: https://tracker.ceph.com/issues/42788
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 21baed999e31c5e69c75f0cbb8757ef91585d917)

include/rados/librados.h: avoid redefinition of rados_object_list_item

doxygen complains at seeing rados_object_list_item gets defined twice.
so let's fix it.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 4c53a02ac56e4e87631262fffc711e2e009561d7)

mgr/dashboard: pin a version for autopep8 and pyfakefs

Fixes: https://tracker.ceph.com/issues/53024
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 946dab4f608ec47e0a3cfefdf8e7d1afda69117f)

Conflicts:
src/pybind/mgr/dashboard/requirements-lint.txt: trivial
resolution

admin/doc-requirements.txt: pin Sphinx at 3.5.4

* pin Sphinx at 3.5.4
* pin docutils at 0.18

at least the combination of these two versions
is known to compile.

to address the bug reported at
https://sourceforge.net/p/docutils/bugs/431/

the backtrace looks like:

/home/jenkins-build/build/workspace/ceph-pr-docs/build-doc/virtualenv/lib/python3.8/site-packages/sphinx/util/docutils.py:285:
RemovedInSphinx30Warning: function based directive support is now
deprecated. Use class based directive instead.
  warnings.warn('function based directive support is now deprecated. '

Exception occurred:
  File
"/home/jenkins-build/build/workspace/ceph-pr-docs/build-doc/virtualenv/lib/python3.8/site-packages/docutils/writers/html5_polyglot/__init__.py",
line 445, in section_title_tags
    if (ids and self.settings.section_self_link
AttributeError: 'Values' object has no attribute 'section_self_link'

please note this change is not cherry-picked from
master, because master already bumped Sphinx to 3.5.4
in 4968baa2523bd2a5ca6be147b26bc28906a864c9.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 6362b554e2e57227c9fda51c7703f14674da8358)

Conflicts:
admin/doc-requirements.txt: trivial resolution

admin/doc-requirements.txt: require breathe >= 4.20

to be compatible with Sphinx 3.2
see https://github.com/michaeljones/breathe/tree/v4.20.0#change-log

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 2f1f16ed865114faedc3bdb3049da722284fab61)

doc/scripts/gen_state_diagram.py: Fix literal comparison syntax warnings

In python3.8 comparing strings using 'is' and 'is not' throws syntax warning.
Instead use equality operator.

Signed-off-by: Varsha Rao <varao@redhat.com>
(cherry picked from commit 61e7bcded852e90e6249ab0f3c37ec2688537c83)

doc/conf.py: define CEPH_RADOS_API for breathe

otherwise we could have following errors:

Invalid C declaration: Expected identifier in nested name, got keyword: int [error at 18]
CEPH_RADOS_API int rados_aio_append (rados_ioctx_t io, const char *oid, rados_completion_t completion, const char *buf, size_t len)
------------------^

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit cf357e17ab5cbd9af4f67c51117b5f368cb9913f)

Merge pull request #43441 from rhcs-dashboard/wip-52836-octopus

octopus: qa/mgr/dashboard/test_pool: don't check HEALTH_OK

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

os/bluestore: _do_write_small fix head_pad

Signed-off-by: dheart <dheart_joe@163.com>
(cherry picked from commit ed8dd300a88173b1e5efafb6bb061a15ea296c29)

Merge pull request #43381 from nkshirsagar/octopus

octopus: rgw: clear buckets before calling list_buckets()

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Reviewed-by: Prashant D <pdhange@redhat.com>
Reviewed-by: Ponnuvel Palaniyappan <pponnuvel@gmail.com>

Merge pull request #43325 from gerald-yang/octopus-50482

octopus: msgr/async: fix unsafe access in unregister_conn()

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

Merge pull request #43310 from gerald-yang/octopus-51199

octopus: msg/async: allow connection reaping to be tuned; fix cephfs test

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #43557 from badone/wip-octopus-ceph-ansible-max-usable-version

octopus: qa/ceph-ansible: Pin to last compatible stable release

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

Merge pull request #43607 from rhcs-dashboard/wip-52986-octopus

octopus: mgr/dashboard/api: set a UTF-8 locale when running pip

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

mgr/dashboard/api: set a UTF-8 locale when running pip

ansible-core started to include files whose filenames are encoded in
non-ascii characters, so we have to use a more capable encoding for the
locale in order to install this package. otherwise we'd have following
error:

Collecting ansible-core<2.12,>=2.11.3
  Using cached ansible-core-2.11.4.tar.gz (6.8 MB)
ERROR: Exception:

Traceback (most recent call last):
  File "/tmp/tmp.fX76ASIrch/venv/lib/python3.8/site-packages/pip/_internal/cli/base_command.py", line 173, in _main
    status = self.run(options, args)
...
  File "/tmp/tmp.fX76ASIrch/venv/lib/python3.8/site-packages/pip/_internal/utils/unpacking.py", line 226, in untar_file
    with open(path, "wb") as destfp:
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 137-140: ordinal not in range(256)

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 05e4145856bb5ed19ecc879f2e50b5a88cb2045e)

15.2.15

octopus: qa/ceph-ansible: Pin to last compatible stable release

https://github.com/ceph/ceph-ansible/pull/6892 introduces a breaking
change so pin the CA version to stable-6.0 and ansible 2.9.

Fixes: https://tracker.ceph.com/issues/52943
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>