git-server-git.apps.pok.os.sepia.ceph.com Git

ceph-volume: filter out non-existing devices

[1] has broken teuthology tests.

```
[root@smithi097 /]# ls -l /sys/block/
total 0
lrwxrwxrwx 1 root root 0 Jul 25 14:15 dm-0 -> ../devices/virtual/block/dm-0
lrwxrwxrwx 1 root root 0 Jul 25 14:15 dm-1 -> ../devices/virtual/block/dm-1
lrwxrwxrwx 1 root root 0 Jul 25 14:15 dm-2 -> ../devices/virtual/block/dm-2
lrwxrwxrwx 1 root root 0 Jul 25 14:15 dm-3 -> ../devices/virtual/block/dm-3
lrwxrwxrwx 1 root root 0 Jul 25 14:15 dm-4 -> ../devices/virtual/block/dm-4
lrwxrwxrwx 1 root root 0 Jul 25 14:13 loop0 -> ../devices/virtual/block/loop0
lrwxrwxrwx 1 root root 0 Jul 25 14:13 loop1 -> ../devices/virtual/block/loop1
lrwxrwxrwx 1 root root 0 Jul 25 14:13 loop2 -> ../devices/virtual/block/loop2
lrwxrwxrwx 1 root root 0 Jul 25 14:13 loop3 -> ../devices/virtual/block/loop3
lrwxrwxrwx 1 root root 0 Jul 25 14:13 loop4 -> ../devices/virtual/block/loop4
lrwxrwxrwx 1 root root 0 Jul 25 14:13 loop5 -> ../devices/virtual/block/loop5
lrwxrwxrwx 1 root root 0 Jul 25 14:13 loop6 -> ../devices/virtual/block/loop6
lrwxrwxrwx 1 root root 0 Jul 25 14:13 loop7 -> ../devices/virtual/block/loop7
lrwxrwxrwx 1 root root 0 Jul 25 14:12 nvme0n1 -> ../devices/pci0000:00/0000:00:02.0/0000:02:00.0/nvme/nvme0/nvme0n1
lrwxrwxrwx 1 root root 0 Jul 25 14:18 nvme1c1n1 -> ../devices/virtual/nvme-fabrics/ctl/nvme1/nvme1c1n1
lrwxrwxrwx 1 root root 0 Jul 25 14:18 nvme1n1 -> ../devices/virtual/nvme-subsystem/nvme-subsys1/nvme1n1
lrwxrwxrwx 1 root root 0 Jul 25 14:18 nvme2c2n1 -> ../devices/virtual/nvme-fabrics/ctl/nvme2/nvme2c2n1
lrwxrwxrwx 1 root root 0 Jul 25 14:18 nvme2n1 -> ../devices/virtual/nvme-subsystem/nvme-subsys2/nvme2n1
lrwxrwxrwx 1 root root 0 Jul 25 14:18 nvme3c3n1 -> ../devices/virtual/nvme-fabrics/ctl/nvme3/nvme3c3n1
lrwxrwxrwx 1 root root 0 Jul 25 14:18 nvme3n1 -> ../devices/virtual/nvme-subsystem/nvme-subsys3/nvme3n1
lrwxrwxrwx 1 root root 0 Jul 25 14:18 nvme4c4n1 -> ../devices/virtual/nvme-fabrics/ctl/nvme4/nvme4c4n1
lrwxrwxrwx 1 root root 0 Jul 25 14:18 nvme4n1 -> ../devices/virtual/nvme-subsystem/nvme-subsys4/nvme4n1
lrwxrwxrwx 1 root root 0 Jul 25 14:12 sda -> ../devices/pci0000:00/0000:00:1f.2/ata5/host4/target4:0:0/4:0:0:0/block/sda
[root@smithi097 /]#
```

devices like `nvme1c1n1` exist in `/sys/block` but aren't present in `/dev`

```
[root@smithi097 /]# ls -l /dev/nvme*
crw------- 1 root root  10, 122 Jul 25 14:16 /dev/nvme-fabrics
crw------- 1 root root 240,   0 Jul 25 14:12 /dev/nvme0
brw-rw---- 1 root disk 259,   0 Jul 25 14:15 /dev/nvme0n1
crw------- 1 root root 240,   1 Jul 25 14:16 /dev/nvme1
brw-rw---- 1 root disk 259,   2 Jul 25 15:24 /dev/nvme1n1
crw------- 1 root root 240,   2 Jul 25 14:16 /dev/nvme2
brw-rw---- 1 root disk 259,   4 Jul 25 15:24 /dev/nvme2n1
crw------- 1 root root 240,   3 Jul 25 14:16 /dev/nvme3
brw-rw---- 1 root disk 259,   6 Jul 25 15:24 /dev/nvme3n1
crw------- 1 root root 240,   4 Jul 25 14:16 /dev/nvme4
brw-rw---- 1 root disk 259,   8 Jul 25 15:24 /dev/nvme4n1
[root@smithi097 /]#
```

If a device isn't actually present in `/dev`, `get_block_devs_sysfs()` shouldn't return it.

[1] https://github.com/ceph/ceph/commit/c7f017b21ade3762ba5b7b9688bed72c6b6

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 0ee8351e4d564724aa68a2b8d03bc53e281e7fc9)

ceph-volume/tests: fix test_exception_returns_default

There are cases, like running tests as root user, where this test fails
because root user can always read the file content.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 47aec3b86bcc850292f66d24670c5d8dc0a33c85)

ceph-volume: drop `ceph-bluestore-tool` call in Device._parse()

disk.has_bluestore_label() does the same thing without a subprocess call.

Fixes: https://tracker.ceph.com/issues/56622
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f5f3117510c287d817534c9f7c25af5f208c2f56)

ceph-volume: fix is_ceph_disk_member()

`dev['NAME']` can't match `part` given that it's the name of the
parent device being compared to the partition name.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 24899a1f8d77bc3c10ef158bb463349d615f7c57)
(cherry picked from commit 66d0a07b34faa4514b1b7cb2d46ed90d83b91963)

ceph-volume/tests: migrate to pyfakefs

ceph-volume unit tests shouldn't actually create contents on the
filesystem from where it runs (even though they are written in a tmp
dir), let's use pyfakefs.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1274ba34b4d21c5abf2cd2da035d6d25ae8d03da)

ceph-volume/tests: fix test_path_is_valid()

When ceph-volume tests are run from a host where there's no `/dev/sda`
device, this test fails.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7ab9d99420db2995542dfab7c416021dcc1c4a47)

ceph-volume: report slave devices in inventory

`ceph-volume inventory` currently reports the following:

```

Device Path               Size        rotates available Model name
/dev/mapper/mpatha        50.00 GB    True    True
/dev/mapper/mpathaa       6.00 GB     True    True
/dev/mapper/mpathab       6.00 GB     True    True
/dev/mapper/mpathac       6.00 GB     True    True
/dev/mapper/mpathad       6.00 GB     True    True
/dev/mapper/mpathae       6.00 GB     True    True
```

whereas something like following would be useful:

```

Device Path               Size         Device nodes    rotates available Model name
/dev/mapper/mpatha        50.00 GB     sdf,sde         True    True
/dev/mapper/mpathaa       6.00 GB      sdbe,sdat       True    True
/dev/mapper/mpathab       6.00 GB      sdav,sdbf       True    True
/dev/mapper/mpathac       6.00 GB      sdbb,sdbl       True    True
/dev/mapper/mpathad       6.00 GB      sdas,sdbc       True    True
/dev/mapper/mpathae       6.00 GB      sdax,sdbh       True    True
```

Fixes: https://tracker.ceph.com/issues/56624
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 1ac2a8270cf4aece58ae3f952cc4d1ef978384ed)

ceph-volume: support symlinks as devices

This makes ceph-volume support passing symlinks as devices.

Fixes: https://tracker.ceph.com/issues/49103
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 9ddc5b77372cd2e0d5badd433c93d147cb25c6ee)

ceph-volume: improve mpath devices reporting

An environment with mpath devices looks like following:

`lsblk` output:

```
sdy        65:128  0    6G  0 disk
`-mpathm  252:8    0    6G  0 mpath
sdz        65:144  0    6G  0 disk
`-mpathm  252:8    0    6G  0 mpath
```

`multipath -ll` output:

```
mpathm (3600140575bbe2d3c0c6493fb6e6ed84c) dm-8 LIO-ORG,TCMU device
size=6.0G features='0' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| `- 2:0:0:30 sdz  65:144 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  `- 3:0:0:30 sdy  65:128 active ready running
```

`ceph-volume inventory` output:
```
Device Path               Size         rotates available Model name
/dev/mapper/mpathm        6.00 GB      True    True
/dev/sdy                  6.00 GB      True    False     TCMU device
/dev/sdz                  6.00 GB      True    False     TCMU device
```

ceph-volume shouldn't report devices `/dev/sdy` and `/dev/sdz`, they will never be
available in such a scenario. Considering this, in a host with a bunch of mpath devices
it will pollute the inventory output.

Fixes: https://tracker.ceph.com/issues/56621
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit e892f02a1a9126cbb520c90aeef0afb3590ddbbb)

ceph-volume: fix simple scan

When the class `Device` is instantiated with a path instead of a
block device, it fails like following.

```
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.6/site-packages/ceph_volume/util/device.py", line 130, in __init__
    self._parse()
  File "/usr/lib/python3.6/site-packages/ceph_volume/util/device.py", line 233, in _parse
    self.ceph_device = disk.has_bluestore_label(self.path)
  File "/usr/lib/python3.6/site-packages/ceph_volume/util/disk.py", line 906, in has_bluestore_label
    with open(device_path, "rb") as fd:
IsADirectoryError: [Errno 21] Is a directory: '/var/lib/ceph/osd/ceph-0/'
```

passing a path instead of a block device is valid, `simple scan` needs it.

Fixes: https://tracker.ceph.com/issues/56969
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 5a10e3e9e865aafd9d3f90beb186777060d17f08)

ceph-volume: Remove refs to get_block_devs_lsblk

It's been replaced by get_block_devs_sysfs

Signed-off-by: Zack Cerza <zack@redhat.com>
(cherry picked from commit 2023e71f695c1962534c6012d5b331bebd10ec31)

ceph-volume: Rename env var; add warning

So that we can have a nice big warning that fires once per invocation, I
think using a callable class with a class attribute seems like a decent
approach. A closure could work too.

Signed-off-by: Zack Cerza <zack@redhat.com>
(cherry picked from commit 69f58f51a2d7967d597a8d61c0ab20b52e8dc374)

ceph-volume: Optionally consume loop devices

A similar proposal was rejected in #24765; I understand the logic
behind the rejection, but this will allow us to run Ceph clusters on
machines that lack disk resources for testing purposes. We just need to
make it impossible to accidentally enable, and make it clear it is
unsupported.

Signed-off-by: Zack Cerza <zack@redhat.com>
(cherry picked from commit c7f017b21ade3762ba5b7b9688bed72c6b60dc0e)

ceph-volume: drop self.abspath in Device()

seems useless to have both self.path and self.abspath attributes.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 01112ba6894834287092d3b672ec548cb3a4d676)

ceph-volume: do not call get_device_vgs() per devices

let's call `ceph_volume.api.lvm.get_all_devices_vgs` only one time instead
so we avoid a bunch of subprocess calls that slow down the process when running
`ceph-volume inventory` command.

Fixes: https://tracker.ceph.com/issues/56623
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f5bd0f6da9a3c2f60df2ed6dfaa4c26831a3be37)

ceph-volume: fix fast device alloc size on mulitple device

The size computed by get_physical_fast_allocs() was wrong when the
function had multiple devices to treat.

For instance if there is 4 OSDs and 2 fast devices of each 10G while
allocating 2 slots per fast devvices. The behavior before was that each
slot would be 2.5G meaning that both fast devices would half full. The
behavior now is that each slot will take 5G so that the fast devices
would be full.

Fixes: https://tracker.ceph.com/issues/56031
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit d0f9e93914e2b7feac41a634311d74c146c8868b)

ceph-volume: fix shebang of install command

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit 5255d4be74f15032331ba24b708ad04193ff87f3)

ceph-volume: fix `simple scan`

`lsblk_all()` should return an empty dict `{}` if nothing was found.
If we raise `RuntimeError()` then the loop in `scan.Scan.main` will stop
and make ceph-volume fails because we don't try to catch this exception.
`scan.Scan.main()` has its own logic in order to detect the given path
is a ceph-disk created OSD anyway.

Fixes: https://tracker.ceph.com/issues/56482
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 2630a50acb4654332c0b6801e9214564bd09d3d0)

Merge pull request #47341 from cbodley/wip-56954

pacific: rgw: better tenant id from the uri on anonymous access

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

Merge pull request #47041 from ifed01/wip-ifed-fix-s3-lock-pac

pacific: rgw: do not permit locked object version removal

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>

Merge pull request #47194 from adamemerson/wip-56586-pacific

pacific: rgw: Guard against malformed bucket URLs

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

Merge pull request #47247 from idryomov/wip-56676-pacific

pacific: librbd: tweak misleading "image is still primary" error message

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #47161 from idryomov/wip-56549-pacific

pacific: librbd: bail from schedule_request_lock() if already lock owner

Reviewed-by: Christopher Hoffman <choffman@redhat.com>

Merge pull request #47158 from idryomov/wip-48038-pacific

pacific: qa/suites/rbd: disable workunit timeout for dynamic_features_no_cache

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #47143 from idryomov/wip-56561-pacific

pacific: rbd: don't default empty pool name unless namespace is specified

Reviewed-by: Christopher Hoffman <choffman@redhat.com>

Merge pull request #47128 from idryomov/wip-50734-pacific

pacific: qa/suites/rbd/pwl-cache: ensure recovery is actually tested

Reviewed-by: Christopher Hoffman <choffman@redhat.com>

Merge pull request #47118 from idryomov/wip-56516-pacific

pacific: rbd-mirror: remove bogus completed_non_primary_snapshots_exist check

Reviewed-by: Mykola Golub <mgolub@suse.com>

Merge pull request #47295 from SUSE/wip-pacific-include-memory

pacific: include/buffer: include <memory>

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #47196 from adamemerson/wip-54491-pacific

pacific: rgw: Fix data race in ChangeStatus

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #47175 from yuvalif/wip-51943-pacific

pacific: rgw/notifications: Change in multipart upload notification behavior

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #46579 from cbodley/wip-55969

pacific: rgw: check object storage_class when check_disk_state

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

Merge pull request #47404 from aaSharma14/wip-57004-pacific

pacific: mgr/dashboard: Show error on creating service with duplicate service id

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #47105 from rhcs-dashboard/wip-56562-pacific

pacific: mgr/dashboard: rbd image pagination

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #47431 from rhcs-dashboard/wip-57010-pacific

pacific: mgr/dashboard: remove token logging

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #47443 from ceph/pacific-rtd

pacific: .readthedocs.yml: Always build latest doc/releases pages

.readthedocs.yml: Always build latest doc/releases pages

We don't backport PRs merged into doc/releases. Therefore, when one browses to an older Ceph release version on docs.ceph.com (e.g., https://docs.ceph.com/en/pacific/), the information is out of date at best.

The doc/releases page is only accurate if browsing https://docs.ceph.com/en/latest/, for example.

So this post_checkout command will make sure we've checked out doc/releases from main before building and publishing.

Signed-off-by: David Galloway <dgallowa@redhat.com>
(cherry picked from commit 055fe1f825b0629b7685d6d3d4d629ffc37a2d7c)

Merge pull request #47375 from adk3798/wip-56985-pacific

pacific: mgr/cephadm: clear error message when resuming upgrade

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>

Merge pull request #47376 from adk3798/wip-56984-pacific

pacific: cephadm: Fix repo_gpgkey should return 2 vars

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>

mgr/dashboard: remove token logging

Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
(cherry picked from commit fdaf909fba0c399b8d9bbc3a29a5af871546d9d9)

Merge pull request #47271 from vumrao/wip-vumrao-56701

pacific: libcephsqlite: ceph-mgr crashes when compiled with gcc12

Reviewed-by: Laura Flores <lflores@redhat.com>

Merge pull request #47416 from zdover23/wip-doc-2022-08-03-backport-47197-to-pacific

pacific: doc/dev: Elaborate on boost .deb creation

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #47296 from aclamk/wip-56669-pacific

pacific: os/bluestore: Fix collision between BlueFS and BlueStore deferred writes

doc/dev: Elaborate on boost .deb creation

Signed-off-by: David Galloway <dgallowa@redhat.com>
(cherry picked from commit 3222485d6582ba6ab3ee908c13b6459c2e2b6302)

Merge pull request #47304 from cbodley/wip-56731

pacific: rgw/rgw_string.h: add missing includes for alpine and boost 1.75

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #47393 from adk3798/pacific-fix-flake8

pacific: pybind/mgr: fix flake8

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

mgr/dashboard: Show error on creating service with duplicate service id

Fixes: https://tracker.ceph.com/issues/56689
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 07cfd44193f6ebd552d13325858e8b5b5c131bfb)

mgr/cephadm: fix flake8

Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit a124f6c47b119a8741f347ea5a809f3fb48d6679)

cephadm: ceph.manual.gpg cleanup

Cleanup file /etc/apt/trusted.gpg.d/ceph.manual.gpg when repo is removed

Signed-off-by: Laurent Barbe <laurent@ksperis.com>
(cherry picked from commit bf0951353a80dc17edcc85a7914cfaf2b956d544)

cephadm: Fix repo_gpgkey should return 2 vars

when option --gpg-url is specified, the name used for the gpg filename is missing and throws an exception
this adds the string "manual" to the gpg key : /etc/apt/trusted.gpg.d/ceph.manual.gpg

Fixes: https://tracker.ceph.com/issues/56950
Signed-off-by: Laurent Barbe <laurent@ksperis.com>
(cherry picked from commit 79c805546c9bf4747a9019f4ad59f55e03c9fcef)

mgr/cephadm: clear error message when resuming upgrade

the message field in the output of "ceph orch upgrade status"
will first take the value of the error field of the UpgradeState,
and if only if it blank/None, display an info string we periodically
update throughout the upgrade with useful info such as that
we're upgrading a daemon of a particular type or pulling an image
on a certain host. When an upgrade fails, we set the error field
of the UpgradeState, pause the upgrade and raise a health warning.
Sometimes, the user is able to resolve the issue and simply resume
the upgrade. The issue here is, in that case, the error field of
the UpgradeState is still set, so instead of seeing the useful info
messages, it will continue to display an error message that may
no longer be relevant. By emptying the error field of the UpgradeState
when upgrades are resumed, we return to normal behavior of
displaying the info string, and will only show another error message
if another error actually occurs.

Fixes: https://tracker.ceph.com/issues/56714
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 429561ccb7b524f071214ff3aad99ba8830a924c)

Merge pull request #47319 from adk3798/wip-56739-pacific

pacific: cephadm: add "su root root" to cephadm.log logrotate config

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>

Merge pull request #47320 from adk3798/wip-56737-pacific

pacific: qa/workunits/cephadm: update test_repos master -> main

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>

Merge pull request #47360 from zdover23/wip-doc-2022-07-29-backport-47347-to-pacific

pacific: doc/radosgw: Uppercase s3

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/radosgw: make s3 uppercase

s/s3/S3/ (Also, a "the" has been added.)

(cherry picked from commit 73f0d5707d275529416d5110160b9ff5ead23d22)

Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>

rgw: better tenant id from the uri on anonymous access

When anonymous tries access public bucket, it gets 404,
because rgw doesn't check tenant correctly.

A previous fix for this broke legacy implicit tenants,
because it didn't check for anonymous access. This version
restricts its behavior to the anonymous user.

Fixes: https://tracker.ceph.com/issues/48001 https://tracker.ceph.com/issues/48382
Original fix by
Author: Rafał Wądołowski <rafal@rafal.net.pl>
Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com>
This fix
Signed-off-by: Marcus Watts <mwatts@redhat.com>
(cherry picked from commit b5880caa505e2df7ff035537c19a8be3a3a8bb8f)

Merge pull request #47259 from aaSharma14/wip-56686-pacific

pacific: os/bluestore: Better readability of perf output

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: sunilangadi2 <NOT@FOUND>

qa/workunits/cephadm: update test_repos master -> main

Missed this one, it seems the test was still passing until
recently but I guess the "master" builds are now gone

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 9b08edf939752d16b5933e1d7fd5bc33d08c2d4a)

cephadm: add "su root root" to cephadm.log logrotate config

Fixes: https://tracker.ceph.com/issues/56639
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit c0929e7e3ea14b0f8bfc71dc8ff74efddafb3ffc)

rgw/rgw_string.h: add missing includes for alpine and boost 1.75
alpine needs: string, stdexcept

Fixes: https://tracker.ceph.com/issues/50924
Signed-off-by: Duncan Bellamy <dunk@denkimushi.com>
(cherry picked from commit ebf3a0398f18eab67d2ba25e6a10b41ff140f6a4)

test/encoding: verify that e.what() starts with expected str

boost changes the way how it prints boost::system::system_error in
boost 1.79 -- it appends the stringified error_category at end of
exception::what(), and our buffer::malformed_input is a subclass
of boost::system::system_error.

so we cannot just compare the return value of what() with the
expected string, to be more future proof, let's check if i
starts with the expected string instead.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 0d9eec3c4feff30ab55601533bccf9ba6e568b9f)

os/bluestore: Fix deferred writes corrupting RocksDB

Deferred writes can sometimes update regions that are no longer mapped to any object.
This cannot happen when BlueStore is running, as blobs are being held,
and allocations are not released until deferred op is executed.
However in case of restart allocations that deferred is targetting are already freed.
Deferred replay is done on BlueStore bootup, before any new object can be allocated,
so no collision with object is possible.
But BlueFS can allocate space from block with deferred ops still pending.

Fixes: https://tracker.ceph.com/issues/54547
Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 2afb951a490aac9ca4d01614867396cbd06b793d)

Conflicts: src/os/bluestore/BlueStore.cc
Modified non-critical BlueFS call foreach_block_extents to get_block_extents.

common/interval_set: Add operator-- to iterator

Only ++ was allowed. Extend iterator manipulation to -- too.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 58860ce3f60489d258aaa10fd783e68083261937)

test/objectstore: Add test for deferred writes

Add test that recreates situation when BlueStore deferred writes
can cause RocksDB files corruption.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 42de47ae8468d000d5bca62bd537d7a1028bae42)

Conflicts: src/test/objectstore/test_deferred.cc
Non-critical code parts.

rgw/rgw_string.h: add missing includes for alpine and boost 1.75
alpine needs: string, stdexcept

Fixes: https://tracker.ceph.com/issues/50924
Signed-off-by: Duncan Bellamy <dunk@denkimushi.com>
(cherry picked from commit ebf3a0398f18eab67d2ba25e6a10b41ff140f6a4)

include/buffer: include <memory>

to address following FTBFS:

/usr/bin/ccache /usr/bin/clang++-13 -DBOOST_ALL_NO_LIB -DBOOST_ASIO_DISABLE_CONCEPTS -DBOOST_ASIO_DISABLE_THREAD_KEYWORD_EXTENSION -DBOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT -DBOOST_PROGRAM_OPTIONS_DYN_LINK -DBOOST_T$
In file included from /var/ssd/ceph/src/crimson/os/seastore/seastore_types.cc:4:
In file included from /var/ssd/ceph/src/crimson/os/seastore/seastore_types.h:14:
In file included from /var/ssd/ceph/src/include/denc.h:47:
/var/ssd/ceph/src/include/buffer.h:98:37: error: no template named 'unique_ptr' in namespace 'std'; did you mean 'boost::movelib::unique_ptr'?
struct unique_leakable_ptr : public std::unique_ptr<T, ceph::nop_delete<T>> {
                                    ^~~~~~~~~~~~~~~
                                    boost::movelib::unique_ptr
/opt/ceph/include/boost/move/unique_ptr.hpp:354:7: note: 'boost::movelib::unique_ptr' declared here
class unique_ptr
      ^

Fixes: https://tracker.ceph.com/issues/53896
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 7c381ba985bd1398ef7d145cc00fae9d0db510e3)

libcephsqlite: ceph-mgr crashes when compiled with gcc12

regex in libcephsqlite, when compiled with GCC12 treats '-' as a range
operator resulting in the following error.
"Invalid start of '[x-x]' range in regular expression"

Fixes: https://tracker.ceph.com/issues/55304
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
(cherry picked from commit ac043a09c5ffb4b434b8644920004b3d5b7f9d8c)

Merge pull request #47087 from kamoltat/wip-ksirivad-pacific-backport-46029

pacific:mon/Elector: notify_rank_removed erase rank from both live_pinging and dead_pinging sets for highest ranked MON

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>

os/bluestore: Better readability of perf output

Get rid of bluestore_ prefix for some stats.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit b5f2cc7755b221ea0a4e4d11c0e8f1866dbdc052)

Merge pull request #46873 from ronen-fr/wip-rf-46860-pacific

Pacific: osd/scrub: late-arriving reservation grants are not an error

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>

librbd: tweak misleading "image is still primary" error message

m_promotion_state == PROMOTION_STATE_NON_PRIMARY doesn't say anything
about the remote image. It could still be primary but it could also be
demoted.

Fixes: https://tracker.ceph.com/issues/56676
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 95dd89b4d771a8e01ce52beecc701548bf8321e8)

Merge pull request #47220 from ceph/16210

v16.2.10

qa: Remove unused variable

Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit c8416c3f0e15e49dde4b9bca6f732231d9b5d16b)

16.2.10

rgw: s3website check for bucket before retargeting

On requesting s3website API without a bucket name it will crash because s->bucket is null

Fixes: https://tracker.ceph.com/issues/56281
Signed-off-by: Seena Fallah <seenafallah@gmail.com>
(cherry picked from commit 933dbabb3a2a43fd016bc61cc0ee5e27f7ad32e7)

Conflicts:
src/rgw/rgw_rest_s3.cc rgw::sal::Bucket -> rgw::sal::RGWBucket
(cherry picked from commit 5b3e3874433acdfec415eae2b7a02878afe00734)

qa: validate subvolume discover on upgrade

Validate subvolume discover on upgrade from
legacy subvolume to v1. The handcrafted
`.meta' file on legacy subvolume root should
not be used for any subvolume apis like getpath,
authorize.

Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit fcc118500c545fe6018cd3f2742127b92c657def)
(cherry picked from commit 34d280e306e2287f4fd9b5cac1c70607263ccc23)

mgr/volumes: V2 Fix for test_subvolume_retain_snapshot_invalid_recreate

Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 5bb46ee690591411d4890b613c6380fced9d04b4)

mgr/volumes: Fix subvolume discover during upgrade

Fixes the subvolume discover to use the correct
metadata file after an upgrade from legacy subvolume
to v1. The fix makes sure, it doesn't use the
handcrafted metadata file placed in the subvolume
root of legacy subvolume.

Co-authored-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
Co-authored-by: Dan van der Ster <daniel.vanderster@cern.ch>
Co-authored-by: Ramana Raja <rraja@redhat.com>
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 7eba9cab6cfb9a13a84062177d7a0fa228311e13)
(cherry picked from commit f8c04135150a7fb3c43607b43a8214e0d57547bc)

rgw: Fix data race in ChangeStatus

Fixes: https://tracker.ceph.com/issues/54208
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 27f5ba9e5f649d8767c8ab44d56404e0186f6fc1)
Fixes: https://tracker.ceph.com/issues/54491
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

rgw: Guard against malformed bucket URLs

Misplaced colons can result in radosgw thinking is has a bucket URL
but with no bucket name, leading to a crash later on.

Fixes: https://tracker.ceph.com/issues/55765
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 3ee9a3b41a289a926fed8b8927ca2a93b4f120a6)
Fixes: https://tracker.ceph.com/issues/56586
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

rgw: Fix `rgw::sal::Bucket::empty` static method signatures

`unique_ptr` overload should take by reference.

Both should be const.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit b1d3e6c00674ebf6bde08968789a426d65db73d9)

Conflicts:
src/rgw/rgw_sal.h
- `unique_ptr` overload of empty

Fixes: https://tracker.ceph.com/issues/56586
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Merge pull request #47095 from aaSharma14/wip-56559-pacific

pacific: os/bluestore: update perf counter priorities

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: neha-ojha <NOT@FOUND>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Yuri Weinstein <yweins@redhat.com>

os/bluestore: update priorities and nicks of bluestore perf counters

These perf counters do not show up in telemetry unless they are set to a "useful" priority or higher. Fetching these counters in telemetry may help to diagnose problems with RocksDB / BlueFS prefetching / insufficient cache sizes.

Signed-off-by: Laura Flores <lflores@redhat.com>
(cherry picked from commit 8790f04c5469d86daebb3bbe1eb780a86b7404ce)

Note: This backport (and the original PR) exposed an extra 34 perf counters/OSD to Prometheus. Given Pacific is a stable release and not to add that much extra load we are adapting this backport and only exposing the 2 required perf-coun>

rgw/notifications: Changing the Multipart Upload notification behavior

Changing the notification behavior in case of Multipart Upload, updating
the related test cases and adding the documentation changes for the same.

Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>
(cherry picked from commit 3be2b4bcc5686a5ff5e348dba2deda444d336bb8)

Conflicts:
src/rgw/rgw_op.cc
src/test/rgw/bucket_notification/test_bn.py
(pacific does not have test_bn.py)

Merge pull request #46553 from adk3798/pacific-preserve-cephadm-user

pacific: cephadm: preserve cephadm user during RPM upgrade

Reviewed-by: Kefu Chai <tchaikov@gmail.com>

librbd: bail from schedule_request_lock() if already lock owner

Race condition may be hit if there are multiple pending locks for the
same image and pending callbacks. Abort exclusive lock process if
already exclusive lock owner.

Fixes: https://tracker.ceph.com/issues/56549
Signed-off-by: Christopher Hoffman <choffman@redhat.com>
(cherry picked from commit 3527d2c764626c09c5ede80ae844551fd8845756)

qa/suites/rbd: disable workunit timeout for dynamic_features_no_cache

The I/O workload in this test is xfstests (qa/run_xfstests_qemu.sh)
which isn't subjected to any timeout other than global max_job_time
limit in any other subsuite (e.g. qemu/workloads/qemu_xfstests.yaml).
But here, there is a parallel "op" workload defined as a workunit.
The workunit task has a default timeout of 3 hours which is effectively
imposed on the entire job. In the "rbd cache = false" configuration,
it's sometimes exceeded.

Fixes: https://tracker.ceph.com/issues/48038
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 0a6a70760a6f771b848852e0779a221e22c8775d)

Merge pull request #47093 from rhcs-dashboard/wip-56546-pacific

pacific: mgr/dashboard: don't log tracebacks on 404s

Reviewed-by: Pegonzal <pegonzal@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>

Merge pull request #47027 from tchaikov/pacific-56466

pacific: make-dist: patch boost source to support python 3.10 …

Merge pull request #47145 from s0nea/wip-56594-pacific

pacific: mgr/dashboard: prevent alert redirect

Merge pull request #46695 from votdev/dashboard_pull_i18n_transifex

mgr/dashboard: Pull latest languages from Transifex

Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>

mgr/dashboard: prevent alert redirect

Prevent Alertmanager alerts from being redirected to the active mgr
dashboard instance. There are two reasons for it:

1. It doesn't bring any additional benefit. The Alertmanager config
   includes all available mgr instances - active and passive ones. In
   case of an alert, it will be sent to all of them. It ensures that
   the active mgr dashboard will receive the alert in any case.
2. The redirect URL includes the mgr IP and NOT the FQDN. This leads
   to issues in environments where an SSL certificate is configured and
   matches the FQDNs, only.

Fixes: https://tracker.ceph.com/issues/56401
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
(cherry picked from commit 965005e0789e566ccadce7a326b0e197ab8d7f5f)

rbd: drop unused default_empty_pool_name argument

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 365300f253fd8066ae6f8cbd36c94ff4b145ab8d)

rbd: don't default empty pool name unless namespace is specified

Commit 96f05a7956b3 ("rbd: delay determination of default pool name")
broke "rbd perf image iostat" and "rbd perf image iotop" GLOBAL_POOL_KEY
support (the ability to blend all rbd pools together into a single
view).

Fixes: https://tracker.ceph.com/issues/56561
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit b2137e205862e6cfc316c11036266da65a78d26d)

Merge pull request #47008 from rhcs-dashboard/wip-56125-pacific

pacific: mgr/dashboard: display helpfull message when the iframe-embedded Grafana dashboard failed to load

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: nSedrickm <NOT@FOUND>

Merge pull request #46969 from guits/wip-56471-2-pacific

pacific: ceph-volume: avoid unnecessary subprocess calls

Merge pull request #47122 from zdover23/wip-doc-2022-07-16-backport-47109-to-pacific

pacific: doc/start: update hardware recs

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #47084 from rhcs-dashboard/wip-56544-pacific

pacific: mgr/dashboard: ingress backend service should list all supported services

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

qa/tasks: rename persistent write log cache trash task

It doesn't really thrash anything, just repeatedly restarts the
workload on top of a dirty cache file. rbd_pwl_cache_recovery is
more on point and gets covered by existing CODEOWNERS.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 2de0574382e2c1c63f20745d6870ac7f82b27b9f)

qa/tasks: add thrash test for persistent write log cache

add thrash test for persistent write log cache. run rbd bench
on persistent write log cache, thrashes rbd bench, test the
recovery function of persistent write log cache.

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
(cherry picked from commit 0eab8de3c017d8318bd6c846991bb3f7c51fa97d)

cmake: patch boost source to support python 3.10

Python 3.10 doesn't include the _Py_fopen() function. Boost
1.75.0 includes a patch which switches to using fopen() for
python versions >= 3.1, but Pacific is using boost 1.73.0,
which still uses _Py_fopen(). This commit adds the boost
1.75.0 patch to `make-dist`, so it's applied to our copy of
the boost source which is then used when building RPM packages.

the included patch comes from
https://github.com/boostorg/python/commit/d9f06052e28873037db7f98629bce72182a42410

Fixes: https://tracker.ceph.com/issues/56466
please note, this change is not cherry-picked from the
"main" branch. as we use boost 1.75 already in that branch,
but to minimize the risk of switching boost from 1.73 to
1.75 in a LTS branch like pacific, we just add a fix to
address this particular issue in boost 1.73.

Signed-off-by: Tim Serong <tserong@suse.com>
Signed-off-by: Kefu Chai <tchaikov@gmail.com>