]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
3 years agorpm, debian: move smartmontools and nvme-cli to ceph-base 44177/head
Yaarit Hatuka [Wed, 25 Aug 2021 02:12:08 +0000 (02:12 +0000)]
rpm, debian: move smartmontools and nvme-cli to ceph-base

We wish to be able to scrape SMART and NVMe metrics from OSD and MON
nodes. For this we require / recommend smartmontools and nvme-cli
dependencies for both the ceph-osd and ceph-mon packages.  However, the
sudoers file (which is required for invoking `smartctl` by user 'ceph')
was installed only in the ceph-osd package.  Since different packages
cannot own the same file, and because we want to be able to scrape from
every daemon, we move the dependencies and the sudoers installation to
ceph-base. For generalization, we rename:
sudoers.d/ceph-osd-smartctl -> sudoers.d/ceph-smartctl

Fixes: https://tracker.ceph.com/issues/50657
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit 7ca39fa92b47427af2f1c6000c653bb4dffc47fe)

Conflicts:
ceph.spec.in
debian/rules

Cherry-pick notes:
- conflict due to octopus not having jaeger dep
- conflict due to octopus not installing rbd-nbd_quiesce on debian

3 years agoMerge pull request #42958 from ifed01/wip-ifed-fix-huge-omap-rename-oct
Igor Fedotov [Thu, 25 Nov 2021 10:23:17 +0000 (13:23 +0300)]
Merge pull request #42958 from ifed01/wip-ifed-fix-huge-omap-rename-oct

octopus: os/bluestore: cap omap naming scheme upgrade transactoin.

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
3 years agoMerge pull request #43959 from guits/wip-53283-octopus
Guillaume Abrioux [Tue, 23 Nov 2021 16:43:56 +0000 (17:43 +0100)]
Merge pull request #43959 from guits/wip-53283-octopus

octopus: ceph-volume: `get_first_lv()` refactor

3 years agoos/bluestore: Fix omap upgrade to per-pg scheme 42958/head
Adam Kupczyk [Sat, 13 Nov 2021 10:28:18 +0000 (11:28 +0100)]
os/bluestore: Fix omap upgrade to per-pg scheme

This is fix to regression introduced by fix to omap upgrade: https://github.com/ceph/ceph/pull/43687
The problem was that we always skipped first omap entry.
This worked fine with objects having omap header key.
For objects without header key we skipped first actual omap key.

Fixes: https://tracker.ceph.com/issues/53260
Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 65a3f374aa1c57c5bb9401e57dab98a643b4360a)

3 years agoceph-volume: `get_first_*()` refactor 43959/head
Guillaume Abrioux [Mon, 8 Mar 2021 08:59:26 +0000 (09:59 +0100)]
ceph-volume: `get_first_*()` refactor

As indicated by commit 17957d9beb42a04b8f180ccb7ba07d43179a41d3 those
fuctions were meant to avoid writing something like following:

```
lvs = get_lvs()
if len(lvs) >= 1:
lvs = lv[0]
```

Those functions should return `None` if 0 or more than 1 item is returned.
The current name of these functions are confusing and can lead to thinking that
we just want the first item returned, even though it returns more than 1
item, let's rename them to `get_single_pv()`, `get_single_vg()` and
`get_single_lv()`

Closes: https://tracker.ceph.com/issues/49643
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a5e4216b49704783c55fb83b3ae6dde35b0082ad)

3 years agoMerge pull request #43952 from guits/wip-52597-octopus
Guillaume Abrioux [Thu, 18 Nov 2021 08:36:35 +0000 (09:36 +0100)]
Merge pull request #43952 from guits/wip-52597-octopus

octopus: ceph-volume: util/prepare fix osd_id_available()

3 years agoMerge pull request #43950 from guits/wip-53189-octopus
Guillaume Abrioux [Thu, 18 Nov 2021 08:35:32 +0000 (09:35 +0100)]
Merge pull request #43950 from guits/wip-53189-octopus

octopus: ceph-volume: fix a typo causing AttributeError

3 years agoceph-volume: util/prepare fix osd_id_available() 43952/head
Guillaume Abrioux [Thu, 9 Sep 2021 08:23:43 +0000 (10:23 +0200)]
ceph-volume: util/prepare fix osd_id_available()

The current check only allows to request an OSD id that exists but
marked as 'destroyed'.
With this small fix, we can now use `--osd-id` with an id that doesn't
exist at all.

Fixes: https://tracker.ceph.com/issues/50880
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 73bfa5d2b0157f92721d8bf36619fd35ee265cdd)

3 years agoceph-volume: fix a typo causing AttributeError 43950/head
Taha Jahangir [Sun, 17 Oct 2021 05:00:27 +0000 (08:30 +0330)]
ceph-volume: fix a typo causing AttributeError

Signed-off-by: Taha Jahangir <mtjahangir@gmail.com>
(cherry picked from commit 4cdbba3344fe26b6351e88ce00a8655890a02115)

3 years agoMerge pull request #43747 from mfoliveira/wip-53100-octopus
Yuri Weinstein [Wed, 10 Nov 2021 23:35:16 +0000 (15:35 -0800)]
Merge pull request #43747 from mfoliveira/wip-53100-octopus

octopus: os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_* options

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
3 years agoos/bluestore: fix invalid omap name conversion when upgrading to per-pg.
Igor Fedotov [Wed, 27 Oct 2021 10:59:34 +0000 (13:59 +0300)]
os/bluestore: fix invalid omap name conversion when upgrading to per-pg.

Fixes: https://tracker.ceph.com/issues/53062
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
(cherry picked from commit cbc97018d883333f81ab9a3cfa99d2f68a9874cd)
(cherry picked from commit dc0a7e49434f76d97016934feed9a8ec806d1e42)

3 years agoMerge pull request #43658 from badone/wip-octopus-ceph-ansible-systemd-bug
Yuri Weinstein [Wed, 10 Nov 2021 16:28:25 +0000 (08:28 -0800)]
Merge pull request #43658 from badone/wip-octopus-ceph-ansible-systemd-bug

octopus: qa/ceph-ansible: Bump OS version for centos

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
3 years agoos/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_search_bytes 43747/head
Kefu Chai [Tue, 1 Jun 2021 11:14:33 +0000 (19:14 +0800)]
os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_search_bytes

so AvlAllocator can switch from the first-first mode to best-fit mode
without walking through the whole space map tree. in the
highly-fragmented system, iterating the whole tree could hurt the
performance of fast storage system a lot.

the idea comes from openzfs's metaslab allocator.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 5a26875049d13130ffe5954428da0e1b9750359f)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
 Conflicts:
src/common/options/global.yaml.in:
- Moved new option into src/common/options.cc

3 years agoos/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_search_count
Kefu Chai [Wed, 2 Jun 2021 07:57:04 +0000 (15:57 +0800)]
os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_search_count

so AvlAllocator can switch from the first-first mode to best-fit mode
without walking through the whole space map tree. in the
highly-fragmented system, iterating the whole tree could hurt the
performance of fast storage system a lot.

the idea comes from openzfs's metaslab allocator.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 40f05b971f5a8064cf9819f80fc3bbf21d5206da)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
 Conflicts:
src/common/options/global.yaml.in
- Moved new option into src/common/options.cc

3 years agoos/bluestore/AvlAllocator: use cbit for counting the order of alignment
Kefu Chai [Wed, 2 Jun 2021 08:31:18 +0000 (16:31 +0800)]
os/bluestore/AvlAllocator: use cbit for counting the order of alignment

no need to calculate the alignment first, cbits() would suffice. as it
counts the first set bit and the follow 0's in a number. the result
is identical to the cbit(alignment of that number).

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 573cbb796e8ba2f433caa308925735101a8161a6)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
3 years agoos/bluestore/AvlAllocator: use delegated ctor
Kefu Chai [Tue, 1 Jun 2021 10:52:11 +0000 (18:52 +0800)]
os/bluestore/AvlAllocator: use delegated ctor

less repeating this way

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 9b52ba1dd0a5e199833d7ab2561a7b388d85afc1)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
 Conflicts:
src/os/bluestore/AvlAllocator.cc
        - Replace `std::string_view name` w/ `const std::string& name`.

3 years agoos/bluestore/AvlAllocator: specialize _block_picker()
Kefu Chai [Wed, 2 Jun 2021 07:11:30 +0000 (15:11 +0800)]
os/bluestore/AvlAllocator: specialize _block_picker()

before this change AvlAllocator::_block_picker() is used by both the
best-fit mode and first-fit mode. but since we cannot achieve the
locality by searching in the area pointed by curosr in best-fit mode,
we just pass a dummy cursor to AvlAllocator::_block_picker() when
searching in the best-fit mode.

but since the range_size_tree is already sorted by the size of ranges,
if _block_picker() fails to find one by the size, we should just give
up right away, and instead try again using a smaller size.

after this change, instead of sharing AvlAllocator::_block_picker()
across both the first-fit mode and the best-fit mode, this method
is specialize to two different variants: one for first-fit, and the
simplified one for best-fit.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 4837166f9e7a659742d4184f021ad12260247888)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
3 years agoos/bluestore: Improve _block_picker function
Adam Kupczyk [Wed, 19 May 2021 10:49:37 +0000 (12:49 +0200)]
os/bluestore: Improve _block_picker function

Make _block_picker function scan (*cursor, end) + (begin, *cursor) instead of (*cursor, end) + (begin, end).
The second run over range (*cursor, end) could never yield any results.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit c732060d3e3ef96c6da06c9dde3ed8c064a50965)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
3 years agoos/bluestore: do not call _block_picker() again if already searched from start()
jhonxue [Wed, 18 Nov 2020 09:41:57 +0000 (17:41 +0800)]
os/bluestore: do not call _block_picker() again if already searched from start()

Fixes: https://tracker.ceph.com/issues/48272
Signed-off-by: Xue Yantao <jhonxue@tencent.com>
(cherry picked from commit fd5ca26e4a23d6c8992ab5927ce85ade958e251f)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
3 years agooctopus: qa/ceph-ansible: Bump OS version for centos 43658/head
Brad Hubbard [Mon, 25 Oct 2021 21:18:26 +0000 (07:18 +1000)]
octopus: qa/ceph-ansible: Bump OS version for centos

The systemd version in the 8.3 image is buggy so use 8.4 instead.

Fixes: https://tracker.ceph.com/issues/52923
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
3 years agoos/bluestore: cap omap naming scheme upgrade transactoin.
Igor Fedotov [Tue, 9 Feb 2021 15:29:01 +0000 (18:29 +0300)]
os/bluestore: cap omap naming scheme upgrade transactoin.

We shouldn't use single per-onode transaction for such an upgrade when onode's omap list is huge. This results in similarly sized WAL/SST files which are inefficient, might cause high memory usage and sometimes error-prone.

Fixes: https://tracker.ceph.com/issues/49170
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit e897fa243c1dd38329733b452872616023f14ac8)

 Conflicts (caused by lack of per-pg omap scheme):
src/os/bluestore/BlueStore.cc
src/os/bluestore/BlueStore.h
src/os/bluestore/bluestore_types.h

3 years agoMerge pull request #43758 from tchaikov/octopus-pr-43748
Sebastian Wagner [Tue, 2 Nov 2021 08:14:55 +0000 (09:14 +0100)]
Merge pull request #43758 from tchaikov/octopus-pr-43748

octopus: admin/doc-requirements.txt: pin Sphinx at 3.5.4

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
3 years agotest/rgw: use spawn library for test_rgw_dmclock_scheduler 43758/head
Casey Bodley [Tue, 20 Jul 2021 16:50:25 +0000 (12:50 -0400)]
test/rgw: use spawn library for test_rgw_dmclock_scheduler

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit a8e3589a2c875b6fadc853c75f20cb9256f294ca)

Conflicts:
src/test/rgw/CMakeLists.txt
src/test/rgw/test_rgw_dmclock_scheduler.cc: trivial resolution

3 years agotest/rgw: fix use of poll() with timers in unittest_rgw_dmclock_scheduler
Casey Bodley [Mon, 19 Jul 2021 22:07:47 +0000 (18:07 -0400)]
test/rgw: fix use of poll() with timers in unittest_rgw_dmclock_scheduler

the AsyncScheduler uses an asio timer to dispatch work to its executor
with an optional delay. when no delay is requested, it waits on the
timer with an expiration time in the past (crimson::dmclock::TimeZero)

tests are failing here because poll() is returning without executing the
handlers of those expired timers

asio implements these timers with timerfd and epoll. debugging with
strace, i see that these timers armed with timerfd_settime() are not
always immediately ready according to epoll_wait():

  eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK)   = 3
  epoll_create1(EPOLL_CLOEXEC)            = 4
  timerfd_create(CLOCK_MONOTONIC, TFD_CLOEXEC) = 5
  epoll_ctl(4, EPOLL_CTL_ADD, 3, {events=EPOLLIN|EPOLLERR|EPOLLET, data={u32=14164052, u64=14164052}}) = 0
  epoll_ctl(4, EPOLL_CTL_ADD, 5, {events=EPOLLIN|EPOLLERR, data={u32=14164064, u64=14164064}}) = 0
  timerfd_settime(5, TFD_TIMER_ABSTIME, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=0, tv_nsec=1}}, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=0, tv_nsec=0}}) = 0
  epoll_wait(4, [{events=EPOLLIN, data={u32=14164052, u64=14164052}}], 128, 0) = 1
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [{events=EPOLLIN, data={u32=14164064, u64=14164064}}], 128, 0) = 1

in this example, it took 6 calls to context.poll() before it was ready
to execute the timer's handler

to work around this, replace calls to context.poll() with calls to
context.run_for() with a very short duration

Fixes: https://tracker.ceph.com/issues/42788
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 21baed999e31c5e69c75f0cbb8757ef91585d917)

3 years agoinclude/rados/librados.h: avoid redefinition of rados_object_list_item
Kefu Chai [Fri, 28 Aug 2020 11:25:51 +0000 (19:25 +0800)]
include/rados/librados.h: avoid redefinition of rados_object_list_item

doxygen complains at seeing rados_object_list_item gets defined twice.
so let's fix it.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 4c53a02ac56e4e87631262fffc711e2e009561d7)

3 years agomgr/dashboard: pin a version for autopep8 and pyfakefs
Nizamudeen A [Mon, 25 Oct 2021 08:42:57 +0000 (14:12 +0530)]
mgr/dashboard: pin a version for autopep8 and pyfakefs

Fixes: https://tracker.ceph.com/issues/53024
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 946dab4f608ec47e0a3cfefdf8e7d1afda69117f)

Conflicts:
src/pybind/mgr/dashboard/requirements-lint.txt: trivial
resolution

3 years agoadmin/doc-requirements.txt: pin Sphinx at 3.5.4
Kefu Chai [Sat, 30 Oct 2021 03:18:17 +0000 (11:18 +0800)]
admin/doc-requirements.txt: pin Sphinx at 3.5.4

* pin Sphinx at 3.5.4
* pin docutils at 0.18

at least the combination of these two versions
is known to compile.

to address the bug reported at
https://sourceforge.net/p/docutils/bugs/431/

the backtrace looks like:

/home/jenkins-build/build/workspace/ceph-pr-docs/build-doc/virtualenv/lib/python3.8/site-packages/sphinx/util/docutils.py:285:
RemovedInSphinx30Warning: function based directive support is now
deprecated. Use class based directive instead.
  warnings.warn('function based directive support is now deprecated. '

Exception occurred:
  File
"/home/jenkins-build/build/workspace/ceph-pr-docs/build-doc/virtualenv/lib/python3.8/site-packages/docutils/writers/html5_polyglot/__init__.py",
line 445, in section_title_tags
    if (ids and self.settings.section_self_link
AttributeError: 'Values' object has no attribute 'section_self_link'

please note this change is not cherry-picked from
master, because master already bumped Sphinx to 3.5.4
in 4968baa2523bd2a5ca6be147b26bc28906a864c9.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 6362b554e2e57227c9fda51c7703f14674da8358)

Conflicts:
admin/doc-requirements.txt: trivial resolution

3 years agoadmin/doc-requirements.txt: require breathe >= 4.20
Kefu Chai [Mon, 14 Dec 2020 08:25:07 +0000 (16:25 +0800)]
admin/doc-requirements.txt: require breathe >= 4.20

to be compatible with Sphinx 3.2
see https://github.com/michaeljones/breathe/tree/v4.20.0#change-log

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 2f1f16ed865114faedc3bdb3049da722284fab61)

3 years agodoc/scripts/gen_state_diagram.py: Fix literal comparison syntax warnings
Varsha Rao [Thu, 11 Jun 2020 06:36:38 +0000 (12:06 +0530)]
doc/scripts/gen_state_diagram.py: Fix literal comparison syntax warnings

In python3.8 comparing strings using 'is' and 'is not' throws syntax warning.
Instead use equality operator.

Signed-off-by: Varsha Rao <varao@redhat.com>
(cherry picked from commit 61e7bcded852e90e6249ab0f3c37ec2688537c83)

3 years agodoc/conf.py: define CEPH_RADOS_API for breathe
Kefu Chai [Fri, 28 Aug 2020 10:20:13 +0000 (18:20 +0800)]
doc/conf.py: define CEPH_RADOS_API for breathe

otherwise we could have following errors:

Invalid C declaration: Expected identifier in nested name, got keyword: int [error at 18]
      CEPH_RADOS_API int rados_aio_append (rados_ioctx_t io, const char *oid, rados_completion_t completion, const char *buf, size_t len)
      ------------------^

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit cf357e17ab5cbd9af4f67c51117b5f368cb9913f)

3 years agoMerge pull request #43441 from rhcs-dashboard/wip-52836-octopus
Yuri Weinstein [Mon, 1 Nov 2021 15:46:22 +0000 (08:46 -0700)]
Merge pull request #43441 from rhcs-dashboard/wip-52836-octopus

octopus: qa/mgr/dashboard/test_pool: don't check HEALTH_OK

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #43381 from nkshirsagar/octopus
Yuri Weinstein [Thu, 28 Oct 2021 21:53:01 +0000 (14:53 -0700)]
Merge pull request #43381 from nkshirsagar/octopus

octopus: rgw: clear buckets before calling list_buckets()

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Reviewed-by: Prashant D <pdhange@redhat.com>
Reviewed-by: Ponnuvel Palaniyappan <pponnuvel@gmail.com>
3 years agoMerge pull request #43325 from gerald-yang/octopus-50482
Yuri Weinstein [Tue, 26 Oct 2021 20:33:15 +0000 (13:33 -0700)]
Merge pull request #43325 from gerald-yang/octopus-50482

octopus: msgr/async: fix unsafe access in unregister_conn()

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoMerge pull request #43310 from gerald-yang/octopus-51199
Yuri Weinstein [Tue, 26 Oct 2021 20:31:35 +0000 (13:31 -0700)]
Merge pull request #43310 from gerald-yang/octopus-51199

octopus: msg/async: allow connection reaping to be tuned; fix cephfs test

Reviewed-by: Kefu Chai <kchai@redhat.com>
3 years agoMerge pull request #43557 from badone/wip-octopus-ceph-ansible-max-usable-version
Yuri Weinstein [Thu, 21 Oct 2021 20:27:47 +0000 (13:27 -0700)]
Merge pull request #43557 from badone/wip-octopus-ceph-ansible-max-usable-version

octopus: qa/ceph-ansible: Pin to last compatible stable release

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
3 years agoMerge pull request #43607 from rhcs-dashboard/wip-52986-octopus
Ernesto Puerta [Thu, 21 Oct 2021 08:55:25 +0000 (10:55 +0200)]
Merge pull request #43607 from rhcs-dashboard/wip-52986-octopus

octopus: mgr/dashboard/api: set a UTF-8 locale when running pip

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
3 years agomgr/dashboard/api: set a UTF-8 locale when running pip 43607/head
Kefu Chai [Tue, 17 Aug 2021 07:53:51 +0000 (15:53 +0800)]
mgr/dashboard/api: set a UTF-8 locale when running pip

ansible-core started to include files whose filenames are encoded in
non-ascii characters, so we have to use a more capable encoding for the
locale in order to install this package. otherwise we'd have following
error:

Collecting ansible-core<2.12,>=2.11.3
  Using cached ansible-core-2.11.4.tar.gz (6.8 MB)
ERROR: Exception:

Traceback (most recent call last):
  File "/tmp/tmp.fX76ASIrch/venv/lib/python3.8/site-packages/pip/_internal/cli/base_command.py", line 173, in _main
    status = self.run(options, args)
...
  File "/tmp/tmp.fX76ASIrch/venv/lib/python3.8/site-packages/pip/_internal/utils/unpacking.py", line 226, in untar_file
    with open(path, "wb") as destfp:
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 137-140: ordinal not in range(256)

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 05e4145856bb5ed19ecc879f2e50b5a88cb2045e)

3 years ago15.2.15 v15.2.15
Jenkins Build Slave User [Wed, 20 Oct 2021 14:19:57 +0000 (14:19 +0000)]
15.2.15

3 years agooctopus: qa/ceph-ansible: Pin to last compatible stable release 43557/head
Brad Hubbard [Mon, 11 Oct 2021 21:38:32 +0000 (07:38 +1000)]
octopus: qa/ceph-ansible: Pin to last compatible stable release

https://github.com/ceph/ceph-ansible/pull/6892 introduces a breaking
change so pin the CA version to stable-6.0 and ansible 2.9.

Fixes: https://tracker.ceph.com/issues/52943
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
3 years agorgw: clear buckets before calling list_buckets() 43381/head
Nikhil Kshirsagar [Fri, 1 Oct 2021 04:49:00 +0000 (10:19 +0530)]
rgw: clear buckets before calling list_buckets()

The ragodgw-admin bucket limit check command has a bug in
octopus. Since we do not clear the bucket list before
list_buckets() returns the next max_entries, they are appended
to the existing list and we end up counting the first ones again.

This bug is triggered if bucket count exceeds max_entries and
causes duplicates in the output of radosgw-admin bucket limit check.

The fix clears the buckets structure before the list_buckets()
populates it again with the next lot of buckets to iterate through.

partial manual cherry-pick of 99f7c4aa1286edfea6961b92bb44bb8fe22bd599

Signed-off-by: Nikhil Kshirsagar <nkshirsagar@gmail.com>
3 years agoMerge pull request #43418 from trociny/wip-51645-octopus
Yuri Weinstein [Thu, 7 Oct 2021 14:27:05 +0000 (07:27 -0700)]
Merge pull request #43418 from trociny/wip-51645-octopus

octopus: osd/OSD: mkfs need wait for transcation completely finish

Reviewed-by: Kefu Chai <kchai@redhat.com>
3 years agoMerge pull request #43407 from trociny/wip-52809-octopus
Yuri Weinstein [Thu, 7 Oct 2021 14:22:11 +0000 (07:22 -0700)]
Merge pull request #43407 from trociny/wip-52809-octopus

octopus: tools/erasure-code: new tool to encode/decode files

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #43268 from cfsnyder/wip-52587-octopus
Guillaume Abrioux [Thu, 7 Oct 2021 08:31:08 +0000 (10:31 +0200)]
Merge pull request #43268 from cfsnyder/wip-52587-octopus

octopus: ceph-volume: fix lvm activate --all --no-systemd

3 years agoqa/mgr/dashboard/test_pool: don't check HEALTH_OK 43441/head
Ernesto Puerta [Wed, 22 Sep 2021 12:25:44 +0000 (14:25 +0200)]
qa/mgr/dashboard/test_pool: don't check HEALTH_OK

Fixes: https://tracker.ceph.com/issues/48845
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 2283cb068b82033b14587c7bac6a28440221dcd8)

3 years agoMerge pull request #43424 from cfsnyder/wip-51695-octopus
Yuri Weinstein [Wed, 6 Oct 2021 16:59:10 +0000 (09:59 -0700)]
Merge pull request #43424 from cfsnyder/wip-51695-octopus

octopus: rgw: fail as expected when set/delete-bucket-website attempted on a non-exis…

Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #43314 from idryomov/wip-rbd-mirror-snapshot-rx-only-octopus
Yuri Weinstein [Tue, 5 Oct 2021 14:57:36 +0000 (07:57 -0700)]
Merge pull request #43314 from idryomov/wip-rbd-mirror-snapshot-rx-only-octopus

octopus: rbd-mirror: unbreak one-way snapshot-based mirroring

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
3 years agoMerge pull request #43312 from idryomov/wip-keyring-resolve-error-octopus
Yuri Weinstein [Tue, 5 Oct 2021 14:57:07 +0000 (07:57 -0700)]
Merge pull request #43312 from idryomov/wip-keyring-resolve-error-octopus

octopus: auth,mon: don't log "unable to find a keyring" error when key is given

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #43263 from cfsnyder/wip-51552-octopus
Yuri Weinstein [Tue, 5 Oct 2021 14:56:28 +0000 (07:56 -0700)]
Merge pull request #43263 from cfsnyder/wip-51552-octopus

octopus: ceph-monstore-tool: use a large enough paxos/{first,last}_committed

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #42986 from MrFreezeex/wip-52460-octopus
Yuri Weinstein [Tue, 5 Oct 2021 14:55:24 +0000 (07:55 -0700)]
Merge pull request #42986 from MrFreezeex/wip-52460-octopus

octopus: rbd-mirror: add perf counters to snapshot replayed

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
3 years agoMerge pull request #42837 from steveftaylor/48212
Yuri Weinstein [Tue, 5 Oct 2021 14:46:28 +0000 (07:46 -0700)]
Merge pull request #42837 from steveftaylor/48212

octopus: mon/OSDMonitor: account for PG merging in epoch_by_pg accounting

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agorgw: fail as expected when set/delete-bucket-website attempted on a non-existent... 43424/head
mengxiangrui [Tue, 6 Jul 2021 11:58:55 +0000 (19:58 +0800)]
rgw: fail as expected when set/delete-bucket-website attempted on a non-existent bucket, rgw should return HTTP 404 and NoSuchBucket.

Fixes: https://tracker.ceph.com/issues/51536
Signed-off-by: xiangrui meng <mengxr@chinatelecom.cn>
(cherry picked from commit c623aa45d35b269c6701a57e44ac05bb29a79dc8)

Conflicts:
- src/rgw/rgw_op.cc

Cherry-pick notes:
- rgw_op.cc forward_reqeuest_to_master takes different arguments in Octopus vs. Quincy

3 years agoosd/OSD: mkfs need wait for transcation completely finish 43418/head
Chen Fan [Wed, 9 Jun 2021 05:29:03 +0000 (13:29 +0800)]
osd/OSD: mkfs need wait for transcation completely finish

when do ceph-osd mkfs, when ceph-osd process exit, sometimes
the block data could be written incompletely. we need add
wait for it complete.

Signed-off-by: Chen Fan <fan.chen@easystack.cn>
(cherry picked from commit 0ffadad3a83b3ca634d7d58a80c84d1d8761e2ea)

3 years agoMerge pull request #43349 from cfsnyder/wip-52351-octopus
Yuri Weinstein [Mon, 4 Oct 2021 19:32:06 +0000 (12:32 -0700)]
Merge pull request #43349 from cfsnyder/wip-52351-octopus

octopus: rgw: fix sts memory leak

Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #43270 from cfsnyder/wip-51778-octopus
Daniel Gryniewicz [Mon, 4 Oct 2021 16:59:23 +0000 (12:59 -0400)]
Merge pull request #43270 from cfsnyder/wip-51778-octopus

octopus: rgw : add check for tenant provided in RGWCreateRole

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
3 years agoMerge pull request #43265 from cfsnyder/wip-52331-octopus
Yuri Weinstein [Mon, 4 Oct 2021 15:26:05 +0000 (08:26 -0700)]
Merge pull request #43265 from cfsnyder/wip-52331-octopus

octopus: cmake: s/Python_EXECUTABLE/Python3_EXECUTABLE/

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #43234 from MrFreezeex/wip-51838-octopus
Yuri Weinstein [Mon, 4 Oct 2021 15:19:57 +0000 (08:19 -0700)]
Merge pull request #43234 from MrFreezeex/wip-51838-octopus

octopus: ceph.spec: selinux scripts respect CEPH_AUTO_RESTART_ON_UPGRADE

Reviewed-by: Kefu Chai <kchai@redhat.com>
3 years agoMerge pull request #43369 from tchaikov/octopus-pr-39602
Yuri Weinstein [Mon, 4 Oct 2021 15:14:10 +0000 (08:14 -0700)]
Merge pull request #43369 from tchaikov/octopus-pr-39602

octopus: mgr/influx: use "N/A" for unknown hostname

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #43352 from rhcs-dashboard/wip-52773-octopus
Yuri Weinstein [Mon, 4 Oct 2021 15:13:06 +0000 (08:13 -0700)]
Merge pull request #43352 from rhcs-dashboard/wip-52773-octopus

octopus: qa/mgr/dashboard: add extra wait to test

Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agotest/erasure-code: remove ceph_erasure_code 43407/head
Mykola Golub [Wed, 25 Mar 2020 08:44:18 +0000 (08:44 +0000)]
test/erasure-code: remove ceph_erasure_code

Its functionality is moved to ceph-erasure-code-tool.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 56aaaf8a97574a0284bad37cc0ba5e1c262f33e0)

3 years agorpm,deb: add ceph-erasure-code-tool to ceph-osd package
Mykola Golub [Wed, 25 Mar 2020 08:37:00 +0000 (08:37 +0000)]
rpm,deb: add ceph-erasure-code-tool to ceph-osd package

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 53c75eebbdc9d85b07dd90608b45ad794a54f848)

3 years agoceph-erasure-code-tool: new tool to encode/decode files
Mykola Golub [Fri, 20 Mar 2020 16:19:25 +0000 (16:19 +0000)]
ceph-erasure-code-tool: new tool to encode/decode files

E.g. it may be useful as a last resort when recovering an object from
a damaged PG: extract the encoded object chunks from the PG shards
with ceph-objectstore-tool and then decode with ceph-erasure-code-tool.

It also has functionality similar to what ceph_erasure_code test provides.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit b274357d29fb25a29f75c08283fec185902e7970)

3 years agomgr/influx: use "N/A" for unknown hostname 43369/head
Kefu Chai [Mon, 22 Feb 2021 05:53:42 +0000 (13:53 +0800)]
mgr/influx: use "N/A" for unknown hostname

in theory, there is chance that get_metadata() returns None, so let use
"N/A" in this case.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit e457ca50011f70cf01a62323998af233a484f338)

3 years agoqa/mgr/dashboard: add extra wait to test 43352/head
Ernesto Puerta [Wed, 22 Sep 2021 12:10:28 +0000 (14:10 +0200)]
qa/mgr/dashboard: add extra wait to test

Fixes: https://tracker.ceph.com/issues/49344
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 9ff778cdaa1ef40fcfa04f221a1da786a0e19655)

3 years agorgw: fix sts memory leak 43349/head
yuliyang_yewu [Tue, 17 Aug 2021 03:04:02 +0000 (11:04 +0800)]
rgw: fix sts memory leak

fix https://tracker.ceph.com/issues/52290

Signed-off-by: yuliyang_yewu <yuliyang_yewu@cmss.chinamobile.com>
(cherry picked from commit ef921bcdaa78d33ed0611a60ec58826d8e6ccb45)

3 years agorgw : add check for tenant provided in RGWCreateRole 43270/head
cao.leilc [Thu, 17 Jun 2021 12:04:23 +0000 (20:04 +0800)]
rgw : add check for tenant provided in RGWCreateRole

Fixes: https://tracker.ceph.com/issues/51206
Signed-off-by: caolei <halei15848934852@163.com>
(cherry picked from commit 3c99ac14080c9f5b1611c9bbe4a223a9fd2927a0)

Conflicts:
src/rgw/rgw_rest_role.cc

- Octopus constructs role explicitly vs. using store->get_role(), and does not wrap in a unique_ptr

3 years agotasks/ceph_manager: ignore EACCES when waiting for quorum 43263/head
Kefu Chai [Thu, 10 Jun 2021 12:19:09 +0000 (20:19 +0800)]
tasks/ceph_manager: ignore EACCES when waiting for quorum

mon_tick_interval is 5 seconds by default. monitors update their
rotating keys every mon_tick_interval. before monitors forms a
quorum, the auth requests from clients are put into the wait list.
these requests are re-enqueued once the monitors form a quorum. but
there is a small window of mon_tick_interval, before they are able
to serve the auth requests even after their claim to be able to
server requests. if these re-enqueued requests happen to be served
in this window, and if authx is enabled, they will be greeted with
errors like

handle_auth_bad_method server allowed_methods [2] but i only support [2]

in the case of ceph cli, the error would look like:

[errno 13] RADOS permission denied (error connecting to the cluster)

so, to address this issue, the EACCES error is ignored when waiting
for a quorum.

Signed-off-by: Kefu Chai <kchai@redhat.com>
3 years agotasks/ceph_manager: use safe_while() to refactor the wait for quorum
Kefu Chai [Thu, 10 Jun 2021 12:10:06 +0000 (20:10 +0800)]
tasks/ceph_manager: use safe_while() to refactor the wait for quorum

for better readability

Signed-off-by: Kefu Chai <kchai@redhat.com>
3 years agoceph-monstore-tool: use a large enough paxos/{first,last}_committed
Kefu Chai [Tue, 9 Apr 2019 14:07:02 +0000 (22:07 +0800)]
ceph-monstore-tool: use a large enough paxos/{first,last}_committed

so the rebuild paxos transaction won't be overwritten by the ones
created before recovery completes.

when the quorum is recovering, the leader will collect the paxos
transactions from peons. if the quorum accept the proposal for setting
the fingerprint, the peon will update the monitor with the paxos
transaction with a newer "last_committed" than the one created using
update_paxos() in ceph_monstore_tool.cc. the latter "last_committed" is
always 0.

so, to avoid this extra paxos proposal obsoleting the "rebuilding" paxos
transaction, we use a large enough number for {first,last}_committed.

Fixes: http://tracker.ceph.com/issues/38219
Signed-off-by: Kefu Chai <kchai@redhat.com>
3 years agoMerge pull request #43094 from mgfritch/use-quay-octopus
Sebastian Wagner [Mon, 27 Sep 2021 20:22:21 +0000 (22:22 +0200)]
Merge pull request #43094 from mgfritch/use-quay-octopus

octopus: cephadm: use quay, not docker

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
3 years agoMerge pull request #43266 from cfsnyder/wip-51555-octopus
Yuri Weinstein [Mon, 27 Sep 2021 20:11:57 +0000 (13:11 -0700)]
Merge pull request #43266 from cfsnyder/wip-51555-octopus

octopus: mon: return -EINVAL when handling unknown option in 'ceph osd pool get'

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #42616 from neha-ojha/wip-51967-octopus
Yuri Weinstein [Mon, 27 Sep 2021 20:09:21 +0000 (13:09 -0700)]
Merge pull request #42616 from neha-ojha/wip-51967-octopus

octopus: common/options: Set osd_client_message_cap to 256.

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
3 years agoqa/suites/rbd: test case for one-way snapshot-based mirroring 43314/head
Ilya Dryomov [Fri, 24 Sep 2021 10:29:34 +0000 (12:29 +0200)]
qa/suites/rbd: test case for one-way snapshot-based mirroring

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 366e9c51a8d83a7c185a00d9fb9e4cde290145e4)

3 years agorbd-mirror: fix a couple of brainos in log messages
Ilya Dryomov [Mon, 20 Sep 2021 20:36:10 +0000 (22:36 +0200)]
rbd-mirror: fix a couple of brainos in log messages

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit fdcdeae2a26927b51ab8a480d48a52b896b5532b)

3 years agorbd-mirror: unbreak one-way snapshot-based mirroring
Ilya Dryomov [Mon, 20 Sep 2021 19:52:57 +0000 (21:52 +0200)]
rbd-mirror: unbreak one-way snapshot-based mirroring

Snapshot replayer needs the remote's mirror peer uuid to find its
snapshots in the remote image.  It is obtained by listing remote's
mirror peers but RemotePoolPoller::handle_mirror_peer_list() skips
tx-only (MIRROR_PEER_DIRECTION_TX) peers.  In effect only rx-tx
(MIRROR_PEER_DIRECTION_RX_TX) peers are considered for matching
and snapshot replayer always fails with "failed to retrieve mirror
peer uuid from remote pool" error.

Instead, skip rx-only (MIRROR_PEER_DIRECTION_RX) peers as we are
definitely not interested in anything having to do with mirroring
_to_ the remote cluster.

Fixes: https://tracker.ceph.com/issues/52675
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit b02d3b0c5aa59aa294de43f94c793f5abf71ac03)

3 years agoauth,mon: don't log "unable to find a keyring" error when key is given 43312/head
Ilya Dryomov [Sun, 19 Sep 2021 10:28:37 +0000 (12:28 +0200)]
auth,mon: don't log "unable to find a keyring" error when key is given

This error is logged even if --key or --keyring are specified and
confuses users because the command actually does its job and exits
with success.  This primarily affects "rbd mirror pool peer bootstrap
import" command and rbd-mirror and cephfs-mirror daemons which connect
to the remote cluster with just mon_host and key:

  $ rbd mirror pool peer bootstrap import mypool tokenfile
  ... -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory

Local cluster commands are affected too:

  $ rados --no-config-file --mon-host $MON_HOST --key $KEY lspools
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  ... -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
  device_health_metrics
  rbd

This was introduced in commit 98a2e5c59daa ("rados: translate errno to
str in CLI").

Fixes: https://tracker.ceph.com/issues/51628
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 70aa026b097d919b41b2a1221d73b326557f75e3)

3 years agoqa/tasks/cephfs/test_sessionmap: reap connections immediately 43310/head
Sage Weil [Wed, 19 May 2021 19:27:56 +0000 (15:27 -0400)]
qa/tasks/cephfs/test_sessionmap: reap connections immediately

We have to reap connections promptly for this test to work.

This test was broken indirectly by d51d80b3234e17690061f65dc7e1515f4244a5a3,
which moved the counter decrement to reap time instead of mark_down/stop
time.

The reaping is asynchronous, so allow for a delay in the count change.

Fixes: https://tracker.ceph.com/issues/50622
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit c8c5071dcd4b0b788f5e924a678095ce5dc1d7f8)
Signed-off-by: Gerald Yang <gerald.yang.tw@gmail.com>
3 years agomsg/async: configurable threshold for reaping dead connections
Sage Weil [Wed, 19 May 2021 19:23:26 +0000 (15:23 -0400)]
msg/async: configurable threshold for reaping dead connections

It is helpful to set this to 1 for tests.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 8129d6bb953015cc05db458afa6aa9b8f5f62614)
Signed-off-by: Gerald Yang <gerald.yang.tw@gmail.com>
3 years agomsgr/async: fix unsafe access in unregister_conn() 43325/head
Sage Weil [Mon, 19 Apr 2021 14:26:30 +0000 (09:26 -0500)]
msgr/async: fix unsafe access in unregister_conn()

We were looking at anon_conns and accepting_conns without holding
the lock (deleted_lock is not sufficient).

Drop this test, and move the decrements:

- inc when we add to conns or anon_conns (no changes there)
- dec when we remove from deleted_conns (several different paths!)

Fixes: https://tracker.ceph.com/issues/49237
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit d51d80b3234e17690061f65dc7e1515f4244a5a3)
Signed-off-by: Gerald Yang <gerald.yang.tw@gmail.com>
3 years agoMerge pull request #43024 from ifed01/wip-ifed-fix-bluefs-replay-crc-oct
Yuri Weinstein [Fri, 24 Sep 2021 15:24:30 +0000 (08:24 -0700)]
Merge pull request #43024 from ifed01/wip-ifed-fix-bluefs-replay-crc-oct

octopus: os/bluestore: accept undecodable multi-block bluefs transactions on log

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #43133 from mgfritch/octopus-backport-43010
Yuri Weinstein [Fri, 24 Sep 2021 15:23:20 +0000 (08:23 -0700)]
Merge pull request #43133 from mgfritch/octopus-backport-43010

octopus: cephadm: add thread ident to log messages

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoMerge pull request #43186 from rhcs-dashboard/wip-52616-octopus
Yuri Weinstein [Fri, 24 Sep 2021 15:19:23 +0000 (08:19 -0700)]
Merge pull request #43186 from rhcs-dashboard/wip-52616-octopus

octopus: mgr/dashboard: Incorrect MTU mismatch warning

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #43140 from ifed01/wip-ifed-fix-migrate-oct
Yuri Weinstein [Fri, 24 Sep 2021 15:18:10 +0000 (08:18 -0700)]
Merge pull request #43140 from ifed01/wip-ifed-fix-migrate-oct

octopus: os/bluestore: fix bluefs migrate command

Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #43008 from ifed01/wip-ifed-fix-52311-oct
Yuri Weinstein [Fri, 24 Sep 2021 15:17:08 +0000 (08:17 -0700)]
Merge pull request #43008 from ifed01/wip-ifed-fix-52311-oct

octopus: os/bluestore: fix using incomplete bluefs log when dumping it.

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
3 years agoMerge pull request #42975 from idryomov/wip-51419-octopus
Yuri Weinstein [Fri, 24 Sep 2021 15:16:35 +0000 (08:16 -0700)]
Merge pull request #42975 from idryomov/wip-51419-octopus

octopus: common/buffer: fix SIGABRT in  rebuild_aligned_size_and_memory

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoMerge pull request #43273 from cfsnyder/wip-52052-octopus
Yuri Weinstein [Fri, 24 Sep 2021 15:15:13 +0000 (08:15 -0700)]
Merge pull request #43273 from cfsnyder/wip-52052-octopus

octopus: rgw: when deleted obj removed in versioned bucket, extra del-marker added

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
3 years agoMerge pull request #43272 from cfsnyder/wip-51330-octopus
Yuri Weinstein [Fri, 24 Sep 2021 15:13:57 +0000 (08:13 -0700)]
Merge pull request #43272 from cfsnyder/wip-51330-octopus

octopus: rgw: avoid infinite loop when deleting a bucket

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
3 years agoMerge pull request #43271 from cfsnyder/wip-51012-octopus
Yuri Weinstein [Fri, 24 Sep 2021 15:13:30 +0000 (08:13 -0700)]
Merge pull request #43271 from cfsnyder/wip-51012-octopus

octopus: rgw: remove quota soft threshold

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
3 years agorgw: remove quota soft threshold 43271/head
Zulai Wang [Sat, 22 May 2021 13:21:10 +0000 (21:21 +0800)]
rgw: remove quota soft threshold

Remove quota soft threshold, which causes expensive checks for sharded buckets

Fixes: 14eabd4aa7b8a2e2c0c43fe7f877ed2171277526
Signed-off-by: Zulai Wang <wangzl31@outlook.com>
(cherry picked from commit 32a39705765af0f87bec9101e5d337b797e05fea)

Conflicts:
src/common/options/rgw.yaml.in
src/rgw/rgw_quota.cc

Cherry-pick notes:
- Options defined in src/common/options.cc in Octopus vs src/common/options/rgw.yaml.in
- RGWQuotaCache::get_stats does not take optional_yeild or DoutPrefixProvider arguments in Octopus

3 years agorgw: when deleted obj removed in versioned bucket, extra del-marker added 43273/head
J. Eric Ivancich [Tue, 15 Jun 2021 19:20:33 +0000 (15:20 -0400)]
rgw: when deleted obj removed in versioned bucket, extra del-marker added

After initial checks are complete, this will read the OLH earlier than
previously to check the delete-marker flag and under the bug's
conditions will return -ENOENT rather than create a spurious delete
marker.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 69d7589fb1305b7d202ffd126c3c835e7cd0dda3)

Conflicts:
src/cls/rgw/cls_rgw_types.h
src/rgw/rgw_rados.cc

Cherry-pick notes:
- RGWRados::apply_olh_log does not take DoutPrefixProvider in Octopus
- change to use some namespace-qualified names in cls_rgw_types

3 years agorgw: avoid infinite loop when deleting a bucket 43272/head
Jeegn Chen [Wed, 25 Nov 2020 09:15:25 +0000 (17:15 +0800)]
rgw: avoid infinite loop when deleting a bucket

When deleting a bucket with an incomplete multipart upload that
has about 2000 parts uploaded, we noticed an infinite loop, which
stopped s3cmd from deleting the bucket forever.
Per check, when the bucket index was sharded (for example 128
shards), the original logic in
RGWRados::cls_bucket_list_unordered() did not calculate
the bucket shard ID correctly when the index key of a data
part was taken as the marker.

The issue is not necessarily reproduced each time. It will depend
on the key of the object. To reproduce it in 128-shard bucket,
we use 334 as the key for the incomplete multipart upload,
which will be located in Shard 127 (known by experiment). In this
setup, the original logic will usually come out a shard ID smaller
than 127 (since 127 is the largest one) from the marker and
thus a circle is constructed, which results in an infinite loop.

PS: Some times the bucket ID calculation may incorrectly going forward
instead of backward. Thus, the check logic may skip some shards,
which may have regular keys. In such scenarios, some non-empty buckets may
be deleted by accident.

Fixes: http://tracker.ceph.com/issues/49206
Signed-off-by: Jeegn Chen <jeegnchen@tencent.com>
(cherry picked from commit 3cafe5774a5a453d58a3a6bed1f02d3200c4bb1d)

Conflicts:
src/rgw/rgw_rados.cc

Cherry-pick notes:
- Octopus cls_bucket_list_unordered doesn't take DoutPrefixProvider as first arg

3 years agoceph-volume: fix lvm activate --all --no-systemd 43268/head
Dimitri Savineau [Tue, 24 Aug 2021 21:17:45 +0000 (17:17 -0400)]
ceph-volume: fix lvm activate --all --no-systemd

When using a system without systemd then the `lvm activate --all --no-systemd`
subcommand still calls systemd.
We already allow users to activate a single OSD without systemd so there's
no reason to not do the same with --all (because activate_all calls activate).

Fixes: https://tracker.ceph.com/issues/25070
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 8e402e112a6383555e2df31ba3321e5956f1841a)

3 years agomon: return -EINVAL when handling unknown option in 'ceph osd pool get' 43266/head
Zhao Cuicui [Mon, 5 Jul 2021 08:53:17 +0000 (16:53 +0800)]
mon: return -EINVAL when handling unknown option in 'ceph osd pool get'

Signed-off-by: Zhao Cuicui <brucen1030@163.com>
(cherry picked from commit 7ed494076e2390f8e6a386278346632d00ee718a)

3 years agocmake: s/Python_EXECUTABLE/Python3_EXECUTABLE/ 43265/head
Michael Fritch [Tue, 17 Aug 2021 21:36:50 +0000 (15:36 -0600)]
cmake: s/Python_EXECUTABLE/Python3_EXECUTABLE/

pass the python3 exec when creating the ceph-volume build venv
fixup for 5fc657b40dc7

Fixes: https://tracker.ceph.com/issues/52304
Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 7db830598507d90d1c9e1f4468f818bebce58037)

3 years agocephadm: quay.io for non-ceph images too 43094/head
Sage Weil [Wed, 11 Aug 2021 16:21:32 +0000 (12:21 -0400)]
cephadm: quay.io for non-ceph images too

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit dbc1d6303f4c2a22f5fa59218aa032fc92073906)

3 years agomgr/cephadm: Put together default container images references
Juan Miguel Olmo Martínez [Fri, 12 Feb 2021 13:09:17 +0000 (14:09 +0100)]
mgr/cephadm: Put together default container images references

Placed all in the same location in order to make easy downstream modifications
and future changes

Signed-off-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com>
(cherry picked from commit ce246479443a64b292c7cff2a662161c8a598e09)

3 years agoMerge pull request #43189 from rhcs-dashboard/wip-51275-octopus
Ernesto Puerta [Tue, 21 Sep 2021 10:00:13 +0000 (12:00 +0200)]
Merge pull request #43189 from rhcs-dashboard/wip-51275-octopus

octopus: mgr/dashboard: deprecated variable usage in Grafana dashboards

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: p-se <NOT@FOUND>
Reviewed-by: Yuri Weinstein <yweins@redhat.com>
3 years agoceph.spec: selinux scripts respect CEPH_AUTO_RESTART_ON_UPGRADE 43234/head
Dan van der Ster [Mon, 12 Jul 2021 13:35:39 +0000 (15:35 +0200)]
ceph.spec: selinux scripts respect CEPH_AUTO_RESTART_ON_UPGRADE

In /etc/sysconfig/ceph we allow operators to define if ceph daemons
should be restarted on upgrade: CEPH_AUTO_RESTART_ON_UPGRADE.

But the post selinux scripts will stop ceph.target regardless if this
is set to `no`, leading to operators adding various hacks to prevent
these unexpected or inconvenient daemon restarts. By now, if users
are using rpms directly, they are likely orchestrating their own
daemon restarts so should not rely on the rpm itself to do this.

Fixes: https://tracker.ceph.com/issues/21672
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit 092a6e3e83e9ef8e37cb6f1033c345dcb5224cfc)

3 years agomgr/dashboard: deprecated variable usage in Grafana dashboards 43189/head
Patrick Seidensal [Tue, 30 Mar 2021 18:20:49 +0000 (20:20 +0200)]
mgr/dashboard: deprecated variable usage in Grafana dashboards

Fixes: https://tracker.ceph.com/issues/50059
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit a709abf8bf5a6b25c21db100e87af3a6c2cf382d)

3 years agomgr/dashboard: Incorrect MTU mismatch warning 43186/head
Aashish Sharma [Thu, 2 Sep 2021 06:27:57 +0000 (11:57 +0530)]
mgr/dashboard: Incorrect MTU mismatch warning

The MTU mismatch warning was being fired for those NIC's as well that are in down state. This PR intends to fix this issue

Fixes:https://tracker.ceph.com/issues/52028
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 58d635455d1f59921d5ad821168f31b6f937588a)

3 years agoMerge pull request #42533 from liewegas/use-quay-octopus
Yuri Weinstein [Wed, 15 Sep 2021 16:28:58 +0000 (09:28 -0700)]
Merge pull request #42533 from liewegas/use-quay-octopus

octopus: cephadm: default to quay.io, not docker.io

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>