git.apps.os.sepia.ceph.com Git

Merge pull request #43491 from badone/wip-nautilus-52891-LibRGW-tests-macro-fail

nautilus: test/ceph_test_librgw_file*: Remove duplicate names

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #43431 from badone/wip-51950-nautilus

nautilus: Don't persist report data

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #43365 from ifed01/wip-ifed-fix-missing-shared-blob-nau

nautilus: os/bluestore: fix erroneous SharedBlob record removal during repair.

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #43608 from rhcs-dashboard/wip-52987-nautilus

nautilus: mgr/dashboard/api: set a UTF-8 locale when running pip

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

mgr/dashboard/api: set a UTF-8 locale when running pip

ansible-core started to include files whose filenames are encoded in
non-ascii characters, so we have to use a more capable encoding for the
locale in order to install this package. otherwise we'd have following
error:

Collecting ansible-core<2.12,>=2.11.3
  Using cached ansible-core-2.11.4.tar.gz (6.8 MB)
ERROR: Exception:

Traceback (most recent call last):
  File "/tmp/tmp.fX76ASIrch/venv/lib/python3.8/site-packages/pip/_internal/cli/base_command.py", line 173, in _main
    status = self.run(options, args)
...
  File "/tmp/tmp.fX76ASIrch/venv/lib/python3.8/site-packages/pip/_internal/utils/unpacking.py", line 226, in untar_file
    with open(path, "wb") as destfp:
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 137-140: ordinal not in range(256)

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 05e4145856bb5ed19ecc879f2e50b5a88cb2045e)

nautilus: test/ceph_test_librgw_file*: Remove duplicate names

These multiple identical test signatures conflict and cause macro
expansion, and therefore the build, to fail.

Fixes: https://tracker.ceph.com/issues/52891
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>

qa/tasks/mgr/test_insights: Remove test for persistent checks

This test makes no sense if we are no longer persisting the store.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
(cherry picked from commit 32d1cca2d9b606915c590f52d61856ee401fb4fc)

pybind/mgr/insights: Don't persist report data

Don't store health reports in rocksdb.

Fixes: https://tracker.ceph.com/issues/48269
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
(cherry picked from commit de66522517edd6f7baf19cc0660478502d3c25e8)

os/bluestore: fix erroneous SharedBlob record removal during repair.

Fixes: https://tracker.ceph.com/issues/51619
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 7090930d4a2e6f2efdecaff23f9a2f795e7819fb)

Merge pull request #42617 from neha-ojha/wip-51966-nautilus

nautilus: common/options: Set osd_client_message_cap to 256.

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

Merge pull request #42240 from trociny/wip-51583-nautilus

nautilus: osd: move down peers out from peer_purged

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>

Merge pull request #43135 from aclamk/wip-aclamk-safer-flush-nau

nautilus: os/bluestore: Remove possibility of replay log and file inconsistency

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #42441 from smithfarm/wip-51770-nautilus

nautilus: rpm: three spec file cleanups

Reviewed-by: Kefu Chai <tchaikov@gmail.com>

os/bluestore/bluefs: Remove possibility of bluefs replay log containing files without data

It had been possible to have a bluefs replay log to serialize file metadata (size, allocations),
but actual data stored in these allocations is not yet synced to disk.

This could happen if _flush_range(h1) allocated space for file h1 on device (like SLOW) that will not
be used when flushing future replay log. Such thing can happen when we have h2 that wrote to WAL and
out replay log is on DB. After fsync(h2) we write to replay log, wait for fdatasync on WAL and DB.
There is no waiting on SLOW, but h1 was dirty and has been serialized to replay log.

Solution is to delay notifying replay log that it has to include h1 after finishing fdatasync.

Cherry-picked from: 03ac53f7d4c83e56f664ad371ffe3bc2d40e1837
Fixes: https://tracker.ceph.com/issues/51129
Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
Conflicts:
src/os/bluestore/BlueFS.cc
src/os/bluestore/BlueRocksEnv.cc

os/bluestore/bluefs: Add test that detects bluefs inconsistency

Add test that detects possible scenario that will cause BlueFS to have file
that contains data that has never been written. This is done by tricking
replay log to already accept file metadata (size, allocations), but actual data
stored in these allocations is not yet synced to disk.

Scenario:
1) write to file h1 on SLOW device
2) flush h1 (and trigger h1 mark to be added to bluefs replay log)
3) write to file h2
4) fsync h2 (forces replay log to be written)

The result is:
- bluefs log now has stable state of h1
- SLOW device is not yet flushed (no fdatasync())

Test detects this condition and fails.

Cherry-picked from: c591a6e14e2c956d268adcaa9aa3e9c8a1fdea2a
Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>

Merge pull request #42954 from tchaikov/nautilus-make-dist

nautilus: make-dist: bump node to 10.16.0

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

make-dist: bump node to 10.16.0

otherwise we have segfault when "npm ci", like

```
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f77f89099ed in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
[Current thread is 1 (Thread 0x7f77f8496740 (LWP 4046307))]
(gdb) bt
#0  0x00007f77f89099ed in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x00000000008c3127 in node::Environment::Environment(node::IsolateData*, v8::Local<v8::Context>, node::tracing::AgentWriterHandle*) ()
#2  0x00000000008e4d4b in node::Start(v8::Isolate*, node::IsolateData*, std::vector<std::string, std::allocator<std::string> > const&, std::vector<std::string, std::allocator<std::string> > const&) ()
#3  0x00000000008e34a2 in node::Start(int, char**) ()
#4  0x00007f77f84c00b3 in __libc_start_main (main=0x89dc10 <main>, argc=3, argv=0x7ffd1dc8e8a8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffd1dc8e898)
     at ../csu/libc-start.c:308
#5  0x000000000089dd45 in _start ()
```

this change is not cherry-picked from master, because the change
introducing the 10.16.0 change of
7f7f8a443c820f3c77a6f267939c33891342a561 is way too large and touches
lots of places in dashboard. while we just need to get the dashboard
frontend npm packages ready with minimal change.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>

Merge pull request #42695 from tchaikov/nautilus-pr-41215

nautilus: cmake: Replace boost download url

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>

cmake: Replace boost download url

Boost has moved downloads to JFrog Artifactory
https://www.boost.org/users/news/boost_has_moved_downloads_to_jfr.html

Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com>
(cherry picked from commit c2c6678e488f41277022eaf7a929f7ef845abd5f)

Conflicts:
make-dist: trivial resolution

common/options: Set osd_client_message_cap to 256.

This seems like a reasonable default value based on testing results here:
https://docs.google.com/spreadsheets/d/1dwKcxFKpAOWzDPekgojrJhfiCtPgiIf8CGGMG1rboRU/edit?usp=sharing

Eventually we may want to rethink how the throttles and even how flow control
works, but this at least gives us some basic limits now ( a little higher than
the old value of 100 that we used for many years).

Signed-off-by: Mark Nelson <mnelson@redhat.com>
(cherry picked from commit ac8cf275a6d191d71c104f6822b62ba67a0a4fcd)

Conflicts:
src/common/options/osd.yaml.in - file does not exist in nautilus

Merge pull request #41973 from trociny/wip-51315-nautilus

nautilus: osd: fix scrub reschedule bug

Reviewed-by: Neha Ojha <nojha@redhat.com>

rpm: cleanup: drop useless conditional block in %postun base

The "meat" of this conditional was ripped out by
328807f80bb6b5d1aa40631e88d755a194d5d2c2, leaving only an empty shell
behind.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit 3b53003f011cfbe51d3471ab9b6cdb9a24ecd4f7)

rpm: cleanup: drop %service_del_postun_without_restart

SUSE needs %service_del_postun (with or without restart) *only* if there
is a possibility that the RPM containing the unit file will be upgraded
from a version that packaged SysVinit scripts instead of systemd unit
files. (Which is not the case here.)

Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit f69aa5abfb2279919026144aa51e3c72f593e935)

Conflicts:
ceph.spec.in

rpm: cleanup: drop use of DISABLE_RESTART_ON_UPDATE

This SUSE-specific variable is deprecated and use of
%service_del_postun_without_restart macro should be preferred these
days.

Signed-off-by: Franck Bui <fbui@suse.com>
(cherry picked from commit 7d99e786df9654d896c43339c684519de4a9afa2)

Conflicts:
ceph.spec.in

osd: move down peers out from peer_purged

f7c5b01e18 tried to fix this, but adding peer_purged.erase() into
the peer_info loop made no effect because in purge_strays() when
inserting an osd to peer_purged we simultaneously remove it from
peer_info.

So it should be a separate loop through peer_purged list.

Fixes: https://tracker.ceph.com/issues/38931
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 64dc3c846ab9b1491459799ed249502599878834)

Conflicts:
src/osd/PeeringState.cc (does not exist, the code is in PG.cc)

Merge pull request #42162 from batrick/i51493

nautilus: pacific: pybind/ceph_volume_client: stat on empty string

Reviewed-by: Ramana Raja <rraja@redhat.com>

pybind/ceph_volume_client: use cephfs mkdirs api

This _mkdir_p should never have worked as the first directory it tries
to stat/mkdir is "", the empty string. This causes an assertion in the
client. I'm not sure how this code ever functioned without causing
faults. They look like:

2021-07-01 02:15:04.449 7f7612b5ab80 3 client.178735 statx enter (relpath want 2047)

The assertion is caused by a C++ exception:

/usr/include/c++/8/string_view:172: constexpr const _CharT& std::basic_string_view<_CharT, _Traits>::operator[](std::basic_string_view<_CharT, _Traits>::size_type) const [with _CharT = char$_Traits = std::char_traits<char>; std::basic_string_view<_CharT, _Traits>::size_type = long unsigned int]: Assertion '__pos < this->_M_len' failed.
Aborted (core dumped)

Where relpath is just the path passed to Client::stat.

This commit only applies to Pacific and older because master no longer
has this library.

Fixes: https://tracker.ceph.com/issues/51492
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 0fb05aea8a6e12c37a9b54641715a9a94ae1366f)

14.2.22

osd: fix scrub reschedule bug

not all element can be visited during reschedule traverse

Fixes: https://tracker.ceph.com/issues/49487
Signed-off-by: wencong wan <wanwc@chinatelecom.cn>
(cherry picked from commit d7561a6e58fc8043b77648a2cdd5d12bb637f92b)

Conflicts:
src/osd/OSD.cc (scrub vs scrub_job variable name, pg->scrubber vs pg->m_planned_scrub)
src/osd/OSD.h (trivial: set vs std::set)

Merge pull request #41792 from trociny/wip-45275-nautilus

nautilus: rbd-mirror: image replayer stop might race with instance replayer shut down

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

rbd_mirror: properly handle image replay canceled when starting replay

It fixes the bug when the handle_start_replay detected the cancel
when it called on_replay_interrupted and returned without
completing m_on_start_finish context.

This is a direct commit to nautilus. The bug was accidentally
fixed in newer versions during refactoring.

Signed-off-by: Mykola Golub <mgolub@suse.com>

Merge pull request #41874 from tchaikov/nautilus-pr-27465

nautilus: ceph-monstore-tool: use a large enough paxos/{first,last}_committed

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #41788 from trociny/wip-48565-nautilus

nautilus: librbd: fix sporadic failures in TestMigration.StressLive

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Merge pull request #41787 from trociny/wip-46149-nautilus

nautilus: librbd: race when disabling object map with overlapping in-flight writes

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

tasks/ceph_manager: ignore EACCES when waiting for quorum

mon_tick_interval is 5 seconds by default. monitors update their
rotating keys every mon_tick_interval. before monitors forms a
quorum, the auth requests from clients are put into the wait list.
these requests are re-enqueued once the monitors form a quorum. but
there is a small window of mon_tick_interval, before they are able
to serve the auth requests even after their claim to be able to
server requests. if these re-enqueued requests happen to be served
in this window, and if authx is enabled, they will be greeted with
errors like

handle_auth_bad_method server allowed_methods [2] but i only support [2]

in the case of ceph cli, the error would look like:

[errno 13] RADOS permission denied (error connecting to the cluster)

so, to address this issue, the EACCES error is ignored when waiting
for a quorum.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 7afd38f846894f11a61f697a2522cd0c30a35dc7)

tasks/ceph_manager: use safe_while() to refactor the wait for quorum

for better readability

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 3908c1f4cd0ebbfdcaae2d9e6de5c1609523cc55)

ceph-monstore-tool: use a large enough paxos/{first,last}_committed

so the rebuild paxos transaction won't be overwritten by the ones
created before recovery completes.

when the quorum is recovering, the leader will collect the paxos
transactions from peons. if the quorum accept the proposal for setting
the fingerprint, the peon will update the monitor with the paxos
transaction with a newer "last_committed" than the one created using
update_paxos() in ceph_monstore_tool.cc. the latter "last_committed" is
always 0.

so, to avoid this extra paxos proposal obsoleting the "rebuilding" paxos
transaction, we use a large enough number for {first,last}_committed.

Fixes: http://tracker.ceph.com/issues/38219
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 5475ef7843ab4021eddee60c2789b81d616383e9)

Merge pull request #41839 from yaarith/telemetry-leaderboard

nautilus: mgr/telemetry: pass leaderboard flag even w/o ident

Reviewed-by: Sage Weil <sage@redhat.com>

mgr/telemetry: pass leaderboard flag even w/o ident

Allow non-identified clusters to appear in the leaderboard.
The leaderboard option still defaults to false, so the change here
is that if they opt in to leaderboard but not ident we'll see
that on the backend.

Note that a leaderboard still does not exist (yet), so this doesn't
have any immediate impact. But if/when we do create one, it will
allow us to show big clusters (that opt in) on the leaderboard
as 'unidentified' or similar.

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit d4a6c3d0099a1f005f41a2cbcfbdbfeddd468db6)
Fixes: https://tracker.ceph.com/issues/51189

Merge pull request #41762 from dvanders/dvanders_50795

nautilus: mon: load stashed map before mkfs monmap

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #41776 from trociny/wip-51144-nautilus

nautilus: cls/rgw: look for plain entries in non-ascii plain namespace too

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>

rbd-mirror: wait for in-flight start/stop/restart

when stopping instance replayer on shut down.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit e55b64eaecb750e4ad6db89a741c8d0d3f03a670)

Conflicts:
src/tools/rbd_mirror/InstanceReplayer.cc (no on_finish arg for stop())

rbd-mirror: make stop properly cancel restart

Previously, if stop was issued when restart was at "stopping"
stage, the stop was just ignored.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 0a3794e56256be33a71e363da34ee84ffc34fef7)

Conflicts:
src/tools/rbd_mirror/ImageReplayer.cc (FunctionContext vs LambdaContext,
update stop's args in handle_remote_journal_metadata_updated)
src/tools/rbd_mirror/ImageReplayer.h (Mutex vs ceph::mutex)

Merge pull request #41750 from ifed01/wip-ifed-fix-alloc-init-add-free-0-len-nau

nautilus: os/bluestore: tolerate zero length for allocators' init_[add/rm]_free()

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #41749 from ifed01/wip-ifed-fix-repair-multithreading-nau

nautilus: os/bluestore: introduce multithireading sync for bluestore's repairer

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #41738 from s0nea/wip-51054-nautilus

nautilus: mgr/dashboard: show partially deleted RBDs

Reviewed-by: Avan Thakkar <athakkar@redhat.com>

rbd-mirror: track in-flight start/stop/restart in instance replayer

The shut down waits for in-flight ops to complete but the
start/stop/restart operations were previously not tracked. This
could cause a potential race and crash between an image replayer
operation and the instance replayer shutting down.

Fixes: https://tracker.ceph.com/issues/45072
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 31140a940ea1909c4b5d68ef4593cb582a527354)

Conflicts:
src/tools/rbd_mirror/InstanceReplayer.cc:
                Mutex::Locker vs std::lock_guard,
                m_local_rados->cct() vs m_local_io_ctx.cct(),
                no stop(Context *on_finish) function.

Merge pull request #41682 from neha-ojha/wip-50704-nautilus

nautilus: osd/PG.cc: handle removal of pgmeta object

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

common: add helper C_TrackerOp context class

This wraps the functionality of starting and finishing a tracked op
into the standard context interface.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 4bd9d1501f3832206ef12155cec3f008e3160822)

librbd/deep_copy: added new migrating flag to object copy

The migration operation and the copyup state machine will set
this flag when attempting to perform a deep-copy due to a
live-migration.

This flag will prevent a possible race condition between the
start of the object deep-copy when migration was enabled and
the writing portion of the deep-copy when migration might
have completed via external means.

Fixes: https://tracker.ceph.com/issues/45694
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 1baba64e213cb808804796575d3f7969cf37a3c6)

Conflicts:
src/librbd/deep_copy/ObjectCopyRequest.cc (trivial)

librbd/deep_copy: added bitwise flag parameter to object copy

This initial version subsumes the original "flatten" boolean flag.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit e79f6b1c157e042f57b577bc510debb21e004ea7)

Conflicts:
src/librbd/deep_copy/ImageCopyRequest.cc (FunctionContext vs LambdaContext, no handler param for ObjectCopyRequest)
src/librbd/deep_copy/ObjectCopyRequest.cc
src/librbd/deep_copy/ObjectCopyRequest.h
src/librbd/io/CopyupRequest.cc
src/librbd/operation/MigrateRequest.cc
src/test/librbd/deep_copy/test_mock_ImageCopyRequest.cc
src/test/librbd/deep_copy/test_mock_ObjectCopyRequest.cc
src/test/librbd/io/test_mock_CopyupRequest.cc
(no handler param for ObjectCopyRequest)

librbd: race when disabling object map with overlapping in-flight writes

The block guard that protects against overlapping updates to the object
map needs to be flushed prior to closing the object map instance.

Fixes: https://tracker.ceph.com/issues/46083
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit ee69323cd27263cb7f9dd97dcbfb1c36f1cc0837)

Conflicts:
src/librbd/ObjectMap.cc (FunctionContext vs LambdaContext, on_finish vs ctx)

test/cls_rgw: make bi_list test not rely on osd_max_omap_entries_per_request

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit d02d91f6f20a3431fd758a67a0bf77ea4bd4d883)

Conflicts:
src/test/cls_rgw/test_cls_rgw.cc (trivial: indentation)

test/cls_rgw: test bi_list for objects with non-ascii names

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 878d9510b4c9c0cc944740642e3342fdcb341936)

cls/rgw: look for plane entries in non-ascii plain namespace too

Fixes: https://tracker.ceph.com/issues/50415
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 7cf30e943276ff66f0eff9f0c088c597b1f9e066)

Conflicts:
src/cls/rgw/cls_rgw.cc (trivial: indentation, 'start_after_key' vs 'start_key', iterator declaration)

mon: load stashed map before mkfs monmap

After mkfs the store may not yet contain monmap:last_committed but
might be respawning after setting mon_sync:temp_newer_monmap.
Load that stashed map before falling back to the mkfs:monmap.

Fixes: https://tracker.ceph.com/issues/50230
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit cc0b4c77753962717da8a280a585990f7eec3c7b)

os/bluestore: tolerate zero length for allocators' init_[add/rm]_free()

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 6548e5d991810e89fc1ac14eb4fcf1a37a2b129f)

os/bluestore: introduce multithireading sync for bluestore's repairer

In quick-fix mode bluestore uses 2 threads by default to perform the
repair. Due to lacking synchronization they might corrupt repair
transaction batch.

Fixes: https://tracker.ceph.com/issues/50017
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 38c5b04235402a7908bc4713f617d767ca9fdc56)

Conflicts:
src/os/bluestore/BlueStore.cc - future stuff attempted to sneak
in
src/os/bluestore/BlueStore.h - the same as above

test/bluestore: add test case to reproduce #50017

This issue is caused by the lack of multithreading sync when doing
bluestore's quick-fix.

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 339a4257a1bfb7dc5d47b019a8a6492affa05b7c)

mgr/dashboard: show partially deleted RBDs

An RBD might be partially deleted if the deletion
process has been started but was interrupted. In
this case return the RBD as part of the RBD list
and mark it as partially deleted.

Fixes: https://tracker.ceph.com/issues/48603
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
(cherry picked from commit d83c277ac1861df31d2a39d16e20c7bebbea676e)

Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-details/rbd-details.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-list/rbd-list.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-list/rbd-list.component.spec.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-list/rbd-list.component.ts
src/pybind/mgr/dashboard/services/rbd.py
src/pybind/mgr/dashboard/tests/test_rbd_service.py
Resolved various conflicts because nautilus and
master diverged a lot.

Merge PR #41485 into nautilus

* refs/pull/41485/head:
qa: avoid TypeError in cleanup

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #41716 from k0ste/wip-51107-nautilus

nautilus: ceph-volume: fix batch report and respect ceph.conf config values

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>

Merge pull request #41713 from k0ste/wip-51104-nautilus

nautilus: ceph-volume: fix batch report and respect ceph.conf config values

Merge pull request #41676 from ifed01/wip-ifed-migrate-nau

nautilus: ceph-volume: implement bluefs volume migration.

ceph-volume: respect the value of bluestore_block_db_size from ceph.conf

If --block-db-size is not given args.block_db_size is set to None,
so we should check for it's value in ceph.conf

Resolves: RHBZ#1962744
Fixes: https://tracker.ceph.com/issues/50958
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit cd70a6f583a651e71b5e1b4cf381467cb85039f6)

ceph-volume: calculate % of device correctly in lvm batch --report

If using --block-db-size, the % of device calculation is incorrect
and always reads 100%.

Resolves: RHBZ#1946478
Fixes: https://tracker.ceph.com/issues/50957
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit ed5ab92dc3e67a670b33f7c36c651571682bf8e2)

ceph-volume: respect the value of bluestore_block_db_size from ceph.conf

If --block-db-size is not given args.block_db_size is set to None,
so we should check for it's value in ceph.conf

Resolves: RHBZ#1962744
Fixes: https://tracker.ceph.com/issues/50958
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit cd70a6f583a651e71b5e1b4cf381467cb85039f6)

ceph-volume: calculate % of device correctly in lvm batch --report

If using --block-db-size, the % of device calculation is incorrect
and always reads 100%.

Resolves: RHBZ#1946478
Fixes: https://tracker.ceph.com/issues/50957
Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit ed5ab92dc3e67a670b33f7c36c651571682bf8e2)

Merge pull request #41650 from rhcs-dashboard/wip-50426-nautilus

nautilus: mgr/Dashboard: Remove erroneous elements in hosts-overview Grafana dashboard

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

Merge pull request #41662 from idryomov/wip-rbd-trash-purge-nautilus

nautilus: librbd: don't stop at the first unremovable image when purging

Reviewed-by: Mykola Golub <mgolub@mirantis.com>

Merge pull request #41641 from idryomov/wip-rbd-qemu-precise-repos-nautilus

nautilus: qa/tasks/qemu: precise repos have been archived

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #41673 from ifed01/wip-ifed-fix-avl-enospc2-nau

nautilus: os/bluestore: fix unexpected ENOSPC in Avl/Hybrid allocators.

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #41648 from rhcs-dashboard/wip-51064-nautilus

nautilus: mgr/dashboard: fix bucket objects and size calculations

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Aashish Sharma <aasharma@redhat.com>

Merge pull request #41114 from k0ste/wip-48650-nautilus

nautilus: ceph-volume: disable cache for blkid calls

Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>

Merge pull request #40827 from ivancich/wip-50300-nautilus

nautilus: rgw: radoslist incomplete multipart parts marker

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

Merge pull request #39771 from ivancich/wip-49187-nautilus

nautilus: rgw: tooling to locate rgw objects with missing rados components

Reviewed-by: Michael Kidd <linuxkidd@gmail.com>

Merge pull request #41611 from dvanders/dvanders_40572_nautilus

nautilus: osd/PeeringState: fix acting_set_writeable min_size check

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>

Merge pull request #41088 from smithfarm/wip-50356-nautilus

nautilus: make-dist: refuse to run if script path contains a colon

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #41246 from idryomov/wip-posix-memalign-fix-nautilus

nautilus: common/buffer: adjust align before calling posix_memalign()

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #40698 from smithfarm/wip-49729-nautilus

nautilus: debian/ceph-common.postinst: do not chown cephadm log dirs

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

osd/PG.cc: handle removal of pgmeta object

In 7f04700, we made the pg removal code
much more efficient. But it started marking the pgmeta object as an unexpected
onode, which in reality is expected to be removed after all the other objects.

This behavior is very easily reproducible in a vstart cluster:

ceph osd pool create test 1 1
rados -p test bench 10 write --no-cleanup
ceph osd pool delete test test --yes-i-really-really-mean-it

Before this patch:

"do_delete_work additional unexpected onode list (new onodes has appeared
since PG removal started[#2:00000000::::head#]" seen in the OSD logs.

After this patch:

"do_delete_work removing pgmeta object #2:00000000::::head#" is seen.

Related to:https://tracker.ceph.com/issues/50466
Signed-off-by: Neha Ojha <nojha@redhat.com>
Manually applied 0e917f1b1e18ca9e48b3f91110d3a46b086f7d83, because
nautilus does not have do_delete_work.

Signed-off-by: Neha Ojha <nojha@redhat.com>

ceph-volume: disable cache for blkid calls

Due to bugs in cache managment in blkid, there are possible to have
nonexistence entries. This entries breaks ceph-volume operations by
passing two or more outputs instead of one (eg. /dev/sdk2).

Fixes: https://tracker.ceph.com/issues/48464
Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com>
(cherry picked from commit 90ed2e03198edec4a61dd9d6010e8d7b306b5f3a)

Merge pull request #41593 from lxbsz/wip-47020-open-fds

nautilus: libcephfs: ignore restoring the open files limit

Reviewed-by: Ramana Raja <rraja@redhat.com>

os/bluestore/bluestore_tool: compare retval stat() with -1

before this change, stat() is always called to check if the
file specified by --dev-target exists even if this option is not
specified. also, we compare the retval of stat() with ENOENT, while
state() returns -1 on error.

after this change, stat() is called only if --dev-target is specified,
and we compare the retval of stat() with -1 and 0 only, so if
--dev-target option is not specified, the tool still hehaves.

this change addresses a regression introduced by
94a91f54fe30a4dd113fbc1b02bc3f3d52c82a92

Fixes: https://tracker.ceph.com/issues/50891
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit d4c65a368c9cf35e01604fc3321f867cbe3e4109)

tests/ceph_volume: add UT for bluefs migration stuff

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit f8def0443db59e7df31132953fff708b76417236)

Conflicts
src/ceph-volume/ceph_volume/tests/devices/lvm/test_migrate.py -
get_single_lv is the new name for get_first_lvi

ceph-volume: implement bluefs volume migration.

This is a wrapper over ceph-bluestore-tool's bluefs-bdev-migrate command.
Primarily intended to introduce LVM tags manipulation which
ceph-bluestore-tool is lacking.

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 58efeb915198d4fbb40b6fa080312d8bee3141bf)

Conflicts:
doc/man/8/ceph-volume.rst - a bit different formatting is in use
src/ceph-volume/ceph_volume/api/lvm.py - get_single_lv is the
new name for get_first_lv

tools/ceph-bluestore-tool: be more legible before requesting additional params

Request DB/WAL size specification when relevant devices are created
only.

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 94a91f54fe30a4dd113fbc1b02bc3f3d52c82a92)

os/bluestore: fix unexpected ENOSPC in Avl/Hybrid allocators.

Avl allocator mode was returning unexpected ENOSPC in first-fit mode if all size-
matching available extents were unaligned but applying the alignment made all of
them shorter than required. Since no lookup retry with smaller size -
ENOSPC is returned.
Additionally we should proceed with a lookup in best-fit mode even when
original size has been truncated to match the avail size.
(force_range_size_alloc==true)

Fixes: https://tracker.ceph.com/issues/50656
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 0eed13a4969d02eeb23681519f2a23130e51ac59)

Conflicts:
src/test/objectstore/Allocator_test.cc - legacy INSTANTIATE_TEST_CASE_P clause is still used in Nautilus

Merge pull request #41158 from smithfarm/wip-50430-nautilus

nautilus: rgw: Added caching for S3 credentials retrieved from keystone

Reviewed-by: Friedmann <ofriedma@redhat.com>

librbd: don't stop at the first unremovable image when purging

As there is no inherent ordering, there may be multiple removable
images past the unremovable image.  On top of that, removing a clone
may make its parent removable so perform an additional pass if any
image gets removed.

Fixes: https://tracker.ceph.com/issues/51021
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 16d9a68a3e863b5a819860abf0696fb76fc9341a)

Conflicts:
qa/workunits/rbd/cli_generic.sh [ commit 6e1434eefc3d
  ("librbd: optionally move parent image to trash on remove")
  not in nautilus ]

rbd: combined error message for expected Trash::purge() errors

Output to stderr instead of the log where regular users wouldn't see
it given the elevated log level.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 0bcb9102174e5d1279fbc507acb161160a366dff)

rbd: propagate Trash::purge() result

Exit with respective status like other commands do.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit d0dd4b75d3efdb7de1e865f09434e8d7392ef158)

qa/tasks/qemu: precise repos have been archived

Fixes: https://tracker.ceph.com/issues/51033
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit dcd193c35eba7583613b805ab3941ff3ac5df745)

monitoring/grafana: Remove erroneous elements in hosts-overview Grafana dashboard

The hosts-overview Grafana dashboard json file contains a repeated element, making
it invalid JSON. Some JSON parsers handle this. However, this prevents Jsonnet
from parsing the dashboard, which prevents the deployment of this dashboard via
Jsonnet.

Fixes: https://tracker.ceph.com/issues/50410
Signed-off-by: Malcolm Holmes <mdh@odoko.co.uk>
(cherry picked from commit 382e293656cff4a0e7d84cc4d3dbfc005e82e10f)

mgr/dashboard: fix bucket objects and size calculations

Fixes: https://tracker.ceph.com/issues/51035
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit 9f5ef98d9c88a91b80e622f16f7061eddff79b2c)

Merge pull request #41386 from rhcs-dashboard/wip-50841-nautilus

nautilus: mgr/dashboard: grafana panels for rgw multisite sync performance

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #41513 from ideepika/wip-49592-upgrade-nautilus

nautilus: qa/upgrade: disable update_features test_notify with older client as lockowner

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

Merge pull request #41531 from rhcs-dashboard/wip-50885-nautilus

nautilus: mgr/dashboard: fix OSDs Host details/overview grafana graphs

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: wornet-mwo <NOT@FOUND>

nautilus: qa/upgrade: disable update_features test_notify with older client as lockowner

* with the recent support for async rbd operations from pacific+ when an
older client(non async support) goes on upgrade, and simultaneously
interacts with a newer client which expects the requests to be async,
experiences hang; considering the return code for request completion to
be acknowledgement for async request, which then keeps waiting for
another acknowledgement of request completion.

this if happens should be a rare only when lockowner is an old client
and should be deferred if compatibility issues arises.

* amend upgrade test workunits to use respective stable branches

Signed-off-by: Deepika Upadhyay <dupadhya@redhat.com>

nautilus: osd/PeeringState: fix acting_set_writeable min_size check

This is a nautilus only manual backport of
https://github.com/ceph/ceph/pull/40572

which is itself composed of commits
7b2e0f4fd1c9071495dae9189428aa1cb8774c30
642a1c165499bcbd4cfdf907af313ac7ffe44ff4

The backport did not apply cleanly because these call have
been factored out into PeeringState.cc in octopus and newer.

The original callers have been fixed in PG.cc.

Fixes: https://tracker.ceph.com/issues/50153
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>