git.apps.os.sepia.ceph.com Git

Merge pull request #45587 from idryomov/wip-pool-reverse-lookup-osdmap-octopus

octopus: librados: check latest osdmap on ENOENT in pool_reverse_lookup()

Reviewed-by: Laura Flores <lflores@redhat.com>

Merge pull request #45479 from cfsnyder/wip-53640-octopus

octopus: rgw/amqp: add default case to silence compiler warning

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #44260 from sseshasa/wip-53550-octopus

octopus: osd/OSDMap: Add health warning if 'require-osd-release' != current release

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #46230 from ceph/octopus-gtest2

octopus: Fixes for make check

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45177 from pponnuvel/wip-54377-octopus

octopus: rbd-mirror: synchronize with in-flight stop in ImageReplayer::stop()

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

include/denc: use pair<const K,V> in range-based for loop

map<K,V>::value_type is pair<const K, V>, so if we use range-based for
loop when iterating through a map, we should use pair<const K,V> instead
of pair<K,V>, the latter also compiles, but it might create a temporary
object of pair<K,V> from pair<const K,V>. GCC-11 complains at seeing
this:

../src/include/denc.h:1002:21: warning: loop variable ‘e’ of type ‘const T&’ {aka ‘const std::pair<OSDPerfMetricQuery, OSDPerfMetricReport>&’} binds to a tem\
porary constructed from type ‘const std::pair<const OSDPerfMetricQuery, OSDPerfMetricReport>’ [-Wrange-loop-constru
ct]
1002 |       for (const T& e : s) {
      |                     ^

this change

* use the value_type of container in `maplike_details<Container>`,
  so we can avoid the overhead of creating temporay objects when
  encoding a map
* define denc_traits for std::pair<const A, B> as well, so the elements
  of a map can be encoded using denc facility

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit c828ce29400a1eea4b223229b36b3a092eda6139)

test/test_rbd_replay: move operator<<(..rbd_loc& name) to rbd_replay

so gtest can print out rbd_loc when printing out diagnostic information
when test fails. after moving operator<<(ostream&, const rbd_loc&) to
the `rbd_replay` namespace, ADL is able to find it. for more details on
the lookup rules, see https://en.cppreference.com/w/cpp/language/adl

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit fe696638515dc7a51214930049ebd1a6f951047f)

common/ceph_time: Don't define public things in time_detail

Defining things in a detail section and then using them outside turned
out to not be the best idea.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 01f706ca0ffd39680dbfacf348750c9c0f851578)

Conflicts:
src/rgw/rgw_sync_error_repo.h not present in octopus

common/ceph_time: add operator<< for signedspan

* templatize operator<<(ostream&, duration<>), so it works for more
  duration<> classes with minimal efforts -- we just need to explicitly
  instantiate these template operators
* explicitly instantiate operator<< for timespan, signedspan, seconds
  and milliseconds. they are most likely to be used in Ceph. we can add
  more of them when necessary.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit a64b96dba14df1e61ee6eb449535a6ff4a9d64b3)

common/ceph_time: move operator<<(ostream&, timespan&) into std namespace

otherwise compiler is not able to find it as the "timespan" here is
actually a class defined in std namespace, even it has an alias defined
in ceph namespace like:

typedef std::chrono::duration<rep, std::nano> timespan;

but this does not make it a member of "ceph" namespace. for more details
on the lookup rules, see https://en.cppreference.com/w/cpp/language/adl

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 75aafcba888a5753d2a4a8378637b4bb9fad5dd0)

Merge pull request #45287 from yaarith/wip-52327-octopus

octopus: mgr/devicehealth: fix missing timezone from time delta calculation

Reviewed-by: Laura Flores <lflores@redhat.com>

Merge pull request #45324 from ljflores/wip-54468-octopus

octopus: osd: require osd_pg_max_concurrent_snap_trims > 0

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Dan van der Ster <daniel.vanderster@cern.ch>

Merge pull request #45593 from vumrao/wip-vumrao-55019

octopus: osd/PrimaryLogPG.cc: CEPH_OSD_OP_OMAPRMKEYRANGE should mark omap dirty

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #45655 from ljflores/wip-55074-octopus

octopus: osd/OSD: osd_fast_shutdown_notify_mon not quite right

Reviewed-by: Nitzan Mordechai <nmordech@redhat.com>

Merge pull request #45772 from ljflores/wip-53606-octopus

octopus: mgr/telemetry: fix waiting for mgr to warm up

Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>

Merge pull request #45182 from ideepika/wip-54323-octopus

octopus: tools/rbd: expand where option rbd_default_map_options can be set

Reviewed-by: Christopher Hoffman <choffman@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Merge pull request #45180 from ideepika/wip-54380-octopus

octopus: common: replace BitVector::NoInitAllocator with wrapper struct

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Merge pull request #45441 from cfsnyder/wip-52072-octopus

octopus: rgw: add the condition of lock mode conversion to PutObjRentention

Reviewed-by: Casey Bodley <cbodley@redhat.com>

test/rbd_mirror: grab timer lock before calling add_event_after()

add_event_after() expects an externally provided mutex to be held
for the call. This was missed in commit 8965a0f2a6f7 ("rbd-mirror:
synchronize with in-flight stop in ImageReplayer::stop()").

Fixes: https://tracker.ceph.com/issues/55317
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 60e16106837e0d23366709f70f39c4f1ae7a2a45)

rbd-mirror: synchronize with in-flight stop in ImageReplayer::stop()

Complete on_finish right away only if the replayer is stopped (meaning
that it is legible to be restarted immediately, possibly from on_finish
itself). This is the behaviour pretty much anyone would assume and
also what ImageReplayer::restart() relies on.

Fixes: https://tracker.ceph.com/issues/54344
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 8965a0f2a6f7bdbe732be94b1ee269cab5be0a2a)

rbd-mirror: turn m_on_stop_finish into a list of Contexts

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 4ad31cd0583ebb695a9d84a35b9fc20ad9ec8585)

rbd-mirror: manual stop should take precedence over regular stop

Somewhat similar to commit 0a3794e56256 ("rbd-mirror: make stop
properly cancel restart"), make it so that a) if a manual stop is
joined to regular stop, the stop becomes manual and b) if a regular
stop is joined to a manual stop, the stop stays manual.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit c5b5787349e91a0fd23cd6d5e73b2a383ddd8687)

rbd-mirror: straighten ImageReplayer::stop() a bit

- don't default on_finish parameter
- m_restart_requested is set in ImageReplayer::restart() which is the
only restart=true call site, so setting m_restart_requested here is
redundant
- is_stopped_() can't be true in is_running_() branch
- on_finish->complete(0) in the end is unreachable

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 219c500977bbfbcfe4ccd24beb294edbe0562d35)

Merge pull request #45458 from cfsnyder/wip-53037-octopus

octopus: rgw: cls_bucket_list_unordered() might return one redundent entry every time is_truncated is true

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45481 from cfsnyder/wip-53653-octopus

octopus: rgw: init bucket index only if putting bucket instance info succeeds

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45492 from cfsnyder/wip-54075-octopus

octopus: rgw: bucket chown bad memory usage

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45496 from cfsnyder/wip-54085-octopus

octopus: librgw: make rgw file handle versioned

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45503 from cfsnyder/wip-54149-octopus

octopus: rgw: RGWPostObj::execute() may lost data.

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45902 from ivancich/wip-55045-octopus

octopus: cls/rgw: rgw_dir_suggest_changes detects race with completion

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45449 from cfsnyder/wip-52349-octopus

octopus: rgw: change order of xml elements in ListRoles response

Reviewed-by: Casey Bodley <cbodley@redhat.com>

googletest submodule: pick up change to silence error=maybe-uninitialized warning

to include the fix of https://github.com/google/googletest/pull/3024

otherwise GCC-11 fails to compile the tests with following warning:

In file included from ../src/googletest/googletest/src/gtest-all.cc:42:
../src/googletest/googletest/src/gtest-death-test.cc: In function ‘bool testing::internal::StackGrowsDown()’:
../src/googletest/googletest/src/gtest-death-test.cc:1301:24: error: ‘dummy’ may be used uninitialized [-Werror=maybe-uninitialized]
1301 |   StackLowerThanAddress(&dummy, &result);
      |   ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
../src/googletest/googletest/src/gtest-death-test.cc:1290:13: note: by argument 1 of type ‘const void*’ to ‘void testing::internal::StackLowerThanAddress(const void*, bool*)’ declared here
1290 | static void StackLowerThanAddress(const void* ptr, bool* result) {
      |             ^~~~~~~~~~~~~~~~~~~~~
../src/googletest/googletest/src/gtest-death-test.cc:1299:7: note: ‘dummy’ declared here
1299 |   int dummy;
      |       ^~~~~
cc1plus: all warnings being treated as errors

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit abbdf338d6725febd686ccdbe395b207283c04ab)

Merge pull request #46219 from zdover23/wip-doc-45512-backport-to-octopus

octopus: ceph/admin: s/master/main

Reviewed-by: Laura Flores <lflores@redhat.com>

ceph/admin: s/master/main

This PR changes the name "master" to "main" so
that builds (and, I assume, a great many other
things) will not fail.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 6a1dd3a8a2f3dc9fe8615d402c9041273516ff89)

Merge pull request #45446 from cfsnyder/wip-52114-octopus

octopus: qa/rgw: update apache-maven mirror for rgw/hadoop-s3a

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45443 from cfsnyder/wip-52108-octopus

octopus: radosgw-admin: 'sync status' is not behind if there are no mdlog entries

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45431 from cfsnyder/wip-51700-octopus

octopus: rgw: url_decode before parsing copysource in copyobject

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45412 from cfsnyder/wip-54495-octopus

octopus: rgw: fix segfault in UserAsyncRefreshHandler::init_fetch

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #46042 from cbodley/wip-55459

octopus: rgw: RGWCoroutine::set_sleeping() checks for null stack

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45523 from cbodley/wip-54622

octopus: rgw: parse tenant name out of rgwx-bucket-instance

Reviewed-by: Shilpa Jagannath <smanjara@redhat.com>

Merge pull request #45488 from cfsnyder/wip-53867-octopus

octopus: rgw: return OK on consecutive complete-multipart reqs

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45485 from cfsnyder/wip-53836-octopus

octopus: rgw: document S3 bucket replication support

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45462 from cfsnyder/wip-53157-octopus

octopus: rgw:When KMS encryption is used and the key does not exist, we should…

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45460 from cfsnyder/wip-53078-octopus

octopus: src/rgw: Fix for malformed url

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45454 from cfsnyder/wip-52989-octopus

octopus: rgw: document rgw_lc_debug_interval configuration option

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45452 from cfsnyder/wip-52957-octopus

octopus: radosgw-admin: 'reshard list' doesn't log ENOENT errors

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #45283 from cbodley/wip-54482

octopus: rgw: fix leak of RGWBucketList memory (octopus only)

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

Merge pull request #45088 from dvanders/wip-52076-octopus

octopus: rgw: resolve empty ordered bucket listing results w/ CLS filtering *and* bucket index list produces incorrect result when non-ascii entries

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>

Merge pull request #45972 from ljflores/wip-55077-octopus

octopus: admin/doc-requirements: bump sphinx to 4.4.0

rgw: RGWCoroutine::set_sleeping() checks for null stack

users of the RGWOmapAppend coroutine don't manage the lifetime of its
underlying coroutine stack, so end up making calls on RGWOmapAppend
after its stack goes away. this null check is a band-aid, and there are
still several other calls in RGWCoroutine that don't check for null
stack

Fixes: https://tracker.ceph.com/issues/49302
Signed-off-by: Or Friedmann <ofriedma@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 3f0f831d66c7d43c9872f5de2aceb68aef4004d8)

doc/conf.py: s/add_javascript/add_js_file/

to address following warning:

jenkins-build/build/workspace/ceph-pr-docs/doc/conf.py:102: RemovedInSphinx40Warning: The app.add_javascript() is deprecated. Please use app.add_js_file() instead.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 1704216628729666bc4e2127d613360bb0f7b33a)

mgr/cephadm: use block quote for "typical use"

otherwise sphinx takes "Typical use" and the following line as a
field. see also

https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#field-lists

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 05798f0cae9afda598f5a154c62fdd24bab9ca30)

mgr/cephadm: improve the formatting of docstring

add an empty line before a doctest block would help
sphinx to tell where the session starts.

see also https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#doctest-blocks

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 8685fffdf20eeb4e2068c421e351aa02c48ff860)

mgr/cephadm: document notes using "note::" directive

so it can be rendered by sphinx in a better way.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit ba3ccee01b31ef9e39a5016a0ffda18628ec3bc2)

doc/conf.py: silence warnings from breathe

breathe calls doxygen for extracting/generating docs from code.
while doxygen complains at seeing undocumented fields/func. these
warnings could fail the sphinx-build command, if it takes warnings
as errors.

in this change, these warnings are silenced.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 8891d653198c30f9578499126e1ee9ee67eca04a)

mgr/cephadm: add empty line after param list in docstring

this helps to silence the warning from sphinx, like

src/pybind/mgr/orchestrator/_interface.py:docstring of orchestrator._interface.Orchestrator.remove_osds:9: WARNING: Field list ends without a blank line; unexpected unindent.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit d9b8e38e3dfe8e6eec6d56ee934c4632de46fc68)

Conflicts:
src/pybind/mgr/orchestrator/_interface.py
- `:param zap:` did not exist in Octopus; removed
this from the param list.

mgr/cephadm: set docstring for shim() methods

this allows the "rpc"ized methods of OrchestratorClientMixin to
have the docstring defined by the original methods.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit d0db2ae4f946e1a985402640ef8f1733b40e91ef)

Conflicts:
src/pybind/mgr/orchestrator/_interface.py
- Removed some typing imports that were not present
in Octopus

rgw: return OK on consecutive complete-multipart reqs

Fixes: https://tracker.ceph.com/issues/50141
Signed-off-by: Mark Kogan <mkogan@redhat.com>
fixup! rgw: return OK on consecutive complete-multipart reqs

(cherry picked from commit 324c377849a5d246f689f6e7a2862f42f1504d2c)

Conflicts: src/rgw/rgw_op.h src/rgw/rgw_op.cc

Cherry-pick notes:
- Conflicts due in rgw_op.h due to execute method adjacent to change not having optional_yield arg
- Conflicts in rgw_op.cc due to lack of rgw::sal::Object encapsulation in Octopus

admin/doc-requirements: bump sphinx to 4.4.0

bump sphinx to latest stable. to address following build failure

ERROR: sphinx-autodoc-typehints 1.17.0 has requirement Sphinx>=4, but you'll have sphinx 3.5.4 which is incompatible.
ERROR: sphinx-substitution-extensions 2022.2.16 has requirement sphinx>=4.0.0, but you'll have sphinx 3.5.4 which is incompatible.

also bump bump sphinx-rtd-theme, otherwise we'd have following
build failure:

ERROR: sphinx-rtd-theme 0.5.2 has requirement docutils<0.17, but you'll have docutils 0.17.1 which is incompatible.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 0a5fab53b3804be5ef1377a2f35006b8df857d39)

Conflicts:
admin/doc-requirements.txt
- `sphinx_rtd_theme` was not present in Octopus

test/cls/rgw: test dir_suggest after successful completion

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit a350888bf9d812db36b3bdf6d1e4ee7469964fad)

cls/rgw: rgw_dir_suggest_changes detects race with completion

if bucket listing races with a pending index transaction, its suggested
removal may be mistakenly applied if that index transaction completes
before the osd receives this suggestion

in `rgw_dir_suggest_changes()`, the sole condition for applying a
suggested change is that the `cur_disk.pending_map` is empty. this is
true after rgw_bucket_complete_op()

on index completion, `rgw_bucket_dir_entry::index_ver` is updated to match
the new value of `rgw_bucket_dir_header::ver`. because most of `struct
rgw_bucket_dir_entry` makes the round trip through bucket listing ->
dir_suggest, we have access to the index_ver of the suggested entry. by
comparing this against the stored entry, we can ignore any suggestions
that were sent before the most recent completion

Fixes: https://tracker.ceph.com/issues/54528
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit aa381b6765b0fb316976c4af7a45f32a157a4f75)

mgr/telemetry: fix waiting for mgr to warm up

1. The implementation of config_notify() in telemetry module sets the
flag for event, which is supposed to wake up the 'serve' thread whenever
a config option is changed. The problem is that we call config_notify()
at the beginning of serve(), before we enter its 'run' loop. This call
sets the event which cancels the 10 seconds wait for the mgr to warm up.
To fix this, we extract the logic of updating the config options to a
separate function (config_update_module_option()), and call it on
__init__, instead of calling config_notify() in serve().

2. We should always wait for the mgr to warm up here (10 seconds). In
case of a sporadic event (e.g. a config option change via CLI) the event
will be set, and wait will return immediately. We enforce this wait by
using time.sleep(10) instead of event.wait(10).

Fixes: https://tracker.ceph.com/issues/53204
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit fa5cc0ca081ca3cce552e0cb21a1e17273cf3482)

Conflicts:
src/pybind/mgr/telemetry/module.py

- Several options under __init__ not present in Octopus
- No type checking in Octopus

osd/OSD: osd_fast_shutdown_notify_mon not quite right

When osd_fast_shutdown and osd_fast_shutdown_notify_mon set as true, OSD marked as Down
it should be marked as Dead,

Fixed: https://tracker.ceph.com/issues/53327

Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
nd

nd

(cherry picked from commit 07302d5e41c49c885c9398c1c478638023e3f264)

Conflicts:
src/mon/OSDMonitor.cc
- In Octopus, an arrow operator was used instead of
the dot operator for calling monitor clog info.

osd: make osd_fast_shutdown_notify_mon option true by default

osd_fast_shutdown_notify_mon option is false by default. So users suffer
from error log flood, slow ops, and the long I/O timeouts on voluntary OS
shutdown before they are aware of the existence of this option. Let's
make this option true by default.

Fixes: https://tracker.ceph.com/issues/53328
Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
(cherry picked from commit 729a5b85a6586b47d16acbba2cf8e765e498cd65)

Conflicts:
src/common/options/global.yaml.in
- global.yaml.in does not exist in Octopus; rather,
these configs were handled in options.cc.

Merge pull request #44960 from BenoitKnecht/wip-54233-octopus

octopus: mon: Abort device health when device not found

Reviewed-by: Yaarit Hatuka <yaarit@redhat.com>

Merge pull request #44546 from cfsnyder/wip-53719-octopus

octopus: osd/OSDMapMapping: fix spurious threadpool timeout errors

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>

Merge pull request #43224 from kotreshhr/wip-52629-octopus

octopus: mgr/volumes: Fix permission during subvol creation with mode

Reviewed-by: Venky Shankar vshankar@redhat.com

Merge pull request #45613 from rhcs-dashboard/octopus-null-injection-fix

octopus: mgr/dashboard: fix "NullInjectorError: No provider for I18n

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>

mgr/dashboard: fix "NullInjectorError: No provider for I18n

Although I am not sure what's the root cause of this but this seems to
fix the test failure. I don't know if this is caused by the differnce in
angular versions between master and octopus but I still don't understand
why it didn't catch in the recent PR to this file (https://github.com/ceph/ceph/pull/44763)

Fixes: https://tracker.ceph.com/issues/55011
Signed-off-by: Nizamudeen A <nia@redhat.com>

osd/PrimaryLogPG.cc: CEPH_OSD_OP_OMAPRMKEYRANGE should mark omap dirty

We should mark_omap_dirty() for all omap write ops, just like we did
in cb927925af1f3df4b9c31df85cf31f982aae1988.

Currently, for CEPH_OSD_OP_OMAPRMKEYRANGE ops, clean_omap gets set to true,
which results in incomplete recovery of objects and results in
inconsistent PGs after a scrub.

Fixes: https://tracker.ceph.com/issues/54592
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit f7fd5895fd3d7d7c4691be91434868d90f7a4e0f)

librados: check latest osdmap on ENOENT in pool_reverse_lookup()

Avoid spurious ENOENT errors from rados_pool_reverse_lookup() and
Rados::pool_reverse_lookup().

This makes lookup by id consistent with lookup by name: the latter
has been checking latest osdmap since commit 7e5669b11b14 ("rados: we
need to get the latest osdmap when pool does not exists").

Fixes: https://tracker.ceph.com/issues/54593
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 1f837e233af32c8a66f88508cde534c361ecfcbc)

rgw: parse tenant name out of rgwx-bucket-instance

used by multisite bucket full sync to request the listing of a specific
bucket instance. if the bucket lives under a tenant, we need to get that
out of the rgwx-bucket-instance header, because the http request path
only names the bucket

Fixes: https://tracker.ceph.com/issues/50785
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 291342425e4b49de9b6985c718f6cb9210f5554d)

rgw: RGWPostObj::execute() may lost data.

Signed-off-by: Lei Zhang <1091517373@qq.com>
(cherry picked from commit f241a330dcb5968f9ec1de1a382572258cb6daac)

librgw: move RGWFileHandle::encode/decode to the private sector

To prevent RGWFileHandle::encode/decode methods to be invoked directly by
other modules

Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
(cherry picked from commit 068c5e7ff1286ac4d5624f6e6bd7dedc21b34095)

librgw: make rgw file handle versioned

The reason that we need this is that there could be the following scenario:

1. rgw_setattr sets the file attr;
2. rgw_write writes some new data, and encodes its attr to store into rados;
3. before the actual persistence of the file's attr bl, rgw_lookup loads the file's
previous attr and modifies the current file handle's metadata;
4. rgw_write's result persisted to rados;
5. rgw_setattr set the current file handle's metadata which is actually an old one to rados

In this case, the attr in rados would be out of date which means loss of data

Fixes: https://tracker.ceph.com/issues/50194
Signed-off-by: Xuehan Xu <xuxuehan@qianxin.com>
(cherry picked from commit 49a35d72e0982c03781d4845c800332bded1c658)

rgw: fix bad memory usage of bucket chown method

In RGWBucketCtl::chown we have one RGWObjectCtx for all objects of a bucket.
In RGWObjectCtx there is a cache mechanism (std::map) for states of objects that will grows
continuously. for buckets with millions of objects this mechanism leads to huge memory usage.

in chown process we really do not need this caching mechanism so we could create one RGWObjectCtx
for every 1000 objects to limit usage of memory.

Fixes: https://tracker.ceph.com/issues/53599
Signed-off-by: Mohammad Fatemipour <mohammad.fatemipour@sotoon.ir>
(cherry picked from commit cf2d83ef81458524715c23e802977dc0760c847f)

Conflicts:
src/rgw/rgw_bucket.cc

Cherry-pick notes:
- Conflicts due to Octopus implementation differences in RGWBucketCtl::chown

doc: rgw: document S3 bucket replication support

Support was added at Octopus.

Fixes: https://tracker.ceph.com/issues/48755
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit 774a247b2b854538b679490581e6950372142797)

rgw: init bucket index only if putting bucket instance info succeeds

Signed-off-by: Huber-ming <zhangsm01@inspur.com>
(cherry picked from commit 6e97f2a32df80f00d44ed3daceac381c46c17026)

Conflicts:
src/rgw/rgw_reshard.cc

Cherry-pick notes:
- pub_bucket_instance_info and init_index don't take prefix provider arg on Octopus

rgw/amqp: add default case to silence compiler warning

Fixes: https://tracker.ceph.com/issues/53252
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit a32ae4b61ac4af244533ce53388411ca50fc1148)

rgw:When KMS encryption is used and the key does not exist, we should not throw ERR_ INVALID_ ACCESS_ Key error code.

When kms encryption is used, the key_id is null or the actual_key size is wrong, we should not throw "ERR_INVALID_ACCESS_KEY " error code, instead of "EINVAL"error code, is used to indicate parameter error.

Signed-off-by: wangyingbin <wangyingbin@inspur.com>
(cherry picked from commit 40dbc29984d67a3f4946a0b30d53f3db19952bf0)

src/rgw: Fix for malformed url

This PR solves: https://tracker.ceph.com/issues/52738
It is solved by making changes to rgw_url.cc
A test is also added to check it's working.

Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>
(cherry picked from commit 2916f2439eb2f62bc08c3e283b13391302b3e497)

rgw: update last_added_entry when count == num_entries

RGWRados::cls_bucket_list_unordered() will produce one redundent entry
every time is_truncated is true.The issue could be easily reproduced
when a bucket is filled with amounts of incomplete multipart upload.
To be more specific, the number of incomplete multipart upload objects
should be greater than 1100.

Signed-off-by: Peng Zhang <zhangpeng@vclusters.com>
(cherry picked from commit 7511f9f675ea4e43992605dc03109bc5f356a5e1)

rgw: document rgw_lc_debug_interval configuration option

Updates the yaml describing this config option with a "desc" and a
"long_desc".

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 9171d3626b5a0181456a68555d5742109abaabbc)

Conflicts:
src/common/options/rgw.yaml.in

Cherry-pick notes:
- Octopus options are not defined in yaml

radosgw-admin: 'reshard list' doesn't log ENOENT errors

ENOENT errors are expected, especially in fresh clusters, before we've
written any entries to the reshard list shards. avoid logging these
non-fatal ERROR messages:

> -1 ERROR: failed to list reshard log entries, oid=reshard.0000000000 marker= (2) No such file or directory

Fixes: https://tracker.ceph.com/issues/52873
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 952c7c844acee5fe73e3f70737606b700b67238c)

Conflicts:
src/rgw/rgw_reshard.cc

Cherry-pick notes:
- Octopus using lderr vs ldpp_dout

rgw: change order of xml elements in ListRoles response

one or more AWS sdks fail to parse our response because they expect
ListRolesResult to come first. not really an rgw bug, but it's easy
enough to fix

Fixes: https://tracker.ceph.com/issues/52027
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 5f1f4212a81a7b124d657d4653de39ee85051964)

qa/rgw: update apache-maven mirror for rgw/hadoop-s3a

Fixes: https://tracker.ceph.com/issues/52069
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 9253733d0883d01988b163ee22cfc3481c01a52d)

radosgw-admin: 'sync status' is not behind if there are no mdlog entries

if remote mdlogs are trimmed prematurely, sync status will report
that it's behind the remote's max-marker even if there are no mdlog
entries to sync

for each behind shard, we fetch the next mdlog entry from the remote. if
we get an empty listing, remove that shard from behind_shards. this
logic now has to run before we print "behind shards:" so that empty
shards aren't listed

Fixes: https://tracker.ceph.com/issues/52091
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 14d43f740d101c8d41a2ced4525bf8efd8c9d943)

rgw: add the condition of lock mode conversion to PutObjRentention

Signed-off-by: wangzhong <wangzhong@cmss.chinamobile.com>
(cherry picked from commit 585eae46a2bb7ed39ca2f3213801e0f77642c9d4)

Amend b7621625ed69f21a5bf701b3385ddee281ff3715 to not call url_decode excessively

Fixes: #43259
Signed-off-by: Paul Reece <paul@servercloud.com>
(cherry picked from commit c83afb4359b9f8b6d8b6942e74a52f303a474d54)

Conflicts:
src/rgw/rgw_op.cc

rgw: url_decode before parsing copysource in copyobject

If the copysource on copyobject call was URL-encoded, it would fail as it would not parse the '/' seperating bucket and key name

URL encoding may be necessary for certain characters in a copysource, and several public examples show URL encoding the copysource

Fixes: #43259
Signed-off-by: Paul Reece <paul@servercloud.com>
(cherry picked from commit b7621625ed69f21a5bf701b3385ddee281ff3715)

rgw: fix segfault in UserAsyncRefreshHandler::init_fetch

Fixes a segfault that was occuring in error handling code of UserAsyncRefreshHandler::init_fetch.
When ruser->read_stats_async returned an error code, the instance of UserAsyncRefreshHandler had
already been deallocated in RGWSI_User_RADOS::read_stats_async and a segmentation fault occurs
when attempting to print a member variable in error logs. This commit removes the extra ref count
drop since the ref is properly dropped upstream in RGWQuotaCache::async_refresh error handling
logic.

Fixes: https://tracker.ceph.com/issues/54112
Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit 71ef3af870e5789e71480682f11a883ff3a673e7)

Merge pull request #45334 from idryomov/wip-client-upgrade-octopus-pacific-cleanup

qa/suites: clean up client-upgrade-octopus-pacific test

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

Merge pull request #44763 from votdev/wip-53928-octopus

octopus: mgr/dashboard: Notification banners at the top of the UI have fixed height

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Volker Theile <vtheile@suse.com>

Merge pull request #44924 from p-se/wip-53883-octopus

octopus: mgr/dashboard: fix Grafana OSD/host panels

Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: p-se <NOT@FOUND>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

qa/suites: clean up client-upgrade-octopus-pacific test

- fix .qa symlinks
- rename nautilus-client-x.yaml to octopus-client-x.yaml
- fix typos and remove stale comment
- remove 2-features permutation (it doesn't do anything useful as the
  workunit is run with RBD_FEATURES environment variable set and those
  features are explicitly passed to RBD.create and RBD.clone calls;
  the net effect is that the exact same job is run twice)

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

Merge pull request #45282 from ceph/wip-yuri-octopus-clients

qa/tests: added upgrade-clients/client-upgrade-octopus-quincy tests

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

rgw: fix bucket index listing count bug

Fix bugs surrounding calculation of number of entries returned and
whether the end of a listing range has been reached.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>

osd: require osd_pg_max_concurrent_snap_trims > 0

If osd_pg_max_concurrent_snap_trims is zero, we mistakenly clear
the snaptrim queue. Require it to be > 0.

Fixes: https://tracker.ceph.com/issues/54396
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit 29545b617b3b0324f9b0b20e032e3e38557115eb)

Conflicts:
src/common/options/global.yaml.in
- This file does not exist in Octopus; rather, global options are defined in src/common/options.cc.

qa/tests: added upgrade-clients/client-upgrade-octopus-quincy tests

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>

mgr/devicehealth: fix missing timezone from time delta calculation

An error occurs when subtracting a datetime object that is offset-naive
(i.e. unaware of timezone) from a datetime object which is offset-aware.

datetime.utcnow() is missing timezone info, e.g.:
'2021-09-22 13:18:45.021712',
while life_expectancy_max is in the format of:
'2021-09-28 00:00:00.000000+00:00',
hence we need to add timezone info to the former when calculating
their time delta.

Please note that we calculate time delta using `datetime.utcnow()` in
`serve()` in this module, but there we refer to the delta in seconds,
which works fine.

Fixes: https://tracker.ceph.com/issues/52327
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit 05902d943bba4a64abbd943270b56cbdd1650e62)

Conflicts:
src/pybind/mgr/devicehealth/module.py
The import typing line needed to be removed

rgw: fix leak of RGWBucketList memory (octopus only)

this updates an earlier octopus-only fix,
0de02a88be0972c89ed2bb10dc438d080137bd18, to also free the RGWBucket*
in each map entry

this issue only exists on octopus, so this fix targets octopus directly
instead of cherry-picking from master

Fixes: https://tracker.ceph.com/issues/54482
Signed-off-by: Casey Bodley <cbodley@redhat.com>