test/rbd_mirror: grab timer lock before calling add_event_after()
add_event_after() expects an externally provided mutex to be held
for the call. This was missed in commit 8965a0f2a6f7 ("rbd-mirror:
synchronize with in-flight stop in ImageReplayer::stop()").
Ilya Dryomov [Sun, 20 Feb 2022 16:33:08 +0000 (17:33 +0100)]
rbd-mirror: synchronize with in-flight stop in ImageReplayer::stop()
Complete on_finish right away only if the replayer is stopped (meaning
that it is legible to be restarted immediately, possibly from on_finish
itself). This is the behaviour pretty much anyone would assume and
also what ImageReplayer::restart() relies on.
Ilya Dryomov [Sun, 20 Feb 2022 12:11:02 +0000 (13:11 +0100)]
rbd-mirror: manual stop should take precedence over regular stop
Somewhat similar to commit 0a3794e56256 ("rbd-mirror: make stop
properly cancel restart"), make it so that a) if a manual stop is
joined to regular stop, the stop becomes manual and b) if a regular
stop is joined to a manual stop, the stop stays manual.
Ilya Dryomov [Sat, 19 Feb 2022 15:43:04 +0000 (16:43 +0100)]
rbd-mirror: straighten ImageReplayer::stop() a bit
- don't default on_finish parameter
- m_restart_requested is set in ImageReplayer::restart() which is the
only restart=true call site, so setting m_restart_requested here is
redundant
- is_stopped_() can't be true in is_running_() branch
- on_finish->complete(0) in the end is unreachable
Or Friedmann [Tue, 19 Apr 2022 12:00:28 +0000 (12:00 +0000)]
rgw: RGWCoroutine::set_sleeping() checks for null stack
users of the RGWOmapAppend coroutine don't manage the lifetime of its
underlying coroutine stack, so end up making calls on RGWOmapAppend
after its stack goes away. this null check is a band-aid, and there are
still several other calls in RGWCoroutine that don't check for null
stack
Fixes: https://tracker.ceph.com/issues/49302 Signed-off-by: Or Friedmann <ofriedma@redhat.com> Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 3f0f831d66c7d43c9872f5de2aceb68aef4004d8)
Kefu Chai [Mon, 22 Jun 2020 01:34:53 +0000 (09:34 +0800)]
doc/conf.py: s/add_javascript/add_js_file/
to address following warning:
jenkins-build/build/workspace/ceph-pr-docs/doc/conf.py:102: RemovedInSphinx40Warning: The app.add_javascript() is deprecated. Please use app.add_js_file() instead.
Kefu Chai [Sun, 6 Mar 2022 06:27:50 +0000 (14:27 +0800)]
doc/conf.py: silence warnings from breathe
breathe calls doxygen for extracting/generating docs from code.
while doxygen complains at seeing undocumented fields/func. these
warnings could fail the sphinx-build command, if it takes warnings
as errors.
Kefu Chai [Sun, 6 Mar 2022 06:23:42 +0000 (14:23 +0800)]
mgr/cephadm: add empty line after param list in docstring
this helps to silence the warning from sphinx, like
src/pybind/mgr/orchestrator/_interface.py:docstring of orchestrator._interface.Orchestrator.remove_osds:9: WARNING: Field list ends without a blank line; unexpected unindent.
Mark Kogan [Mon, 5 Apr 2021 12:49:42 +0000 (15:49 +0300)]
rgw: return OK on consecutive complete-multipart reqs
Fixes: https://tracker.ceph.com/issues/50141 Signed-off-by: Mark Kogan <mkogan@redhat.com>
fixup! rgw: return OK on consecutive complete-multipart reqs
Cherry-pick notes:
- Conflicts due in rgw_op.h due to execute method adjacent to change not having optional_yield arg
- Conflicts in rgw_op.cc due to lack of rgw::sal::Object encapsulation in Octopus
Kefu Chai [Sat, 5 Mar 2022 17:44:30 +0000 (01:44 +0800)]
admin/doc-requirements: bump sphinx to 4.4.0
bump sphinx to latest stable. to address following build failure
ERROR: sphinx-autodoc-typehints 1.17.0 has requirement Sphinx>=4, but you'll have sphinx 3.5.4 which is incompatible.
ERROR: sphinx-substitution-extensions 2022.2.16 has requirement sphinx>=4.0.0, but you'll have sphinx 3.5.4 which is incompatible.
also bump bump sphinx-rtd-theme, otherwise we'd have following
build failure:
ERROR: sphinx-rtd-theme 0.5.2 has requirement docutils<0.17, but you'll have docutils 0.17.1 which is incompatible.
Casey Bodley [Thu, 10 Mar 2022 20:32:48 +0000 (15:32 -0500)]
cls/rgw: rgw_dir_suggest_changes detects race with completion
if bucket listing races with a pending index transaction, its suggested
removal may be mistakenly applied if that index transaction completes
before the osd receives this suggestion
in `rgw_dir_suggest_changes()`, the sole condition for applying a
suggested change is that the `cur_disk.pending_map` is empty. this is
true after rgw_bucket_complete_op()
on index completion, `rgw_bucket_dir_entry::index_ver` is updated to match
the new value of `rgw_bucket_dir_header::ver`. because most of `struct
rgw_bucket_dir_entry` makes the round trip through bucket listing ->
dir_suggest, we have access to the index_ver of the suggested entry. by
comparing this against the stored entry, we can ignore any suggestions
that were sent before the most recent completion
Nizamudeen A [Thu, 24 Mar 2022 08:01:18 +0000 (13:31 +0530)]
mgr/dashboard: fix "NullInjectorError: No provider for I18n
Although I am not sure what's the root cause of this but this seems to
fix the test failure. I don't know if this is caused by the differnce in
angular versions between master and octopus but I still don't understand
why it didn't catch in the recent PR to this file (https://github.com/ceph/ceph/pull/44763)
Fixes: https://tracker.ceph.com/issues/55011 Signed-off-by: Nizamudeen A <nia@redhat.com>
Casey Bodley [Wed, 12 May 2021 18:13:13 +0000 (14:13 -0400)]
rgw: parse tenant name out of rgwx-bucket-instance
used by multisite bucket full sync to request the listing of a specific
bucket instance. if the bucket lives under a tenant, we need to get that
out of the rgwx-bucket-instance header, because the http request path
only names the bucket
Xuehan Xu [Sat, 2 Jan 2021 14:50:23 +0000 (22:50 +0800)]
librgw: make rgw file handle versioned
The reason that we need this is that there could be the following scenario:
1. rgw_setattr sets the file attr;
2. rgw_write writes some new data, and encodes its attr to store into rados;
3. before the actual persistence of the file's attr bl, rgw_lookup loads the file's
previous attr and modifies the current file handle's metadata;
4. rgw_write's result persisted to rados;
5. rgw_setattr set the current file handle's metadata which is actually an old one to rados
In this case, the attr in rados would be out of date which means loss of data
In RGWBucketCtl::chown we have one RGWObjectCtx for all objects of a bucket.
In RGWObjectCtx there is a cache mechanism (std::map) for states of objects that will grows
continuously. for buckets with millions of objects this mechanism leads to huge memory usage.
in chown process we really do not need this caching mechanism so we could create one RGWObjectCtx
for every 1000 objects to limit usage of memory.
Fixes: https://tracker.ceph.com/issues/53599 Signed-off-by: Mohammad Fatemipour <mohammad.fatemipour@sotoon.ir>
(cherry picked from commit cf2d83ef81458524715c23e802977dc0760c847f)
Conflicts:
src/rgw/rgw_bucket.cc
Cherry-pick notes:
- Conflicts due to Octopus implementation differences in RGWBucketCtl::chown
Matt Benjamin [Tue, 5 Jan 2021 20:30:23 +0000 (15:30 -0500)]
doc: rgw: document S3 bucket replication support
Support was added at Octopus.
Fixes: https://tracker.ceph.com/issues/48755 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit 774a247b2b854538b679490581e6950372142797)
rgw:When KMS encryption is used and the key does not exist, we should not throw ERR_ INVALID_ ACCESS_ Key error code.
When kms encryption is used, the key_id is null or the actual_key size is wrong, we should not throw "ERR_INVALID_ACCESS_KEY " error code, instead of "EINVAL"error code, is used to indicate parameter error.
rgw: update last_added_entry when count == num_entries
RGWRados::cls_bucket_list_unordered() will produce one redundent entry
every time is_truncated is true.The issue could be easily reproduced
when a bucket is filled with amounts of incomplete multipart upload.
To be more specific, the number of incomplete multipart upload objects
should be greater than 1100.
ENOENT errors are expected, especially in fresh clusters, before we've
written any entries to the reshard list shards. avoid logging these
non-fatal ERROR messages:
> -1 ERROR: failed to list reshard log entries, oid=reshard.0000000000 marker= (2) No such file or directory
Casey Bodley [Fri, 6 Aug 2021 19:14:26 +0000 (15:14 -0400)]
radosgw-admin: 'sync status' is not behind if there are no mdlog entries
if remote mdlogs are trimmed prematurely, sync status will report
that it's behind the remote's max-marker even if there are no mdlog
entries to sync
for each behind shard, we fetch the next mdlog entry from the remote. if
we get an empty listing, remove that shard from behind_shards. this
logic now has to run before we print "behind shards:" so that empty
shards aren't listed
Cory Snyder [Wed, 2 Feb 2022 09:46:59 +0000 (04:46 -0500)]
rgw: fix segfault in UserAsyncRefreshHandler::init_fetch
Fixes a segfault that was occuring in error handling code of UserAsyncRefreshHandler::init_fetch.
When ruser->read_stats_async returned an error code, the instance of UserAsyncRefreshHandler had
already been deallocated in RGWSI_User_RADOS::read_stats_async and a segmentation fault occurs
when attempting to print a member variable in error logs. This commit removes the extra ref count
drop since the ref is properly dropped upstream in RGWQuotaCache::async_refresh error handling
logic.
Ilya Dryomov [Thu, 10 Mar 2022 11:40:34 +0000 (12:40 +0100)]
qa/suites: clean up client-upgrade-octopus-pacific test
- fix .qa symlinks
- rename nautilus-client-x.yaml to octopus-client-x.yaml
- fix typos and remove stale comment
- remove 2-features permutation (it doesn't do anything useful as the
workunit is run with RBD_FEATURES environment variable set and those
features are explicitly passed to RBD.create and RBD.clone calls;
the net effect is that the exact same job is run twice)
Kefu Chai [Sat, 5 Mar 2022 04:49:57 +0000 (12:49 +0800)]
cmake: pass RTE_DEVEL_BUILD=n when building dpdk
ceph is still using the Makefile based building system for building
DPDK. and DPDK enables -Werror if RTE_DEVEL_BUILD is 'y' which is
enabled by default when the dpdk is built from a git repo.
but newer GCC is more picky than the older versions, to prevent
the possible FTBFS when we switch to newer GCC for building old
branches whose dpdk submodule might be include the changes addressing
those warnings. let's just disable this option.
the only effect of this option is to add -Werror to CFLAGS. but
the building warnings from DPDK is not our focus when developing
Ceph in the most cases. so it should be fine.
Kotresh HR [Thu, 10 Feb 2022 05:34:41 +0000 (11:04 +0530)]
mgr/volumes: Fix clone uid/gid mismatch
This is the regression caused by commit 18b85c53a.
The 'set_attrs' function sets the uid/gid of the
group to the subvolume if uid/gid is not passed.
The attrs of the clone should match the source
snapshot. Hence, don't use the 'set_attrs'
function to set only the quota attrs for the
clone.