Matt Benjamin [Thu, 25 Feb 2021 22:39:08 +0000 (17:39 -0500)]
rgw: objectlock: improve client error messages
A bucket object lock configuration can only be set on buckets
created with the object-lock option enabled. Likewise, on
object lock or object retention hold can only be set on objects
in buckets with object lock enabled. Object lock and related
policy and policy violations are also potentially confusing
to client users.
Raise the debug level to 4, but add a human-readable client error
message, when object lock constraints are violated.
Fixes: https://tracker.ceph.com/issues/49541 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit 7583374e5294b1c1c16068999123fef98827e9dc)
Casey Bodley [Mon, 15 Jun 2020 15:45:11 +0000 (11:45 -0400)]
test/rgw: test_datalog_autotrim filters out new entries
if other sync activity is racing with test_datalog_autotrim, it can
create new datalog entries after the 'datalog autotrim' command runs
instead of asserting that the datalog is empty after trim, assert that
any entries have a marker larger than the max-marker reported by
'datalog status' before the trim
Kefu Chai [Tue, 30 Mar 2021 18:32:38 +0000 (02:32 +0800)]
mgr/PyModule: put mgr_module_path before Py_GetPath()
pip comes with _vendor/progress. so there is chance to import the vendored
version of "progress" module instead of the "progress" mgr module, and
fail to import the latter.
in this change, the order of paths are rearranged so the configured
`mgr_module_path` is put before the return value of `Py_GetPath()`.
Kefu Chai [Thu, 25 Feb 2021 09:56:02 +0000 (17:56 +0800)]
pybind/mgr/dashboard: bump up requests to 2.25.1
request 2.20 is not compatible with urllib3 v1.25.2 and up. this causes
trouble of incompatibility with other python modules. for instance, we
now have following error:
ERROR: pip's dependency resolver does not currently take into account
all the packages that are installed. This behaviour is the source of the
following dependency conflicts.
botocore 1.20.14 requires urllib3<1.27,>=1.25.4, but you have urllib3
1.24.3 which is incompatible.
see also https://github.com/psf/requests/pull/5092
Kefu Chai [Sat, 12 Dec 2020 07:19:40 +0000 (15:19 +0800)]
admin/build-doc: stop passing --use-feature=2020-resolver to pip
to silence the warning of
WARNING: --use-feature=2020-resolver no longer has any effect, since it is now the default dependency resolver in pip. This will become an error in pip 21.0.
Kefu Chai [Fri, 19 Mar 2021 04:05:45 +0000 (12:05 +0800)]
pybind/mgr/dashboard: bump flake8 to 3.9.0
to address the failure of
ERROR: Cannot install -r requirements-lint.txt (line 2) and -r requirements-lint.txt (line 8) because these package versions have conflicting dependencies.
The conflict is caused by:
flake8 3.8.4 depends on pycodestyle<2.7.0 and >=2.6.0a1
autopep8 1.5.6 depends on pycodestyle>=2.7.0
To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict
also, loosen the version of pytest:
The conflict is caused by:
The user requested pytest<4
The user requested pytest<4
pytest-cov 2.11.1 depends on pytest>=4.6
To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency
conflict
Kefu Chai [Sun, 20 Dec 2020 05:10:16 +0000 (13:10 +0800)]
pybind/ceph_argparse.py: use a safe value for timeout
we have reports that on arm32 machines, it timed out immediately, so
to prevent it from int overflow, use a safer value instead of
(1 << (32 - 1)) - 1.
Adam Kupczyk [Mon, 22 Mar 2021 10:20:11 +0000 (11:20 +0100)]
os/bluestore: Make Onode::put/get resiliant to split_cache
In
OnodeCacheShard* ocs = c->get_onode_cache();
std::lock_guard l(ocs->lock);
while waiting for lock, split_cache might have changed OnodeCacheShard.
This will result in adding Onode to improper OnodeCacheShard.
Such action is obviously bad, as we will operate in future (at least once) on
different OnodeCacheShard then we got lock for. Particulary sensitive to this
are _trim and split_cache functions, as they iterate over elements.
Kefu Chai [Sat, 16 Jan 2021 06:33:17 +0000 (14:33 +0800)]
mgr: update mon metadata when monmap is updated
there is chance that some monitor(s) is updated / upgraded in a single
monmap update without being removed from cluster state's metata first,
so, without this change, we will not update the metadata associated with
that monitor, hence the mgr modules which consumes the metadata is not
updated accordingly and keep reporting the stale information.
in this change, we always update the metadata associated with all monitor
included by the latest monmap. multiple "mon metadata" commands are sent
to monitor for retrieving their updated metadata, instead of sending a
single one, so that we can reuse "MetadataUpdate" to update the metadata
of a given daemon. as the number of monitors in a typical cluster is
relatively small, and the frequency of monmap update is low, so this
overhead should be fine.
unlike other places where we ask mon for metadata in Mgr class, the code
sending the mon command for updated monitor metata is located outside of
`cluster_state.with_monmap()` block, the reason is that `with_monmap()`
is guraded by the monc_lock under the hood, while `start_mon_command()`
also need to acquire the monc_lock, which is not a recursive lock. so we
have to do this out of the `with_monmap()` block.
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch> Fixes: https://tracker.ceph.com/issues/49938
(cherry picked from commit 6147c0917157efd2d35610e759685656a4989abb)
Dan van der Ster [Tue, 23 Mar 2021 10:28:37 +0000 (11:28 +0100)]
test_ipaddr: check that we correctly skip loopback
We should skip devices named 'lo' or of the form 'lo:0' regardless
of their IP address.
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch> Related-to: https://tracker.ceph.com/issues/49938
(cherry picked from commit 780125d1ed93cd7b17172752b3e76186a524103b)
Kefu Chai [Thu, 25 Mar 2021 09:08:48 +0000 (17:08 +0800)]
run-make-check.sh: let ctest generate XML output
to enable XUnit plugin of jenkins to consume the ctest output and
publish it in the dashboard, we need to
* let ctest generate XML output instead of plain text output
* do not fail the test if any test case fails. this allows the publisher
to do its job by checking the XML output.
* prevent ctest from compressing the output. see
https://issues.jenkins.io/browse/JENKINS-21737
Dan van der Ster [Thu, 12 Nov 2020 16:14:37 +0000 (17:14 +0100)]
common/options: bluefs_buffered_io=true by default
Enable bluefs_buffered_io again because it makes a huge user-visible
improvement in metadata intensive scenarios, such as but not limited to
PG deletion.
In our environment, deleting PGs from 4 hybrid OSDs (sharing one SATA SSD block.db) saturates
the block.db at 350MB/s reads and causes slow reqs and flapping on the OSDs.
Those OSDs have 3GB osd_target_memory.
Enabling bluefs_buffered_io drops the SSD IO down to <1MBps and the OSDs
are performant again. (The underlying PG deletion inefficiency is being
solved separately, but the page cache is so much more effective than
the bluestore cache in this scenario).
Lastly, remove the comment about swap. We should separately advise
operators to disable swap on OSD machines, as it is much better in
our experience to OOM and restart than to chug along swapping.
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch> Related-to: https://tracker.ceph.com/issues/45765 Related-to: https://tracker.ceph.com/issues/47044
(cherry picked from commit 5ec8e8e63d409860c35e24a192090ac2b70af8f6)
Kamoltat [Mon, 8 Feb 2021 15:45:06 +0000 (15:45 +0000)]
qa/tasks/mgr/test_progress: fix wait_until_equal
Octopus ceph_test_case doesn't have period arg
so remove that in wait_until_equal. Also increase
time to wait for complete events by using RECOVERY_PERIOD
instead of EVENT_CREATION_PERIOD
Not needed in masters because only octopus and nautilus
doesn't have a period argument in qa/tasks/mgr/test_progress.py
wait_until_equals() function
The osd_fast_shutdown option may cause the cluster log to receive
too many entries of 'osd.X reported immediately failed by osd.Y',
depending on cluster scale.
This might be an issue for LMA stacks/tools that check ceph logs
for failed lines, and then require additional logic to filter on
an intended OSD (fast) shutdown; might not be an option/possible,
and require an admin to analyze.
So, add osd_fast_shutdown_notify_mon option for OSD to also tell
the monitor it is shutting down (done in slow/non-fast shutdown)
under osd_fast_shutdown.
This introduces minimal delay (the ack from the mon is required
to prevent the messages), and addresses the cluster log issue.
Note: the osd_mon_shutdown_timeout option can be used to control
the maximum amount of time waiting for the monitor ack to arrive.
Fixes: http://tracker.ceph.com/issues/46978 Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
(cherry picked from commit c75734729764868c5c501722fc8de08dac9ebd4a)
Kefu Chai [Sat, 20 Mar 2021 05:00:01 +0000 (13:00 +0800)]
install-deps.sh: remove existing ceph-libboost of different version
we install different versions of precompiled ceph-libboost packages
for different branches when building and testing them on ubuntu test
nodes. for instance,
- nautilus, octopus: v1.72
- pacific: v1.73
they share the same set of test nodes. and these ceph-libboost packages
conflict with each other, because they install files to the same places.
in order to avoid the confliction, we should uninstall existing packages
before installing a different version of ceph-libboost packages.
ceph-libboost${version}-dev is a package providing the shared headers of
boost library, so, in this change we check if it is installed before
returning or removing the existing packages.
Sage Weil [Wed, 24 Feb 2021 20:59:57 +0000 (14:59 -0600)]
mon/OSDMonitor: fix safety/idempotency of {set,rm}-device-class
If the command is resent (e.g., due to network reconnect), the second
instance may find that the pending crush map already has the changes
and not wait for it to commit.
Note that the stderr message will be misleading in this case; that is a
problem with most of our mon commands. :(
Sage Weil [Sat, 13 Mar 2021 16:34:43 +0000 (11:34 -0500)]
osd: propagate base pool application_metadata to tiers
If there is application metadata on the base pool, it should be mirrored
to any other tiers in the set. This aligns with the fact that the
'ceph osd pool application ...' commands refuse to operate on a non-base
pool.
This fixes problems with accessing tiers (e.g., cache tiers) when the
cephx cap is written in terms of application metadata.
delete the part where _osd_in_out_completed_events_count()
was called in test_osd_cannot_recover() and revert to initial
state of the function since we don't need to use this function
in octopus. Also delete a duplicate of _osd_in_out_events_count().
This must be added by mistake in #39289 as well.
No need to fix for the backport in Nautilus: #38173
since the bugs are occured by adding additional code to
the cherry-pick specifically for Octopus.
Ilya Dryomov [Wed, 17 Mar 2021 10:00:33 +0000 (11:00 +0100)]
qa: krbd_blkroset.t: update for separate hw and user read-only flags
Since kernel 5.12, hardware read-only state and user read-only
policy (BLKROGET/SET ioctls) are tracked separately in the block
layer. As the purpose of our ->set_read_only() method was exactly
that, it was removed.
As a side effect, BLKROSET no longer returns EROFS on an attempt
to make a read-only mapping read-write with "blockdev --setrw".
The policy gets updated, but the device remains read-only as before
because the hardware (== mapping) state is controlled by the driver.
Neha Ojha [Tue, 9 Mar 2021 00:48:58 +0000 (00:48 +0000)]
pybind/mgr/balancer/module.py: assign weight-sets to all buckets before balancing
Add an additional check to make sure that the choose_args section has the same
number of buckets as the crushmap. If not, ensure that
get_compat_weight_set_weights assigns weight-sets to all buckets.
Without this change, if we end up with an orig_ws, which has fewer buckets
than the crushmap, the mgr will crash due a KeyError in do_crush_compat().
Yuval Lifshitz [Tue, 17 Nov 2020 11:31:59 +0000 (13:31 +0200)]
rgw/notification: trigger notifications on changes from any user
any user authorized to make changes to a bucket may trigger
notifications defined on that bucket.
manual test procedure of the fix is described here:
https://gist.github.com/yuvalif/39c183aa0f74d286ecef7844268817df