Casey Bodley [Mon, 15 Jun 2020 15:45:11 +0000 (11:45 -0400)]
test/rgw: test_datalog_autotrim filters out new entries
if other sync activity is racing with test_datalog_autotrim, it can
create new datalog entries after the 'datalog autotrim' command runs
instead of asserting that the datalog is empty after trim, assert that
any entries have a marker larger than the max-marker reported by
'datalog status' before the trim
rgw: return ERR_NO_SUCH_BUCKET early while evaluating bucket policy
Right now we create a ERR_NO_SUCH_BUCKET ret code but continue further
processing. Since this ret code isn't returned at any stage we end up creating a
bucket instance anyway which shouldn't happen and then succeeding the client
call in cases like put bucket versioning. Return an error code early in these
cases
Kamoltat [Mon, 8 Feb 2021 15:45:06 +0000 (15:45 +0000)]
qa/tasks/mgr/test_progress: fix wait_until_equal
Octopus ceph_test_case doesn't have period arg
so remove that in wait_until_equal. Also increase
time to wait for complete events by using RECOVERY_PERIOD
instead of EVENT_CREATION_PERIOD
Not needed in masters because only octopus and nautilus
doesn't have a period argument in qa/tasks/mgr/test_progress.py
wait_until_equals() function
luo rixin [Thu, 20 Feb 2020 12:07:39 +0000 (20:07 +0800)]
test/TestOSDScrub: fix mktime() error
The var tm tm isn't initialized, when the tm.tm_isdst is a
positive value, mktime(&tm) return -1 result in test failed
in ubuntu 19.10 for aarch64 GLIBC2.30.
Kefu Chai [Fri, 30 Aug 2019 08:21:03 +0000 (16:21 +0800)]
cmake: SKIP_RPATH if RPATH is not necessary
some executables like ceph_test_mon_memory_target do not link against
libraries built from source tree, like librados and libceph-common. so
cmake does not set RPATH for them. hence cmake complains like:
before this change, `CMAKE_INSTALL_RPATH` is set globally. so cmake is
asked to rewrite the RPATH for all installed targets. but this is not
needed. as some executables do not link against libceph-common. hence,
cmake complains when installing them, like:
CMake Error at src/test/mon/cmake_install.cmake:90 (file):
file RPATH_CHANGE could not write new RPATH:
/usr/lib64/ceph
to the file:
/home/abuild/rpmbuild/BUILDROOT/ceph-15.0.0-4347.g85a07b9.x86_64/usr/bin/ceph_test_log_rss_usage
No valid ELF RPATH or RUNPATH entry exists in the file;
after this change, `SKIP_RPATH` is set for those executables which do
not link against any libraries created from ceph source tree. so we can
avoid setting the RPATH for these executables when `make install`.
Nizamudeen A [Tue, 23 Mar 2021 07:10:46 +0000 (12:40 +0530)]
mgr/dashboard: Fix for alert notification message being undefined
Prometheus alert notification message in the dashboard always comes up
as undefined. Its because we were showing the alert.summary instead of
alert.description for displaying the message. I couldn't find the
summary field in the ceph_default_alerts.yml file. So removed all the
Summary fields from the dashboard code.
Kefu Chai [Thu, 19 Dec 2019 03:36:59 +0000 (11:36 +0800)]
test/pybind: s/nosetests/python3/
different distros package python3-nose in different ways by adding
different postfix to "/usr/bin/nosetests" to differentiate it from
its python2 counterpart.
* on bionic, python3-nose offers "nosetests3"
* on el8, python3-nose offers "nosetests-3" and "nosetests-3.6"
Kefu Chai [Wed, 31 Mar 2021 11:00:59 +0000 (19:00 +0800)]
mgr/dashboard: encode non-ascii string before passing it to exec_cmd()
because on Python3, tempfile.TemporaryFile() is opened in binary mode by
default, we need to encode non-ascii string before write to it.
otherwise, we have following failure:
def test_unicode_password(self):
self.test_create_user()
password = '\u7ae0\u9c7c\u4e0d\u662f\u5bc6\u7801'
with tempfile.TemporaryFile(mode='w+') as pwd_file:
> pwd_file.write(password)
E UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-5: ordinal not in range(256)
Kefu Chai [Wed, 31 Mar 2021 04:15:17 +0000 (12:15 +0800)]
cmake: allow use libzstd in system
since we are moving the test nodes from bionic to focal, we are able to
use the prebuilt libzstd libraries when running "make check". to speed
up the build and test, in this change:
* add FindZstd.cmake which allows us to use the libzstd in system
* extract BuildZstd.cmake for better readability
* add an option named "WITH_SYSTEM_ZSTD", which defaults to "OFF",
so user can enable it on demand.
Since the v1.4.0 release there have been a few improvements to Zstandard
including improved compression ratios, faster compression, and faster
decompression.
Kefu Chai [Tue, 30 Mar 2021 02:38:43 +0000 (10:38 +0800)]
debian/control: install python3-* packages for "make check"
Signed-off-by: Kefu Chai <kchai@redhat.com>
Conflicts:
debian/control: this change is not cherry-picked from master,
the corresponding commit in master is 50162091461e42939375475f70ecfd0817f2551c, but that commit also includes
the changes to update the runtime dependencies to python3. but we only
need to update the dependencies for running "make check". so instead
of cherry-picking from master, a separated change is made here.
Conflicts:
debian/control: we still need python 2.7 at runtime
so ignore that change, but we need to use tox instead of
python-tox for running "make check" on focal, so we
need the tox change in this commit.
Dan van der Ster [Tue, 23 Mar 2021 10:28:37 +0000 (11:28 +0100)]
test_ipaddr: check that we correctly skip loopback
We should skip devices named 'lo' or of the form 'lo:0' regardless
of their IP address.
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch> Related-to: https://tracker.ceph.com/issues/49938
(cherry picked from commit 780125d1ed93cd7b17172752b3e76186a524103b)
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch> Fixes: https://tracker.ceph.com/issues/49938
(cherry picked from commit 6147c0917157efd2d35610e759685656a4989abb)
Kefu Chai [Thu, 25 Mar 2021 09:08:48 +0000 (17:08 +0800)]
run-make-check.sh: let ctest generate XML output
to enable XUnit plugin of jenkins to consume the ctest output and
publish it in the dashboard, we need to
* let ctest generate XML output instead of plain text output
* do not fail the test if any test case fails. this allows the publisher
to do its job by checking the XML output.
* prevent ctest from compressing the output. see
https://issues.jenkins.io/browse/JENKINS-21737
Dan van der Ster [Thu, 12 Nov 2020 16:14:37 +0000 (17:14 +0100)]
common/options: bluefs_buffered_io=true by default
Enable bluefs_buffered_io again because it makes a huge user-visible
improvement in metadata intensive scenarios, such as but not limited to
PG deletion.
In our environment, deleting PGs from 4 hybrid OSDs (sharing one SATA SSD block.db) saturates
the block.db at 350MB/s reads and causes slow reqs and flapping on the OSDs.
Those OSDs have 3GB osd_target_memory.
Enabling bluefs_buffered_io drops the SSD IO down to <1MBps and the OSDs
are performant again. (The underlying PG deletion inefficiency is being
solved separately, but the page cache is so much more effective than
the bluestore cache in this scenario).
Lastly, remove the comment about swap. We should separately advise
operators to disable swap on OSD machines, as it is much better in
our experience to OOM and restart than to chug along swapping.
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch> Related-to: https://tracker.ceph.com/issues/45765 Related-to: https://tracker.ceph.com/issues/47044
(cherry picked from commit 5ec8e8e63d409860c35e24a192090ac2b70af8f6)
Kefu Chai [Tue, 23 Mar 2021 08:06:45 +0000 (16:06 +0800)]
pybind/ceph_daemon: do not fail if prettytable is not available
ubuntu focal does not package python-prettytable. but we need to run
"make check" on focal. it turns out the prettytable is not a must have
for running "make check", so just skip it if it is not around.
this change is not cherry-picked from master, as we have dropped python2
support in master, and python3-prettytable is packged fro python3 on
ubuntu focal. also nautilus is the latest release which has python2
support.
Conflicts:
cmake/modules/CephChecks.cmake
src/test/fio/CMakeLists.txt: check gettid() in /CMakeLists.txt
instead, as nautilus does not have cmake/modules/CephChecks.cmake by
then.
The osd_fast_shutdown option may cause the cluster log to receive
too many entries of 'osd.X reported immediately failed by osd.Y',
depending on cluster scale.
This might be an issue for LMA stacks/tools that check ceph logs
for failed lines, and then require additional logic to filter on
an intended OSD (fast) shutdown; might not be an option/possible,
and require an admin to analyze.
So, add osd_fast_shutdown_notify_mon option for OSD to also tell
the monitor it is shutting down (done in slow/non-fast shutdown)
under osd_fast_shutdown.
This introduces minimal delay (the ack from the mon is required
to prevent the messages), and addresses the cluster log issue.
Note: the osd_mon_shutdown_timeout option can be used to control
the maximum amount of time waiting for the monitor ack to arrive.
Fixes: http://tracker.ceph.com/issues/46978 Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
(cherry picked from commit c75734729764868c5c501722fc8de08dac9ebd4a)
Kefu Chai [Sat, 20 Mar 2021 05:00:01 +0000 (13:00 +0800)]
install-deps.sh: remove existing ceph-libboost of different version
we install different versions of precompiled ceph-libboost packages
for different branches when building and testing them on ubuntu test
nodes. for instance,
- nautilus, octopus: v1.72
- pacific: v1.73
they share the same set of test nodes. and these ceph-libboost packages
conflict with each other, because they install files to the same places.
in order to avoid the confliction, we should uninstall existing packages
before installing a different version of ceph-libboost packages.
ceph-libboost${version}-dev is a package providing the shared headers of
boost library, so, in this change we check if it is installed before
returning or removing the existing packages.
Ilya Dryomov [Wed, 17 Mar 2021 10:00:33 +0000 (11:00 +0100)]
qa: krbd_blkroset.t: update for separate hw and user read-only flags
Since kernel 5.12, hardware read-only state and user read-only
policy (BLKROGET/SET ioctls) are tracked separately in the block
layer. As the purpose of our ->set_read_only() method was exactly
that, it was removed.
As a side effect, BLKROSET no longer returns EROFS on an attempt
to make a read-only mapping read-write with "blockdev --setrw".
The policy gets updated, but the device remains read-only as before
because the hardware (== mapping) state is controlled by the driver.
Neha Ojha [Tue, 9 Mar 2021 00:48:58 +0000 (00:48 +0000)]
pybind/mgr/balancer/module.py: assign weight-sets to all buckets before balancing
Add an additional check to make sure that the choose_args section has the same
number of buckets as the crushmap. If not, ensure that
get_compat_weight_set_weights assigns weight-sets to all buckets.
Without this change, if we end up with an orig_ws, which has fewer buckets
than the crushmap, the mgr will crash due a KeyError in do_crush_compat().
- document the guideline for locking when working with python GIL
- add primitives to extract the patterns for acquiring/releasing
GIL. so they can be reused.
Conflicts:
src/mgr/ActivePyModules.cc
src/mgr/DaemonState.cc
- convert std::pair to DaemonKey::type-name in a few places
- Removed "mds_metadata" which doesn't exist in latest Nautilus