Sage Weil [Mon, 2 Dec 2019 13:43:54 +0000 (07:43 -0600)]
mon: cap keys in mon_sync messages
The previous cap was set at 1 MB. However, a user was experiencing mon
timeouts while syncing the purged_snap_epoch * keys, which are ~20 bytes
each. Reducing the max payload to 64K resolved the problem, which maps
to (very!) roughly 1500 keys per message. Set our limit a bit higher than
that since we just made this quite a bit more efficient. Most of the time
the keys are larger than 20 bytes and we wouldn't hit the key limit, but
having one ensures that we won't burn too much CPU in one go when we do
have lots of these little keys.
Sage Weil [Tue, 12 Nov 2019 20:51:41 +0000 (14:51 -0600)]
mon/MonitorDBStore: improve get_chunk_tx limits
The old version was horribly inefficient in that it would reencode the
transaction on every iteration.
Instead, estimate the size if we add an item and stop it if looks like it
will go over. This isn't super precise, but it's close enough, since the
limits are approximate.
Drop the single-use helper since it only makes the code harder to
follow.
Brad Hubbard [Tue, 3 Mar 2020 05:58:35 +0000 (15:58 +1000)]
mgr/run-tox-tests: Fix issue with PYTHONPATH
Something changed recently on Bionic which caused tox to fail when
PYTHONPATH is a relative path. For some reason the path is mangled by
the time it gets to pytest so we need to ensure we are using an absolute
path. This seems to be nautilus specific, at least ATM.
Alex Zhang [Sun, 29 Sep 2019 09:33:58 +0000 (02:33 -0700)]
common: Fix multiple logical errors in get_device_id.
0. If blkdev.serial exists, the serial should be used. The original impl seems wrong (if serial does not exist, then use the value from the uninitialized buffer, or even worse, use the value from the last call (model))
1. When using fallback methods, device id should only be returned when both model and serial are present. The original impl looks like a logical error.
Kefu Chai [Wed, 29 May 2019 09:45:35 +0000 (17:45 +0800)]
common/blkdev.c: check retval of snprintf()
as snprintf()'ed string could be truncated, to properly use this
function, we need to check its return value.
to silence warning like
../src/common/blkdev.cc: In member function ‘int64_t
BlkDev::get_string_property(blkdev_prop_t, char*, size_t) const’:
../src/common/blkdev.cc:165:15: warning: ‘%s’ directive output may be
truncated writing up to 4095 bytes into a region of size between 4085
and 4089 [-Wformat-truncation=]
165 | "%s/block/%s/%s", sysfsdir(), dev, propstr);
| ^~
In file included from /usr/include/stdio.h:873,
from /usr/include/c++/9/cstdio:42,
from /usr/include/c++/9/ext/string_conversions.h:43,
from /usr/include/c++/9/bits/basic_string.h:6493,
from /usr/include/c++/9/string:55,
from /usr/include/c++/9/bits/locale_classes.h:40,
from /usr/include/c++/9/bits/ios_base.h:41,
from /usr/include/c++/9/ios:42,
from /usr/include/c++/9/ostream:38,
from /usr/include/c++/9/iterator:64,
from
/opt/ceph/include/boost/iterator/iterator_traits.hpp:10,
from
/opt/ceph/include/boost/range/iterator_range_core.hpp:26,
from
/opt/ceph/include/boost/algorithm/string/replace.hpp:16,
from ../src/common/blkdev.cc:31:
/usr/include/x86_64-linux-gnu/bits/stdio2.h:67:35: note:
‘__builtin___snprintf_chk’ output 9 or more bytes (assuming 4108) into a
destination of size 4096
67 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL
- 1,
|
~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
68 | __bos (__s), __fmt, __va_arg_pack ());
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Matt Benjamin [Fri, 19 Jul 2019 20:32:20 +0000 (16:32 -0400)]
RGWLC: fix expiration header tag match
Need to match key->value
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit af327f21aa377a7abd0290814bfa7333db5443c3) Signed-off-by: Yuval Lifshitz <yuvalif@yahoo.com>
Matt Benjamin [Fri, 3 May 2019 17:48:31 +0000 (13:48 -0400)]
rgw: fix header timestamp
The AWS example of this header intends to be RFC822-compliant.
Found by Tyler Brekke <tbrekke@redhat.com>.
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit 6da5be5aba0820dc91aa44d4b63cd490b39371db) Signed-off-by: Yuval Lifshitz <yuvalif@yahoo.com>
Matt Benjamin [Tue, 19 Feb 2019 16:17:45 +0000 (11:17 -0500)]
rgw: complete expiration header (object tags)
The expiration header tag processing is complete, but the
passed RGWObjTags argument was never initialized. Now it is
initialized in RGWGetObj and RGWPutObj, as required.
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit 8981c5e9f688b13a00345e069c2ce1e62fb0a588) Signed-off-by: Yuval Lifshitz <yuvalif@yahoo.com>
Matt Benjamin [Tue, 19 Feb 2019 14:30:18 +0000 (09:30 -0500)]
RGWLC: debug tags in rgwlc_s3_expiration_header
Dump object tags to log at debug level 16.
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit 45f463fec55aebe53fa91aebf891dd9cb6cc1e19) Signed-off-by: Yuval Lifshitz <yuvalif@yahoo.com>
Tiago Melo [Tue, 12 Nov 2019 11:08:40 +0000 (10:08 -0100)]
mgr/dashboard: Use default language when running "npm run build"
This allow us to simply run "npm run build" and it will compile the frontend
with "en-US" as the default language and in the correct "dist/en-US" folder.
Venky Shankar [Wed, 26 Feb 2020 04:52:37 +0000 (23:52 -0500)]
mgr/volumes: unregister job upon async threads exception
If the async threads hit a temporary exception the job is
never unregistered and therefore gets skipped by the async
threads on subsequent scans.
Patrick hit this in nautilus when one of the purge threads
hit an exception when trying to log a message. The trash
entry was never picked up again by the purge threads.
Conflicts:
src/tools/rbd_mirror/InstanceReplayer.h (ceph::mutex vs Mutex)
src/tools/rbd_mirror/NamespaceReplayer.cc (does not exist)
src/tools/rbd_mirror/PoolReplayer.cc (code from NamespaceReplayer is here)
src/test/rbd_mirror/test_mock_PoolReplayer.cc (accordingly to PoolReplayer.cc changes)
Patrick Donnelly [Tue, 25 Feb 2020 04:26:30 +0000 (20:26 -0800)]
Merge PR #33526 into nautilus
* refs/pull/33526/head:
test: verify purge queue w/ large number of subvolumes
test: pass timeout argument to mount::wait_for_dir_empty()
mgr/volumes: access volume in lockless mode when fetching async job
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Venky Shankar [Wed, 19 Feb 2020 12:31:40 +0000 (07:31 -0500)]
mgr/volumes: access volume in lockless mode when fetching async job
Saw a deadlock when deleting lot of subvolumes -- purge threads were
stuck in accessing global lock for volume access. This can happen
when there is a concurrent remove (which renames and signals the
purge threads) and a purge thread is just about to scan the trash
directory for entries.
For the fix, purge threads fetches entries by accessing the volume
in lockless mode. This is safe from functionality point-of-view as
the rename and directory scan is correctly handled by the filesystem.
Worst case the purge thread would pick up the trash entry on next
scan, never leaving a stale trash entry.
Yuval Lifshitz [Sun, 2 Feb 2020 14:14:15 +0000 (16:14 +0200)]
cmake: disable kafka when not found
This should fix the issue of incompatible librdkafka in ubuntu-xenial
This does not apply to octopus (or newer) since ubuntu-xenial is
not supported for them.
Note that this change automatically changes the users build directives,
which we would not like to carry forward to newer versions.
Yuval Lifshitz [Wed, 29 Jan 2020 11:41:44 +0000 (13:41 +0200)]
rgw/kafka: update release notes
Since ubuntu-xenial does not have correct librdkafka version,
the nautilus release notes must give a warning that a custom
librdjkafka should be used if kafka support is needed.
This does not apply to octopus (or newer) since ubuntu-xenial is
not supported for them.
Tatjana Dehler [Fri, 24 Jan 2020 16:02:22 +0000 (17:02 +0100)]
mgr/dashboard: show checkboxes for booleans
The frontend showed textboxes for the dashboard settings because
the actual type information was missing here. The REST API then
returned the default type 'str'.
Edit the e2e test case in order to update a different setting as
the 'editMgrModule' method can't handle checkboxes.
Fixes: https://tracker.ceph.com/issues/43769 Signed-off-by: Tatjana Dehler <tdehler@suse.com>
(cherry picked from commit 7e7cac98116e76bc9e4c52ac47796e3fcd880667)
Yaarit Hatuka [Mon, 27 Jan 2020 13:57:55 +0000 (08:57 -0500)]
mgr/devicehealth: fix telemetry stops sending device reports after 48 hours
Telemetry module fetches device metrics which were scraped in the last
"telemetry interval"*2 (=48 hours by default) by calling
_get_device_metrics() with min_sample. _get_device_metrics() fetches the
metrics from omap and breaks on the first one that is older than
min_sample. But because it fetched in ascending order (from oldest to
newest) it was breaking on the first one it received, if it was older
than the interval above. We need to pass min_sample to get_omap_vals()
so it will start fetching from that value.
Sage Weil [Fri, 4 Oct 2019 20:03:02 +0000 (15:03 -0500)]
mgr/devicehealth: factor _get_device_metrics out of show_device_metrics
Add the min_sample lower-bound argument too
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 7be5c1323b3814e2634d5cd66d45cab5a77df680)
Conflicts: had to be backported to enable backporting of
https://github.com/ceph/ceph/pull/32903
Backport tracker: https://tracker.ceph.com/issues/43873
Tiago Pasqualini [Fri, 31 Jan 2020 18:22:19 +0000 (15:22 -0300)]
rgw: make max_connections configurable in beast
Beast frontend currently accepts a hardcoded number of connections
that is defined by boost::asio::socket_base::max_connections. This
commit makes it configurable via a 'max_connections' config option
on rgw frontend.