Kotresh HR [Thu, 26 Mar 2020 05:00:39 +0000 (10:30 +0530)]
nautilus: mgr/volumes: Add interface to get subvolume metadata
The following interface is added
"ceph fs subvolume info <vol_name> <sub_name> [<group_name>]"
The output is in json format with following fields
1. atime: access time of subvolume path in the format "YYYY-MM-DD HH:MM:SS"
2. mtime: modification time of subvolume path in the format "YYYY-MM-DD HH:MM:SS"
3. ctime: change time of subvolume path in the format "YYYY-MM-DD HH:MM:SS"
4. uid: uid of subvolume path
5. gid: gid of subvolume path
6. mode: mode of subvolume path
7. mon_addrs: list of monitor addresses
8. bytes_pcent: quota used in percentage if quota is set, else displays "undefined"
9. bytes_quota: quota size in bytes if quota is set, else displays "infinite"
10. bytes_used: current used size of the subvolume in bytes
11. created_at: time of creation of subvolume in the format "YYYY-MM-DD HH:MM:SS"
12. data_pool: data pool the subvolume belongs to
13. path: absolute path of a subvolume
14. type: subvolume type indicating whether it's clone or subvolume
Jan Fajerski [Wed, 4 Mar 2020 10:39:40 +0000 (11:39 +0100)]
ceph-volume: available_lvm: vg space takes precedence
This changes available_lvm to check for generic reasons only if no VGs
were found. A VG can contain a (mounted) lv, which triggers the
ro/locked test, despite the VG having space available.
Sage Weil [Wed, 11 Sep 2019 22:26:52 +0000 (17:26 -0500)]
mon: disable min pg per osd warning
Now that the pg_autoscaler is on by default, it is "normal" (and okay) to
have a small number of PGs in the cluster if the overall cluster usage is
also low. This setting just results in a health warning out of the box
when you create a pool and haven't written any data yet.
J. Eric Ivancich [Fri, 31 Jan 2020 20:01:40 +0000 (15:01 -0500)]
rgw: fix bug with (un)ordered bucket listing and marker w/ namespace
When listing without specifying a namespace, the returned entries
could be in one or more namespaces. The marker used to continue the
listing may therefore contain a namespace, and that needs to be
preserved. This fixes a bug in both ordered and unordered listings
where it was not preserved.
Shilpa Jagannath [Tue, 26 Nov 2019 08:03:52 +0000 (13:33 +0530)]
rgw: when a period lookup for oldest_realm_epoch returns an ENOENT,
find the oldest one and update RGWMetadataLogHistory. This is to avoid an
empty cursor being passed in to ceph_assert() in PurgePeriodLogsCR::operate()
in case of incomplete period history.
mgr/dashboard: fix COVERAGE_PATH in run-backend-api-tests.sh
As we cannot backport directly https://github.com/ceph/ceph/pull/33407
to nautilus and COVERAGE_PATH is invalid for CentOS7 + py2,
this fix is applied directly to nautilus.
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
Yan, Zheng [Tue, 9 Oct 2018 03:46:56 +0000 (11:46 +0800)]
mds: handle bad purge queue item encoding
The bad encoding was introduced by commit a88f8d5eb4
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit ba06fcbe3e8345dc2c4c4c3dc3bcc18acf5ab076)
Conflicts:
src/mds/PurgeQueue.cc: advance changed to += Fixes: https://tracker.ceph.com/issues/36635
Note: This commit from v13.2.3 fixes a bad backport in v13.2.2. It is
also required in Octopus/Nautilus to handle upgrades.
(cherry picked from commit b73d1989bcbea227017607f8dd6e79633ec11f8f)
Conflicts:
src/mds/PurgeQueue.cc: += changed to advance
Conflicts:
src/pybind/rbd/rbd.pyx
- no "snap_exists", "snap_get_name", "snap_get_id",
"mirror_image_create_snapshot", "mirror_image_get_mode", "config_set",
"config_get", "config_remove", "snap_get_mirror_namespace", in nautilus
- nautilus "mirror_image_enable" does not take any argument
zhangdaolong [Tue, 24 Mar 2020 00:51:44 +0000 (08:51 +0800)]
pybind/rbd: fix no lockers are obtained, ImageNotFound exception will be output
No lockers are obtained, ImageNotFound exception will be output,
but tht image is always exist.when lockers number is zero,
Should not output any exceptions。
Fixes: https://tracker.ceph.com/issues/44613 Signed-off-by: zhangdaolong <zhangdaolong@fiberhome.com>
(cherry picked from commit a183aac978dac69f996250324975073a78cb476b)
Tatjana Dehler [Fri, 8 Nov 2019 12:51:40 +0000 (13:51 +0100)]
mgr/dashboard: fix tests in order to match pg num conventions
Update the tests test_ganesha and test_rbd_mirroring in order
to match the PG num conventions. It prevents the health warning
'POOL_PG_NUM_NOT_POWER_OF_TWO' from being shown.
This is an alternative (and straightforward) way to specify barred space not allowed for 'use
some extra' BlueFS volume selector policy.
Disabled by default since it should depend on RocksDB settings and actual volume size.
Robin H. Johnson [Fri, 27 Mar 2020 19:48:13 +0000 (20:48 +0100)]
rgw: reject control characters in response-header actions
S3 GetObject permits overriding response header values, but those inputs
need to be validated to insure only characters that are valid in an HTTP
header value are present.
Credit: Initial vulnerability discovery by William Bowling (@wcbowling)
Credit: Further vulnerability discovery by Robin H. Johnson <rjohnson@digitalocean.com> Signed-off-by: Robin H. Johnson <rjohnson@digitalocean.com>
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> Reviewed-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit d8dd5e513c0c62bbd7d3044d7e2eddcd897bd400)
Ilya Dryomov [Fri, 6 Mar 2020 19:16:45 +0000 (20:16 +0100)]
msg/async/crypto_onwire: fix endianness of nonce_t
As a AES-GCM IV, nonce_t is implicitly shared between server and
client. Currently, if their endianness doesn't match, they are unable
to communicate in secure mode because each gets its own idea of what
the next nonce should be after the counter is incremented.
Several RFCs state that the nonce counter should be BE, but since we
use LE for everything on-disk and on-wire, make it LE.
The secure mode uses AES-128-GCM with 96-bit nonces consisting of a
32-bit counter followed by a 64-bit salt. The counter is incremented
after processing each frame, the salt is fixed for the duration of
the session. Both are initialized from the session key generated
during session negotiation, so the counter starts with essentially
a random value. It is allowed to wrap, and, after 2**32 frames, it
repeats, resulting in nonce reuse (the actual sequence numbers that
the messenger works with are 64-bit, so the session continues on).
Because of how GCM works, this completely breaks both confidentiality
and integrity aspects of the secure mode. A single nonce reuse reveals
the XOR of two plaintexts and almost completely reveals the subkey
used for producing authentication tags. After a few nonces get used
twice, all confidentiality and integrity goes out the window and the
attacker can potentially encrypt-authenticate plaintext of their
choice.
We can't easily change the nonce format to extend the counter to
64 bits (and possibly XOR it with a longer salt). Instead, just
remember the initial nonce and cut the session before it repeats,
forcing renegotiation.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
Conflicts:
src/msg/async/ProtocolV2.cc [ context: commit 697aafa2aad2
("msg/async/ProtocolV2: remove unused parameter") not in
nautilus ]
src/msg/async/ProtocolV2.h [ context: commit ed3ec4c01d17
("msg: Build target 'common' without using namespace in
headers") not in nautilus ]
Jan Fajerski [Wed, 8 Apr 2020 09:55:57 +0000 (11:55 +0200)]
ceph-volume/batch: return success when all devices are filtered
batch should only return an error if some (but not all) devices are
filtered. When only some devices are filtered the resulting osd layout
could look very different from what a user expects. If all devies are
filtered just return success.
Fixes: https://tracker.ceph.com/issues/44994 Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit cb3eade3d00fdcdb34c4156f3c488bbb8747cd6f)
dashboard: Resolve FQDN / hostname mismatch in hosts overview panel
In the AVG Disk Utilization panel, the result is calculated
by combining the output of node_disk_io_time_seconds_total
with the output of ceph_disk_occupation. However, the
first vector encodes the instance label with the full FQDN
while the ceph label only contains the hostname:port. In
order for these to match correctly, the domain name and port
has to be stripped from the labels.
When moving to LVM-based ceph-volume setups, several
grafana dashboards stopped working. The problem is that
(device, instance) no longer results in unique labels
which causes errors like:
"many-to-many matching not allowed: matching labels must be unique on one side"
The references to `$osd_hosts` etc. were encoded as
`[[osd_hosts]]` in the PromQL expression divisor, and
the panel always displayed N/A as the result of the
query.
Replacing the `[[...]]` with `$...` makes the expression
work again.
qa/suites: use "nautilus" branch for cbt based testing
cbt is using features like the f string introduced by python3.6, so we
need either stop testing on ubuntu 16.04, which came with python3.5, or
specify the a cbt branch which support python3.5. to address this issue
once and for all, it's simpler just pin the branch to "nautilus" when
performing cbt based tests.
this change is not cherry-picked from master, as in master, we only
tests on distros which offers python3.6 and up.
mgr/dashboard: show alert panel if prometheus/alertmanager is unconfigured
If the tabs under the "Monitoring" page aren't properly configured, a
notification is shown which explains the user which setting needs to be
enabled and also provides a link to the corresponding documentation.
Fixes: https://tracker.ceph.com/issues/42877 Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 460f7bb3272c6536c9a5fc0919071d7c17e9aa5a)
by adding the previously added monitoring related features as well as
the newest feature addition. Extends the documentation where necessary
to describe the Prometheus' alert configuration.
Fixes: https://tracker.ceph.com/issues/42877 Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 36421284c315baf7e79a8c0586ca98ac0126037e)
mgr/dashboard: move monitoring tabs to a single page
with a tab for 'active alerts', 'all alerts' and 'silences'. Due to
ambiguity with existing names, `AlertListComponent` has been renamed to
`ActiveAlertListComponent`. Introduces `MonitoringListComponent` as
first page for monitoring concerns, using path `/monitoring`.
Keeps the activated tab open, independent of the way that's used to go
back to the previous page, be it the cancel button or submit button or
the link on the breadcrumb. Also keeps the active tab open even when the
page is reloaded.
Fixes: https://tracker.ceph.com/issues/42877 Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 855f214b29c8ed935c8f4ba0b8a8396692f946a1)
mgr/dashboard: refactor test of Prometheus alert service
Mocking the test the way it was removed the asynchronous nature of the
test. By using an Observable the test can stay asynchronous and be
tested as well.