Robin H. Johnson [Fri, 27 Mar 2020 19:48:13 +0000 (20:48 +0100)]
rgw: reject control characters in response-header actions
S3 GetObject permits overriding response header values, but those inputs
need to be validated to insure only characters that are valid in an HTTP
header value are present.
Credit: Initial vulnerability discovery by William Bowling (@wcbowling)
Credit: Further vulnerability discovery by Robin H. Johnson <rjohnson@digitalocean.com> Signed-off-by: Robin H. Johnson <rjohnson@digitalocean.com>
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> Reviewed-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit d8dd5e513c0c62bbd7d3044d7e2eddcd897bd400)
Ilya Dryomov [Fri, 6 Mar 2020 19:16:45 +0000 (20:16 +0100)]
msg/async/crypto_onwire: fix endianness of nonce_t
As a AES-GCM IV, nonce_t is implicitly shared between server and
client. Currently, if their endianness doesn't match, they are unable
to communicate in secure mode because each gets its own idea of what
the next nonce should be after the counter is incremented.
Several RFCs state that the nonce counter should be BE, but since we
use LE for everything on-disk and on-wire, make it LE.
The secure mode uses AES-128-GCM with 96-bit nonces consisting of a
32-bit counter followed by a 64-bit salt. The counter is incremented
after processing each frame, the salt is fixed for the duration of
the session. Both are initialized from the session key generated
during session negotiation, so the counter starts with essentially
a random value. It is allowed to wrap, and, after 2**32 frames, it
repeats, resulting in nonce reuse (the actual sequence numbers that
the messenger works with are 64-bit, so the session continues on).
Because of how GCM works, this completely breaks both confidentiality
and integrity aspects of the secure mode. A single nonce reuse reveals
the XOR of two plaintexts and almost completely reveals the subkey
used for producing authentication tags. After a few nonces get used
twice, all confidentiality and integrity goes out the window and the
attacker can potentially encrypt-authenticate plaintext of their
choice.
We can't easily change the nonce format to extend the counter to
64 bits (and possibly XOR it with a longer salt). Instead, just
remember the initial nonce and cut the session before it repeats,
forcing renegotiation.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
Conflicts:
src/msg/async/ProtocolV2.cc [ context: commit 697aafa2aad2
("msg/async/ProtocolV2: remove unused parameter") not in
nautilus ]
src/msg/async/ProtocolV2.h [ context: commit ed3ec4c01d17
("msg: Build target 'common' without using namespace in
headers") not in nautilus ]
Venky Shankar [Wed, 26 Feb 2020 04:52:37 +0000 (23:52 -0500)]
mgr/volumes: unregister job upon async threads exception
If the async threads hit a temporary exception the job is
never unregistered and therefore gets skipped by the async
threads on subsequent scans.
Patrick hit this in nautilus when one of the purge threads
hit an exception when trying to log a message. The trash
entry was never picked up again by the purge threads.
Patrick Donnelly [Tue, 25 Feb 2020 04:26:30 +0000 (20:26 -0800)]
Merge PR #33526 into nautilus
* refs/pull/33526/head:
test: verify purge queue w/ large number of subvolumes
test: pass timeout argument to mount::wait_for_dir_empty()
mgr/volumes: access volume in lockless mode when fetching async job
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Venky Shankar [Wed, 19 Feb 2020 12:31:40 +0000 (07:31 -0500)]
mgr/volumes: access volume in lockless mode when fetching async job
Saw a deadlock when deleting lot of subvolumes -- purge threads were
stuck in accessing global lock for volume access. This can happen
when there is a concurrent remove (which renames and signals the
purge threads) and a purge thread is just about to scan the trash
directory for entries.
For the fix, purge threads fetches entries by accessing the volume
in lockless mode. This is safe from functionality point-of-view as
the rename and directory scan is correctly handled by the filesystem.
Worst case the purge thread would pick up the trash entry on next
scan, never leaving a stale trash entry.
Yaarit Hatuka [Mon, 27 Jan 2020 13:57:55 +0000 (08:57 -0500)]
mgr/devicehealth: fix telemetry stops sending device reports after 48 hours
Telemetry module fetches device metrics which were scraped in the last
"telemetry interval"*2 (=48 hours by default) by calling
_get_device_metrics() with min_sample. _get_device_metrics() fetches the
metrics from omap and breaks on the first one that is older than
min_sample. But because it fetched in ascending order (from oldest to
newest) it was breaking on the first one it received, if it was older
than the interval above. We need to pass min_sample to get_omap_vals()
so it will start fetching from that value.
Sage Weil [Fri, 4 Oct 2019 20:03:02 +0000 (15:03 -0500)]
mgr/devicehealth: factor _get_device_metrics out of show_device_metrics
Add the min_sample lower-bound argument too
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 7be5c1323b3814e2634d5cd66d45cab5a77df680)
Conflicts: had to be backported to enable backporting of
https://github.com/ceph/ceph/pull/32903
Backport tracker: https://tracker.ceph.com/issues/43873
Kefu Chai [Mon, 10 Feb 2020 08:27:22 +0000 (16:27 +0800)]
ceph-monstore-tool: rename mon-ids in initial monmap
when ceph-mon starts, it checks to see if it's listed in the monmap, if
not it complains
```
no public_addr or public_network specified, and mon.a not present in
monmap or ceph.conf.
```
then bails out. normally, the monitor will try to rename its name in
monmap when performing "mkfs", but in our case, we are merely using the
"mkfs" monmap for passing the monmap built by ceph-monstore-tools, and
we don't actually go through the "mkfs" process. so, ceph-mon won't
rename when booting up.
in this change, user is allowed to specify the mon-ids in command line
when rebuilding mondb, the default mon-ids would be a,b,c,... if not
specified.
The lvm batch command fails to prepare the OSDs on the created LV.
When using lvm batch, the LV/VG are created prior the OSD prepare.
During that creation, multiple tags are set with null value.
ceph-volume: add unit test test_safe_prepare_osd_already_created
This commit adds a new unit test
`test_safe_prepare_osd_already_created()` in order to test when
`is_ceph_device()` returns `True` `RuntimeError` is well raised.
Jan Fajerski [Mon, 6 Jan 2020 17:02:57 +0000 (18:02 +0100)]
ceph-volume: add available property in target specific flavors
This adds two properties available_[lvm,raw] to device (and thus inventory).
The goal is to have different notions of availability based on the
intended use case. For example finding LVM structures make a drive
unavailable for the raw mode, but might be available for the lvm mode.
Fixes: https://tracker.ceph.com/issues/43400 Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 233ccff24006082766b52a94b7c46cdf5b7cd929)
When using vg/lv, this function throws an error like following:
```
stderr: unable to read label for test_group/data-lv2: (2) No such file or directory
stderr: 2020-02-04T21:03:32.153+0000 7fe091af4200 -1 bluestore(test_group/data-lv2) _read_bdev_label failed to open test_group/data-lv2: (2) No such file or directory
```
When passing a vg/lv path for generating a single report, it fails
because the filter used in the `lvs` command isn't right. It uses the lv
name instead of the vg name because `os.path.basename(device)` is used
while it should be `os.path.dirname(device)`
Ramana Raja [Tue, 11 Feb 2020 10:49:09 +0000 (05:49 -0500)]
mgr/volumes: fix py2 compat issue
Fix the following issue seen while upstream teuthology testing,
File "/usr/share/ceph/mgr/volumes/fs/operations/versions/subvolume_base.py", line 98, in load_config
self.metadata_mgr = MetadataManager(self.fs, self.legacy_config_path, 0o640)
File "/usr/share/ceph/mgr/volumes/fs/operations/versions/subvolume_base.py", line 73, in legacy_config_path
meta_config = "{0}.meta".format(m.digest().hex())
AttributeError: 'str' object has no attribute 'hex'
This issue is not observed in master/octopus, as it only supports
py3.