]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
4 years agomsg/msg_types: entity_addrvec_t: fix decode on big-endian hosts 36814/head
Ulrich Weigand [Fri, 3 Jul 2020 13:47:00 +0000 (15:47 +0200)]
msg/msg_types: entity_addrvec_t: fix decode on big-endian hosts

When decoding an entity_addrvec_t with marker 1, we just have
a single (non-legacy) entity_addr_t.  This should be decoded
exactly the same as done by entity_addr_t::decode, but it
currently is not.  Specifically, the sa_family member of
the sockaddr is not converted from the on-wire little-endian
format to host byte order (as done by entity_addr_t::decode).

Fixed by using the same code as in entity_addr_t::decode.

Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
(cherry picked from commit 31da17378b712542e915adbf4084e0212b8bb615)

4 years agomessages,mds: Fix decoding of enum types on big-endian systems
Ulrich Weigand [Tue, 18 Aug 2020 07:51:22 +0000 (09:51 +0200)]
messages,mds: Fix decoding of enum types on big-endian systems

When a struct member that has enum type needs to be encoded or
decoded, we need to use an explicit integer type, since there
are no encode routines for the enum type.  (This is probably
to avoid introducing dependencies on implementation-defined
choices by the compiler to use a particular underlying type.)

This leads to code sequences along the lines of:
  encode((int32_t)state, bl);
and
  decode((int32_t&)(state), bl);

The encode line is actually fine, but the decode line is
incorrect on big-endian systems if the underlying type of
the enum differs from the explicitly chosen integer type.

This is because this performs in effect a pointer cast,
and will write the decoded int32_t value into the memory
backing the "state" member variable.  If the sizes differ,
the value is written into the wrong bytes of "state" on
big-endian systems.

This patch fixes the problem by decoding into an intermediate
variable of the integer type first, and then casting the result
while assigning to the struct member of enum type.

This bug showed up initially as invalid health-status values
causing Ceph daemon aborts on s390x.  I've tried to find and
fix all other instances of the same enum decode pattern as well.

Fixes: https://tracker.ceph.com/issues/47015
Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
(cherry picked from commit 7ed716823fd02d84ea53cb61350bf14f248ebb8b)

Conflicts:
src/mds/PurgeQueue.cc
- nautilus has "p.advance(pad_size)", instead of "p += pad_size",
  in the line immediately preceding the first change

4 years agoMerge pull request #37254 from bstillwell/wip-47425-nautilus
Yuri Weinstein [Thu, 24 Sep 2020 22:25:55 +0000 (15:25 -0700)]
Merge pull request #37254 from bstillwell/wip-47425-nautilus

nautilus: compressor: Add a config option to specify Zstd compression level

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agoMerge pull request #37269 from badone/wip-nautilus-enable-mgr-client-debug
Yuri Weinstein [Thu, 24 Sep 2020 22:25:32 +0000 (15:25 -0700)]
Merge pull request #37269 from badone/wip-nautilus-enable-mgr-client-debug

nautilus: tests/qa: Enable debug_client for mgr tests

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
4 years agoMerge pull request #37318 from yaarith/nautilus-fix-dev-id-split
Yuri Weinstein [Thu, 24 Sep 2020 22:14:45 +0000 (15:14 -0700)]
Merge pull request #37318 from yaarith/nautilus-fix-dev-id-split

nautilus: mgr/telemetry: fix device id splitting when anonymizing serial

Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>
4 years agoMerge pull request #37288 from yaarith/nautilus-add-smartctl-nvme-dependencies
Yuri Weinstein [Thu, 24 Sep 2020 22:12:37 +0000 (15:12 -0700)]
Merge pull request #37288 from yaarith/nautilus-add-smartctl-nvme-dependencies

nautilus: ceph.spec.in, debian/control: add smartmontools and nvme-cli dependen…

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #37324 from aaSharma14/wip-47570-nautilus
Yuri Weinstein [Thu, 24 Sep 2020 20:45:29 +0000 (13:45 -0700)]
Merge pull request #37324 from aaSharma14/wip-47570-nautilus

nautilus: mgr/dashboard: table detail rows overflow

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
4 years agoMerge pull request #37309 from rhcs-dashboard/wip-47579-nautilus
Yuri Weinstein [Thu, 24 Sep 2020 20:44:49 +0000 (13:44 -0700)]
Merge pull request #37309 from rhcs-dashboard/wip-47579-nautilus

nautilus: mgr/dashboard: fix pool usage calculation

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
4 years agoMerge pull request #37226 from smithfarm/wip-47532-nautilus
Yuri Weinstein [Thu, 24 Sep 2020 20:44:10 +0000 (13:44 -0700)]
Merge pull request #37226 from smithfarm/wip-47532-nautilus

nautilus: ceph.in: ignore failures to flush stdout

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #36984 from k0ste/wip-47281-nautilus
Yuri Weinstein [Thu, 24 Sep 2020 20:43:36 +0000 (13:43 -0700)]
Merge pull request #36984 from k0ste/wip-47281-nautilus

nautilus: prometheus: Properly split the port off IPv6 addresses

Reviewed-by: Jan Fajerski <jfajerski@suse.com>
4 years agoMerge pull request #37209 from smithfarm/wip-46983-nautilus
Yuri Weinstein [Thu, 24 Sep 2020 15:03:20 +0000 (08:03 -0700)]
Merge pull request #37209 from smithfarm/wip-46983-nautilus

nautilus: test/rbd-mirror: pool watcher registration error might result in race

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
4 years agoMerge pull request #37165 from dillaman/wip-47100-nautilus
Yuri Weinstein [Thu, 24 Sep 2020 15:01:59 +0000 (08:01 -0700)]
Merge pull request #37165 from dillaman/wip-47100-nautilus

nautilus:  librbd: using migration abort can result in the loss of data

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
4 years agoMerge pull request #37157 from smithfarm/wip-46519-nautilus
Yuri Weinstein [Thu, 24 Sep 2020 15:00:46 +0000 (08:00 -0700)]
Merge pull request #37157 from smithfarm/wip-46519-nautilus

nautilus: rgw: fix boost::asio::async_write() does not return error...

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agoMerge pull request #37040 from trociny/wip-46720-nautilus
Yuri Weinstein [Thu, 24 Sep 2020 14:59:52 +0000 (07:59 -0700)]
Merge pull request #37040 from trociny/wip-46720-nautilus

nautilus: librbd: don't resend async_complete if watcher is unregistered

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
4 years agoMerge pull request #36880 from smithfarm/wip-47186-nautilus
Yuri Weinstein [Thu, 24 Sep 2020 14:59:10 +0000 (07:59 -0700)]
Merge pull request #36880 from smithfarm/wip-47186-nautilus

nautilus: rgw: dump transitions in RGWLifecycleConfiguration::dump()

Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agoMerge pull request #36843 from Huber-ming/nautilus-loglevel
Yuri Weinstein [Thu, 24 Sep 2020 14:58:23 +0000 (07:58 -0700)]
Merge pull request #36843 from Huber-ming/nautilus-loglevel

nautilus: rgw: log resharding events at level 1 (formerly 20)

Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agoMerge pull request #37195 from guits/wip-47504-nautilus
Jan Fajerski [Thu, 24 Sep 2020 12:17:29 +0000 (14:17 +0200)]
Merge pull request #37195 from guits/wip-47504-nautilus

nautilus: ceph-volume: fix simple activate when legacy osd

4 years agoMerge pull request #37295 from rhcs-dashboard/wip-47573-nautilus
Lenz Grimmer [Thu, 24 Sep 2020 08:31:32 +0000 (10:31 +0200)]
Merge pull request #37295 from rhcs-dashboard/wip-47573-nautilus

nautilus: mgr/dashboard: cpu stats incorrectly displayed

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Patrick Seidensal <pnawracay@suse.com>
4 years agoMerge pull request #37033 from smithfarm/wip-47350-nautilus
Yuri Weinstein [Wed, 23 Sep 2020 19:21:28 +0000 (12:21 -0700)]
Merge pull request #37033 from smithfarm/wip-47350-nautilus

nautilus: core: include/encoding: Fix encode/decode of float types on big-endian systems

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
4 years agoMerge pull request #37169 from ivancich/nautilus-47487
Yuri Weinstein [Wed, 23 Sep 2020 15:27:05 +0000 (08:27 -0700)]
Merge pull request #37169 from ivancich/nautilus-47487

nautilus: rgw: ordered bucket listing code clean-up

Reviewed-by: Casey Bodley <cbodley@redhat.com>
4 years agomgr/dashboard: table detail rows overflow 37324/head
Aashish Sharma [Wed, 16 Sep 2020 06:49:10 +0000 (12:19 +0530)]
mgr/dashboard: table detail rows overflow

Added word-wrap to the rgw-bucket-details table rows to fix overflow of values

Fixes:https://tracker.ceph.com/issues/47434
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 7196e03449d585a6b7e1e4065120a571fe2ab894)

4 years agorgw: dump transitions in RGWLifecycleConfiguration::dump() 36880/head
Shengming Zhang [Sat, 18 Jul 2020 06:16:34 +0000 (14:16 +0800)]
rgw: dump transitions in RGWLifecycleConfiguration::dump()

Signed-off-by: Shengming Zhang <zhangsm01@inspur.com>
(cherry picked from commit c843b6f08799c1fef39a4a6fd24c27207d1e951e)

Conflicts:
src/rgw/rgw_json_enc.cc
- trivial whitespace difference
- dump_object() accepts string_view in master branch, but accepts char*
  in nautilus branch

4 years agoMerge pull request #36863 from batrick/i47178
Yuri Weinstein [Tue, 22 Sep 2020 19:49:18 +0000 (12:49 -0700)]
Merge pull request #36863 from batrick/i47178

nautilus: qa/tasks/vstart_runner: use parent's umount methods

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
4 years agomgr/telemetry: fix device id splitting when anonymizing serial 37318/head
Yaarit Hatuka [Thu, 27 Aug 2020 03:04:34 +0000 (23:04 -0400)]
mgr/telemetry: fix device id splitting when anonymizing serial

Anonymizing the serial number in the device id string fails in rare
cases where 'vendor' and 'model' are missing from the device id
string. Ideally, device id is generated (in blkdev.cc) as
'vendor_model_serial', in case all fields were successfully retrieved
from the device. In cases where they were not, device id can also be
generated as 'model_serial' or 'serial'. Splitting by '_' fails in the
latter case (since 'serial' is the only element in the string).

In order to anonymize serial numbers in smartctl reports we now rely
on the serial number value as retrieved from the raw smartctl report
itself (as opposed to the one in device id). That's in order to avoid
possible inconsistencies between the serial retrieved from device id and
the one in the report.

Fixes: https://tracker.ceph.com/issues/46977
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit e5099a7b58bcf39d80beb908c192c3bf639db1a4)

Conflicts:
src/pybind/mgr/telemetry/module.py

In master we use Python 3's f-string formatting to create 'anon_devid':
anon_devid = f"{devid.rsplit('_', 1)[0]}_{uuid.uuid1()}"

The conflict happened since Nautilus still uses Python 2, and 'anon_id'
is created via string concatenation.
anon_devid = devid[:devid.rfind('_')] + '_' + str(uuid.uuid1())

4 years agoMerge pull request #37283 from yuriw/wip-yuriw-47561-nautilus
Yuri Weinstein [Tue, 22 Sep 2020 14:19:38 +0000 (07:19 -0700)]
Merge pull request #37283 from yuriw/wip-yuriw-47561-nautilus

nautilus: qa/tests: removed ../stress-split/7-final-workload/rbd-python.yaml

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agoMerge pull request #37278 from tspmelo/wip-47558-nautilus
Laura Paduano [Tue, 22 Sep 2020 14:08:38 +0000 (16:08 +0200)]
Merge pull request #37278 from tspmelo/wip-47558-nautilus

nautilus: mgr/dashboard: Allow editing iSCSI targets with initiators logged-in

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
4 years agomgr/dashboard: fix pool usage calculation 37309/head
Ernesto Puerta [Thu, 25 Jun 2020 09:17:22 +0000 (11:17 +0200)]
mgr/dashboard: fix pool usage calculation

Currently Dashboard Pool usage calculation does not match the output of
'ceph df' command.

Fixes: https://tracker.ceph.com/issues/45185
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit b4a9dc17a3de90379964443d26b29f1759824f28)

Conflicts:
qa/tasks/mgr/dashboard/helper.py
src/mon/PGMap.cc
src/pybind/mgr/dashboard/frontend/src/app/ceph/pool/pool-list/pool-list.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/pool/pool-list/pool-list.component.spec.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/pool/pool-list/pool-list.component.ts
src/pybind/mgr/dashboard/frontend/src/app/shared/components/usage-bar/usage-bar.component.ts:
        - Keep UsageBar component totalBytes/usedBytes names
        - Bring new UsageBar option decimal
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
4 years agoMerge pull request #37284 from idryomov/wip-krbd-noudev-nautilus
Ilya Dryomov [Tue, 22 Sep 2020 07:52:48 +0000 (09:52 +0200)]
Merge pull request #37284 from idryomov/wip-krbd-noudev-nautilus

nautilus: krbd: optionally skip waiting for udev events

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
4 years agomgr/dashboard: cpu stats incorrectly displayed 37295/head
Avan Thakkar [Thu, 23 Jul 2020 06:27:32 +0000 (11:57 +0530)]
mgr/dashboard: cpu stats incorrectly displayed

Fixes: https://tracker.ceph.com/issues/46683
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit f039e5585d552c553e37a5f68713cfe2b109b97f)

4 years agoceph.spec.in, debian/control: add smartmontools and nvme-cli dependencies 37288/head
Yaarit Hatuka [Fri, 18 Sep 2020 03:25:56 +0000 (03:25 +0000)]
ceph.spec.in, debian/control: add smartmontools and nvme-cli dependencies

These packages are needed in order to scrape device health metrics from
devices used by OSD and MON daemons.

smartmontools' smartctl is what we use in order to scrape devices' SMART
attributes and general health metrics.
In addition, we use nvme-cli tool on NVMe devices, which fetches
vendor specific NVMe related health metrics.

Ceph rely on these tools for proper functioning of the underlying layers
of devicehealth mgr module, and other mgr modules which use devicehealth
functionality (such as diskprediction_local, telemetry, dashboard).

Essentially, most of devicehealth commands rely on proper functioning of
smartctl, otherwise they lack the device health metrics.

For example, in case smartctl is missing, the commands:
    ceph device scrape-daemon-health-metrics <who>
    ceph device scrape-health-metrics [<devid>]
will not be able to scrape health metrics, and the command:
    ceph device predict-life-expectancy <devid>
will not provide any meaningful output (since there are no metrics).

In short, when we scrape a device by its daemon (be it an OSD or a MON):
  ceph device scrape-daemon-health-metrics <who>
The devicehealth module command eventually invokes a
block_device_get_metrics() call in either osd/OSD.cc or mon/Monitor.cc,
which wraps calls to both
    block_device_run_smartctl()       (spawns smartctl)
    block_device_run_vendor_nvme()    (spawns nvme)
in common/blkdev.cc.

Minimum version requirements:
'smartmontools' is the package name, which contains two utility
programs: 'smartd' and 'smartctl'. Ceph uses the latter.

Version 6.7 of smartctl first introduced the --json option (beta), which
allows to output the metrics in a JSON format. Since then a few
adjustments were made and the feature officially launched in smartctl
version 7.0.
Since we rely on the JSON format to process the metrics, we must have
smartmontools' smartctl version >= 7.

That said, we choose not to specify smartmontools version here on
purpose, since there might be a scenario where:
We specified smartmontools version to be >= 7.
smartmontools 7 is not available yet in rhel 8 / centos 8.
A user installs via rpm ceph-osd, for example.
smartmontools will not be installed (since version >= 7 is not available
in this repo yet).
Then the user upgrades to 8.3 (which should have smartmontools >= 7),
but smartmontools will not get upgraded (since it's not installed).
In the scenario where we do not specify a version, smartmontools 6.6
will be installed, but it will be upgraded to >= 7 when a user upgrades
(and if it's a fresh installation - version >= 7 would be installed
anyway).

nvme-cli does not have a minimum version.

We use 'Recommends' for both rpm and deb packages since we do not want
the installation to fail in case of conflicts. 'Recommends' weakens the
dependency to be installed in case possible, but ignores it in cases of
conflicts with other dependencies.

It's worth mentioning that smartmontools and nvme-cli dependencies exist
in ceph-container builds.
We add them here for the cases of bare metal installations.

In the future we will add a separate package (with smartmontools and
nvme-cli dependencies) that can be installed on any node (running
rbd-mirror, rgw, mds, mgr, etc.), in order to be able to collect the
health metrics of its devices and offer their life expectancy
prediction.

Fixes: https://tracker.ceph.com/issues/47479
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit d5528a7e8e3b8289288d5a0a55d57d9935a3966c)

Conflicts:
    ceph.spec.in

Had to remove the line:
    Requires: python%{python3_pkgversion}-ceph-common = %{_epoch_prefix}%{version}-%{release}
which slipped in between
    Requires: libstoragemgmt
and
    %if 0%{?weak_deps}

Also, removed from the cherry-picked commit the dependencies for mon package
from both ceph.spec.in and debian/control.
That's because in Nautilus we do not scrape the health metrics of mon devices
(please see commit d592e56e74d94c6a05b9240fcb0031868acefbab).

4 years agoMerge pull request #37051 from trociny/wip-47362-nautilus
Yuri Weinstein [Mon, 21 Sep 2020 19:50:20 +0000 (12:50 -0700)]
Merge pull request #37051 from trociny/wip-47362-nautilus

nautilus: os/bluestore: fix collection_list ordering

Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agoqa/tests: removed ../stress-split/7-final-workload/rbd-python.yaml 37283/head
Yuri Weinstein [Mon, 21 Sep 2020 17:26:26 +0000 (10:26 -0700)]
qa/tests: removed ../stress-split/7-final-workload/rbd-python.yaml

Fixes: https://tracker.ceph.com/issues/47561
Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
4 years agoMerge pull request #36828 from batrick/i47152
Yuri Weinstein [Mon, 21 Sep 2020 15:21:47 +0000 (08:21 -0700)]
Merge pull request #36828 from batrick/i47152

nautilus: pybind/mgr/volumes: add global lock debug

Reviewed-by: Venky Shankar <vshankar@redhat.com>
4 years agoMerge pull request #36714 from kotreshhr/wip-46948-nautilus
Yuri Weinstein [Mon, 21 Sep 2020 15:20:44 +0000 (08:20 -0700)]
Merge pull request #36714 from kotreshhr/wip-46948-nautilus

nautilus: qa: Fix traceback during fs cleanup between tests

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoMerge pull request #36181 from smithfarm/wip-46592-nautilus
Yuri Weinstein [Mon, 21 Sep 2020 15:20:06 +0000 (08:20 -0700)]
Merge pull request #36181 from smithfarm/wip-46592-nautilus

nautilus: common:  ignore SIGHUP prior to fork

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
4 years agoqa: add test for mapping and unmapping from a network namespace 37284/head
Ilya Dryomov [Wed, 16 Sep 2020 14:38:10 +0000 (16:38 +0200)]
qa: add test for mapping and unmapping from a network namespace

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit d2884adb1542de7a43f82eb899056aa74de95052)

4 years agoceph-volume: fix wrong type passed in terminal.warning() 37195/head
Guillaume Abrioux [Fri, 18 Sep 2020 11:51:51 +0000 (13:51 +0200)]
ceph-volume: fix wrong type passed in terminal.warning()

`terminal.warning()` excepts a `str`.
Passing `e` means we pass a type `exceptions.RuntimeError`

Changing to `terminal.warning(e.message)` fixes the issue.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1877672
Resolves: rhbz#1877672

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a1f42c8d7b3fe08da82c528038d8db9ccdd5c98a)

4 years agoceph-volume: fix simple activate when legacy osd
Guillaume Abrioux [Thu, 10 Sep 2020 23:13:06 +0000 (01:13 +0200)]
ceph-volume: fix simple activate when legacy osd

`ceph-volume simple activate --all` relies on the presence of json files
in `/etc/ceph/osd` that was created with `ceph-volume simple scan`
command.

In a cluster lifecycle, it is very likely an OSD which was deployed with
ceph-disk at some point gets removed or replaced. It means the corresponding
json file in `/etc/ceph/osd` becomes unrelevant. It makes `ceph-volume
simple activate --all` fails because it tries to mount non existing
partitions.
The idea here is to simply warn the user that the osd described in the
json file doesn't exist anymore and exit properly instead of throwing an
error.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1877672
Closes: https://tracker.ceph.com/issues/47493
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a3e9e215bda110b3224e165bee6565943b3f3c14)

4 years agomgr/dashboard: Allow editing iSCSI targets with initiators logged-in 37278/head
Tiago Melo [Mon, 7 Sep 2020 09:47:19 +0000 (09:47 +0000)]
mgr/dashboard: Allow editing iSCSI targets with initiators logged-in

Fixes: https://tracker.ceph.com/issues/47393
Signed-off-by: Tiago Melo <tmelo@suse.com>
(cherry picked from commit 6de09f131074294b71e47ab0e168036a1fcc35fe)

4 years agoqa: Enable debug_client for mgr tests 37269/head
Brad Hubbard [Wed, 16 Sep 2020 02:16:23 +0000 (12:16 +1000)]
qa: Enable debug_client for mgr tests

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
(cherry picked from commit 926e06caf5c1262ab1268126d1d775897ff87471)

4 years agocompressor: Set the Zstd default compression level to 1 37254/head
Bryan Stillwell [Tue, 24 Mar 2020 21:15:41 +0000 (15:15 -0600)]
compressor: Set the Zstd default compression level to 1

The default compression level of 5 for Zstandard is too high for the majority
of use cases since it requires too many CPU cycles.  This patch switches the
default to 1.

Fixes: https://tracker.ceph.com/issues/44724
Signed-off-by: Bryan Stillwell <bstillwell@godaddy.com>
(cherry picked from commit caf74d533b0c6c9e6fc5b1463ae2c3be1103d7f3)

4 years agocompressor: Add a config option to specify Zstd compression level
Bryan Stillwell [Fri, 6 Mar 2020 17:58:50 +0000 (10:58 -0700)]
compressor: Add a config option to specify Zstd compression level

Add a new configuration item called 'compressor_zstd_level' so that the
Zstandard compression level can be tuned to the workload on a cluster.

Fixes: https://tracker.ceph.com/issues/43377
Signed-off-by: Bryan Stillwell <bstillwell@godaddy.com>
(cherry picked from commit 82699067b89eab01744f1b7f10490ec0975bb1a6)

4 years agoMerge pull request #37161 from rhcs-dashboard/alert-loading
Lenz Grimmer [Fri, 18 Sep 2020 10:56:59 +0000 (12:56 +0200)]
Merge pull request #37161 from rhcs-dashboard/alert-loading

nautilus: mgr/dashboard: Monitoring: Fix for the infinite loading bar action

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
4 years agoceph.in: ignore failures to flush stdout 37226/head
Dan van der Ster [Mon, 14 Sep 2020 14:23:53 +0000 (16:23 +0200)]
ceph.in: ignore failures to flush stdout

Catch an IOError exception when flushing ceph stdout.

Fixes: https://tracker.ceph.com/issues/47442
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit 48503413a28fbea32f8ef3d48cb765771216f165)

4 years agomgr/dashboard: Monitoring: Fix for the infinite loading bar action 37161/head
nizamial09 [Tue, 15 Sep 2020 13:01:52 +0000 (18:31 +0530)]
mgr/dashboard: Monitoring: Fix for the infinite loading bar action

Only seen in nautilus
Intended to fix the unusual behaviour in the All Alerts tab where the loading bar progressess continously until one of the alerts is selected.

To reproduce:
Navigate to cluster -> Monitoring -> All Alerts tab. You can see the progress bar at the bottom of the table.

Fixes: https://tracker.ceph.com/issues/47435
Signed-off-by: Nizamudeen A <nia@redhat.com>
4 years agotest/rbd-mirror: pool watcher registration error might result in race 37209/head
Jason Dillaman [Wed, 5 Aug 2020 16:36:26 +0000 (12:36 -0400)]
test/rbd-mirror: pool watcher registration error might result in race

The init finish context should be swapped out before it attempts to
re-register the watcher. This affects the test case which mocks the
timer to fire immediately instead of after 30 seconds.

Fixes: https://tracker.ceph.com/issues/46669
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit c89d31ebf6c412d609123979c63ebc600b70e179)

Conflicts:
src/tools/rbd_mirror/PoolWatcher.cc
- nautilus uses Mutex::Locker where master has std::lock_guard

4 years agoMerge pull request #37204 from tchaikov/nautilus-doc-rtd
Kefu Chai [Thu, 17 Sep 2020 02:54:31 +0000 (10:54 +0800)]
Merge pull request #37204 from tchaikov/nautilus-doc-rtd

nautilus: doc: enable Read the Docs

Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agodoc: add in-doc search from read the docs 37204/head
Kefu Chai [Thu, 9 Apr 2020 15:14:42 +0000 (23:14 +0800)]
doc: add in-doc search from read the docs

readthedocs-sphinx-search features better user experience than the
builtin search offered by sphinx

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 8bd8a8badbf992347a0883a537cce414432c867e)

4 years agodoc: use plantweb as fallback of sphinx-ditaa
Kefu Chai [Thu, 9 Apr 2020 13:25:39 +0000 (21:25 +0800)]
doc: use plantweb as fallback of sphinx-ditaa

RTD does not support installing system packages, the only ways to install
dependencies are setuptools and pip. while ditaa is a tool written in
Java. so we need to find a native python tool allowing us to render ditaa
images. plantweb is able to the web service for rendering the ditaa
diagram. so let's use it as a fallback if "ditaa" is not around.

also start a new line after the directive, otherwise planweb server will
return 500 at seeing the diagram.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 0cb56e0f13dc57167271ec7f20f11421416196a2)

Conflicts:
doc/cephfs/cephfs-io-path.rst
doc/dev/deduplication.rst
doc/install/ceph-deploy/quick-cephfs.rst
doc/radosgw/vault.rst
doc/rbd/rbd-kubernetes.rst
doc/rbd/rbd-persistent-cache.rst: these file does not exist in
          nautilus, so drop related changes

4 years agodoc/conf.py: exclude pybindings docs from build for RTD
Kefu Chai [Thu, 9 Apr 2020 08:51:06 +0000 (16:51 +0800)]
doc/conf.py: exclude pybindings docs from build for RTD

because it'd difficult to prepare (dummy) librados,libcephfs and librbd for
their python bindings in the building environment offered by Read the Docs.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 847e4ef941401e1b580e93d7058e8413bd131e21)

Conflicts:
doc/conf.py: trivial resolution

4 years agoreadthedocs: add .readthedocs.yml
Kefu Chai [Thu, 9 Apr 2020 07:35:15 +0000 (15:35 +0800)]
readthedocs: add .readthedocs.yml

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 99b75c14d2c84bad8a6b491e4981eeb481751c40)

4 years agoMerge pull request #37171 from neha-ojha/wip-47092-nautilus
Yuri Weinstein [Wed, 16 Sep 2020 21:06:12 +0000 (14:06 -0700)]
Merge pull request #37171 from neha-ojha/wip-47092-nautilus

nautilus: mon: mark pgtemp messages as no_reply more consistenly in preprocess_…

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
4 years agolibrbd: migration abort should revert data back to the original image 37165/head
Jason Dillaman [Wed, 5 Aug 2020 13:12:41 +0000 (09:12 -0400)]
librbd: migration abort should revert data back to the original image

If the migration destination image was modified and then the migration
was aborted, we need to copy the data back to the source image to avoid
losing data. For simplicity we will only revert the HEAD revision state
and will not attempt to copy new snapshots on the destination image
back to the source.

Fixes: https://tracker.ceph.com/issues/41394
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 5bd15da8be09a4e7644d411a0b0c132e5b795393)

Conflicts:
src/librbd/api/Migration.cc: trivial resolution
(cherry picked from commit f94c276accb8c9f4096b3bffc95362a921f2a2a5)

Conflicts:
        src/librbd/api/Migration.cc: snapshot-based mirroring changes

4 years agolibrbd: track in-progress migration aborting operation
Jason Dillaman [Tue, 4 Aug 2020 21:09:57 +0000 (17:09 -0400)]
librbd: track in-progress migration aborting operation

We want to prevent the destination image from being used while an
abort is in-progress. Test that the image has no watchers prior to
permitting the abort, switch the migration state to ABORTING, and
treat the image as read-only if the migration state is ABORTING.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit d848b4f1c083bd9f3e6eac3e186c9e7df2e22004)
(cherry picked from commit 2bc5229717326298ca964333fc07a38c5e48701e)

Conflicts:
src/librbd/image/RefreshRequest.cc: lock and read-only flags refactor

4 years agolibrbd: improved debug logs on list watcher state machine
Jason Dillaman [Tue, 4 Aug 2020 21:19:21 +0000 (17:19 -0400)]
librbd: improved debug logs on list watcher state machine

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit aee199f1daea6fd8eb7671464e45154ed0e9fd65)
(cherry picked from commit dc0b7a0006c956ada842118744ad2768ff9984a5)

Conflicts:
src/librbd/image/ListWatchersRequest.cc: dropped mirroring filter

4 years agolibrbd: deep-copy objects from specified start position
Jason Dillaman [Wed, 5 Feb 2020 20:54:55 +0000 (15:54 -0500)]
librbd: deep-copy objects from specified start position

Only read data from from after the specified start position and copy it
to the specified starting write position in the destination.

Fixes: https://tracker.ceph.com/issues/43933
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit d445ac503a9b8aa7a8f7d235023325cca44d58ce)

4 years agolibrbd: ensure deep-copy snapshot map includes all destination snap ids
Jason Dillaman [Wed, 5 Feb 2020 20:27:39 +0000 (15:27 -0500)]
librbd: ensure deep-copy snapshot map includes all destination snap ids

When deep-copying from an arbitrary start snapshot id, the snap sequence
will be missing all older snapshots. Additionally, snapshot types that
are not deep-copied still need to be included in the destination snap
map.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 5ddc15a34b3bc2726ea15cfbfbe2bd28a329db99)

Conflicts:
src/librbd/deep_copy/ImageCopyRequest.cc: RefCountedObject changes

4 years agolibrbd: deep-copy snapshots from a specified start/end position
Jason Dillaman [Wed, 5 Feb 2020 19:23:53 +0000 (14:23 -0500)]
librbd: deep-copy snapshots from a specified start/end position

Allow the snapshots to be arbitrarily copied from any source image
start/end snapshot ids. If the end snapshot is not a user-snapshot,
it will associate to the destination image HEAD revision.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 1fac1f72c2d2000175f854f9bb14ec95ccb68b08)

Conflicts:
src/librbd/deep_copy/SnapshotCopyRequest.cc: different lock types
src/test/librbd/deep_copy/test_mock_SnapshotCopyRequest.cc: no mirror snapshot namespaces

4 years agolibrbd: deep-copy should accept a lower-bound for the destination snap_id
Jason Dillaman [Wed, 5 Feb 2020 15:42:27 +0000 (10:42 -0500)]
librbd: deep-copy should accept a lower-bound for the destination snap_id

For snapshot-based mirroring, we will want to prevent the modification of
snapshots below the last sync snapshot and to prevent the copying of data
below that lower-bound as well. This commit just adds the new parameter and
future commits will update the snapshot and object copy behavior.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit fccbc70fe15e4c92c22467b204362881b12468f1)

Conflicts:
src/librbd/DeepCopyRequest.cc: different lock types
src/tools/rbd_mirror/ImageSync.cc: trivial resolution

4 years agoMerge pull request #37091 from neha-ojha/wip-47194-nautilus
Yuri Weinstein [Wed, 16 Sep 2020 15:52:38 +0000 (08:52 -0700)]
Merge pull request #37091 from neha-ojha/wip-47194-nautilus

nautilus: os/bluestore: enable more flexible bluefs space management by default.

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
4 years agoMerge pull request #36982 from neha-ojha/wip-47296-nautilus
Yuri Weinstein [Wed, 16 Sep 2020 15:44:27 +0000 (08:44 -0700)]
Merge pull request #36982 from neha-ojha/wip-47296-nautilus

nautilus: mon/OSDMonitor: only take in osd into consideration when trimming osd…

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agoMerge pull request #36920 from p-se/wip-47228-nautilus
Yuri Weinstein [Wed, 16 Sep 2020 15:43:53 +0000 (08:43 -0700)]
Merge pull request #36920 from p-se/wip-47228-nautilus

nautilus: mgr/dashboard: document Prometheus' security model

Reviewed-by: Laura Paduano <lpaduano@suse.com>
4 years agoMerge pull request #36726 from dillaman/wip-compile-fixes-nautilus
Yuri Weinstein [Wed, 16 Sep 2020 15:43:24 +0000 (08:43 -0700)]
Merge pull request #36726 from dillaman/wip-compile-fixes-nautilus

nautilus: minor tweaks to fix compile issues under latest Fedora

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
4 years agoMerge pull request #36412 from shyukri/wip-46088-nautilus
Yuri Weinstein [Wed, 16 Sep 2020 15:42:49 +0000 (08:42 -0700)]
Merge pull request #36412 from shyukri/wip-46088-nautilus

nautilus: mgr/prometheus: automatically discover RBD pools for stats gathering

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Mykola Golub <mgolub@mirantis.com>
4 years agomon: mark pgtemp messages as no_reply more consistently in preprocess_pgtemp 37171/head
Greg Farnum [Wed, 12 Aug 2020 23:44:11 +0000 (23:44 +0000)]
mon: mark pgtemp messages as no_reply more consistently in preprocess_pgtemp

If a message is forwarded, it's conceivable the leader's and peon's evaluation
will disagree about whether the message is useful or not, which could result
in the leader ignoring it and the peon having a dangling forwarded message.
Fix this by marking the op as no_reply whenever ignoring it.

Fixes: https://tracker.ceph.com/issues/46914
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 73a014fc2ca928eb72def31c9e4177063cda421a)

4 years agoMerge pull request #36944 from vumrao/wip-vumrao-47257
Yuri Weinstein [Tue, 15 Sep 2020 20:39:31 +0000 (13:39 -0700)]
Merge pull request #36944 from vumrao/wip-vumrao-47257

nautilus: mon/PGMap: add pg count for pools in the ceph df command

Reviewed-by: Kefu Chai <kchai@redhat.com>
4 years agorgw: advance pseudo-folders properly in delimited ordered listing 37169/head
J. Eric Ivancich [Tue, 15 Sep 2020 18:20:04 +0000 (14:20 -0400)]
rgw: advance pseudo-folders properly in delimited ordered listing

The code mistakenly uses the current marker to figure out how to skip
past a pseudo-directory. This could allow for some entries in a bucket
to be skipped. The code should have used the current pseudo-directory
to determine what to skip past.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
4 years agoRevert "rgw: fix list bucket with delimiter wrongly skip some special keys"
J. Eric Ivancich [Mon, 14 Sep 2020 23:33:51 +0000 (19:33 -0400)]
Revert "rgw: fix list bucket with delimiter wrongly skip some special keys"

This reverts commit 04b15cef88c5d50ce18911f63c63fa094101ced0.

While this did fix https://tracker.ceph.com/issues/40905, it did so in
an unnecessarily complex manner. So we're reverting it to more easily
apply a cleaner solution.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
4 years agoMerge pull request #36952 from ceph/wip-nautilus-bz1872983
Jan Fajerski [Tue, 15 Sep 2020 13:25:34 +0000 (15:25 +0200)]
Merge pull request #36952 from ceph/wip-nautilus-bz1872983

nautilus: ceph-volume: simple scan should ignore tmpfs

4 years agoMerge pull request #36453 from jan--f/wip-46113-nautilus
Jan Fajerski [Tue, 15 Sep 2020 13:22:23 +0000 (15:22 +0200)]
Merge pull request #36453 from jan--f/wip-46113-nautilus

nautilus: ceph-volume: report correct rejected reason in inventory if device type is invalid

4 years agorgw: fix boost::asio::async_write() does not return error... 37157/head
Mark Kogan [Thu, 2 Jul 2020 16:37:43 +0000 (19:37 +0300)]
rgw: fix boost::asio::async_write() does not return error...

although remote has closed the connection

Fixes: https://tracker.ceph.com/issues/46332
Signed-off-by: Mark Kogan <mkogan@redhat.com>
(cherry picked from commit c997eb6ad77deebd8e903fe84da7af6fcf50d528)

4 years agoMerge pull request #37055 from ifed01/wip-ifed-fix-rocksdb-opts-nautilus
Yuri Weinstein [Mon, 14 Sep 2020 15:31:29 +0000 (08:31 -0700)]
Merge pull request #37055 from ifed01/wip-ifed-fix-rocksdb-opts-nautilus

nautilus: kv/RocksDBStore: make options compaction_threads/disableWAL/flusher_t…

Reviewed-by: Jianpeng Ma <jianpeng.ma@intel.com>
Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agoMerge pull request #37081 from idryomov/wip-relax-preauth-asserts-nautilus
Ilya Dryomov [Sun, 13 Sep 2020 18:56:37 +0000 (20:56 +0200)]
Merge pull request #37081 from idryomov/wip-relax-preauth-asserts-nautilus

nautilus: msg/async/ProtocolV2: allow rxbuf/txbuf get bigger in testing

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
4 years agoMerge pull request #36188 from smithfarm/wip-45954-nautilus
Yuri Weinstein [Fri, 11 Sep 2020 16:50:46 +0000 (09:50 -0700)]
Merge pull request #36188 from smithfarm/wip-45954-nautilus

nautilus: rgw: fail when get/set-bucket-versioning attempted on a non-existent …

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
4 years agoos/bluestore: enable more flexible bluefs space management by default. 37091/head
Igor Fedotov [Fri, 21 Aug 2020 11:51:00 +0000 (14:51 +0300)]
os/bluestore: enable more flexible bluefs space management by default.

It turned out that we did't enable it when introduced this feature in
https://github.com/ceph/ceph/pull/29687.

Fixes: https://tracker.ceph.com/issues/47053
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 41823db09c609155db31afbd03b817c2a578fa9d)

4 years agomsg/async/ProtocolV2: allow rxbuf/txbuf get bigger in testing 37081/head
Ilya Dryomov [Sat, 29 Aug 2020 10:02:30 +0000 (12:02 +0200)]
msg/async/ProtocolV2: allow rxbuf/txbuf get bigger in testing

We have a kernel client test case that constructs huge auth tickets
to exercise the three related code paths in the kernel.  One of the
tickets is bigger than 1000000 bytes, as required for triggering the
third code path.

We haven't bumped into this assert earlier because the kernel client
is still on msgr v1.  However, "rbd map" and "rbd unmap" commands
started connecting to the cluster in commit 96f05a7956b3 ("rbd: delay
determination of default pool name") and that happens via msgr v2.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 94953dd9398a937d43026f73efaf437597071ca7)

4 years agoceph-volume: show correct rejected reason in inventory if device type is not acceptable 36453/head
Satoru Takeuchi [Fri, 22 May 2020 01:45:32 +0000 (01:45 +0000)]
ceph-volume: show correct rejected reason in inventory if device type is not acceptable

If device type is not acceptable in `c-v inventory`, its rejected reason
becomes "Insufficient space (<5GB)" by mistake. It's because sys_api is
empty due to skipping devices that are neither `disk` nor `device`. We
should report the target device is not acceptable in this case.

Fixes: https://tracker.ceph.com/issues/46102
Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
(cherry picked from commit 3e5d91d41f7275d4656019c1ca3fc80927d214c9)

4 years agoceph-volume: cleanup code
Satoru Takeuchi [Fri, 22 May 2020 01:07:17 +0000 (01:07 +0000)]
ceph-volume: cleanup code

Simplify the logic and fix a typo.

Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
(cherry picked from commit 0169b72fff48134ef01802ade38c55281b9f4510)

4 years agomon/PGMap: add pg count for pools in the ceph df command 36944/head
Vikhyat Umrao [Wed, 26 Aug 2020 10:17:05 +0000 (03:17 -0700)]
mon/PGMap: add pg count for pools in the ceph df command
Fixes: https://tracker.ceph.com/issues/46663
Signed-off-by: Vikhyat Umrao <vikhyat@redhat.com>
(cherry picked from commit 9ffbffbe6cd1d8dafb1dd88cbc1ce644afc7a915)

 Conflicts:
PendingReleaseNotes
        - Taking only release notes line for this commit

Signed-off-by: Vikhyat Umrao <vikhyat@redhat.com>
4 years agotest/objectstore: make store_test also run collection_list_legacy 37051/head
Mykola Golub [Thu, 20 Aug 2020 11:24:42 +0000 (12:24 +0100)]
test/objectstore: make store_test also run collection_list_legacy

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 08fab7a8a9103f87935c685c0a66d28e361bc9f5)

Conflicts:
src/test/objectstore/store_test.cc:
different collection_list arguments in
SyntheticWorkloadState::shutdown

4 years agoos/kstore: fix collection_list properly set next if end reached
Mykola Golub [Wed, 19 Aug 2020 10:16:12 +0000 (11:16 +0100)]
os/kstore: fix collection_list properly set next if end reached

Previously it was setting it to GMAX (happened when one had end
set to not GMAX and max set to INT_MAX).

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 2a67fc5e4cf37912b568ad3046f290023d06eb90)

4 years agoos/kstore: fix collection_list ordering
Mykola Golub [Wed, 19 Aug 2020 08:56:57 +0000 (09:56 +0100)]
os/kstore: fix collection_list ordering

It has the same key escaping bug as the blustore has, but we
don't need to workaround it here because kstore is not in
production use.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit c1eff9f7812b131c10df245ae92450d70623de2b)

4 years agoos/bluestore: refactor _collection_list
Mykola Golub [Wed, 19 Aug 2020 08:33:38 +0000 (09:33 +0100)]
os/bluestore: refactor _collection_list

Make it operate with oids only hiding keys in CollectionListIterator.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit f2ccd547d8d0b1518f77a08b61f3c7f509af65d3)

4 years agoosd: add and utilize OSD_FIXED_COLLECTION_LIST feature
Mykola Golub [Thu, 30 Jul 2020 14:21:28 +0000 (15:21 +0100)]
osd: add and utilize OSD_FIXED_COLLECTION_LIST feature

If all osds from upacting set have this feature set
the backend can use the new "fixed" collection_list method,
otherwise it fallbacks to the legacy method.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 8f9d335bc7cccb221ca7316ee0c8f22198d0f9ef)

Conflicts:
src/include/ceph_features.h: trivial
src/osd/PGBackend.h: trivial
src/osd/PrimaryLogPG.h: PG::get_min_upacting_features instead of
PeeringState::get_min_upacting_features

4 years agokv/RocksDBStore: make options compaction_threads/disableWAL/flusher_threads/compact_o... 37055/head
Jianpeng Ma [Fri, 10 Jan 2020 06:33:55 +0000 (14:33 +0800)]
kv/RocksDBStore: make options compaction_threads/disableWAL/flusher_threads/compact_on_mount work.

This bug introduce by commit: 5f72c376deb64562e.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
(cherry picked from commit ce3a1ca56df917bc3fa186d0e8c7b4256d2a37e9)

4 years agoos: add collection_list_legacy
Mykola Golub [Thu, 30 Jul 2020 07:39:45 +0000 (08:39 +0100)]
os: add collection_list_legacy

which provides the old collection_list behaviour on the bluestore.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit fb3c7d062e305d577286e9788e5d7536ad44364d)

Conflicts:
src/os/bluestore/BlueStore.cc: (std::shared_lock vs RWLock)
src/os/bluestore/BlueStore.h: trivial (std::vector vs vector)

4 years agoos/bluestore: fix collection_list properly set next if end reached
Mykola Golub [Fri, 31 Jul 2020 15:05:10 +0000 (16:05 +0100)]
os/bluestore: fix collection_list properly set next if end reached

Previously it was setting it to GMAX (happened when one had end
set to not GMAX and max set to INT_MAX).

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 46d73a806c90b944c8596ad3d9dae3f5cf78d915)

4 years agoos/bluestore: make get_key_object work with temp keys
Mykola Golub [Wed, 29 Jul 2020 18:02:54 +0000 (19:02 +0100)]
os/bluestore: make get_key_object work with temp keys

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit fc3faf34332b9d6b684419256825f07e98f7aa19)

4 years agoos/bluestore: fix collection_list ordering
Mykola Golub [Mon, 13 Jul 2020 06:33:07 +0000 (07:33 +0100)]
os/bluestore: fix collection_list ordering

Fixes: https://tracker.ceph.com/issues/43174
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit a3d94deddaa5a56ade4a4a8a94d31424238b62ee)

4 years agoMerge pull request #36930 from aclamk/wip-bluefs-log-replay-rescue-nautilus
Yuri Weinstein [Tue, 8 Sep 2020 17:15:44 +0000 (10:15 -0700)]
Merge pull request #36930 from aclamk/wip-bluefs-log-replay-rescue-nautilus

nautilus: Rescue procedure for extremely large bluefs log

Neha Ojha <nojha@redhat.com>

4 years agolibrbd: properly unregister failed async_complete request 37040/head
Mykola Golub [Thu, 9 Jul 2020 13:36:39 +0000 (14:36 +0100)]
librbd: properly unregister failed async_complete request

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 143179ffa92ce64d89fc98d6d6c9e7a3722a12f2)

Conflicts:
src/librbd/ImageWatcher.cc: RWLock vs std::unique_lock

4 years agolibrbd: don't resend async_complete if watcher is unregistered
Mykola Golub [Wed, 8 Jul 2020 15:04:12 +0000 (16:04 +0100)]
librbd: don't resend async_complete if watcher is unregistered

Also wait for pending async_complete after unregistering the
watcher.

Fixes: https://tracker.ceph.com/issues/45268
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 5b6804e19a8f524ab1528a638eb286482e12fe48)

Conflicts:
src/librbd/ImageWatcher.cc: FunctionContext vs LambdaContext
src/librbd/ImageWatcher.h: headers

4 years agoinclude/encoding: Fix encode/decode of float types on big-endian systems 37033/head
Ulrich Weigand [Fri, 4 Sep 2020 13:42:41 +0000 (15:42 +0200)]
include/encoding: Fix encode/decode of float types on big-endian systems

Currently, floating-point types use "raw" encoding, which means they're
simply copied as byte stream.

This means that if the decoding happens on a machine that differs in
byte order from the source machine, the returned value will be
incorrect. As one effect of this problem, a big-endian OSD node cannot
join a cluster where the MON node is little-endian (or vice versa),
because the OSDMap (incremental) structure contains floating-point
values, and as a result of this conversion problem, the OSD node will
crash with an assertion failure as soon as it receives any OSDMap update
from the MON.

This should be fixed by always encoding floating-point values in
little-endian byte order just as is done for integers. (Note that this
still assumes source and target machines used the same floating-point
format except for byte order. But given that nearly all platforms these
days use IEEE binary32/binary64 for float/double, that seems a
reasonable assumption.)

Fixes: https://tracker.ceph.com/issues/47302
Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
(cherry picked from commit f1f8b9f93b8c64a17c430a66e73e1e47a58781c7)

4 years agoceph-volume: simple scan should ignore tmpfs 36952/head
Andrew Schoen [Fri, 4 Sep 2020 14:44:49 +0000 (09:44 -0500)]
ceph-volume: simple scan should ignore tmpfs

When simple scan is ran against a ceph-volume
OSD, util.encryption.legacy_encrypted returns
tmpfs. We want to avoid creating a Device
object with tmpfs and ignore the OSD as it's
not a ceph-disk created OSD.

Resolves: rhbz#1872983

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit ff4b7bfa573e95acf4cdc01ccf2881f0935e91d6)

4 years agoprometheus: Properly split the port off IPv6 addresses 36984/head
Matthew Oliver [Thu, 13 Aug 2020 00:41:44 +0000 (10:41 +1000)]
prometheus: Properly split the port off IPv6 addresses

The Prometheus module when splitting the port nubmer for public and
client networks/ips doesn't take IPv6 addresses into account.

This patch fixes this by using `rsplit(':', 1)` rather then `split(':')`
which leads to bugs like:

  curl --silent http://localhost:9283/metrics | grep ceph_mon_metadata{
  ceph_mon_metadata{ceph_daemon="mon.mon2",hostname="mon2.example.net",public_addr="[2001",rank="0",ceph_version="ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)"} 1.0

Note the public_addr above being split at the first ':' of an IPv6
address.

Signed-off-by: Matthew Oliver <moliver@suse.com>
Fixes: https://tracker.ceph.com/issues/46846
(cherry picked from commit 985cce055bcee60b843806291458517c7ee890a3)

4 years agomon/OSDMonitor: only take in osd into consideration when trimming osdmaps 36982/head
Kefu Chai [Thu, 3 Sep 2020 15:02:58 +0000 (23:02 +0800)]
mon/OSDMonitor: only take in osd into consideration when trimming osdmaps

we should not take down osd into consideration when trimming osdmap. in
e62269c892, we decrease the upper bound of range of osdmaps to be trimmed
if the given osd is out. but we should have to decrease it only if the
osd in question is still *in*.

so, in this change, the min_lec is decreased only if the osd in question
is *in*.

Fixes: https://tracker.ceph.com/issues/47290
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 79175740cf2394bba74163ca8a5131419b9641ed)

 Conflicts:
src/mon/OSDMonitor.cc - trivial resolution

4 years agoMerge pull request #36785 from callithea/wip-46937-nautilus
Yuri Weinstein [Wed, 2 Sep 2020 17:33:26 +0000 (10:33 -0700)]
Merge pull request #36785 from callithea/wip-46937-nautilus

nautilus: mgr: Add missing states to PG_STATES in mgr_module.py.

Reviewed-by: Neha Ojha <nojha@redhat.com>
4 years agoMerge pull request #36853 from dillaman/wip-46932-nautilus
Yuri Weinstein [Wed, 2 Sep 2020 16:42:53 +0000 (09:42 -0700)]
Merge pull request #36853 from dillaman/wip-46932-nautilus

nautilus: librados: add LIBRADOS_SUPPORTS_GETADDRS support

Reviewed-by: Mykola Golub <mgolub@mirantis.com>
4 years agoMerge pull request #36784 from callithea/wip-46752-nautilus
Yuri Weinstein [Wed, 2 Sep 2020 16:41:46 +0000 (09:41 -0700)]
Merge pull request #36784 from callithea/wip-46752-nautilus

nautilus: mgr/dashboard: wait longer for health status to be cleared

Reviewed-by: Stephan Müller <smueller@suse.com>
4 years agoMerge pull request #36578 from callithea/wip-46716-nautilus
Yuri Weinstein [Wed, 2 Sep 2020 16:40:39 +0000 (09:40 -0700)]
Merge pull request #36578 from callithea/wip-46716-nautilus

nautilus: mgr/diskprediction_local: Fix array size error

Reviewed-by: Josh Durgin <jdurgin@redhat.com>