Conflicts:
src/os/bluestore/bluestore_types.h
Caused by the lack of explicit std:: reference in headers - master has
got this as a part of crimson effort.
rbd: make common options override krbd-specific options
ceph-csi has added support for passing custom map and unmap options via
mapOptions and unmapOptions storage class parameters. However, it also
uses --read-only for implementing ROX (ReadOnlyMany) PVs. If the user
supplies "mapOptions: rw", they will get around the intended read-only
restriction (at least on the block device).
ceph-csi could be patched to use "-o ro", but it actually makes sense
for common options to win over device type-specific equivalents.
Conflicts:
src/tools/rbd/action/Kernel.cc [ snapshot quiesce support and
commit 34f539d8af33 ("rbd: delay parsing of default kernel map
options") not in octopus ]
Alfonso Martínez [Fri, 18 Sep 2020 15:16:34 +0000 (17:16 +0200)]
mgr/dashboard: fix performance issue when listing large amounts of buckets
Fixes: https://tracker.ceph.com/issues/47543 Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit 924368e1d0aebcb0d8f9747589d9048414d33080)
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-bucket-details/rgw-bucket-details.component.spec.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-bucket-details/rgw-bucket-details.component.ts
src/pybind/mgr/dashboard/frontend/src/app/shared/api/rgw-bucket.service.ts
- Adapted changes in these files to octopus code.
Alfonso Martínez [Thu, 13 Aug 2020 12:29:38 +0000 (14:29 +0200)]
mgr/dashboard: Landing Page improvements
Fixes: https://tracker.ceph.com/issues/42072 Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit d66e684b9ec83cca8a58b0a7b8661c568eb0cf6d)
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/dashboard/health-pie/health-pie.component.scss
src/pybind/mgr/dashboard/frontend/src/app/ceph/dashboard/health/health.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/dashboard/health/health.component.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/dashboard/info-card/info-card.component.scss
src/pybind/mgr/dashboard/frontend/src/styles/defaults/_bootstrap-defaults.scss
this file doesn't exist in octopus, so I moved the code into:
src/pybind/mgr/dashboard/frontend/src/stykes/defaults.scss
Igor Fedotov [Tue, 9 Jun 2020 08:44:31 +0000 (11:44 +0300)]
os/bluestore: remove preextended WAL support.
Fixes: https://tracker.ceph.com/issues/45613 Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 7fdbf61860b9d9deaf8734cdd57cf5c8d5f93f81)
Conflicts:
src/common/options.cc
- option "bluefs_preextend_wal_files" has a different default value
("false") in octopus (but the whole option is being deleted, so it
doesn't matter)
Tiago Melo [Fri, 28 Aug 2020 13:59:47 +0000 (13:59 +0000)]
mgr/dashboard: Fix npm package's vulnerabilities
Manual update of some npm packages to fix package's vulnerabilities.
This could not have been done by backport since master has a different list
of packages installed.
Greg Farnum [Wed, 12 Aug 2020 23:44:11 +0000 (23:44 +0000)]
mon: mark pgtemp messages as no_reply more consistently in preprocess_pgtemp
If a message is forwarded, it's conceivable the leader's and peon's evaluation
will disagree about whether the message is useful or not, which could result
in the leader ignoring it and the peon having a dangling forwarded message.
Fix this by marking the op as no_reply whenever ignoring it.
Conflicts:
src/common/admin_socket.cc
- octopus has a "while" block (instead of "if") under the comment
// make sure one of the registered commands with this prefix validates
but this is being removed
Matthew Oliver [Thu, 9 Jul 2020 06:13:05 +0000 (06:13 +0000)]
rgw: Swift API anonymous access should 401
There was a previous patch to fix this but turns out that only fixed it
for the Swift V1 auth. And it actaully broke keystone because it didn't
take into account the idiosyncrasies of multi tenancy. Which resulted in
the incorect behaviour for keystone. Worse, because it didn't take
tenants properly into account keystone ACLs where broken.
This patch reworks, and simplifies the original patch to work for both
auths. It even extends the ThirdPartyAccountApplier to check for an ANON
user and properly scope it to a tenant.
Fixes: https://tracker.ceph.com/issues/46295 Signed-off-by: Matthew Oliver <moliver@suse.com>
(cherry picked from commit 67081098dc2dddd80d52d5acd166e68954cae618)
Casey Bodley [Mon, 31 Aug 2020 15:19:34 +0000 (11:19 -0400)]
radosgw-admin: period pull command is not always a raw_storage_op
if a --url is given, 'period pull' does not depend on any zone/period
configuration and can be a raw_storage_op. if we get a --remote instead,
we do need to initialize the zone/period configuration to find the
correct endpoint/access keys
mgr/dashboard: Fix many-to-many issue in host-details dashboard
The labels on one side do not match the labels of the other side, where
a label_replace is used. The fix uses the same label_replace on the
missing side.
Fixes: https://tracker.ceph.com/issues/47334 Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit fe64b9d1763ec9dbe78fe73c403929524ab4e253)
Yaarit Hatuka [Thu, 27 Aug 2020 03:04:34 +0000 (23:04 -0400)]
mgr/telemetry: fix device id splitting when anonymizing serial
Anonymizing the serial number in the device id string fails in rare
cases where 'vendor' and 'model' are missing from the device id
string. Ideally, device id is generated (in blkdev.cc) as
'vendor_model_serial', in case all fields were successfully retrieved
from the device. In cases where they were not, device id can also be
generated as 'model_serial' or 'serial'. Splitting by '_' fails in the
latter case (since 'serial' is the only element in the string).
In order to anonymize serial numbers in smartctl reports we now rely
on the serial number value as retrieved from the raw smartctl report
itself (as opposed to the one in device id). That's in order to avoid
possible inconsistencies between the serial retrieved from device id and
the one in the report.
`ceph-volume simple activate --all` relies on the presence of json files
in `/etc/ceph/osd` that was created with `ceph-volume simple scan`
command.
In a cluster lifecycle, it is very likely an OSD which was deployed with
ceph-disk at some point gets removed or replaced. It means the corresponding
json file in `/etc/ceph/osd` becomes unrelevant. It makes `ceph-volume
simple activate --all` fails because it tries to mount non existing
partitions.
The idea here is to simply warn the user that the osd described in the
json file doesn't exist anymore and exit properly instead of throwing an
error.
Patrick Donnelly [Wed, 16 Sep 2020 19:28:55 +0000 (12:28 -0700)]
mon: allow overriding the initial mon_host
This overrides what the CephContext believes to be the current quorum of
monitors (retrieved from other instances of the MonClient), introduced
by [1]. Tests need to be able to target a specific monitor for
exercising forwarding and other things.
mon: store mon updates in ceph context for future MonMap instantiation
MonMap builds initial mon list using provided sources, like
mon-host or monmap.
For future instantiations of MonClient, if mon addresses are
updated, stale information from the provided sources are used.
This commit retains mon updates that are processed by the
MonClient in CephContext, for use in MonMap instantiations
and hence uses updated information as required.
This is helpful in cases where librados or libcephfs
instantiate MonClient in the ceph-mgr deamon as required.
split mempool allocation for bluestore_cache_other
While doing root cause analysis it bluestore_cache_other gives a bit of
a crude estimate, something more helpful would be to have it split into
the following fields:
ceph.spec.in, debian/control: add smartmontools and nvme-cli dependencies
These packages are needed in order to scrape device health metrics from
devices used by OSD and MON daemons.
smartmontools' smartctl is what we use in order to scrape devices' SMART
attributes and general health metrics.
In addition, we use nvme-cli tool on NVMe devices, which fetches
vendor specific NVMe related health metrics.
Ceph rely on these tools for proper functioning of the underlying layers
of devicehealth mgr module, and other mgr modules which use devicehealth
functionality (such as diskprediction_local, telemetry, dashboard).
Essentially, most of devicehealth commands rely on proper functioning of
smartctl, otherwise they lack the device health metrics.
For example, in case smartctl is missing, the commands:
ceph device scrape-daemon-health-metrics <who>
ceph device scrape-health-metrics [<devid>]
will not be able to scrape health metrics, and the command:
ceph device predict-life-expectancy <devid>
will not provide any meaningful output (since there are no metrics).
In short, when we scrape a device by its daemon (be it an OSD or a MON):
ceph device scrape-daemon-health-metrics <who>
The devicehealth module command eventually invokes a
block_device_get_metrics() call in either osd/OSD.cc or mon/Monitor.cc,
which wraps calls to both
block_device_run_smartctl() (spawns smartctl)
block_device_run_vendor_nvme() (spawns nvme)
in common/blkdev.cc.
Minimum version requirements:
'smartmontools' is the package name, which contains two utility
programs: 'smartd' and 'smartctl'. Ceph uses the latter.
Version 6.7 of smartctl first introduced the --json option (beta), which
allows to output the metrics in a JSON format. Since then a few
adjustments were made and the feature officially launched in smartctl
version 7.0.
Since we rely on the JSON format to process the metrics, we must have
smartmontools' smartctl version >= 7.
That said, we choose not to specify smartmontools version here on
purpose, since there might be a scenario where:
We specified smartmontools version to be >= 7.
smartmontools 7 is not available yet in rhel 8 / centos 8.
A user installs via rpm ceph-osd, for example.
smartmontools will not be installed (since version >= 7 is not available
in this repo yet).
Then the user upgrades to 8.3 (which should have smartmontools >= 7),
but smartmontools will not get upgraded (since it's not installed).
In the scenario where we do not specify a version, smartmontools 6.6
will be installed, but it will be upgraded to >= 7 when a user upgrades
(and if it's a fresh installation - version >= 7 would be installed
anyway).
nvme-cli does not have a minimum version.
We use 'Recommends' for both rpm and deb packages since we do not want
the installation to fail in case of conflicts. 'Recommends' weakens the
dependency to be installed in case possible, but ignores it in cases of
conflicts with other dependencies.
It's worth mentioning that smartmontools and nvme-cli dependencies exist
in ceph-container builds.
We add them here for the cases of bare metal installations.
In the future we will add a separate package (with smartmontools and
nvme-cli dependencies) that can be installed on any node (running
rbd-mirror, rgw, mds, mgr, etc.), in order to be able to collect the
health metrics of its devices and offer their life expectancy
prediction.
Bryan Stillwell [Tue, 24 Mar 2020 21:15:41 +0000 (15:15 -0600)]
compressor: Set the Zstd default compression level to 1
The default compression level of 5 for Zstandard is too high for the majority
of use cases since it requires too many CPU cycles. This patch switches the
default to 1.
Jason Dillaman [Wed, 5 Aug 2020 16:36:26 +0000 (12:36 -0400)]
test/rbd-mirror: pool watcher registration error might result in race
The init finish context should be swapped out before it attempts to
re-register the watcher. This affects the test case which mocks the
timer to fire immediately instead of after 30 seconds.
Fixes: https://tracker.ceph.com/issues/46669 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit c89d31ebf6c412d609123979c63ebc600b70e179)
RTD does not support installing system packages, the only ways to install
dependencies are setuptools and pip. while ditaa is a tool written in
Java. so we need to find a native python tool allowing us to render ditaa
images. plantweb is able to the web service for rendering the ditaa
diagram. so let's use it as a fallback if "ditaa" is not around.
also start a new line after the directive, otherwise planweb server will
return 500 at seeing the diagram.
doc/conf.py: exclude pybindings docs from build for RTD
because it'd difficult to prepare (dummy) librados,libcephfs and librbd for
their python bindings in the building environment offered by Read the Docs.