Kefu Chai [Mon, 22 Jun 2020 01:34:53 +0000 (09:34 +0800)]
doc/conf.py: s/add_javascript/add_js_file/
to address following warning:
jenkins-build/build/workspace/ceph-pr-docs/doc/conf.py:102: RemovedInSphinx40Warning: The app.add_javascript() is deprecated. Please use app.add_js_file() instead.
Kefu Chai [Sun, 6 Mar 2022 06:27:50 +0000 (14:27 +0800)]
doc/conf.py: silence warnings from breathe
breathe calls doxygen for extracting/generating docs from code.
while doxygen complains at seeing undocumented fields/func. these
warnings could fail the sphinx-build command, if it takes warnings
as errors.
Kefu Chai [Sun, 6 Mar 2022 06:23:42 +0000 (14:23 +0800)]
mgr/cephadm: add empty line after param list in docstring
this helps to silence the warning from sphinx, like
src/pybind/mgr/orchestrator/_interface.py:docstring of orchestrator._interface.Orchestrator.remove_osds:9: WARNING: Field list ends without a blank line; unexpected unindent.
Kefu Chai [Sat, 5 Mar 2022 17:44:30 +0000 (01:44 +0800)]
admin/doc-requirements: bump sphinx to 4.4.0
bump sphinx to latest stable. to address following build failure
ERROR: sphinx-autodoc-typehints 1.17.0 has requirement Sphinx>=4, but you'll have sphinx 3.5.4 which is incompatible.
ERROR: sphinx-substitution-extensions 2022.2.16 has requirement sphinx>=4.0.0, but you'll have sphinx 3.5.4 which is incompatible.
also bump bump sphinx-rtd-theme, otherwise we'd have following
build failure:
ERROR: sphinx-rtd-theme 0.5.2 has requirement docutils<0.17, but you'll have docutils 0.17.1 which is incompatible.
Nizamudeen A [Thu, 24 Mar 2022 08:01:18 +0000 (13:31 +0530)]
mgr/dashboard: fix "NullInjectorError: No provider for I18n
Although I am not sure what's the root cause of this but this seems to
fix the test failure. I don't know if this is caused by the differnce in
angular versions between master and octopus but I still don't understand
why it didn't catch in the recent PR to this file (https://github.com/ceph/ceph/pull/44763)
Fixes: https://tracker.ceph.com/issues/55011 Signed-off-by: Nizamudeen A <nia@redhat.com>
Ilya Dryomov [Thu, 10 Mar 2022 11:40:34 +0000 (12:40 +0100)]
qa/suites: clean up client-upgrade-octopus-pacific test
- fix .qa symlinks
- rename nautilus-client-x.yaml to octopus-client-x.yaml
- fix typos and remove stale comment
- remove 2-features permutation (it doesn't do anything useful as the
workunit is run with RBD_FEATURES environment variable set and those
features are explicitly passed to RBD.create and RBD.clone calls;
the net effect is that the exact same job is run twice)
Kefu Chai [Sat, 5 Mar 2022 04:49:57 +0000 (12:49 +0800)]
cmake: pass RTE_DEVEL_BUILD=n when building dpdk
ceph is still using the Makefile based building system for building
DPDK. and DPDK enables -Werror if RTE_DEVEL_BUILD is 'y' which is
enabled by default when the dpdk is built from a git repo.
but newer GCC is more picky than the older versions, to prevent
the possible FTBFS when we switch to newer GCC for building old
branches whose dpdk submodule might be include the changes addressing
those warnings. let's just disable this option.
the only effect of this option is to add -Werror to CFLAGS. but
the building warnings from DPDK is not our focus when developing
Ceph in the most cases. so it should be fine.
Kotresh HR [Thu, 10 Feb 2022 05:34:41 +0000 (11:04 +0530)]
mgr/volumes: Fix clone uid/gid mismatch
This is the regression caused by commit 18b85c53a.
The 'set_attrs' function sets the uid/gid of the
group to the subvolume if uid/gid is not passed.
The attrs of the clone should match the source
snapshot. Hence, don't use the 'set_attrs'
function to set only the quota attrs for the
clone.
Kotresh HR [Wed, 12 Jan 2022 09:31:53 +0000 (15:01 +0530)]
mgr/volumes: Fix subvoume snapshot clone failure
Problem:
The subvolume snapshot clone fails if the quota on the source
has exceeded. Since the quota is not strictly enforced at the
byte range, this is a possibility.
Cause:
The quota on the clone is set prior to copying the data
from the source. Hence the quota mostly get enforced before
copying the entire data from the source resulting in the
clone failure.
Solution:
Enforce quota on the clone after the data is copied.
Conflicts:
src/pybind/mgr/volumes/fs/async_cloner.py: The commit cf2a1ad65120 is
not backported
src/pybind/mgr/volumes/fs/async_job.py: The commit cf2a1ad65120 is not
backported
Conflicts:
qa/tasks/cephfs/test_volumes.py: Conflicts due to tests ordering
src/pybind/mgr/volumes/fs/volume.py: The commit e308bf898955 is not
backported
src/pybind/mgr/volumes/module.py: The commit f002c6ce4033 is not
backported
Jan Fajerski [Wed, 18 Dec 2019 10:35:40 +0000 (11:35 +0100)]
mgr_util: add CephfsClient implementation
This pulls parts of the VolumesClient implementation into mgr_util to
make the CephFS specific pieces available to other mgr modules. To
reduce code duplication the VolumeClient now extends the CephfsClient
class to add the volume specific methods.
src/pybind/mgr/mgr_util.py
src/pybind/mgr/tox.ini
src/pybind/mgr/volumes/fs/operations/volume.py
src/pybind/mgr/volumes/fs/volume.py
Trivial conflicts because ofthe order of backports to octopus
Mykola Golub [Fri, 18 Feb 2022 10:42:23 +0000 (10:42 +0000)]
rbd-mirror: make mirror properly detect pool replayer needs restart
When a PoolReplayer detects remote pool metadata change it
sets "stopping" flag expecting the Mirror will restart it.
Although setting "stopping" flag makes the PoolReplayer::run
thread to terminate, the thread's is_started function will still
return true until join is called (and reset the thread id).
This made impossible for the Mirror to detect (by calling
PoolReplayer::is_running) that the PoolReplayer needed restart.
Sarthak0702 [Wed, 16 Feb 2022 12:45:35 +0000 (18:15 +0530)]
mgr/dashboard: Contact Info should be visible only when Ident channel is checked
Fixes:https://tracker.ceph.com/issues/54133 Signed-off-by: Sarthak0702 <sarthak.0702@gmail.com>
(cherry picked from commit 15211a6378a6fee9316f79ba0b27821891527c38)
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/telemetry/telemetry.component.ts
- `this.loading` used in Octopus instead of `this.loadingReady()`
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/telemetry/telemetry.component.ts
- `this.i18n()` was used in Octopus instead of `$localize`
both a < b and b < a evaluate to true. This violates STL strict weak
ordering requirements which is a problem because GroupSnapshotNamespace
is used as a key in std::map (ictx->snap_ids at least), etc.
Although the code received tremendous changes meantime
and this error shouldn't show up again, we need to handle
the case where this tag wouldn't have been set.
ee8887f4c0ff4f91117f31b621b95c8d08019130 was intended for adding
mpath devices support in ceph-volume but it has missed the lvm batch scenario.
This also fixes the zapping of mpath devices prepared with `ceph-volume raw`
Ilya Dryomov [Tue, 8 Feb 2022 09:11:49 +0000 (10:11 +0100)]
rbd: mark optional positional arguments as such in help output
Currently at least five commands have optional positional arguments.
Overloading po::value<std::string>()->default_value("") for this
is a bit sneaky but nothing better fits into the existing Shell.cc
framework.
Note that strictly speaking "[<interval>] [<start-time>]" should be
"[<interval> [<start-time>]]" but we aren't doing that here because
"ceph" command doesn't do it either.
Ilya Dryomov [Mon, 31 Jan 2022 13:08:26 +0000 (14:08 +0100)]
qa/suites/krbd: add legacy+rxbounce and crc+rxbounce coverage
For basic, rbd and rbd-nomount subsuites, replace legacy and crc
facets with "legacy or legacy+rxbounce" and "crc or crc+rxbounce"
facets (chosen at random).
For fsx, singleton and thrash subsuites, add legacy+rxbounce and
crc+rxbounce facets and drop prefer-crc facet. The expected behaviour
of the latter depends on cluster configuration and should be tested
separately.
mgr/prometheus: Fix regression with OSD/host details/overview dashboards
Fix issues with PromQL expressions and vector matching with the
`ceph_disk_occupation` metric.
As it turns out, `ceph_disk_occupation` cannot simply be used as
expected, as there seem to be some edge cases for users that have
several OSDs on a single disk. This leads to issues which cannot be
approached by PromQL alone (many-to-many PromQL erros). The data we
have expected is simply different in some rare cases.
I have not found a sole PromQL solution to this issue. What we basically
need is the following.
1. Match on labels `host` and `instance` to get one or more OSD names
from a metadata metric (`ceph_disk_occupation`) to let a user know
about which OSDs belong to which disk.
2. Match on labels `ceph_daemon` of the `ceph_disk_occupation` metric,
in which case the value of `ceph_daemon` must not refer to more than
a single OSD. The exact opposite to requirement 1.
As both operations are currently performed on a single metric, and there
is no way to satisfy both requirements on a single metric, the intention
of this commit is to extend the metric by providing a similar metric
that satisfies one of the requirements. This enables the queries to
differentiate between a vector matching operation to show a string to
the user (where `ceph_daemon` could possibly be `osd.1` or
`osd.1+osd.2`) and to match a vector by having a single `ceph_daemon` in
the condition for the matching.
Although the `ceph_daemon` label is used on a variety of daemons, only
OSDs seem to be affected by this issue (only if more than one OSD is run
on a single disk). This means that only the `ceph_disk_occupation`
metadata metric seems to need to be extended and provided as two
metrics.
`ceph_disk_occupation` is supposed to be used for matching the
`ceph_daemon` label value.
`ceph_disk_occupation_human` is supposed to be used for anything where
the resulting data is displayed to be consumed by humans (graphs, alert
messages, etc).
- Octopus does not generate Grafana dashboards using jsonnet, hence
grafana_dashboards.jsonnet was removed.
- Octopus does not support features, hence hosts_overview.feature was
removed.
- Features implemented in prometheus/module.py that never were
backported to Octopus were removed.
- `tox.ini` file adapted to include mgr/prometheus tests introduced by
the backport.
- Add `cherrypy` to src/pybind/mgr/requirements.txt to fix Prometheus
unit testing.
where the numbers of scrubbed object, clones, dirty and omap are always
less than the total number of corresponding numbers, if the PG contains
object(s) whose hash happens to be 0xffffffff.
in this change, if the calculated hash of the upper bound is greater
than the maximum possible number represented by uint32_t, in addition to
setting the hash of the upper bound hobj to 0xffffffff, we also set the
nspace of hobj of the upper bound to "\xff", so that the upper bound
is greater than an hobj whose hash happens to be 0xfffffff. please note,
the nspace of "\xff" is not an ascii string, so it's not likely to be
less than a real-world nspace of an hobj.
with this new *greater* upper bound, we are able to include the previous
missing hobj when listing the objects in a PG. so the scrub won't be
annoyed when the number of objects does not match.
Venky Shankar [Mon, 13 Dec 2021 06:15:19 +0000 (01:15 -0500)]
mds: ignore unknown client op when tracking op latency
Server::handle_client_request() ignores unknown client operation
by returning -ENOTSUPP, however, Server::perf_gather_op_latency()
aborts on unknown client op, thereby causing -ENOTSUPP to never
reach the client. ceph_abort() seems unnecessary here.
Note, we could have invoked Server::perf_gather_op_latency()
when the return value to client is not -ENOTSUPP, however,
a valid client operation *might* just return -ENOTSUPP in
some cases.
@mchangir ran into this with his getvxattr op changes (PR #42001).