cephadm/ingress: make frontend stat bind on localhost
The current configuration of keepalived makes it do
a curl on localhost:9999 in order to check the endpoint is alive.
Given the endpoint only binds on the vip addr, that doesn't work.
which specfies the app config value of "graphviz_dot". this annoys
sphinx:
WARNING: while setting up extension breathe: node class 'graphviz' is already registered, its visitors will be overridden
Traceback (most recent call last):
File "/home/docs/checkouts/readthedocs.org/user_builds/ceph/envs/44951/lib/python3.8/site-packages/sphinx/cmd/build.py", line 276, in build_main
app = Sphinx(args.sourcedir, args.confdir, args.outputdir,
File "/home/docs/checkouts/readthedocs.org/user_builds/ceph/envs/44951/lib/python3.8/site-packages/sphinx/application.py", line 245, in __init__
self.setup_extension(extension)
File "/home/docs/checkouts/readthedocs.org/user_builds/ceph/envs/44951/lib/python3.8/site-packages/sphinx/application.py", line 402, in setup_extension
self.registry.load_extension(self, extname)
File "/home/docs/checkouts/readthedocs.org/user_builds/ceph/envs/44951/lib/python3.8/site-packages/sphinx/registry.py", line 430, in load_extension
metadata = setup(app)
File "/home/docs/checkouts/readthedocs.org/user_builds/ceph/envs/44951/lib/python3.8/site-packages/breathe/__init__.py", line 14, in setup
renderer_setup(app)
File "/home/docs/checkouts/readthedocs.org/user_builds/ceph/envs/44951/lib/python3.8/site-packages/breathe/renderer/sphinxrenderer.py", line 2613, in setup
app.add_config_value("graphviz_dot", "dot", "html")
File "/home/docs/checkouts/readthedocs.org/user_builds/ceph/envs/44951/lib/python3.8/site-packages/sphinx/application.py", line 535, in add_config_value
self.config.add(name, default, rebuild, types)
File "/home/docs/checkouts/readthedocs.org/user_builds/ceph/envs/44951/lib/python3.8/site-packages/sphinx/config.py", line 282, in add
raise ExtensionError(__('Config value %r already present') % name)
sphinx.errors.ExtensionError: Config value 'graphviz_dot' already present
Extension error:
Config value 'graphviz_dot' already present
this issue has been reported to upstream, see
https://github.com/michaeljones/breathe/issues/803
before it is fixed upstream, let's stick with 4.32.0
which is known to work.
Volker Theile [Wed, 9 Feb 2022 08:37:48 +0000 (09:37 +0100)]
mgr/dashboard: "Please expand your cluster first" shouldn't be shown if cluster is already meaningfully running
This PR will assume that a cluster is already up and fully running. If this should not be the expected behaviour, deployment tools have to set 'INSTALLED' explicitly. Without this assumption it might happen that upgraded and fully running clusters, e.g. Octopus -> Pacific, will show the 'Expand Cluster' on first log in.
cephadm will take care that the bootstrap phase will write the necessary key to show the 'Expand cluster' page.
Kotresh HR [Thu, 10 Feb 2022 05:34:41 +0000 (11:04 +0530)]
mgr/volumes: Fix clone uid/gid mismatch
This is the regression caused by commit 18b85c53a.
The 'set_attrs' function sets the uid/gid of the
group to the subvolume if uid/gid is not passed.
The attrs of the clone should match the source
snapshot. Hence, don't use the 'set_attrs'
function to set only the quota attrs for the
clone.
Adam King [Thu, 10 Feb 2022 01:42:42 +0000 (20:42 -0500)]
qa/tasks/cephadm_cases: increase timeouts in test_cli.py
These seem to be failing sometimes but in my testing
sometimes these events are happening a few seconds after
we hit the timeout. Trying to see if this makes the tests
more consistent. No need to mark the test as failed
if we report something up in 34 seconds vs 25 especially
when cephadm works on a cyclic daemon refresh.
* refs/pull/42000/head:
qa: update rhel kclient to setup container tools
qa: stop overriding distro for k-testing
qa: only use RHEL for workload testing
qa: convert fs:workload to use cephadm
qa: split fs begin task
qa/tasks/cephadm: setup CephManager when OSDs are provisioned
qa/tasks/cephadm: setup file system if MDS are provisioned
Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
mds: add an option to decide whether prefetching entire dir or not.
Accessing one single dentry could be fastened by set this option to
false, when dir is not in the memory. Signed-off-by: "Shen, Hang" <shenhang@kuaishou.com>
Nizamudeen A [Tue, 8 Feb 2022 06:20:29 +0000 (11:50 +0530)]
mgr/dashboard: set appropriate baseline branch for applitools
All the dashboard PRs are checked against a baseline branch called
'default' in the visual regresstion testing. This will cause issues when
testing PRs in different branches. For eg: currently our master and
pacific has to save two different screenshots since the two of them
differ slightly.
Disabling the applitools logs as well because its too 'noisy'
Fixes: https://tracker.ceph.com/issues/54190 Signed-off-by: Nizamudeen A <nia@redhat.com>
Soumya Koduri [Thu, 3 Feb 2022 18:55:22 +0000 (00:25 +0530)]
rgw/dbstore: Handle read vs delete races
Now that tail objects are associated with objectID, they are not deleted
as part of this DeleteObj operation. Such tail objects (with no head object
in *.object.table are cleaned up later by GC thread.
To avoid races between writes/reads & GC delete, mtime is maintained for each
tail object. This mtime is updated when tail object is written and also when
its corresponding head object is deleted (like here in this case).
Vicente Cheng [Fri, 10 Dec 2021 06:49:55 +0000 (06:49 +0000)]
msg/async: fix outgoing_bl overflow and reset state_offset
- we should reset state_offset when read done.
- check outgoing_bl before we try to write a message.
In some environments, network would temporily block and return EAGAIN.
For async msgr, we would callback the write event directly, but that still
increase the outgoing_bl.
Think about this case, the sender is in congestion or network driver
has some problems. The data appended to outgoing_bl and outgoing_bl
is not consumed up-to-date immediately.
That size of outgoing_bl will increase with time then overflow.
The wrong outgoing_bl would cause some problems so we need to wait
for outgoing_bl before we appended another message.
Ronen Friedman [Sun, 30 Jan 2022 15:23:39 +0000 (15:23 +0000)]
osd/scrub: fix unintended changes to scrub (cluster)logs
OSD logs and cluster logs are monitored by some scrub tests:
some specific strings are required to either appear or not appear in
the logs. The Scrubber backend PR has unintentionally modified some
of these logs, and here we restore the exact logs text.
Ilya Dryomov [Tue, 8 Feb 2022 09:11:49 +0000 (10:11 +0100)]
rbd: mark optional positional arguments as such in help output
Currently at least five commands have optional positional arguments.
Overloading po::value<std::string>()->default_value("") for this
is a bit sneaky but nothing better fits into the existing Shell.cc
framework.
Note that strictly speaking "[<interval>] [<start-time>]" should be
"[<interval> [<start-time>]]" but we aren't doing that here because
"ceph" command doesn't do it either.
Rishabh Dave [Mon, 7 Feb 2022 18:44:42 +0000 (00:14 +0530)]
monitoring: mention PyYAML only once in requirements
Following error occurs while running "sudo install-deps.sh" -
ERROR: Double requirement given: PyYAML==6.0 (from -r requirements-lint.txt (line 5)) (already in pyyaml (from -r requirements-alerts.txt (line 1)), name='PyYAML')
PyYAML is mentioned twice as a requirement. It is mentioned once in both
the following files -
monitoring/ceph-mixin/requirements-lint.txt
monitoring/ceph-mixin/requirements-alerts.txt