Kefu Chai [Tue, 25 May 2021 06:31:02 +0000 (14:31 +0800)]
mon/OSDMonitor: drop stale failure_info even if can_mark_down()
in a124ee85b03e15f4ea371358008ecac65f9f4e50, we add a check to drop
stale failure_info reports. but if osdmap does not prohibit us from
marking the osd in question down, the branch checking the stale info
is not executed. in general, it is allowed to mark an osd down, so
the fix of a124ee85b03e15f4ea371358008ecac65f9f4e50 just fails to
work.
in this change, we check for stale failure report of osd in question
as long as the osd is not marked down in the same function. this should
address the slow ops of failure report issue.
we should not proceed against user's will if dual stack is specified but
only one network for a network family can be found. the right fix is
have better error message and documentation, not to tolerate the
failure.
Matthew Oliver [Mon, 10 Aug 2020 04:46:21 +0000 (04:46 +0000)]
pick_address: Warn and continue when you find at least 1 IPv4 or IPv6 address
Currently if specify a single public or cluster network, yet have both
`ms bind ipv4` and `ms bind ipv6` set daemons crash when they can't find
both IPs from the same network:
unable to find any IPv4 address in networks '2001:db8:11d::/120' interfaces ''
And rightly so, of course it can't find an IPv4 network in an IPv6
network.
This patch, adds a new helper method, networks_address_family_coverage,
that takes the list of networks and returns a bitmap of address families
supported.
We then check to see if we have enough networks defined and if you don't
it'll warn and then continue.
Also update the network-config-ref to mention having to define both
address family addresses for cluster and or public networks.
As well as a warning about `ms bind ipv4` being enabled by default which
is easy to miss, there by enabling dual stack when you may only be
expect single stack IPv6.
Thee is also a drive by to fix a `note` that wan't being displayed due
to missing RST syntax.
Signed-off-by: Matthew Oliver <moliver@suse.com> Fixes: https://tracker.ceph.com/issues/46845 Fixes: https://tracker.ceph.com/issues/39711
(cherry picked from commit 9f75dfbf364f5140b3f291e0a2c6769bc3d8cbac)
Dan van der Ster [Thu, 29 Apr 2021 23:06:17 +0000 (01:06 +0200)]
mgr/progress: ensure progress stays between [0,1]
If _original_pg_count is 0 then progress can be negative.
Fixes: https://tracker.ceph.com/issues/50591 Related-to: https://tracker.ceph.com/issues/50587 Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit 20990a94598d0249745e2ec25c9197d842119d92)
os/FileStore: fix to handle readdir error correctly
Currently filestore code does not handle readdir error.
As man readdir(3) says, we need to check errno after readdir
returns NULL to determine if error happens or not.
This patch fixes the all readdir() calls to check errono and
handle it appropriately:
- FileStore.cc ... abort if EIO error happens
- BtrfsFileStoreBAckend.cc/LFNindex.cc
... return error to upper layer
Without this fixes, primary PG could fail to correctly perform
backfill operation and could lead data loss propagation described
in #50558.
Kefu Chai [Thu, 11 Mar 2021 13:13:13 +0000 (21:13 +0800)]
mon/OSDMonitor: drop stale failure_info
failure_info keeps strong references of the MOSDFailure messages
sent by osd or peon monitors, whenever monitor starts to handle
an MOSDFailure message, it registers it in its OpTracker. and
the failure report messageis unregistered when monitor acks them
by either canceling them or replying the reporters with a new
osdmap marking the target osd down. but if this does not happen,
the failure reports just pile up in OpTracker. and monitor considers
them as slow ops. and they are reported as SLOW_OPS health warning.
in theory, it does not take long to mark an unresponsive osd down if
we have enough reporters. but there is chance, that a reporter fails
to cancel its report before it reboots, and the monitor also fails
to collect enough reports and mark the target osd down. so the
target osd never gets an osdmap marking it down, so it won't send
an alive message to monitor to fix this.
in this change, we check for the stale failure info in tick(), and
simply drop the stale reports. so the messages can released and
marked "done".
will add a trim failures call in the loop, which mutates failure_info,
while we are still iterating this map. so have to restructure the loop
a little bit.
Mark Houghton [Tue, 3 Nov 2020 15:14:06 +0000 (15:14 +0000)]
rgw: rename variable for clarity
Signed-off-by: Mark Houghton <mhoughton@microfocus.com>
(cherry picked from commit 5d22b7d29a25db4f648daf0c51be74702d4149a2) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Conflicts:
src/rgw/rgw_op.cc
Mark Houghton [Tue, 3 Nov 2020 11:10:04 +0000 (11:10 +0000)]
rgw: fix RGWDeleteMultiObj::verify_permission
Signed-off-by: Mark Houghton <mhoughton@microfocus.com>
(cherry picked from commit ba23750bea89a0e9818887abe62db0efef02fe3a) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Conflicts:
src/rgw/rgw_op.cc
Mark Houghton [Tue, 20 Oct 2020 16:54:32 +0000 (17:54 +0100)]
rgw: Honour governance retention override in multi-object delete.
Allow governance retention to be overridden by a suitably privileged user.
Fixes: http://tracker.ceph.com/issues/47586 Signed-off-by: Mark Houghton <mhoughton@microfocus.com>
(cherry picked from commit 6989da1bcbe59e4d561c9d16f0ff891f6c6ef567) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Conflicts:
src/rgw/rgw_op.cc
Mark Houghton [Thu, 15 Oct 2020 11:13:50 +0000 (12:13 +0100)]
rgw: Check S3 object lock date in multi-object delete
Multi-object delete (via the S3 API) will now check each object's retention date in the same way as single object delet does.
Fixes: http://tracker.ceph.com/issues/47586 Signed-off-by: Mark Houghton <mhoughton@microfocus.com>
(cherry picked from commit 1a3f08550813e719b34a8133b83eefa97dd43d3a) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Conflicts:
src/rgw/rgw_common.h
src/rgw/rgw_common.cc
src/rgw/rgw_op.cc
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Kefu Chai [Thu, 21 Feb 2019 14:54:30 +0000 (22:54 +0800)]
pybind: set language_level for cythonize explicitly
Compiling rbd.pyx because it changed.
[1/1] Cythonizing rbd.pyx
/usr/lib/python2.7/dist-packages/Cython/Compiler/Main.py:367:
FutureWarning: Cython directive 'language_level' not set, using 2 for
now (Py2). This will change in a later re
lease! File: /var/ssd/ceph/src/pybind/rbd/rbd.pyx
tree = Parsing.p_module(s, pxd, full_module_name)
J. Eric Ivancich [Wed, 14 Apr 2021 17:55:22 +0000 (13:55 -0400)]
rgw: during reshard lock contention, adjust logging
When RGW fails to get a lock on a reshard log, we log it in such a way
that it looks like an error. Instead we'll make sure that the log
message is informational.
mon: Modifying trim logic to change paxos_service_trim_max dynamically
Currently, the Paxos Service trim logic is bounded by a max value (paxos_service_trim_max). This change dynamically modifies the max value when the number of logs to be trimmed is higher than paxos_service_trim_max.
The paxos_service_trim_max_multiplier has been added in case we want to increase paxos_service_trim_max by a certain factor. If this option is enabled we get a new upper bound when trim sizes are high.
mon: Adding variables for Paxos trim
1. Define variables for paxos_service_trim_min and paxos_service_trim_max.
2. Use them in place of g_conf()→paxos_service_trim_min and g_conf()→paxos_service_trim_max
Kefu Chai [Tue, 30 Mar 2021 18:32:38 +0000 (02:32 +0800)]
mgr/PyModule: put mgr_module_path before Py_GetPath()
pip comes with _vendor/progress. so there is chance to import the vendored
version of "progress" module instead of the "progress" mgr module, and
fail to import the latter.
in this change, the order of paths are rearranged so the configured
`mgr_module_path` is put before the return value of `Py_GetPath()`.
Conflicts:
src/mgr/PyModule.cc
- nautilus has a preprocessor directive "#if PY_MAJOR_VERSION >= 3"
which is not there in master
- since we still need to support python2, apply the same change to
the #else branch at line 351
Conflicts:
src/pybind/mgr/dashboard/services/access_control.py
- Some of the changes are not backported because those features are
not implemented on nautilus. So I left them as it is
mgr/dashboard: filesystem pool size should use stored stat
Fixes: https://tracker.ceph.com/issues/50195 Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Replaces 'bytes_used' with 'stored' stat to see the correct results
of CephFS pool stats.
mon/MonClient: reset authenticate_err in _reopen_session()
Otherwise, if "mon host" list has at least one unqualified IP address
without a port and both msgr1 and msgr2 are turned on, there is a race
affecting MonClient::authenticate().
For backwards compatibility reasons such an address is expanded into
two entries, each being treated as a separate monitor. For example,
"mon host = 1.2.3.4" generates the following initial monmap:
0: v1:1.2.3.4:6789/0
1: v2:1.2.3.4:3300/0
See MonMap::_add_ambiguous_addr() for details.
Then, the following can happen:
1. we connect to both endpoints and attempt to authenticate
2. authenticate() sets authenticate_err to 1 and sleeps on auth_cond
3. msgr1 authenticates first (i.e. it gets the final MAuth message
before msgr2 gets the monmap)
4. active_con is set to msgr1 connection, msgr2 connection is closed
as redundant
5. _finish_auth() sets authenticate_err to 0 and signals auth_cond,
but before either the monmap is received or authenticate() wakes
up, msgr1 connection is closed due to a network hiccup
6. ms_handle_reset() calls _reopen_session() which clears active_con
and again connects to both endpoints and attempts to authenticate
7. authenticate() wakes up, sees that there is no active_con and goes
back to sleep, but this time with authenticate_err == 0
8. msgr2 authenticates first but doesn't call _finish_auth() because
it is called only if authenticate_err == 1
9. active_con is set to msgr2 connection, msgr1 connection is closed
as redundant
10. authenticate() hangs on auth_cond until timeout defaulting to 5
minutes
The discrepancy between msgr1 and msgr2 plays a key role. For msgr1,
authentication is considered to be complete as soon as the final MAuth
message is received -- the monmap is not waited for. For msgr2,
authentication is considered to be complete only after the monmap is
received.
Avoid the race by setting authenticate_err to 1 in _reopen_session(),
so that _finish_auth() is called on/after every authentication attempt
instead of just the first one.
Aashish Sharma [Thu, 25 Mar 2021 05:55:37 +0000 (11:25 +0530)]
mgr/dashboard:Simplify some complex calculations in test_alerts.yml
run-promtool-unittests is failing with difference in floating point values in some complex calculations. This PR intends to simplify those calculations and fix this issue.
Kefu Chai [Fri, 19 Mar 2021 02:32:16 +0000 (10:32 +0800)]
test: run promtool test without docker on ubuntu/focal
before this change, we use docker for running promtools offered by
a docker image, but this is not efficient, and quite a few developers
do not want to use docker for running "make check". this change was
introduced by #39246, the reason was that, in Ceph's CI process, we
are using Ubuntu/Bionic for running "make check" jobs, but prometheus
packaged by Bionic does not offer the "test rules" command. so, to
address problem, we are using "dnanexus/promtool:2.9.2" docker image
for verifying monitoring/prometheus/alerts/test_alerts.yml.
after this change, we use prometheus packaged by debian derivatives
instead of pulling a docker image.
* debian/control: add prometheus as a "make check" dependency
* install-deps.sh: partially revert 53a5816deda0874a3a37e131e9bc22d88bb2a588, as we don't need to
pull docker or start docker service for using promtool anymore.
* cmake: check if promtool is capable of running "test rules"
command, bail out if it is not.
Conflicts:
debian/control (python-cherrypy3 conficts with python-cherrypy3 | python3-cherrypy3 in nautilus)
install-deps.sh (preload_wheels_for_tox method not in nautilus so removed that)
src/test/CMakeLists.txt (#561-594 new changes overrid these lines, merged correctly now)
Conflicts:
install-deps.sh (changed dnf to yumdnf , preload_wheels method not in nautilus so removed that)
src/test/CMakeLists.txt (#561-594 new changes overrid these lines, merged correctly now)
Aashish Sharma [Mon, 8 Mar 2021 09:44:00 +0000 (15:14 +0530)]
mgr/dashboard: Remove username, password fileds from -Cluster/Manager Modules/dashboard
Username, password fields are empty in Cluster/Manager Modules/dashboard.Since this functionality is when dashboard supported single user-password, now we need to remove these fields from here.
Conflicts:
src/pybind/mgr/dashboard/services/access_control.py(no check_migrate_v0_to_current and check_migrate_v1_to_current methods in nautilus)
src/pybind/mgr/dashboard/tests/test_access_control.py(no test_load_v2 method in nautilus, 'time' import no longer needed with new changes)