Sage Weil [Tue, 10 Sep 2019 21:41:18 +0000 (16:41 -0500)]
osdc/Objecter: resend OSD tell commands on EAGAIN
Request map *and* resend. We don't have map epoch info about when the
reply was sent, and the OSD isn't ordering with respect to epochs anyway.
So, resend now, just in case we already saw a map change, or because we
were suffering from a peering vs command race on the OSD side, and then
also request a new map from the mon, in case we are missing a map update.
Sage Weil [Tue, 10 Sep 2019 03:07:03 +0000 (22:07 -0500)]
osd: route tell commands to asok; migrate commands
- move items from _do_command to asok_command in OSD.cc
- update PG::do_command to take a std::function on_finish
- sprinkle in some osd_lock locking (_do_command implicitly locks osd_lock,
asok_command() does not; most commands don't need it)
Sage Weil [Fri, 6 Sep 2019 20:18:12 +0000 (15:18 -0500)]
mon/MonClient: send tell commands out of band via MCommand
The current tell mon command handling is pretty fragile and semi-broken:
we force the client to (exclusively) connect to the target mon, disrupting
other monclient business, and the retry logic is fragile.
Instead, use entirely independent connections for each tell command, and
tear them down when we get a reply. Implement independent and simple
error handling and timeouts.
Keep most of the old behavior alive so that we can still use tell against
pre-octopus mons.
Sage Weil [Fri, 6 Sep 2019 15:36:31 +0000 (10:36 -0500)]
common/admin_socket: return int from hook call()
Previously, call() returned a bool. Return an int instead so we can
wire this up to tell command return values.
The admin socket 'ceph daemon ...' unix domain socket protocol does not
pass a return code, only data, so we cannot pass these errors that way.
We have two choices: make error codes silently succeed when accessed via
asok (so that we get an error string etc), or make them fail without any
specific error code or string.
Unfortunately, there are several cases where an exception was caught and
what() returned as a string, or where error strings are returned. These
would "blindly" fail if we took the latter approach.
So, for the asok interface, -ENOSYS means a "hard" error that gives the
user no data and makes the 'ceph daemon ...' command return an error code.
Other error codes are interpreted as a success. This is ONLY for the
asok interface; the tell interface has full fidelity with error codes and
error strings.
Note that this means that 'net new' tell-style commands that we move over
to this handler will also appear to succeed via the 'ceph daemon'
interface when they return error codes.
Sage Weil [Fri, 6 Sep 2019 14:44:23 +0000 (09:44 -0500)]
mgr/DaemonServer: route MCommand (for octopus+) to asok commands
Send mgr 'tell' commands (if they originate from a octopus+ client that
knows the difference between MCommand and MMgrCommand) to the asok
comand queue.
Sage Weil [Tue, 10 Sep 2019 18:53:54 +0000 (13:53 -0500)]
pybind/ceph_argparse: disambiguate mgr tell and CLI commands
The mgr tell commands are somewhat special in that you can tell the mgr
with an empty id ('ceph tell mgr' or target ('mgr', '')) to get the
currently active mgr. This makes it hard to disabiguate between a tell
command and a CLI command.
Fix that by explicitly setting the target to 'mon-mgr' when a CLI command
is flagged as a mgr command.
Sage Weil [Thu, 5 Sep 2019 22:11:26 +0000 (17:11 -0500)]
common/admin_socket: simplify command routing
Back in e30e937c8962249af283a7571eb106ef444b79e3 we made it possible to
route a command via any prefix. This worked when we wanted to pass
arguments but were just dealing with a vector<string>. These days we have
an actual prefix followed by named arguments, so we don't need this
ad hoc routing.
Derive the prefix from the cmddesc at registration time, and match that
explicitly against the prefix at execution time.
mgr/dashboard: Bucket names cannot be formatted as IP address
In general, bucket names should follow domain name constraints:
- Bucket names must be unique.
- Bucket names cannot be formatted as IP address.
- Bucket names can be between 3 and 63 characters long.
- Bucket names must not contain uppercase characters or underscores.
- Bucket names must start with a lowercase letter or number.
- Bucket names must be a series of one or more labels. Adjacent labels are separated by a single period (.). Bucket names can contain lowercase letters, numbers, and hyphens. Each label must start and end with a lowercase letter or a number.
On IBM Z the Boost tagged pointer implementation cannot use
"pointer compression" as there are no unused bits in an address;
the whole 64-bit address space is available to user space code.
Instead, Boost uses 16-byte atomics. This is always supported
on IBM Z, but depending on the particular compiler (version)
it may require linking against libatomic. The existing checks
in CheckCxxAtomic.cmake do not catch this, however, as they only
test for (up to) 8-byte atomic support.
Fixed by adding a test for 16-byte atomic support on IBM Z.
The 'osd_op_queue_cut_off' config option determines which level of
high priority ops should use strict priority ordering and may change
from time to time. Since the main strategy of 'osd_kick_recovery_op_priority'
is to simply follow up 'osd_op_queue_cut_off', we can instead make a direct
use of 'osd_op_queue_cut_off' to achieve the same thing explicitly.
Sage Weil [Tue, 24 Sep 2019 17:05:24 +0000 (12:05 -0500)]
osd/PeeringState: skip wait state if osd set is empty
If there are no down OSDs from prior intervals, then the normal peering
process will end up contacting all of the prior OSDs and ensuring that
their prior interval is terminated during peering.
Sage Weil [Mon, 23 Sep 2019 19:46:07 +0000 (14:46 -0500)]
osd: is_replica() -> is_nonprimary()
The 'replica' term does not map well onto EC pools. More importantly,
the implementation is often wrong for EC pools, where role may be 0 or 1
for EC pools independent of whether the OSD is the primary or not.
Introduce 'nonprimary' to mean an acting osd that is not the primary.
Sage Weil [Tue, 6 Aug 2019 22:04:44 +0000 (17:04 -0500)]
osd/PeeringState: piggyback lease and ack on activation messages
The lease goes out with the MOSDPGLog or info, and the ack comes back with
the info.
We no longer need to renew the lease explicitly in
all_activated_and_committed() because we *just* piggybacked on activation.
We can just wait for the normal renew event to fire.
Sage Weil [Tue, 6 Aug 2019 03:05:38 +0000 (22:05 -0500)]
osd/PeeringState: renew before activate messages; send after activated
We want to renew before we prepeare or send activate messages so that we
have the opportunity to include leases in them (coming soon!).
And we do not want to send explicit lease messages until we know that the
peers have activate. In particular, we want to avoid queueing a notify
(via pending_activators) and then sending a lease that will arrive before
it.
If we see that a prior_readable_down_osd is known to be dead, we can
remove it from the set. And if the set is empty, we can skip the rest of
our waiting period and leave the WAIT state.
Sage Weil [Tue, 23 Jul 2019 19:07:59 +0000 (14:07 -0500)]
osd/PeeringState: track down OSDs relevant to prior_readable_until_ub
Keep track of which OSDs from the prior set we care about that affect
the prior_readable_until_ub. Note that it is only the *down* OSDs that
we have to track here, since everything in the *probe* set we will already
contact during peering (they are still up), guaranteeing that those PGs
are aware of the interval change and are no longer readable in the prior
interval.
install-deps.sh: only install python-srpm-macros for required macros
the reason why we need to install these macros is to solve the
egg-chicken problem -- to set `_python_buildid` and `python3_pkgversion`
so that we can prepare the build dependencies and install them. in which,
`_python_buildid` is defined using `python3_pkgversion`. this macro is
offered by python-srpm-macros.
the other macros, like `python3_sitelib` and `__python3` are offered by
`python3-rpm-macros`.
this change also avoid the issue if we install `*rpm-macros` on CentOS8:
Error:
Problem: package R-rpm-macros-1.1.0-2.el8.noarch requires /usr/bin/Rscript, but none of the providers can be installed
- package R-rpm-macros-1.1.0-2.el8.noarch requires R-core, but none of the providers can be installed
- conflicting requests
- nothing provides libRblas.so()(64bit) needed by R-core-3.6.1-1.el8.x86_64
- nothing provides openblas-Rblas needed by R-core-3.6.1-1.el8.x86_64
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)