Kefu Chai [Fri, 16 Oct 2020 17:10:24 +0000 (01:10 +0800)]
pybind/mgr/dashboard: use setUpClass for initializeing class
instead of relying on __init__(), use setUpClass() to initialize class
for testing. it turns out in pytest > 4, __init__() is called for the
test class but the attributes of the instantiated class is in turn overriden.
Kefu Chai [Tue, 20 Oct 2020 10:33:29 +0000 (18:33 +0800)]
pybind/mgr/dashboard: refactor overlong statement
to silence lint warning like:
services/tcmu_service.py:64:39: E126 continuation line over-indented for hanging indent':'./services/tcmu_service.py:64:39: E126 continuation line over-indented for hanging indent'}
2: 1 E126 continuation line over-indented for hanging indent
Kefu Chai [Thu, 8 Oct 2020 07:13:36 +0000 (15:13 +0800)]
tools/setup-virtualenv.sh: pass --use-feature=2020-resolver to pip
as long as pip supports this option, pass it to `pip install`
to silence warnings and errors like:
ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.
We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.
autopep8 1.5.4 requires pycodestyle>=2.6.0, but you'll have pycodestyle 2.5.0 which is incompatible.
pytest-cov 2.10.1 requires pytest>=4.6, but you'll have pytest 3.10.1 which is incompatible.
Patrick Donnelly [Tue, 20 Oct 2020 02:26:52 +0000 (19:26 -0700)]
Merge PR #37529 into master
* refs/pull/37529/head:
qa: set rados op timeouts for mds/ceph-fuse
qa: print debug info on mount cleanup
qa: remove redundant rmr
qa: use null mode to prevent undesired changes to mountpoint
qa: unmount all clients before deleting the file system
osdc: add timeout configs for mons/osds
common: accept timespan for SaferCond.wait_for
mgr/cephadm: do not configure Dashboard Ganesha settings
The Dashboard can get cluster information from the Orchestrator.
For settings that are set by previous revisions, the Dashboard will
check them and ask user to remove them.
mgr/dashboard: support Orchestrator and user-defined Ganesha clusters
This change make the Dashboard support two types of Ganesha clusters:
- Orchestrator clusters (Since Octopus)
- Deployed by the Orchestrator.
- The Dashboard gets the pool/namespace that stores Ganesha
configuration objects from the Orchestrator.
- The Dashboard gets the daemons in a cluster from the Orchestrator.
- User-defined clusters (Since Nautilus)
- Clusters defined by using `ceph dashboard
set-ganesha-clusters-rados-pool-namespace` command is treated as
user-defined clusters.
- Each daemon has its own RADOS configuration objects. The
Dashboard uses these objects to deduce daemons.
Jason Dillaman [Fri, 16 Oct 2020 15:25:39 +0000 (11:25 -0400)]
journal: possible race condition between flush and append callback
When notifying the journal recorder of an overflow or if the object
close request has completed due to no more in-flight IO, it was
possible for a race between a flush request and the processing of
an append completion to attempt to kick off duplicate notifications.
Since the overflowed and closed callbacks are properly protected from
duplicates, use a counter instead of a boolean to track possible
in-flight handler callbacks.
Fixes: https://tracker.ceph.com/issues/47880 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Kefu Chai [Fri, 16 Oct 2020 14:07:50 +0000 (22:07 +0800)]
crimson/common: schedule action only if the future is not available
otherwise we could call do_until() recursively if we have other tasks
which need to prempt the reactor and current future's state is actually
always available.
Kefu Chai [Fri, 16 Oct 2020 06:11:52 +0000 (14:11 +0800)]
crimson/common: do not take from a future twice
before this change, in our specialization of seastar::do_until(),
we access `f` after calling `f.get()`, this is not correct. as `f.get()`
actually moves `f._state` away and detaches the associated promise if any.
so we cannot call `f._then()` anymore after calling `f.get()`. as
`f._then()` schedules `f` by detaching the future from promise and
attaching the scheduled task to the promise. but `future_base::detach_promise()`
does not check `_promise` before accessing it, hence the segfault.
after this change, the order of the checks is rearranged so that
`f.get()` is called at the end. and also use `f.get0()` to be more
explicit, as we are accessing the only element of the returned
value.
Adam C. Emerson [Thu, 15 Oct 2020 16:03:13 +0000 (12:03 -0400)]
Merge pull request #37660 from adamemerson/wip-datalog-fix
cls/fifo: Switch use CLS_ERR for errors
rgw/fifo: Fix a few missed return value assignments
rgw/fifo: Add some error logging
rgw/fifo: Catch two instances journaling a new part
rgw/fifo: Use unique_ptr and explicit release for callbacks
Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>
Matthew Oliver [Mon, 10 Aug 2020 04:46:21 +0000 (04:46 +0000)]
pick_address: Warn and continue when you find at least 1 IPv4 or IPv6 address
Currently if specify a single public or cluster network, yet have both
`ms bind ipv4` and `ms bind ipv6` set daemons crash when they can't find
both IPs from the same network:
unable to find any IPv4 address in networks '2001:db8:11d::/120' interfaces ''
And rightly so, of course it can't find an IPv4 network in an IPv6
network.
This patch, adds a new helper method, networks_address_family_coverage,
that takes the list of networks and returns a bitmap of address families
supported.
We then check to see if we have enough networks defined and if you don't
it'll warn and then continue.
Also update the network-config-ref to mention having to define both
address family addresses for cluster and or public networks.
As well as a warning about `ms bind ipv4` being enabled by default which
is easy to miss, there by enabling dual stack when you may only be
expect single stack IPv6.
Thee is also a drive by to fix a `note` that wan't being displayed due
to missing RST syntax.
Signed-off-by: Matthew Oliver <moliver@suse.com> Fixes: https://tracker.ceph.com/issues/46845 Fixes: https://tracker.ceph.com/issues/39711
Patrick Donnelly [Tue, 13 Oct 2020 17:09:41 +0000 (10:09 -0700)]
qa: set rados op timeouts for mds/ceph-fuse
Now that the osdc Objecter obeys updates to these configs, let's use
them to avoid having them block forever on operations that may never
complete (or should complete in a timely manner).
Fixes: https://tracker.ceph.com/issues/47734 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Have the Objecter track the rados_(mon|osd)_op_timeout configs so that
it can be configured at runtime/startup. This is useful for the
MDS/ceph-fuse so that we can avoid waiting forever for a response from
the Monitors that will never come (statfs on a deleted file system's
pools).
Also: make these configs take a time value rather than double. This is
simpler to deal with in the code and allows time units to be used (e.g.
"5m" for 5 minutes).
Fixes: https://tracker.ceph.com/issues/47734 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Neha Ojha [Tue, 13 Oct 2020 15:52:20 +0000 (15:52 +0000)]
qa/suites/crimson-rados: add .qa helper
Fixes:
OSError: /home/nojha/src/github.com_ceph_ceph_master/qa/suites/crimson-rados/basic/centos_latest.yaml
does not exist (abs /home/nojha/src/github.com_ceph_ceph_master/qa/suites/crimson-rados/basic/centos_latest.yaml)
Yan, Zheng [Fri, 7 Aug 2020 15:58:19 +0000 (23:58 +0800)]
mds: distribute dirfrags for ephemeral distributed directory
Instead of distribute individual dir inodes inside the ephemeral
distributed dir. Distributing dirfrags can limit number of subtrees
created by the ephemeral dist pin.
This patch also unifies codes that handle export pin and ephemeral pin.
Jason Dillaman [Mon, 5 Oct 2020 18:04:14 +0000 (14:04 -0400)]
librbd: support preprocessing source object data prior to deep-copy
Let object dispatch layers potentially mutate the data read from the
source image prior to issuing the actual deep-copy operations against
the destination image.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
The write-ops now only stores write vs zero ops and the type of
zero operation is delayed until the actual op is sent. This will
make the state machine compatible with the copyup process hook.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>