Due to bugs in cache managment in blkid, there are possible to have
nonexistence entries. This entries breaks ceph-volume operations by
passing two or more outputs instead of one (eg. /dev/sdk2).
Kefu Chai [Thu, 20 May 2021 05:55:13 +0000 (13:55 +0800)]
os/bluestore/bluestore_tool: compare retval stat() with -1
before this change, stat() is always called to check if the
file specified by --dev-target exists even if this option is not
specified. also, we compare the retval of stat() with ENOENT, while
state() returns -1 on error.
after this change, stat() is called only if --dev-target is specified,
and we compare the retval of stat() with -1 and 0 only, so if
--dev-target option is not specified, the tool still hehaves.
Igor Fedotov [Fri, 19 Feb 2021 11:31:52 +0000 (14:31 +0300)]
ceph-volume: implement bluefs volume migration.
This is a wrapper over ceph-bluestore-tool's bluefs-bdev-migrate command.
Primarily intended to introduce LVM tags manipulation which
ceph-bluestore-tool is lacking.
Conflicts:
doc/man/8/ceph-volume.rst - a bit different formatting is in use
src/ceph-volume/ceph_volume/api/lvm.py - get_single_lv is the
new name for get_first_lv
Igor Fedotov [Mon, 17 May 2021 19:23:26 +0000 (22:23 +0300)]
os/bluestore: fix unexpected ENOSPC in Avl/Hybrid allocators.
Avl allocator mode was returning unexpected ENOSPC in first-fit mode if all size-
matching available extents were unaligned but applying the alignment made all of
them shorter than required. Since no lookup retry with smaller size -
ENOSPC is returned.
Additionally we should proceed with a lookup in best-fit mode even when
original size has been truncated to match the avail size.
(force_range_size_alloc==true)
Fixes: https://tracker.ceph.com/issues/50656 Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit 0eed13a4969d02eeb23681519f2a23130e51ac59)
Conflicts:
src/test/objectstore/Allocator_test.cc - legacy INSTANTIATE_TEST_CASE_P clause is still used in Nautilus
Ilya Dryomov [Wed, 26 May 2021 12:21:22 +0000 (14:21 +0200)]
librbd: don't stop at the first unremovable image when purging
As there is no inherent ordering, there may be multiple removable
images past the unremovable image. On top of that, removing a clone
may make its parent removable so perform an additional pass if any
image gets removed.
monitoring/grafana: Remove erroneous elements in hosts-overview Grafana dashboard
The hosts-overview Grafana dashboard json file contains a repeated element, making
it invalid JSON. Some JSON parsers handle this. However, this prevents Jsonnet
from parsing the dashboard, which prevents the deployment of this dashboard via
Jsonnet.
Deepika Upadhyay [Wed, 26 May 2021 19:25:13 +0000 (00:55 +0530)]
nautilus: qa/upgrade: disable update_features test_notify with older client as lockowner
* with the recent support for async rbd operations from pacific+ when an
older client(non async support) goes on upgrade, and simultaneously
interacts with a newer client which expects the requests to be async,
experiences hang; considering the return code for request completion to
be acknowledgement for async request, which then keeps waiting for
another acknowledgement of request completion.
this if happens should be a rare only when lockowner is an old client
and should be deferred if compatibility issues arises.
* amend upgrade test workunits to use respective stable branches
Xiubo Li [Fri, 7 Aug 2020 07:45:52 +0000 (15:45 +0800)]
msg: throw a system error when center.init fails
In the libcephfs test case, it will run handreds of threads in
parallel, it will possibly reach the open files limit, but there
won't useful logs about what has happened.
This will just throw a system error, just like:
C++ exception with description "(24) Too many open files" thrown in the test body.
Fixes: https://tracker.ceph.com/issues/43039 Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit 6338050)
Conflicts:
src/msg/async/Stack.cc
- nautilus uses plain "i" as the for loop counter variable, while
master has more fancy "worker_id"
Xiubo Li [Fri, 7 Aug 2020 08:21:07 +0000 (16:21 +0800)]
libcephfs: ignore restoring the open files limit
Let's just ignore restoring the open files limit, the kernel will
defer releasing the file descriptors and then the process will be
possibly reachthe open files limit.
Fixes: https://tracker.ceph.com/issues/43039 Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit c871d68)
Patrick Donnelly [Mon, 22 Mar 2021 16:17:43 +0000 (09:17 -0700)]
mgr/pybind/volumes: avoid acquiring lock for thread count updates
Perform thread count updates in a dedicated tick thread. This avoids the
mgr Finisher thread from getting potentially hung via a mutex deadlock
in the cloner thread management.
Fixes: https://tracker.ceph.com/issues/49605 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit b27ddfaed4a3c66bac2343c8315a1fe542edb63e)
Kefu Chai [Tue, 25 May 2021 06:31:02 +0000 (14:31 +0800)]
mon/OSDMonitor: drop stale failure_info even if can_mark_down()
in a124ee85b03e15f4ea371358008ecac65f9f4e50, we add a check to drop
stale failure_info reports. but if osdmap does not prohibit us from
marking the osd in question down, the branch checking the stale info
is not executed. in general, it is allowed to mark an osd down, so
the fix of a124ee85b03e15f4ea371358008ecac65f9f4e50 just fails to
work.
in this change, we check for stale failure report of osd in question
as long as the osd is not marked down in the same function. this should
address the slow ops of failure report issue.
Ilya Dryomov [Mon, 17 May 2021 19:16:16 +0000 (21:16 +0200)]
mon/MonClient: tolerate a rotating key that is slightly out of date
Commit 918c12c2ab5d ("monclient: avoid key renew storm on clock skew")
made wait_auth_rotating() wait for a key set with a valid "current" key
(instead of any key set, including with all keys expired if the clocks
are skewed). While a good idea in general, this is a bit too stringent
because the monitors will hand out key sets with "current" key that is
_just_ about to expire. There is nothing wrong with that as "next" key
is also there, valid for the entire auth_service_ticket_ttl. So even
if the daemon is talking to the leader, it is possible to get a key set
with an expired "current" key. If the daemon is talking to a peon, it
is pretty easy to run into in practice. This, coupled with the fact
that _check_auth_rotating() explicitly allows the keys to go slightly
out of date, can lead to wait_auth_rotating() stalling the boot for up
to 30 seconds:
15:41:11.824+0000 1 ... ==== auth_reply(proto 2 0 (0) Success)
15:41:41.824+0000 0 monclient: wait_auth_rotating timed out after 30
15:41:41.824+0000 -1 mds.b unable to obtain rotating service keys; retrying
Apply the same 30 second or less tolerance in wait_auth_rotating().
client: Fix executeable access check for the root user
Executeable permission check always returned sucessful
even when executeable bit is not set on any of the user,
group or others. This patch fixes it by overiding
executeable permission check for root only if one of
the executeable bit is set
Conflicts:
src/client/Client.cc: The commit 6aa78836548f (cephfs errno aliases) is not present in
nautilus and some other trivial conflict, may be because some patches are missing
in nautilus.
Alfonso Martínez [Fri, 22 May 2020 11:36:10 +0000 (13:36 +0200)]
mgr/dashboard: grafana panels for rgw multisite sync performance
* RGW sync perf. counters are now exposed through grafana panels.
* Sync Performance tab is only shown if rgw realm is detected.
* Prometheus module: added metrics suitable for prometheus consumption (from existing ones, not replacing for backward compatibility).
Fixes: https://tracker.ceph.com/issues/45310 Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit cf4ff7d2f03bc285a3fae3f27577333f11dab58a)
Conflicts:
- Solved conflicts from cherry-pick:
src/pybind/mgr/dashboard/controllers/rgw.py
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-bucket-form/rgw-bucket-form.component.spec.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-bucket-form/rgw-bucket-form.component.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-daemon-list/rgw-daemon-list.component.ts
src/pybind/mgr/dashboard/frontend/src/app/shared/api/rgw-site.service.spec.ts
src/pybind/mgr/dashboard/frontend/src/app/shared/api/rgw-site.service.ts
src/pybind/mgr/dashboard/services/rgw_client.py
src/pybind/mgr/dashboard/tests/test_rgw_client.py
- src/pybind/mgr/dashboard/tools.py: added method included in a feature not to be backported.
- src/pybind/mgr/dashboard/module.py: fixed linting issue.
we should not proceed against user's will if dual stack is specified but
only one network for a network family can be found. the right fix is
have better error message and documentation, not to tolerate the
failure.
Matthew Oliver [Mon, 10 Aug 2020 04:46:21 +0000 (04:46 +0000)]
pick_address: Warn and continue when you find at least 1 IPv4 or IPv6 address
Currently if specify a single public or cluster network, yet have both
`ms bind ipv4` and `ms bind ipv6` set daemons crash when they can't find
both IPs from the same network:
unable to find any IPv4 address in networks '2001:db8:11d::/120' interfaces ''
And rightly so, of course it can't find an IPv4 network in an IPv6
network.
This patch, adds a new helper method, networks_address_family_coverage,
that takes the list of networks and returns a bitmap of address families
supported.
We then check to see if we have enough networks defined and if you don't
it'll warn and then continue.
Also update the network-config-ref to mention having to define both
address family addresses for cluster and or public networks.
As well as a warning about `ms bind ipv4` being enabled by default which
is easy to miss, there by enabling dual stack when you may only be
expect single stack IPv6.
Thee is also a drive by to fix a `note` that wan't being displayed due
to missing RST syntax.
Signed-off-by: Matthew Oliver <moliver@suse.com> Fixes: https://tracker.ceph.com/issues/46845 Fixes: https://tracker.ceph.com/issues/39711
(cherry picked from commit 9f75dfbf364f5140b3f291e0a2c6769bc3d8cbac)
Dan van der Ster [Thu, 29 Apr 2021 23:06:17 +0000 (01:06 +0200)]
mgr/progress: ensure progress stays between [0,1]
If _original_pg_count is 0 then progress can be negative.
Fixes: https://tracker.ceph.com/issues/50591 Related-to: https://tracker.ceph.com/issues/50587 Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit 20990a94598d0249745e2ec25c9197d842119d92)
Conflicts:
src/test/libcephfs/test.cc
- octopus is missing a bunch of tests, but this doesn't matter because the
commit being cherry-picked did not touch those
Jeff Layton [Mon, 1 Feb 2021 16:04:07 +0000 (11:04 -0500)]
client: add ceph_ll_lookup_vino
Add a new API function for looking up an inode via a vinodeno_t. This
should give ganesha a way to reliably look up snapshot inodes.
We do need to add some special handling for CEPH_SNAPDIRs. If we're
looking for one, then find the non-snapped parent, and then call
open_snapdir to get the snapdir inode.
Also, have the function check the local cache before calling the MDS
to look up an inode.
Fixes: https://tracker.ceph.com/issues/48991 Signed-off-by: Jeff Layton <jlayton@redhat.com>
(cherry picked from commit 70622079c2ec55222a139fa5042902e0b19bd839)
Jeff Layton [Mon, 1 Feb 2021 15:41:14 +0000 (10:41 -0500)]
client: make _lookup_ino take a vinodeno_t
Currently, it always leaves the snapid as 0. Rename it to
_lookup_vino and make it fill the snapid from the vinodeno_t
instead, but only when it's a "real" snapid.
Change the existing callers to pass in a vinodeno_t with the
snapid set to CEPH_NOSNAP.
Jeff Layton [Wed, 3 Feb 2021 13:12:41 +0000 (08:12 -0500)]
client: stop doing unnecessary work in ll_lookup_inode
It's not clear to me why we're looking up the parent and name of the
inode in ll_lookup_inode, as we don't actually do anything with them.
Just return once we get an inode reference.