Adam Kupczyk [Mon, 24 May 2021 12:27:05 +0000 (14:27 +0200)]
os/bluestore/bluefs: Add test that detects bluefs inconsistency
Add test that detects possible scenario that will cause BlueFS to have file
that contains data that has never been written. This is done by tricking
replay log to already accept file metadata (size, allocations), but actual data
stored in these allocations is not yet synced to disk.
Scenario:
1) write to file h1 on SLOW device
2) flush h1 (and trigger h1 mark to be added to bluefs replay log)
3) write to file h2
4) fsync h2 (forces replay log to be written)
The result is:
- bluefs log now has stable state of h1
- SLOW device is not yet flushed (no fdatasync())
Adam Kupczyk [Mon, 24 May 2021 12:49:51 +0000 (14:49 +0200)]
os/bluestore/bluefs: Remove possibility of bluefs replay log containing files without data
It had been possible to have a bluefs replay log to serialize file metadata (size, allocations),
but actual data stored in these allocations is not yet synced to disk.
This could happen if _flush_range(h1) allocated space for file h1 on device (like SLOW) that will not
be used when flushing future replay log. Such thing can happen when we have h2 that wrote to WAL and
out replay log is on DB. After fsync(h2) we write to replay log, wait for fdatasync on WAL and DB.
There is no waiting on SLOW, but h1 was dirty and has been serialized to replay log.
Solution is to delay notifying replay log that it has to include h1 after finishing fdatasync.
Fixes: https://tracker.ceph.com/issues/50965 Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 03ac53f7d4c83e56f664ad371ffe3bc2d40e1837)
This _mkdir_p should never have worked as the first directory it tries
to stat/mkdir is "", the empty string. This causes an assertion in the
client. I'm not sure how this code ever functioned without causing
faults. They look like:
2021-07-01 02:15:04.449 7f7612b5ab80 3 client.178735 statx enter (relpath want 2047)
胡玮文 [Mon, 21 Jun 2021 13:31:49 +0000 (21:31 +0800)]
mgr/dashboard: fix OSD out count
Think we have 3 OSDs out but up (prepare for re-formatting to change min_alloc_size), and another OSD down but in
(during reboot). The dashboard will display "1 down, 2 out", which is obviously incorrect. It should be "1 down, 3 out"
The rgw bucket creation form has the Name field which have an async
validator. The validator calls all the bucket name and check if the
entered name is unique or not. This happens on every keystroke. So if
100 or more buckets are there, then the async validation can be real
slow and causes misvalidations in different fields.
I changed the validation logic and did some cleanups to improve the
performance of the async validation.
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-bucket-form/rgw-bucket-form.component.ts
- Solved some import conflicts. Used the I18N import and removed the
forkJoin import
src/pybind/mgr/dashboard/frontend/src/app/shared/api/rgw-bucket.service.spec.ts
- Dont need ${RgwHelper.DAEMON_QUERY_PARAM}
src/pybind/mgr/dashboard/frontend/src/app/shared/api/rgw-bucket.service.ts
- Removed enumerate function
Ilya Dryomov [Sun, 2 May 2021 21:13:29 +0000 (23:13 +0200)]
qa/workunits/rbd: disable qemu-iotest test 055 globally
It doesn't work on Focal and already disabled on CentOS 7 and 8. More
importantly, it doesn't actually test rbd -- it always tests "file", no
matter which protocol is specified in IMGPROTO.
Aaryan Porwal [Wed, 26 May 2021 08:58:15 +0000 (14:28 +0530)]
mgr/dashboard: fix for right sidebar nav icon not clickable
fixed the responsive sidebar not opening on click event, and close sidebar on clicking tasks and notification list item because it'll be over shadowed by the sidebar Signed-off-by: Aaryan Porwal <aaryanporwal2233@gmail.com>
(cherry picked from commit 4e53a139d96215477d00eb709c1662d8277cba1d)
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/core/navigation/navigation/navigation.component.html
- Adopt the master branch changes.
Patrick Donnelly [Fri, 18 Jun 2021 16:27:54 +0000 (09:27 -0700)]
mds: avoid journaling overhead for ceph.dir.subvolume for no-op case
In preparation for acquiring the xlock on the directory inode, the MDS
must journal a few events before continuing on with the setvxattr. This
can cause significant delays in the volumes ceph-mgr module which needs
to regularly enable this vxattr from multiple code paths. We could cache
in that module whether the vxattr is set but it's also pretty easy to
adjust the MDS to acquire a rdlock on the directory to check if the
subvolume flag is already set. That is much lighter weight and the lock
is generally readily available.
Fixes: https://tracker.ceph.com/issues/51276 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit b5f736eee408c220ffdfb67b10667a7b553dac25)
Deepika Upadhyay [Wed, 26 May 2021 09:11:55 +0000 (14:41 +0530)]
rados/cephadm/qa/distros: update to latest distros
- removes ubuntu_18.04 support for podman, instead we move to focal.
- use rhel_8.3 for all rhel_8 references
- use {centos/rhel}_8 instead of {rhel/centos}_latest: to keep things
same in master and octopus since we use: rhel_8 and centos_8 as latest
version symlinks, which differentiated after an octopus only commit.
this was not cherry picked from master as octopus had some of the
symlinks, not in sync with master, this commit does cleanup for them,
and tries to make them similar to master.
Sage Weil [Wed, 3 Mar 2021 14:14:29 +0000 (08:14 -0600)]
qa: new kubic distro files; use kubic podman for centos/rhel
The current centos/rhel version of podman (2.2.1) is broken.
- create new qa/distros/podman/* files that install kubic podman
- include centos/rhel variants
- adjust cephadm jobs to use new yaml files
- remove old qa/distros/all/*_podman.yaml files
trivial fix: we do not have cephadm/thrash suite in octopus(removed)
- distro(from octopus) renamed to 0-distro(from pacific)
Tatjana Dehler [Thu, 27 May 2021 09:46:50 +0000 (11:46 +0200)]
mgr/dashboard: show partially deleted RBDs
An RBD might be partially deleted if the deletion
process has been started but was interrupted. In
this case return the RBD as part of the RBD list
and mark it as partially deleted.
Fixes: https://tracker.ceph.com/issues/48603 Signed-off-by: Tatjana Dehler <tdehler@suse.com>
(cherry picked from commit d83c277ac1861df31d2a39d16e20c7bebbea676e)
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-details/rbd-details.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-list/rbd-list.component.spec.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/block/rbd-list/rbd-list.component.ts
src/pybind/mgr/dashboard/services/rbd.py
src/pybind/mgr/dashboard/tests/test_rbd_service.py
Resolved various conflicts because octopus and
master diverged a lot.
Conflicts:
src/mds/Server.cc
- most of the master commit was already backported via c5362b8464bdafbea7556acdee9e877b71ed4f8d
This backports just one small part that was missed in that commit.
Kefu Chai [Fri, 4 Jun 2021 03:25:12 +0000 (11:25 +0800)]
debian/control: ceph-mgr-modules-core does not Recommend ceph-mgr-rook anymore
per https://www.debian.org/doc/debian-policy/ch-relationships.html
> Recommends
> This declares a strong, but not absolute, dependency.
>
> The Recommends field should list packages that would be found together
> with this one in all but unusual installations.
ceph-mgr-modules-core provides a set of ceph-mgr modules which are
always enabeld. but the rook module enables ceph-mgr to install and
configure a Ceph cluster using Rook. this module is very useful but
it does not have such a strong connection with ceph-mgr-modules-core.
we can always install it separately for using better intergration with
Rook.
Sage Weil [Fri, 4 Jun 2021 17:49:40 +0000 (12:49 -0500)]
mgr/telemetry: pass leaderboard flag even w/o ident
Allow non-identified clusters to appear in the leaderboard.
The leaderboard option still defaults to false, so the change here
is that if they opt in to leaderboard but not ident we'll see
that on the backend.
Note that a leaderboard still does not exist (yet), so this doesn't
have any immediate impact. But if/when we do create one, it will
allow us to show big clusters (that opt in) on the leaderboard
as 'unidentified' or similar.
liu shi [Fri, 14 May 2021 07:51:01 +0000 (03:51 -0400)]
cpu_profiler: fix asok command crash
fixes: https://tracker.ceph.com/issues/50814 Signed-off-by: liu shi <liu.shi@navercorp.com>
(cherry picked from commit be7303aafe34ae470d2fd74440c3a8d51fcfa3ff)
Cory Snyder [Fri, 28 May 2021 19:08:49 +0000 (15:08 -0400)]
mgr/DaemonServer.cc: prevent integer underflow that is triggered by large increases to pg_num/pgp_num
This fixes a scenario where mgrs continually crash while attempting to apply large increases to pg_num/pgp_num. The max step size (estmax) for each incremental update to the pgp_num is calculated as a percentage of the pg_num, which permits the possibility for the max step size (estmax) to be greater than the current pgp_num when the increase is large; this causes an integer underflow when the max step size is subtracted from the pgp_num in order to calculate the next step size with std::clamp. The integer underflow causes hi < lo in args passed to std::clamp, which causes a failed assertion, SIGABRT, and ultimately crashing mgr.
Jonas Jelten [Mon, 15 Mar 2021 22:21:07 +0000 (23:21 +0100)]
os/bluestore: strip trailing slash for directory listings
Calls to BlueRocksEnv::GetChildren may contain a trailing / in the
queried directory, which is stripped away with this patch.
If it's not stripped, the directory entry is not found in BlueFS:
```
10 bluefs readdir db/
20 bluefs readdir dir db/ not found
3 rocksdb: [db/db_impl/db_impl_open.cc:1785] Persisting Option File error: OK
```