- Remove npm-force-resolutions: no resolution needed anymore and this is modifying package-lock.json every time it is run (striping last empty line).
- Add .npmrc: save exact version by default; do not launch audit report when installing.
Fixes: https://tracker.ceph.com/issues/48005 Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit f08c0db689dc6bd29323ac03a91c69e2fe7365a2)
Conflicts:
src/pybind/mgr/dashboard/frontend/package-lock.json
- Accept version from master branch.
src/pybind/mgr/dashboard/frontend/package.json
- Accept version from master branch.
Jianpeng Ma [Wed, 8 Sep 2021 01:51:19 +0000 (09:51 +0800)]
librbd: Read request need exclusive-lock when enable pwl-cache.
TestLibRBD.TestFUA descript the following workload:
a)write/read the same image w/ pwl-cache
write_image = open(image_name);
read_image = open(image_name);
b)i/o workload is:
write(write_image)
write need EXLock and require EXLOCK
read(read_image)
in ExclusiveLock<I>::init(), firstly read need EXLOCK
so will require EXLOCK. write_image release EXLOCK(will
flush data to osd and remove cache). read_image init pwl-cache
and read-io firstly enter pwl-cache and missed and then read
from osd.
write(write_image)
write need EXLOCK and require EXLOCK. This make read_image remove
empty cache. write_image init cache pool and write data to cache.
read(read_image)
In send_set_require_lock(), it set write need EXLOCK.
So read don't require EXLOCK and dirtyly read from osd.
Because second-read don't need EXLOCK and make write_image don't
release EXLOCK(flush dirty data to osd and shutdown pwl-cache).
This make second-read don't read the latest data.
So we should make read also need EXLOCK when enable pwl-cache.
Fixes: https://tracker.ceph.com/issues/51438 Tested-by: Feng Hualong <hualong.feng@intel.com> Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
(cherry picked from commit 621facb6e66ce92ca36d566c78bc065a9666639e)
Jianpeng Ma [Mon, 1 Nov 2021 00:33:23 +0000 (08:33 +0800)]
librbd: send FLUSH_SOURCE_INTERNAL when do copy/deep_copy.
copy/deep_copy use object_map to judge whether object exist.
If w/ librbdo pwl cache, flush can't flush data to osd which
change objectmap state. So we should send flush w/ FLUSH_SOURCE_INTERNAL
to make data flush to osd.
Fixes:https://tracker.ceph.com/issues/53057 Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
(cherry picked from commit a2ae83f8aab18933eae77cf3034b740082a39e4f)
Jianpeng Ma [Mon, 29 Nov 2021 07:16:21 +0000 (15:16 +0800)]
librbd/cache/pwl: Using BlockGuard control overlap ops order when flush to osd.
In process of tests, we met some inconsistent-data problem. Test case
mainly use write,then discard to detect data consistent.
W/o pwl, write/discard are synchronous ops. After write, data already
located into osd. But w/ pwl, we use asynchronous api to send ops to
osd.
Although we mare sure send order. But send-order don't makre sure
complete order. This mean pwl keep order of write/discard. But it
don't keep the same semantics which use synchronous api. W/ pwl, it make
synchronous to asynchronous. For normal ops, it's not problem. But if
connected-commands w/ overlap, it make data inconsistent.
So we use BlockGuard to solve this issue.
Fixes: https://tracker.ceph.com/issues/49876 Fixes: https://tracker.ceph.com/issues/53108 Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
(cherry picked from commit 8e8f3ef516e98da011f3086f8e78a2fa261293ed)
The backport for the lvm migrate feature in pacific was merged after the
get_first_*() refactor backport.
So we have still some old references to `get_single_lv()`
Yaarit Hatuka [Wed, 25 Aug 2021 02:12:08 +0000 (02:12 +0000)]
rpm, debian: move smartmontools and nvme-cli to ceph-base
We wish to be able to scrape SMART and NVMe metrics from OSD and MON
nodes. For this we require / recommend smartmontools and nvme-cli
dependencies for both the ceph-osd and ceph-mon packages. However, the
sudoers file (which is required for invoking `smartctl` by user 'ceph')
was installed only in the ceph-osd package. Since different packages
cannot own the same file, and because we want to be able to scrape from
every daemon, we move the dependencies and the sudoers installation to
ceph-base. For generalization, we rename:
sudoers.d/ceph-osd-smartctl -> sudoers.d/ceph-smartctl
Neha Ojha [Mon, 9 Aug 2021 14:35:01 +0000 (14:35 +0000)]
qa/suites/rados/perf/ceph.yaml: remove rgw
This is no longer required because we removed cosbench workloads in fd350fd0150a2d4072f055658c20314a435a19ba. This is also required to prevent
failures like the following or any other changes that break the rgw task:
```
2021-08-06T20:13:25.812 INFO:teuthology.orchestra.run.smithi060.stderr:curl: (7) Failed to connect to smithi060.front.sepia.ceph.com port 80: Connection refused
2021-08-06T20:15:33.813 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_git_teuthology_04c2febe7099917d97a71271f17abb5710030132/teuthology/contextutil.py", line 31, in nested
vars.append(enter())
File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
return next(self.gen)
File "/home/teuthworker/src/github.com_ceph_ceph-c_3c0f8c8164075af7aac4d1f2805d3f4580709461/qa/tasks/rgw.py", line 191, in start_rgw
wait_for_radosgw(url, remote)
File "/home/teuthworker/src/github.com_ceph_ceph-c_3c0f8c8164075af7aac4d1f2805d3f4580709461/qa/tasks/util/rgw.py", line 94, in wait_for_radosgw
assert exit_status == 0
AssertionError
```
Alfonso Martínez [Wed, 24 Nov 2021 14:36:50 +0000 (15:36 +0100)]
mgr/dashboard: upgrade Cypress to the latest stable version
- Remove unneeded dependency that was causing UI performance issues: zone.js
- Ignore 'ResizeObserver loop limit exceeded' error.
- run-frontend-e2e-tests.sh refactoring: create rgw dashboard user through
'ceph dashboard set-rgw-credentials' and use it on rgw buckets' tests.
Fixes: https://tracker.ceph.com/issues/53357 Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit 3e4e29590aa1742fc3b44d21389325a13cca8199)
Conflicts:
src/pybind/mgr/dashboard/frontend/package-lock.json
- Regenerate file to align to pacific. Signed-off-by: Alfonso Martínez <almartin@redhat.com>
Nizamudeen A [Thu, 18 Nov 2021 07:13:39 +0000 (12:43 +0530)]
mgr/dashboard: fix flaky inventory e2e test
When `inventory.getTableCount('total').should('be.eq', totalDiskCount);`
this line is executed the table was not loaded properly and hence the
getTableCount returns 0 on the first try but on second try it passes
since the table is loaded. But in orch e2es the retries are set to 0. I
am not sure if it makes sense to set it to 1. Anyway I am adapting the
test a bit to expect the count to be equal to totalDiskCount so that the
test will wait a bit.
Avan Thakkar [Tue, 9 Nov 2021 21:37:33 +0000 (03:07 +0530)]
mgr/dashboard: provisioned values is misleading in RBD image table
Fixes: https://tracker.ceph.com/issues/46617 Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Adding hint in image table similar to the one in rbd-details.
Adam Kupczyk [Sat, 13 Nov 2021 10:28:18 +0000 (11:28 +0100)]
os/bluestore: Fix omap upgrade to per-pg scheme
This is fix to regression introduced by fix to omap upgrade: https://github.com/ceph/ceph/pull/43687
The problem was that we always skipped first omap entry.
This worked fine with objects having omap header key.
For objects without header key we skipped first actual omap key.
Fixes: https://tracker.ceph.com/issues/53307 Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 65a3f374aa1c57c5bb9401e57dab98a643b4360a)
The calls to remove a bucket had parameters to specify a prefix and
delimiter, which does not make sense. This was precipitated due to some
existing Swift protocol logic, but buckets are removed irrespective of
prefix and delimiter. So the functions and calls are adjusted to
remove those parameters. Additionally, those same parameters were
removed for aborting incomplete multipart uploads.
Additionally a bug is fixed in which during bucket removal, multipart
uploads were only removed if the prefix was non-empty.
Conflicts:
src/rgw/rgw_sal_rados.cc
src/rgw/rgw_sal.h
src/rgw/rgw_sal_rados.h
- Alterations due to Zipper 7 code refactoring
src/rgw/rgw_sal_dbstore.cc
src/rgw/rgw_sal_dbstore.h
- Did not exist before Zipper 7 code refactoring
ceph-volume: fix bug with miscalculation of required db/wal slot size for VGs with multiple PVs
Previous logic for calculating db/wal slot sizes made the assumption that there would only be
a single PV backing each db/wal VG. This wasn't the case for OSDs deployed prior to v15.2.8,
since ceph-volume previously deployed multiple SSDs in the same VG. This fix removes the
assumption and does the correct calculation in either case.
Sage Weil [Fri, 5 Nov 2021 15:39:07 +0000 (11:39 -0400)]
mgr/cephadm: allow osd spec removal
OSD specs/drivegroups are essentially templates for OSD creation but do
not map to the full lifecycle of the OSDs that they create. When a spec
is removed, remove it immediately.
If no --force is provided, the error lists which OSDs will be left behind.
If --force is passed, the service is removed.
This leaves behind a few oddities:
- When you list services, OSDs that were created by the drivegroup may
still exist, causing the drivegroup to appear in the list as
unmanaged services.
- If you create a new drivegroup with the same name, the prior OSDs will
appear to belong to the new spec instance, regardless of whether the
spec/drivegroup parameters are the same.
AndrewSharapov [Fri, 29 Oct 2021 15:10:20 +0000 (18:10 +0300)]
mgr/cephadm: Fixed spawning ip addresses list for public network interface.
Eevery call of find_ip_on_host() actually duplicates the list of public ip
addresses in self.networks, while it should NOT change it. As the result
value of key mgr/cephadm/host.<hostname> in kv store becomes very large
and may cause crash of ceph mgr.
fix tox test: AttributeError: 'HostCache' object has no
attribute 'update_host_networks' which was introduced in 78983ad0d0cce422da32dc4876ac186f6d32c3f5 (not yet in pacific)