myoungwon oh [Fri, 13 Aug 2021 08:15:54 +0000 (17:15 +0900)]
seastore: seperate Journal interface from SegmentedJournal implementation
A subsequent PR will introduce a CircularBoundedJournal implementation
for fast nvme devices.
SegmentCleaner no longer needs a reference to Journal, so dispense with
the set_segment_provider machinery and simply pass it in the
constructor.
Move responsibility for finding the journal segments into the journal
itself. This does mean that we check the segment headers on the journal
device twice, but that should be a neglible amount of overhead on mount.
SegmentCleaner::init_segments no longer needs to return Journal
segments, so merge with mount().
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com> Signed-off-by: Samuel Just <sjust@redhat.com>
Kotresh HR [Thu, 10 Feb 2022 05:34:41 +0000 (11:04 +0530)]
mgr/volumes: Fix clone uid/gid mismatch
This is the regression caused by commit 18b85c53a.
The 'set_attrs' function sets the uid/gid of the
group to the subvolume if uid/gid is not passed.
The attrs of the clone should match the source
snapshot. Hence, don't use the 'set_attrs'
function to set only the quota attrs for the
clone.
Adam King [Thu, 10 Feb 2022 01:42:42 +0000 (20:42 -0500)]
qa/tasks/cephadm_cases: increase timeouts in test_cli.py
These seem to be failing sometimes but in my testing
sometimes these events are happening a few seconds after
we hit the timeout. Trying to see if this makes the tests
more consistent. No need to mark the test as failed
if we report something up in 34 seconds vs 25 especially
when cephadm works on a cyclic daemon refresh.
* refs/pull/42000/head:
qa: update rhel kclient to setup container tools
qa: stop overriding distro for k-testing
qa: only use RHEL for workload testing
qa: convert fs:workload to use cephadm
qa: split fs begin task
qa/tasks/cephadm: setup CephManager when OSDs are provisioned
qa/tasks/cephadm: setup file system if MDS are provisioned
Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com>
mds: add an option to decide whether prefetching entire dir or not.
Accessing one single dentry could be fastened by set this option to
false, when dir is not in the memory. Signed-off-by: "Shen, Hang" <shenhang@kuaishou.com>
Nizamudeen A [Tue, 8 Feb 2022 06:20:29 +0000 (11:50 +0530)]
mgr/dashboard: set appropriate baseline branch for applitools
All the dashboard PRs are checked against a baseline branch called
'default' in the visual regresstion testing. This will cause issues when
testing PRs in different branches. For eg: currently our master and
pacific has to save two different screenshots since the two of them
differ slightly.
Disabling the applitools logs as well because its too 'noisy'
Fixes: https://tracker.ceph.com/issues/54190 Signed-off-by: Nizamudeen A <nia@redhat.com>
Soumya Koduri [Thu, 3 Feb 2022 18:55:22 +0000 (00:25 +0530)]
rgw/dbstore: Handle read vs delete races
Now that tail objects are associated with objectID, they are not deleted
as part of this DeleteObj operation. Such tail objects (with no head object
in *.object.table are cleaned up later by GC thread.
To avoid races between writes/reads & GC delete, mtime is maintained for each
tail object. This mtime is updated when tail object is written and also when
its corresponding head object is deleted (like here in this case).
Vicente Cheng [Fri, 10 Dec 2021 06:49:55 +0000 (06:49 +0000)]
msg/async: fix outgoing_bl overflow and reset state_offset
- we should reset state_offset when read done.
- check outgoing_bl before we try to write a message.
In some environments, network would temporily block and return EAGAIN.
For async msgr, we would callback the write event directly, but that still
increase the outgoing_bl.
Think about this case, the sender is in congestion or network driver
has some problems. The data appended to outgoing_bl and outgoing_bl
is not consumed up-to-date immediately.
That size of outgoing_bl will increase with time then overflow.
The wrong outgoing_bl would cause some problems so we need to wait
for outgoing_bl before we appended another message.
Ronen Friedman [Sun, 30 Jan 2022 15:23:39 +0000 (15:23 +0000)]
osd/scrub: fix unintended changes to scrub (cluster)logs
OSD logs and cluster logs are monitored by some scrub tests:
some specific strings are required to either appear or not appear in
the logs. The Scrubber backend PR has unintentionally modified some
of these logs, and here we restore the exact logs text.
Ilya Dryomov [Tue, 8 Feb 2022 09:11:49 +0000 (10:11 +0100)]
rbd: mark optional positional arguments as such in help output
Currently at least five commands have optional positional arguments.
Overloading po::value<std::string>()->default_value("") for this
is a bit sneaky but nothing better fits into the existing Shell.cc
framework.
Note that strictly speaking "[<interval>] [<start-time>]" should be
"[<interval> [<start-time>]]" but we aren't doing that here because
"ceph" command doesn't do it either.
Rishabh Dave [Mon, 7 Feb 2022 18:44:42 +0000 (00:14 +0530)]
monitoring: mention PyYAML only once in requirements
Following error occurs while running "sudo install-deps.sh" -
ERROR: Double requirement given: PyYAML==6.0 (from -r requirements-lint.txt (line 5)) (already in pyyaml (from -r requirements-alerts.txt (line 1)), name='PyYAML')
PyYAML is mentioned twice as a requirement. It is mentioned once in both
the following files -
monitoring/ceph-mixin/requirements-lint.txt
monitoring/ceph-mixin/requirements-alerts.txt
Nizamudeen A [Mon, 7 Feb 2022 10:53:29 +0000 (16:23 +0530)]
cephadm: change shared_folder directory for prometheus and grafana
After https://github.com/ceph/ceph/pull/44059 the monitoring/prometheus
and monitoring/grafana/dashboards directories are changed to
monitoring/ceph-mixins. That broke the shared_folders in the cephadm
bootstrap script.
Changed all the instances of monitoring/prometheus and
monitoring/grafana/dashboards to monitoring/ceph-mixins
Also, renaming all the instances of prometheus_alerts.yaml to
prometheus_alerts.yml.
Fixes: https://tracker.ceph.com/issues/54176 Signed-off-by: Nizamudeen A <nia@redhat.com>
Soumya Koduri [Thu, 13 Jan 2022 20:44:19 +0000 (02:14 +0530)]
rgw/dbstore: Use Object ID to handle racing writes
Create unique ID for each object upload which will be
atomically updated in the head object at the end. This will
prevent data corruption during concurrent writes.
Incase of Multipart Uploads, upload_id is used as ObjectID.
XXX: The stale or obsolete tail data needs to be deleted
Also addressed invalid usage of CephContext in dbstore tests.
Vicente Cheng [Thu, 16 Dec 2021 03:20:05 +0000 (03:20 +0000)]
test/msgr: add unittest to simulate network block temporarily
Add new test case to verify the network block temporarily,
that case would make outgoing_bl overflow so add the assert
checking mechanism to claim_append
Just use 2 connections because that we could not generate the
large data set to verify it
Simulate the EAGAIN situation looks like by skip calling
cs.send() because EAGAIN would return size 0 and keep the
outgoing_bl