Sebastian Wagner [Wed, 26 Aug 2020 12:53:09 +0000 (14:53 +0200)]
mgr/cephadm: Call cephadm with --container-image
The kernel treats any process with PID 1 different. Especially
it does not generate a core dump. Call podman / docker with
--init in order to get core dumps.
In addition, we can now properly reap zombies processes.
Fixes: https://tracker.ceph.com/issues/44231 Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Wed, 26 Aug 2020 12:45:34 +0000 (14:45 +0200)]
cephadm: Add --container-image
The kernel treats any process with PID 1 different. Especially
it does not generate a core dump. Call podman / docker with
--init in order to get core dumps.
In addition, we can now properly reap zombies processes.
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Wed, 26 Aug 2020 10:56:26 +0000 (12:56 +0200)]
Merge pull request #36571 from pcuzner/cephadm-tox-update
cephadm: remove py2 from tox tests
Reviewed-by: Juan Miguel Olmo Martínez <jolmomar@redhat.com> Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com> Reviewed-by: Tim Serong <tserong@suse.com>
Kefu Chai [Tue, 25 Aug 2020 09:30:00 +0000 (17:30 +0800)]
crimson/osd: drop misdirected ops
see also `PrimaryLogPG::do_op()`, we should ignore the ops hitting us if
we are not supposed to serve them. this happens when the client is using
a stale osdmap.
Kefu Chai [Tue, 25 Aug 2020 04:34:15 +0000 (12:34 +0800)]
cmake: silence "detached HEAD" warning
git complains when checking out a tag in "detached HEAD", like:
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them
...
but this does not help, as, in general, we don't hack fio in Ceph,
so disable this warning. and also clone the repo in shallow mode
for the same reason -- we don't care about the whole history of
fio repo. we just use it for testing.
Patrick Donnelly [Mon, 24 Aug 2020 20:18:13 +0000 (13:18 -0700)]
Merge PR #36684 into master
* refs/pull/36684/head:
qa/tasks/nfs: Test mounting of export created with nfs command
qa/tasks/nfs: Add helper method to check nfs cluster status
qa/tasks/nfs: Cleanup created filesystem
qa/tasks/nfs: Remove unused port status function and 'stdin' keyword argument
Reviewed-by: Sebastian Wagner <swagner@suse.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Kefu Chai [Fri, 21 Aug 2020 04:31:16 +0000 (12:31 +0800)]
crimson/osd: check for DNE object and return ENOENT in read ops
* omap_get_keys()
this change addresses the failure of
test_rados.py:TestIoctx.test_get_omap_keys
* omap_get_vals_by_keys()
this change addresses the failure of
test_rados.py:TestIoctx.test_get_omap_vals_by_keys
* read()
this change addresses the failure of
test_rados.py:TestIoctx.test_write_ops
Kefu Chai [Fri, 21 Aug 2020 05:08:42 +0000 (13:08 +0800)]
crimson/osd: update oi.size after truncating an object
* update oi.size if object size changes after the object is truncated
* do not add a truncate op to trasaction of the size of object does
not change because of truncate op.
Kefu Chai [Thu, 20 Aug 2020 17:29:51 +0000 (01:29 +0800)]
crimson/osd: return rval which is negative
a less-than-zero rval indicates an error, and should not be normalized
to 0 if allows_returnvec() evaluates to false. probably we need a better
way to return a negative error code which does not fall into any known
error. but at this moment, grab the last rval and return it if it is
less than zero, can be used as a short term solution.
Kefu Chai [Sat, 22 Aug 2020 05:59:19 +0000 (13:59 +0800)]
qa/tasks/workunit: allow passing optional args to workunit
* add comment to _run_tests()
* use `os.path.commonpath()` instead using string matching directly
for matching given workunit spec with executables.
* allow passing optional args to workunit
Kefu Chai [Fri, 21 Aug 2020 12:22:23 +0000 (20:22 +0800)]
qa/tasks/ceph: create a log file before redirecting to it
as it is shell who interprets ">>" and redirect the stderr to given
file, but the shell process is launched by ubuntu:ububunt without using
sudo, so the command fails with "Permission denied" failure. to address
this issue, in this change, a file with proper priviledges is created
beforehand using `install`, so shell is able to write to it.
also, instead of creating this file in `maybe_redirect_stderr()`, it
returns the command to create the log file.
Patrick Donnelly [Sat, 22 Aug 2020 01:44:06 +0000 (18:44 -0700)]
Merge PR #36681 into master
* refs/pull/36681/head:
mds: don't track change of config 'mds_replay_unsafe_with_closed_session'
mds: fix 'forward loop' when forward_all_requests_to_auth is set
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Sat, 22 Aug 2020 01:40:42 +0000 (18:40 -0700)]
Merge PR #36131 into master
* refs/pull/36131/head:
doc: document cephfs mirroring dev work
test: add tests for `ceph fs mirror` family of commands
mds: track filesystem mirror peers in fsmap
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Kefu Chai [Thu, 20 Aug 2020 11:39:15 +0000 (19:39 +0800)]
crimson/osd: update size of object after writing to object
in the writesame op implemented in 6f7d1a435c1e80ee7ad6a9fca898d686255cc206, we failed to update the OI of
object after appending to it, in this change `oi.size` is updated
accordingly.
Patrick Donnelly [Fri, 21 Aug 2020 23:12:32 +0000 (16:12 -0700)]
Merge PR #36472 into master
* refs/pull/36472/head:
qa/workunits/fs: add test for subvolume
mds: don't move inode with nlink > 1 to global snaprealm if it's in subvolume
mds: disallow hardlink across subvolume
mds: disallow across subvolume rename
mds: disallow creating snapshot on descendent directory of subvolume
mds: add vxattr that marks/clears subvolume flag
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Rishabh Dave [Mon, 17 Aug 2020 09:58:12 +0000 (15:28 +0530)]
qa: add method run ceph cluster command with better interface
This new method should allow better control on the process launched by
the passed command. This is achieved by allowing arguments provided by
teuthology.orchestra.run.run().
Jason Dillaman [Fri, 21 Aug 2020 14:37:41 +0000 (10:37 -0400)]
librbd: flush requests could race past initiation of write ops
Now that IO is being processed by multiple threads, it's possible
that a write operation that was issued prior to a flush would not
have been started prior to the processing of the flush.
Fixes: https://tracker.ceph.com/issues/47050 Signed-off-by: Jason Dillaman <dillaman@redhat.com>