]>
git.apps.os.sepia.ceph.com Git - ceph-ci.git/log
Radoslaw Zarzynski [Sat, 17 Apr 2021 17:14:06 +0000 (17:14 +0000)]
crimson/mgr: don't report if there is no connection available.
During a teuthology run [1] following crash happended:
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-04-08_10:14:11-rados-master-distro-basic-smithi/
6028696 $ less remote/smithi052/log/ceph-osd.3.log.gz
...
DEBUG 2021-04-08 10:32:58,548 [shard 0] ms - [osd.3(client) v2:172.21.15.52:6813/30889@62168 >> mon.0 v2:172.21.15.52:3300/0] <== #3 === mgrmap(e 4) v1 (1796)
INFO 2021-04-08 10:32:58,549 [shard 0] ms - [osd.3(client) v2:172.21.15.52:6813/30889@62056 >> mgr.4100 v2:172.21.15.52:6800/30259] closing: reset no, replace no
DEBUG 2021-04-08 10:32:58,549 [shard 0] ms - [osd.3(client) v2:172.21.15.52:6813/30889@62056 >> mgr.4100 v2:172.21.15.52:6800/30259] TRIGGER CLOSING, was READY
INFO 2021-04-08 10:32:58,549 [shard 0] ms - [osd.3(client) v2:172.21.15.52:6813/30889@62056 >> mgr.4100 v2:172.21.15.52:6800/30259] execute_ready(): protocol aborted at CLOSING -- std::system_error (error crimson::net:4, read eof)
DEBUG 2021-04-08 10:32:58,549 [shard 0] ms - [osd.3(client) v2:172.21.15.52:6813/30889@62056 >> mgr.4100 v2:172.21.15.52:6800/30259] closed!
Segmentation fault on shard 0.
Backtrace:
0x000000000151765c
0x00000000014d9600
0x00000000014d9902
0x00000000014d9972
/lib64/libpthread.so.0+0x0000000000012b1f
0x0000000000e59cba
0x00000000014dc8a6
0x00000000014cdd1c
0x0000000001503053
0x000000000149fab7
0x00000000006e0ef5
/lib64/libc.so.6+0x00000000000237b2
0x000000000072a23d
daemon-helper: command crashed with signal 11
```
[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-04-08_10:14:11-rados-master-distro-basic-smithi/
6028696 /
GDB testifies the `conn` during the execution of `ceph::mgr:report()` was null:
```
(gdb) frame 7
154 in /usr/src/debug/ceph-17.0.0-2935.g4153f8c2.el8.x86_64/src/crimson/mgr/client.cc
(gdb) print conn
$1 = {_b = 0x0, _p = 0x0}
```
Taken altogether with the `mgr.4100 v2:172.21.15.52:6800/30259] closed!`
debug this suggests that a call to `report()` occurred (likely from the
timer) but we were in the middle of the unatomic reconnect sequence:
```cpp
seastar::future<> Client::reconnect()
{
if (conn) {
conn->mark_down();
conn = {};
}
// ...
return seastar::sleep(a_while).then([this] {
// ...
conn = msgr.connect(peer, CEPH_ENTITY_TYPE_MGR);
});
}
```
This commit alters the `mgr::report()` to skip reporting is the `conn`
is unavailable.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Wed, 14 Apr 2021 07:06:50 +0000 (15:06 +0800)]
Merge pull request #40149 from tchaikov/wip-cmake-job-pool
cmake: use ninja job pool
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Kefu Chai [Wed, 14 Apr 2021 04:42:31 +0000 (12:42 +0800)]
Merge pull request #40785 from tchaikov/wip-doc-list-spacing
doc/_themes: remove spacing after `ul li p`
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Kefu Chai [Wed, 14 Apr 2021 01:33:35 +0000 (09:33 +0800)]
doc/radosgw: fix formatting of command line block
should add an empty line after the directive.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 14 Apr 2021 00:51:08 +0000 (08:51 +0800)]
admin: require sphinx>=3.2.1
loosen the required sphinx version.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sun, 11 Apr 2021 03:09:25 +0000 (11:09 +0800)]
doc/_themes: remove spacing after `ul li p`
in the latest document generated from RtD, the spacing after `ul li p`
elements is set to 24px as the plain `p` elements. but this the lists
more sparse and difficult to read.
in this change, the spacing is restored to 0 as it was in old theme.css
in sphinx_rtd_theme.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Gregory Farnum [Tue, 13 Apr 2021 22:38:01 +0000 (15:38 -0700)]
Merge pull request #40835 from gregsfortytwo/wip-stretch-mon-state
Fix issues with in-memory monitor stretch state
Reviewed-by: Sam Just <sjust@redhat.com>
Patrick Donnelly [Tue, 13 Apr 2021 20:26:43 +0000 (13:26 -0700)]
Merge PR #40833 into master
* refs/pull/40833/head:
test/librados/tier_cxx: use non-deprecated wait_for_complete
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Samuel Just [Tue, 13 Apr 2021 18:18:17 +0000 (18:18 +0000)]
test/librados/tier_cxx: use non-deprecated wait_for_complete
Fixes: https://tracker.ceph.com/issues/50342
Signed-off-by: Samuel Just <sjust@redhat.com>
Yuval Lifshitz [Tue, 13 Apr 2021 16:00:49 +0000 (19:00 +0300)]
Merge pull request #40798 from yuvalif/wip-yuval-fix-49800
rgw/test: use 'localhost' for amqp ssl test
Kefu Chai [Tue, 13 Apr 2021 13:14:44 +0000 (21:14 +0800)]
Merge pull request #40832 from wjwithagen/wjw-fix-y2c.py
common: make y2c.py work on FreeBSD
Reviewed-by: Kefu Chai <kchai@redhat.com>
Willem Jan Withagen [Tue, 13 Apr 2021 08:17:01 +0000 (10:17 +0200)]
common: make y2c.py work on FreeBSD
1) make reference to python3 indepedant of explicit path
2) add required py-yaml module to install list
fixes: #40731
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
Greg Farnum [Tue, 13 Apr 2021 06:25:47 +0000 (06:25 +0000)]
mon: set_healthy_stretch_mode in update_from_paxos, not random leader calls!
Add header comment describing how this works now.
Fixes: https://tracker.ceph.com/issues/50308
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Greg Farnum [Tue, 13 Apr 2021 05:28:31 +0000 (05:28 +0000)]
mon: maintain stretch_recovery_triggered in new OSDMon::set_*_stretch_mode
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Greg Farnum [Tue, 13 Apr 2021 05:27:32 +0000 (05:27 +0000)]
mon: add a set_recovery_stretch_mode function
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Greg Farnum [Fri, 9 Apr 2021 21:35:03 +0000 (21:35 +0000)]
mon: rename maybe_engage_stretch_mode to try_engage_stretch_mode
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Kefu Chai [Tue, 13 Apr 2021 03:38:49 +0000 (11:38 +0800)]
Merge pull request #40731 from tchaikov/wip-yamlize-options
common: extract options into yaml
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Kefu Chai [Tue, 13 Apr 2021 03:06:53 +0000 (11:06 +0800)]
Merge pull request #40739 from jdurgin/wip-alienstore-debug
crimson/os/alienstore: use bluestore debug prefix
Reviewed-by: Kefu Chai <kchai@redhat.com>
Sage Weil [Tue, 13 Apr 2021 02:38:49 +0000 (22:38 -0400)]
Merge PR #40626 into master
* refs/pull/40626/head:
qa/suites/rados/objectstore: separate store_test tests
qa/standalone: split osd/ into 2 directories
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Kefu Chai [Tue, 13 Apr 2021 01:15:17 +0000 (09:15 +0800)]
common/options: s/immutable-objet-cache/immutable-object-cache/
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 9 Apr 2021 10:54:46 +0000 (18:54 +0800)]
common: extract options into yaml
extract the options in common/options.cc into separate .yaml.in
files, and preprocess them using CMake before translating them into .cc
files using a python script.
this change paves the road to render the options using sphinx, and
will allow us to further annotate the options to include more metadata.
also, a this YAML file can be consumed by applications like dashboard
and Sphinx to consume these metadata in a simpler way.
* use @variable-name@ for substituting the variables in .yaml.in file
* use cmake variable of `mgr_disabled_modules` instead of C macro
to define `mgr_disabled_modules` in global.yaml.in
* debian/control, ceph.spec.in, win32_deps_build.sh: add python3-yaml
as build dep
* add y2c.py (short for YAML to C++) to translate .yaml to .cc file
* common/options/*.yaml.in: extract and split options into .yaml.in
files, the subvars in it is then replaced with CMake variables,
and copied to the corresponding .yaml files
* include/config-h.in.cmake: remove MGR_DISABLED_MODULES, as it
is not a CMake variable.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sun, 11 Apr 2021 04:54:18 +0000 (12:54 +0800)]
cmake: use the same name for macros and cmake variables
for two reasons,
* consolidate the namings
* pave the road to yamlize options where we will use cmake variables
to substitude the @<variable-name>@ in .in files instead of relying
on C/C++ macros
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 9 Apr 2021 04:14:23 +0000 (12:14 +0800)]
cmake: introduce WITH_EC_ISA_PLUGIN
instead of checking "HAVE_NASM_X64_AVX2 OR HAVE_ARMV8_SIMD" everywhere,
use a single cached variable of WITH_EC_ISA_PLUGIN. so it's more
consistent when checking the availability of ec_isa plugin.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Sage Weil [Tue, 6 Apr 2021 14:41:09 +0000 (09:41 -0500)]
qa/suites/rados/objectstore: separate store_test tests
This takes 5 hours currently.
- Separate out filestore and memstore into separate task (~1 hr)
- Split bluestore into -a and -b (a tests exclude SynethicMatrixC,
b tests include it)
Signed-off-by: Sage Weil <sage@newdream.net>
Ilya Dryomov [Mon, 12 Apr 2021 19:23:06 +0000 (21:23 +0200)]
Merge pull request #40576 from idryomov/wip-no-cephxv2-for-unmap
qa/suites/krbd: don't require CEPHX_V2 for unmap subsuite
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Yuval Lifshitz [Mon, 12 Apr 2021 19:19:10 +0000 (22:19 +0300)]
Merge pull request #40800 from yuvalif/wip-yuval-fix-50291
rgw/amqp/test: fix mock prototype for librabbitmq-0.11.0
Patrick Donnelly [Mon, 12 Apr 2021 19:11:05 +0000 (12:11 -0700)]
Merge PR #40651 into master
* refs/pull/40651/head:
doc/cephadm: fix a typo
Reviewed-by: Sebastian Wagner <swagner@suse.com>
Samuel Just [Mon, 12 Apr 2021 18:50:06 +0000 (11:50 -0700)]
Merge pull request #40067 from myoungwon/wip-49726
osd: avoid for the two copy to cancel each other
Reviewed-by: Samuel Just <sjust@redhat.com>
Sage Weil [Mon, 12 Apr 2021 16:36:48 +0000 (12:36 -0400)]
Merge PR #40546 into master
* refs/pull/40546/head:
SECURITY.md: Create SECURITY.md
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Sage Weil [Mon, 12 Apr 2021 15:53:02 +0000 (11:53 -0400)]
Merge PR #40726 into master
* refs/pull/40726/head:
doc: Add GPG Keys
Reviewed-by: Sage Weil <sage@redhat.com>
Sage Weil [Mon, 12 Apr 2021 15:45:50 +0000 (11:45 -0400)]
Merge PR #40736 into master
* refs/pull/40736/head:
mgr/cephadm: rewrite/simplify describe_service
mgr/orchestrator: report osds as osd.unmanaged as appropriate
mgr/orchestrator: remove IMAGE ID from 'orch ls'
Reviewed-by: Michael Fritch <mfritch@suse.com>
Ernesto Puerta [Mon, 12 Apr 2021 15:33:52 +0000 (17:33 +0200)]
Merge pull request #40721 from rhcs-dashboard/49925-fix-nfs-ganesha
mgr/dashboard: fix errors when creating NFS export.
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Sage Weil [Tue, 6 Apr 2021 14:47:10 +0000 (09:47 -0500)]
qa/standalone: split osd/ into 2 directories
The whole osd/ directory takes 3 hours to run. Of that, about half is
osd-backfill*:
2021-04-05T20:38:55.932 INFO:tasks.workunit:Running workunit osd/osd-backfill-prio.sh...
2021-04-05T20:47:27.184 INFO:tasks.workunit:Running workunit osd/osd-backfill-recovery-log.sh...
2021-04-05T20:55:59.497 INFO:tasks.workunit:Running workunit osd/osd-backfill-space.sh...
2021-04-05T21:48:47.549 INFO:tasks.workunit:Running workunit osd/osd-backfill-stats.sh...
2021-04-05T22:17:09.197 INFO:tasks.workunit:Running workunit osd/osd-bench.sh...
Signed-off-by: Sage Weil <sage@newdream.net>
Kefu Chai [Mon, 12 Apr 2021 14:40:13 +0000 (22:40 +0800)]
Merge pull request #40814 from tchaikov/wip/ceph-object-corpus/pacific
ceph-object-corpus: pick up
16.2.0-90-g50f1821b4c
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
J. Eric Ivancich [Mon, 12 Apr 2021 14:34:50 +0000 (10:34 -0400)]
Merge pull request #40801 from ivancich/wip-radoslist-incomplete-multipart-parts-marker
rgw: radoslist incomplete multipart parts marker
Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Kefu Chai [Mon, 12 Apr 2021 13:48:12 +0000 (21:48 +0800)]
ceph-object-corpus: pick up
16.2.0-90-g50f1821b4c
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Mon, 12 Apr 2021 13:47:04 +0000 (21:47 +0800)]
Merge pull request #40806 from rhcs-dashboard/fix-mailmap-master
mailmap: sort alphabetically & add Pere and Waad
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Mon, 12 Apr 2021 13:42:43 +0000 (21:42 +0800)]
Merge pull request #40656 from tchaikov/wip-qa-upgrade-focal
qa/suites: test upgrade/octopus-x on focal instead bionic
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Kefu Chai [Mon, 12 Apr 2021 13:40:29 +0000 (21:40 +0800)]
Merge pull request #40811 from tchaikov/wip-gen-corpus
script/gen-corpus.sh: set CEPH_CONF
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Hardik Vyas [Fri, 9 Apr 2021 11:41:26 +0000 (17:11 +0530)]
doc: Add GPG Keys
Replaced my GPG key with ceph.com and David's GPG keys
Signed-off-by: Hardik Vyas <hvyas@redhat.com>
Kefu Chai [Mon, 12 Apr 2021 11:47:02 +0000 (19:47 +0800)]
Merge pull request #40795 from wjwithagen/wjw-fix-ceph-dencoder
tools: do not unload plugins during destruction.
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Mon, 12 Apr 2021 11:00:18 +0000 (19:00 +0800)]
qa/workunits/cls: add executable bit to script
all the scripts except for test_cls_cas.sh under qa/workunits/cls
are executable. to be more consistent, add the executable bit to
test_cls_cas.sh as well.
also, these scripts are launched by src/script/gen-corpus.sh directly,
so it's convenient just call them.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Mon, 12 Apr 2021 10:55:08 +0000 (18:55 +0800)]
script/gen-corpus.sh: set CEPH_CONF
if we happen to run this script on a host where /etc/ceph/ceph.conf is
available, ceph CLI would use it instead. so, point it to $PWD/ceph.conf
instead.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Mon, 12 Apr 2021 10:35:47 +0000 (18:35 +0800)]
Merge pull request #40797 from wjwithagen/wjw-fix-monmap-retval
test: Undo the FreeBSD specific retval test
Reviewed-by: Kefu Chai <kchai@redhat.com>
Ernesto Puerta [Mon, 12 Apr 2021 09:22:36 +0000 (11:22 +0200)]
mailmap: add Dashboard members: Waad and Pere
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
Ernesto Puerta [Mon, 12 Apr 2021 09:19:25 +0000 (11:19 +0200)]
mailmap: sort alphabetically
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
Kefu Chai [Mon, 12 Apr 2021 09:06:47 +0000 (17:06 +0800)]
Merge pull request #40794 from wjwithagen/wjw-fix-promtool
test: Run Dockers only on Linux platforms
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Mon, 12 Apr 2021 08:42:52 +0000 (16:42 +0800)]
Merge pull request #38120 from kiizawa/wip-cls-remote-read
osd: allow remote read by calling cls method from within cls context
Reviewed-by: Samuel Just <sjust@redhat.com>
Willem Jan Withagen [Sun, 11 Apr 2021 14:27:35 +0000 (16:27 +0200)]
test: Undo the FreeBSD specific retval test
Changes to the socket code now result in returning EINVAL
In the past ENOENT was returned which is the FreeBSD error code
if DNS lookup does not work.
And that change is probably because somewhere in the code that
errorcode is not passed verbatim from the systemcall, but is
rewritten in extra evaluation.
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
Willem Jan Withagen [Sun, 11 Apr 2021 13:01:22 +0000 (15:01 +0200)]
tools: do not unload plugins during destruction.
FreeBSD ceph-dencoder crashes in the exit() calls, due to
invalid pointer references during the release process of
the loaded libraries.
Often this is signaled by libc reporting:
__cxa_thread_call_dtors: dtr 0x47efc0 from unloaded dso, skipping
The cause for this is different behaviour between FreeBSD and Linux:
https://groups.google.com/g/bsdmailinglist/c/22ncTZAbDp4/m/Dii_pII5AwAJ
_The FreeBSD implementation here looks racy. If one thread dlcloses an
object while another thread is exiting, we can end up calling a
function at an invalid memory address. It also looks as if it may
be possible to unload one library, load another at the same address,
and end up executing entirely the wrong code, which would have some
serious security implications.
The GNU/Linux equivalent of this function locks the DSO in memory
until all references to it have gone away. A call to dlclose() on
GNU/Linux will not actually unload the library until all threads
with destructors in that library have been unloaded. I believe
that this reuses the same reference counting mechanism that
allows the same library to be dlopened and dlclosed multiple times.
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
Kefu Chai [Mon, 12 Apr 2021 08:26:34 +0000 (16:26 +0800)]
Merge pull request #37016 from zhangdaolong/subcommon-bulefs-import
os/bluestore:Add subcommand bluefs-import in ceph-bluestore-tool.
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Mon, 12 Apr 2021 08:23:48 +0000 (16:23 +0800)]
Merge pull request #40644 from SMIL-Infra/cleanup-slash
cephadm: cleanup extra slash in runtime dir
Reviewed-by: Adam King <adking@redhat.com>
Kefu Chai [Mon, 12 Apr 2021 08:09:33 +0000 (16:09 +0800)]
Merge pull request #40658 from tchaikov/wip-systemd
cmake: s/HAVE_MSGHDR/WITH_SYSTEMD/
Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
Yuval Lifshitz [Sun, 11 Apr 2021 16:37:39 +0000 (19:37 +0300)]
rgw/amqp/test: fix mock prototype for librabbitmq-0.11.0
also use extern C for to get compilation errors when
function prototype change
Fixes: https://tracker.ceph.com/issues/50291
Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
pcuzner [Mon, 12 Apr 2021 00:02:03 +0000 (12:02 +1200)]
Merge pull request #40635 from pcuzner/prometheus_add_pool_metadata
mgr/prometheus:Improve the pool metadata
Yuval Lifshitz [Sun, 11 Apr 2021 18:52:06 +0000 (21:52 +0300)]
Merge pull request #39944 from yuvalif/wip-yuval-fix-49650
rgw/notifications: delete bucket notification object when empty
Willem Jan Withagen [Sun, 11 Apr 2021 12:46:17 +0000 (14:46 +0200)]
test: Run Dockers only on Linux platforms
Running a docker alternative only works if the platform
is Linux
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
Kefu Chai [Sun, 11 Apr 2021 16:43:46 +0000 (00:43 +0800)]
Merge pull request #40786 from tchaikov/wip-script-bit
build-integration-branch: retry when running into network failures
Reviewed-by: Sage Weil <sage@redhat.com>
Yuval Lifshitz [Sun, 11 Apr 2021 15:12:35 +0000 (18:12 +0300)]
rgw/test: use 'localhost' for amqp ssl test
also move the amqp ssl tests to 'test_bn.py'
this fix combines commit:
1418bcc1dc3f22257fec840556902b4bf88932b8
with commit:
979335f60a9ab771973c303277be4edb4da55c01
Fixes: https://tracker.ceph.com/issues/49800
Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
Kefu Chai [Sun, 11 Apr 2021 03:59:41 +0000 (11:59 +0800)]
build-integration-branch: retry when running into network failures
Signed-off-by: Kefu Chai <kchai@redhat.com>
zhangdaolong [Mon, 7 Sep 2020 01:00:10 +0000 (09:00 +0800)]
os/bluestore/bluestore_tool: Add subcommand blufs-import
Examples
ceph-bluestore-tool bluefs-import --path /var/lib/ceph/osd/ceph-1 --input-file ./db/CURRENT --dest-file db/CURRENT
Signed-off-by: zhang daolong <zhangdaolong@fiberhome.com>
Kefu Chai [Sun, 11 Apr 2021 03:52:16 +0000 (11:52 +0800)]
build-integration-branch: define variables for pr_number and friends
so they can be reused later.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Sage Weil [Fri, 9 Apr 2021 20:26:00 +0000 (16:26 -0400)]
mgr/cephadm: rewrite/simplify describe_service
The prior implementation first tried to fabricate services based on the
running daemons, and then filled in defined services on top. This led
to duplication and a range of small errors.
Instead, flip this around: start with the services that are defined,
and only fill in 'unmanaged' services where we need to.
Drop the osd kludges and instead rely on DaemonDescription.service_id to
return the right thing.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Sat, 10 Apr 2021 13:01:58 +0000 (09:01 -0400)]
Merge PR #40577 into master
* refs/pull/40577/head:
cephadm: normalize unqualified repo digests to docker.io
mgr/cephadm/upgrade: normalize unqualified target image
Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
Sage Weil [Sat, 10 Apr 2021 13:01:27 +0000 (09:01 -0400)]
Merge PR #40537 into master
* refs/pull/40537/head:
cephadm:persist the grafana.db file
Reviewed-by: Sage Weil <sage@redhat.com>
J. Eric Ivancich [Fri, 9 Apr 2021 19:37:24 +0000 (15:37 -0400)]
rgw: test `radosgw-admin radoslist` and incomplete multiparts better
Make sure there are more than 1000 incomplete multiparts and also make
sure one of the incomplete multiparts has at least 1000 parts. This
test is done indirectly through rgw-orphan-list, which invokes
`radosgw-admin radoslist`.
Also, clean up shell flags, so script output is less verbose.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Patrick Donnelly [Sat, 10 Apr 2021 03:08:51 +0000 (20:08 -0700)]
Merge PR #40653 into master
* refs/pull/40653/head:
mon: check mdsmap is resizeable before promoting standby-replay
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Sat, 10 Apr 2021 03:06:21 +0000 (20:06 -0700)]
Merge PR #40642 into master
* refs/pull/40642/head:
client: don't allow access to MDS-private inodes
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Patrick Donnelly [Sat, 10 Apr 2021 03:04:32 +0000 (20:04 -0700)]
Merge PR #40481 into master
* refs/pull/40481/head:
qa: test standby-replay with fs:workloads
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Sat, 10 Apr 2021 03:02:27 +0000 (20:02 -0700)]
Merge PR #40431 into master
* refs/pull/40431/head:
qa/cephfs: remove create_keyring_file from cephfs_test_case.py
qa/cephfs: don't use sudo to write files in /tmp
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Patrick Donnelly [Sat, 10 Apr 2021 02:58:40 +0000 (19:58 -0700)]
Merge PR #40389 into master
* refs/pull/40389/head:
mds: reject lookup ino requests for mds dirs
test: add test for invalid lookup of mdsdir
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Neha Ojha [Sat, 10 Apr 2021 00:24:41 +0000 (17:24 -0700)]
Merge pull request #40723 from zdover23/wip-doc-second-attempt-mclock-rewrite-second-half-2021-Apr-09
doc/rados: rewrite mclock docs (2 of 2)
Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Josh Durgin [Fri, 9 Apr 2021 23:48:13 +0000 (16:48 -0700)]
Merge pull request #40738 from jdurgin/wip-librados-docs
include/librados: fix doxygen syntax for docs build
Reviewed-by: Neha Ojha <nojha@redhat.com>
Josh Durgin [Fri, 9 Apr 2021 22:11:52 +0000 (18:11 -0400)]
crimson/os/alienstore: use bluestore debug prefix
Filestore is never accurate. Since we only intend to use bluestore
with alien mode, it's not worth introducing a separate debug
subsystem.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Josh Durgin [Fri, 9 Apr 2021 22:01:32 +0000 (18:01 -0400)]
include/librados: fix doxygen syntax for docs build
The docs build is now warning about these like:
WARNING: Unparseable C cross-reference: '[in]'
Invalid C declaration: Expected identifier in nested name.
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
Sage Weil [Sat, 3 Apr 2021 13:14:00 +0000 (09:14 -0400)]
cephadm: normalize unqualified repo digests to docker.io
A RepoDigests returned by docker|podman image inspect can either include
the docker.io/ prefix or not. For reasons that aren't entirely clear,
this may vary between hosts in a cluster. However, ceph/ceph@sha256:abc...
is the same thing as docker.io/ceph/ceph@sha256:abc..., and should be
treated as such. Otherwise, upgrade can get into a loop where it pulls
the image on a new host, finds the other variant of the repodigests,
sees no overlap, updates target_digests, and restarts. (It will then
find the first variant again on the first host and loop.)
Avoid this by normalizing any docker.io digests by always including the
docker.io/ prefix.
Note that it is technically possible that this assumption is wrong: it
may be that the image that already exists on the local host is from a
different registry in registries.conf's unqualified-search-registries.
However, we don't know which, since this is a search list. In practice,
it should be exceeding rare that an image that *we* are installing using
a fully-qualified image name will end up having an unqualified repodigest
in the local registry. Hopefully!
Fixes: https://tracker.ceph.com/issues/50114
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 6 Apr 2021 13:36:31 +0000 (09:36 -0400)]
mgr/cephadm/upgrade: normalize unqualified target image
If we get an unqualified target image, assume it's docker.io. This
ensures that we're passing a fully-qualified target to docker|podman on
the various hosts and don't end up with something different based on the
per-host search path for unqualified image names.
Signed-off-by: Sage Weil <sage@newdream.net>
Zac Dover [Fri, 9 Apr 2021 10:49:15 +0000 (20:49 +1000)]
doc/rados: rewrite mclock docs (2 of 2)
This is my second attempt to rewrite the
second half of the mclock docs. The first attempt
is enshrined in https://github.com/ceph/ceph/pull/40571,
in which I got cute with git and got burned.
Signed-off-by: Zac Dover <zac.dover@gmail.com>
Sage Weil [Fri, 9 Apr 2021 20:22:49 +0000 (16:22 -0400)]
mgr/orchestrator: report osds as osd.unmanaged as appropriate
If there is no osdspec_affinity or service_name (from unit.meta), then
report as 'osd.unmanaged'.
Signed-off-by: Sage Weil <sage@newdream.net>
Samuel Just [Fri, 9 Apr 2021 19:42:18 +0000 (12:42 -0700)]
Merge pull request #39216 from myoungwon/wip-manifest-dedup-test
osd, test: reworks for manifest dedup test cases
Reviewed-by: Samuel Just <sjust@redhat.com>
Sage Weil [Fri, 9 Apr 2021 19:35:17 +0000 (15:35 -0400)]
mgr/orchestrator: remove IMAGE ID from 'orch ls'
This is not very useful at this level:
- we see it from 'orch ps'
- it can be a mix of ids during upgrade
- some services may have multiple images at steady state (e.g., ingress)
Signed-off-by: Sage Weil <sage@newdream.net>
J. Eric Ivancich [Thu, 8 Apr 2021 23:19:36 +0000 (19:19 -0400)]
rgw: fix radoslist stuck loop
When an incomplete multipart upload has in excess of 1000 parts,
looping over those parts was not handled property causing an infinite
loop. The paging/marker is now handled correctly.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Kefu Chai [Thu, 8 Apr 2021 04:19:42 +0000 (12:19 +0800)]
qa/suites: test upgrade/octopus-x on focal instead bionic
so we can solely build on focal in future once all other bionic facets
in qa/ is removed or replaced.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Yuri Weinstein [Fri, 9 Apr 2021 14:45:55 +0000 (07:45 -0700)]
Merge pull request #40623 from ronen-fr/wip-ronenf-revert-40077
osd: Revert "osd: Try other PGs when reservation failures occur"
Reviewed-by: Neha Ojha <nojha@redhat.com>
Yuri Weinstein [Fri, 9 Apr 2021 14:44:42 +0000 (07:44 -0700)]
Merge pull request #40606 from myoungwon/wip-49427-2
osd: recover unreadable snapshot before reading refcount info
Reviewed-by: Samuel Just <sjust@redhat.com>
Alfonso Martínez [Fri, 9 Apr 2021 08:51:21 +0000 (10:51 +0200)]
mgr/dashboard: fix errors when creating NFS export.
- Fix daemon raw config parsing.
- Handle error when no rgw daemons found.
Fixes: https://tracker.ceph.com/issues/49925
Signed-off-by: Alfonso Martínez <almartin@redhat.com>
myoungwon oh [Fri, 12 Mar 2021 13:01:23 +0000 (22:01 +0900)]
osd: avoid for the two copy to cancel each other
add the op to blocked_list, then return early
if the destination of copy-from already exists and
the head object is same
fixes: https://tracker.ceph.com/issues/49726
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Josh Durgin [Thu, 8 Apr 2021 22:36:27 +0000 (15:36 -0700)]
Merge pull request #40510 from aclamk/wip-bluestore-sharding-rst
doc: Add BlueStore sharding documentation
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Patrick Donnelly [Wed, 24 Mar 2021 20:54:17 +0000 (13:54 -0700)]
mds: reject lookup ino requests for mds dirs
Fixes: https://tracker.ceph.com/issues/49922
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Mon, 29 Mar 2021 22:08:28 +0000 (15:08 -0700)]
qa: test standby-replay with fs:workloads
Fixes: https://tracker.ceph.com/issues/50045
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Thu, 8 Apr 2021 18:35:05 +0000 (11:35 -0700)]
Merge PR #40486 into master
* refs/pull/40486/head:
mds: trim cache regularly for standby-replay
mds: remove extra heap release
Reviewed-by: Sidharth Anupkrishnan <sanupkri@redhat.com>
Patrick Donnelly [Thu, 8 Apr 2021 18:34:21 +0000 (11:34 -0700)]
Merge PR #40520 into master
* refs/pull/40520/head:
mds/scrub: background scrub error fixes
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Thu, 8 Apr 2021 18:33:41 +0000 (11:33 -0700)]
Merge PR #40633 into master
* refs/pull/40633/head:
mds: ensure export_pin rank < max_mds
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Thu, 8 Apr 2021 18:33:08 +0000 (11:33 -0700)]
Merge PR #40638 into master
* refs/pull/40638/head:
mds: do not show the default auth if it's unambiguous
mds: switch to rank number instead
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Wed, 7 Apr 2021 19:27:05 +0000 (12:27 -0700)]
mon: check mdsmap is resizeable before promoting standby-replay
If any MDS is up:creating, some rank data structures may not exist yet.
Fixes: https://tracker.ceph.com/issues/50215
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Kefu Chai [Thu, 8 Apr 2021 14:57:23 +0000 (22:57 +0800)]
Merge pull request #40617 from tchaikov/wip-system-pmem
install-deps.sh: install libpmem libraries if WITH_PMEM is set
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Thu, 8 Apr 2021 14:28:10 +0000 (07:28 -0700)]
Merge PR #40467 into master
* refs/pull/40467/head:
doc: detail `fs snapshot mirror daemon status` mgr command
doc: s/<fs>/<fs_name>/g
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sébastien Han <seb@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Kefu Chai [Thu, 8 Apr 2021 12:37:41 +0000 (20:37 +0800)]
Merge pull request #40654 from rzarzynski/wip-crimson-notify-lifetime
crimson/osd: fix the lifetime of Notify during timeouts
Reviewed-by: Kefu Chai <kchai@redhat.com>
Ilya Dryomov [Thu, 8 Apr 2021 11:07:44 +0000 (13:07 +0200)]
Merge pull request #40641 from idryomov/wip-require-ceph-common-for-ioc
packaging: require ceph-common for immutable object cache daemon
Reviewed-by: Kefu Chai <kchai@redhat.com>
Radoslaw Zarzynski [Wed, 7 Apr 2021 19:49:17 +0000 (19:49 +0000)]
crimson/osd: improve debugs around Watch / Notify.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>