]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Daniel Pivonka [Mon, 28 Jun 2021 14:12:13 +0000 (10:12 -0400)]
mgr/cephadm: add ceph orch host drain and limit host removal to empty hosts
ceph orch host drain removes all daemons from a host so it can be safely removed
ceph orch host rm will only remove host that a safe to remove
Signed-off-by: Daniel Pivonka <dpivonka@redhat.com>
Kefu Chai [Tue, 22 Jun 2021 12:59:51 +0000 (20:59 +0800)]
Merge pull request #41967 from sandrobonazzola/patch-1
doc/install/get-packages: point to current stable release
Reviewed-by: Kefu Chai <kchai@redhat.com>
Sandro Bonazzola [Tue, 22 Jun 2021 07:49:30 +0000 (09:49 +0200)]
doc/install/get-packages: point to current stable release
Point to pacific for downloading the cephadm script
Co-authored-by: Kefu Chai <tchaikov@gmail.com>
Signed-off-by: Sandro Bonazzola <sbonazzo@redhat.com>
Kefu Chai [Tue, 22 Jun 2021 11:58:01 +0000 (19:58 +0800)]
Merge pull request #41960 from rzarzynski/wip-crimson-alienstore-sync-shardedwq
crimson/os: synchronize producers with consumers in AlienStore's queues.
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Tue, 22 Jun 2021 11:43:55 +0000 (19:43 +0800)]
Merge pull request #41957 from liewegas/fix-vstart-registry-url
vstart.sh: fix docker url
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Tue, 22 Jun 2021 10:38:43 +0000 (18:38 +0800)]
Merge pull request #41961 from rzarzynski/wip-crimson-intcltreq-fix-assert
crimson/osd: fix construction of InternalClientRequest in DEBUG builds.
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Tue, 22 Jun 2021 10:37:37 +0000 (18:37 +0800)]
Merge pull request #41962 from rzarzynski/wip-crimson-intcltreq-more-asserts
crimson/osd: introduce more asserts to the Watch timeout handling.
Reviewed-by: Kefu Chai <kchai@redhat.com>
Radoslaw Zarzynski [Mon, 21 Jun 2021 12:28:52 +0000 (12:28 +0000)]
crimson/osd: introduce more asserts to the Watch timeout handling.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Patrick Donnelly [Mon, 21 Jun 2021 23:54:59 +0000 (16:54 -0700)]
Merge PR #41860 into master
* refs/pull/41860/head:
qa: log messages when falling back to force/lazy umount
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Radoslaw Zarzynski [Mon, 21 Jun 2021 18:59:45 +0000 (18:59 +0000)]
crimson/osd: fix construction of InternalClientRequest in DEBUG builds.
The assert in the ctor of `InternalClientRequest` actually operates on
the ctor's argument we `std::moved` from, not on the class' member.
When a debug build is used, this translates into failures like the one
below:
```
2021-06-16T22:53:03.410 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:02 smithi170 conmon[43770]: ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-
build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/
17.0.0-4987-gec8844b6 /rpm/el8/BUILD/
ceph-17.0.0-4987-gec8844b6
/src/crimson/osd/osd_operations/internal_client_request.cc:19: crimson::osd::InternalClientRequest::InternalClientRequest(Ref<crimson::osd::PG>): Assertion `bool(pg)' f
ailed.
2021-06-16T22:53:05.363 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 0# 0x0000558BE7BBF68F in /usr/bin/ceph-osd
2021-06-16T22:53:05.363 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 1# FatalSignal::signaled(int, siginfo_t const*) in /usr/bi
n/ceph-osd
2021-06-16T22:53:05.363 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 2# FatalSignal::install_oneshot_signal_handler<6>()::{lamb
da(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in /usr/bin/ceph-osd
2021-06-16T22:53:05.364 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 3# 0x00007F8AD7535B20 in /lib64/libpthread.so.0
2021-06-16T22:53:05.364 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 4# gsignal in /lib64/libc.so.6
2021-06-16T22:53:05.364 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 5# abort in /lib64/libc.so.6
2021-06-16T22:53:05.364 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 6# 0x00007F8AD5B2EC89 in /lib64/libc.so.6
2021-06-16T22:53:05.365 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 7# 0x00007F8AD5B3CA76 in /lib64/libc.so.6
2021-06-16T22:53:05.365 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 8# crimson::osd::InternalClientRequest::InternalClientRequ
est(boost::intrusive_ptr<crimson::osd::PG>) in /usr/bin/ceph-osd
2021-06-16T22:53:05.365 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 9# crimson::osd::Watch::do_watch_timeout(boost::intrusive_ptr<crimson::osd::PG>) in /usr/bin/ceph-osd
2021-06-16T22:53:05.365 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 10# seastar::noncopyable_function<void ()>::direct_vtable_for<crimson::osd::Watch::Watch(crimson::osd::Watch::private_ctag_t, boost::intrusive_ptr<crimson::osd::ObjectContext>, watch_info_t const&, entity_name_t const&, boost::intrusive_ptr<crimson::osd::PG>)::{lambda()#1}>::call(seastar::noncopyable_function<void ()> const*) in /usr/bin/ceph-osd
2021-06-16T22:53:05.366 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 11# 0x0000558BED653759 in /usr/bin/ceph-osd
2021-06-16T22:53:05.366 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 12# 0x0000558BED61B148 in /usr/bin/ceph-osd
2021-06-16T22:53:05.366 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 13# 0x0000558BED61B576 in /usr/bin/ceph-osd
2021-06-16T22:53:05.366 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 14# 0x0000558BED7C93C9 in /usr/bin/ceph-osd
2021-06-16T22:53:05.367 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 15# 0x0000558BED326D5A in /usr/bin/ceph-osd
2021-06-16T22:53:05.367 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 16# 0x0000558BED330E7E in /usr/bin/ceph-osd
2021-06-16T22:53:05.367 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 17# main in /usr/bin/ceph-osd
2021-06-16T22:53:05.367 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 18# __libc_start_main in /lib64/libc.so.6
2021-06-16T22:53:05.368 INFO:journalctl@ceph.osd.6.smithi170.stdout:Jun 16 22:53:05 smithi170 conmon[43770]: 19# _start in /usr/bin/ceph-osd
```
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Radoslaw Zarzynski [Mon, 21 Jun 2021 21:16:35 +0000 (21:16 +0000)]
crimson/os: synchronize producers with consumers in AlienStore's queues.
Some time ago we replaced the single, `boost::lockfree`-based queue
in `ThreadPool` with the in-house, lockish `ShardedWorkQueue` vector.
Unfortunately, pushing into such queue isn't synchronized with
consuming from it -- the former happens without locking the `mutex`.
As the underlying primitive behind `ShardedWorkQueue::pending` is
plain `std::deque`, it's unsafe to operate that way in multi-thread
environment. Indeed, weirdly looking crashes have been spotted at Sepia:
```
(virtualenv) rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-06-21_14:49:36-rados-master-distro-basic-smithi/
6182668 $ less ./remote/smithi196/log/ceph-osd.7.log.gz
...
0# 0x000055862FD67ADF in ceph-osd
1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
3# 0x00007FB22CF36B20 in /lib64/libpthread.so.0
4# 0x00005586357540E4 in ceph-osd
5# 0x00007FB22CF36B20 in /lib64/libpthread.so.0
6# pthread_cond_timedwait in /lib64/libpthread.so.0
7# crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) in ceph-osd
8# 0x00005586313E303B in ceph-osd
9# 0x00007FB22CC51BA3 in /lib64/libstdc++.so.6
10# 0x00007FB22CF2C14A in /lib64/libpthread.so.0
11# clone in /lib64/libc.so.6
Fault at location: 0x18
daemon-helper: command crashed with signal 11
```
This fix introduces the synchronization to the `push_back()` method of
`ShardedWorkQueue`. The side effect is that it may stall the reactor.
Therefore, a follow-up change that switches to e.g. `boost::lockfree`
is expected.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Sage Weil [Mon, 21 Jun 2021 19:01:29 +0000 (15:01 -0400)]
vstart.sh: fix docker url
Signed-off-by: Sage Weil <sage@newdream.net>
Kefu Chai [Mon, 21 Jun 2021 12:46:41 +0000 (20:46 +0800)]
Merge pull request #41941 from tchaikov/wip-crimson-errorator-loop
crimson/common: extract parallel_for_each_state out
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Mon, 21 Jun 2021 08:36:20 +0000 (16:36 +0800)]
Merge pull request #41949 from tchaikov/wip-crimson-prometheus
crimson/osd: expose metrics using http server
Reviewed-by: Samuel Just <sjust@redhat.com>
Kefu Chai [Mon, 21 Jun 2021 06:50:10 +0000 (14:50 +0800)]
crimson/osd: expose metrics using http server
so, we can query the metrics using HTTP API, like
http://localhost:9180/metrics?name=io*
or
http://192.168.2.8:9180/metrics?name=io_queue_delay
or
http://localhost:9180/metrics
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Mon, 21 Jun 2021 04:34:02 +0000 (12:34 +0800)]
Merge pull request #41934 from cyx1231st/wip-seastore-onode-logs
crimson/onode-staged-tree: improve logs to understand inconsistent load from seastore
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Xuehan Xu <xuxuehan@360.cn>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yingxin Cheng [Fri, 18 Jun 2021 08:47:05 +0000 (16:47 +0800)]
crimson/onode-staged-tree: print NodeExtent with the header
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Yingxin Cheng [Fri, 18 Jun 2021 08:33:00 +0000 (16:33 +0800)]
crimson/onode-staged-tree: validate node header when load
Add logs to detect corruptions when load nodes. assert() is not
informative enough to understand the context.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Yingxin Cheng [Fri, 18 Jun 2021 08:21:06 +0000 (16:21 +0800)]
crimson/onode-staged-tree: delete copy constructor of DummyNodeExtent
Dummy backend is used for unit tests without transactions, so there
should be no copy-on-write behavior.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Yingxin Cheng [Fri, 18 Jun 2021 08:17:09 +0000 (16:17 +0800)]
crimson/onode-staged-tree: add trace logs when start to load nodes
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Amnon Hanuhov [Sun, 20 Jun 2021 18:30:36 +0000 (21:30 +0300)]
Merge pull request #41861 from AmnonHanuhov/wip-Refactor_crimson_internals
crimson/net: Complete the refactor to std::unique_ptr inside Messenger
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 19 Jun 2021 14:57:07 +0000 (22:57 +0800)]
Merge pull request #41921 from gregsfortytwo/wip-mon-stretch-crush-rule
mon: Sanely set the default CRUSH rule when creating pools in stretch…
Reviewed-by: Samuel Just <sjust@redhat.com>
Amnon Hanuhov [Sat, 19 Jun 2021 14:56:13 +0000 (17:56 +0300)]
tools/crimson: Use crimson::make_message() in perf_crimson_msgr
Instead of ceph::make_message() because conn::send() in crimson expects
a std::unique_ptr and not boost::intrusive_ptr
Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
Kefu Chai [Sat, 19 Jun 2021 14:54:25 +0000 (22:54 +0800)]
Merge pull request #41845 from agayev/zoned-revise-per-zone-naming-scheme
os/bluestore: Revise the naming scheme for per-zone cleaning informat…
Reviewed-by: Igor Fedotov <ifedotov@suse,com>
Amnon Hanuhov [Sat, 19 Jun 2021 14:52:54 +0000 (17:52 +0300)]
test/crimson: Use crimson::make_message() in test_alien_echo
Instead of ceph::make_message() because conn::send() in crimson expects
a std::unique_ptr and not boost::intrusive_ptr
Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
Kefu Chai [Sat, 19 Jun 2021 14:51:59 +0000 (22:51 +0800)]
Merge pull request #41830 from tchaikov/wip-ceph-argparse-cleanup
pybind/ceph_argparse: cleanups preparing for type annotations
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Amnon Hanuhov [Thu, 3 Jun 2021 11:47:00 +0000 (14:47 +0300)]
crimson/net: Use MessageURef in messenger internals
Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
Amnon Hanuhov [Tue, 8 Jun 2021 12:51:33 +0000 (15:51 +0300)]
crimson/osd: Get rid of send_to_osd() overloading
Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
Amnon Hanuhov [Tue, 8 Jun 2021 12:48:59 +0000 (15:48 +0300)]
osd: Overload send_osd_message() in PeeringState
To allow passing MessageURef from crimson-osd and MessageRef from
ceph-osd
Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
Amnon Hanuhov [Tue, 8 Jun 2021 12:43:50 +0000 (15:43 +0300)]
crimson/osd: Move message to send_to_osd() in ShardServices
To avoid refcounting the underlying RefCountedObject
Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
Kefu Chai [Sat, 19 Jun 2021 13:11:32 +0000 (21:11 +0800)]
Merge pull request #41923 from liewegas/fix-51234
ceph_test_librados_service: wait longer for servicemap to update
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 19 Jun 2021 13:03:04 +0000 (21:03 +0800)]
Merge pull request #41914 from lxbsz/wip-51092
os/memstore: make the used_bytes to atomic
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 19 Jun 2021 13:02:18 +0000 (21:02 +0800)]
Merge pull request #41896 from ifed01/wip-ifed-verbose-kernel-read
blk/KernelDevice: be more verbose on read errors.
Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 19 Jun 2021 11:13:17 +0000 (19:13 +0800)]
crimson/common: specialize errorator<> for future<>
otherwise it always needs a return value.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 19 Jun 2021 11:12:36 +0000 (19:12 +0800)]
crimson/common: extract parallel_for_each_state out
if `parallel_for_each_state` is defined as a nested class in errorator,
clang fails to compile it:
../src/crimson/common/errorator.h:716:47: error: no class named 'parallel_for_each_state' in 'errorator<AllowedErrors...>'
friend class errorator<AllowedErrors...>::parallel_for_each_state;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
and the forward declaration does not help. so we have to extract it
out of the errorator. to speed up the compilation, it is moved into
errorator-loop.h. its name mirrors `include/seastar/core/loop.h`.
we could extract the `errorator<>::parallel_for_each()` out as well,
as its return type can be deduced from the type of Iterator and Func.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 19 Jun 2021 08:51:58 +0000 (16:51 +0800)]
Merge pull request #41920 from ljflores/patch-1
doc: fixed a small typo in Perf Counters documentation
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 19 Jun 2021 08:49:31 +0000 (16:49 +0800)]
Merge pull request #41925 from tchaikov/wip-fmtlib
fmt: pickup fix of link failure with clang
Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
Patrick Donnelly [Sat, 19 Jun 2021 02:54:09 +0000 (19:54 -0700)]
Merge PR #41900 into master
* refs/pull/41900/head:
qa: use centos latest for fs:upgrade
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Patrick Donnelly [Sat, 19 Jun 2021 02:52:54 +0000 (19:52 -0700)]
Merge PR #41899 into master
* refs/pull/41899/head:
mon/MDSMonitor: check fscid exists for legacy case
Reviewed-by: Ramana Raja <rraja@redhat.com>
Patrick Donnelly [Sat, 19 Jun 2021 02:52:24 +0000 (19:52 -0700)]
Merge PR #41898 into master
* refs/pull/41898/head:
mon/MDSMonitor: fix whitespace in debug message
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Patrick Donnelly [Sat, 19 Jun 2021 02:51:55 +0000 (19:51 -0700)]
Merge PR #41892 into master
* refs/pull/41892/head:
client: remove unused include from barrier.cc
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Sat, 19 Jun 2021 02:51:05 +0000 (19:51 -0700)]
Merge PR #41833 into master
* refs/pull/41833/head:
cephfs-mirror: silence warnings when connecting via mon host
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Sat, 19 Jun 2021 02:50:22 +0000 (19:50 -0700)]
Merge PR #41723 into master
* refs/pull/41723/head:
mds: to print the unknow type value
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Patrick Donnelly [Sat, 19 Jun 2021 02:49:15 +0000 (19:49 -0700)]
Merge PR #40997 into master
* refs/pull/40997/head:
test: add test to verify adding an active peer back to source
pybind/mirroring: disallow adding a active peer back to source
pybind/cephfs: interface to fetch file system id
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Sat, 19 Jun 2021 02:47:53 +0000 (19:47 -0700)]
Merge PR #36823 into master
* refs/pull/36823/head:
qa : add a test for the cmd, dump cache
mds : add timeout to the command, dump cache, to prevent it from running too long and affecting the service
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Laura Flores [Thu, 17 Jun 2021 19:23:45 +0000 (14:23 -0500)]
doc: fixed a small typo in Perf Counters documentation
There is a small typo in the Perf Counters documentation. Gauge was spelled incorrectly.
Signed-off-by: Laura Flores <lflores@redhat.com>
Ernesto Puerta [Fri, 18 Jun 2021 18:08:11 +0000 (20:08 +0200)]
Merge pull request #40506 from p-se/pse-update-grafana-deprecated-variables
mgr/dashboard: deprecated variable usage in Grafana dashboards
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: p-se <NOT@FOUND>
Ernesto Puerta [Fri, 18 Jun 2021 18:07:11 +0000 (20:07 +0200)]
Merge pull request #41808 from rhcs-dashboard/51164-show-only-days-in-bucket-details
mgr/dashboard: bucket details: show lock retention period only in days
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Ernesto Puerta [Fri, 18 Jun 2021 18:04:16 +0000 (20:04 +0200)]
Merge pull request #41758 from rhcs-dashboard/support-multiple-crush-trees
mgr/dashboard: crushmap tree doesn't display crush type other than root
Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Kefu Chai [Fri, 18 Jun 2021 08:26:09 +0000 (16:26 +0800)]
Merge pull request #35903 from agayev/fix-deployment-guide
doc: Add a missing instruction to manual deployment guide.
Reviewed-by: Kefu Chai <kchai@redhat.com>
Abutalib Aghayev [Thu, 2 Jul 2020 15:35:10 +0000 (11:35 -0400)]
doc: Add a missing instruction to manual deployment guide.
Following the instructions as is results in the following error at step 15:
$ sudo -u ceph ceph-mon --mkfs -i node1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
global_init: error reading config file.
Signed-off-by: Abutalib Aghayev <agayev@cs.cmu.edu>
Kefu Chai [Fri, 18 Jun 2021 03:14:29 +0000 (11:14 +0800)]
ceph.spec.in: bump up the required version of fmt-devel to 6.2.1
6.2.1 is the version packaged by EPEL8, in other words, this is the
version we've been testing. so to be more consistent with the
known-to-be-good version, let's bump up the required version.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 18 Jun 2021 03:09:15 +0000 (11:09 +0800)]
debian/control: add libfmt-dev for "make check"
so, on debian derivatives, we can use the libfmt-dev package for
building Ceph. this change is created in hope to reduce the compile
time.
>= 6.1.2 is specified, as it is the version packaged by ubuntu focal,
which is used for running "make check" and intergration tests.
find_package(fmt 6.0.0 QUIET)
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 18 Jun 2021 02:59:33 +0000 (10:59 +0800)]
fmt: pickup fix of link failure with clang
fmtlib v7.1.3 contains the fix of https://github.com/fmtlib/fmt/issues/1753
so let's bump up the submodule to the latest master HEAD of fmtlib
for more fixes.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 18 Jun 2021 02:22:21 +0000 (10:22 +0800)]
Merge pull request #41911 from tchaikov/wip-crimson-nbd-cleanup
crimson/tools/store_nbd: better cleanup
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Kefu Chai [Thu, 17 Jun 2021 07:08:32 +0000 (15:08 +0800)]
crimson/tools/store_nbd: replace wait_pending with seastar::gate
the inc_pending + promise<> solution is pratically identical to
seastar::gate, so let's use the prepackaged solution instead.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 17 Jun 2021 05:26:14 +0000 (13:26 +0800)]
crimson/tools/store_nbd: drop unnecessary seastar::now()
the body of handle_exception() is a synchronous operation, there is no
need to return seastar::now() here.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 17 Jun 2021 01:35:40 +0000 (09:35 +0800)]
crimson/tools/store_nbd: better cleanup
* remove unix domain socket file when cleanup
so we don't need to remove it manually after each run.
* shutdown input and output streams when cleanup
so reactor does not watch them anymore.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 17 Jun 2021 02:53:34 +0000 (10:53 +0800)]
crimson/tools/store_nbd: s/socket/server_socket/
to prepare for the next commit, which will keep track of the
connected_socket as another member variable.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Amnon Hanuhov [Thu, 17 Jun 2021 21:04:40 +0000 (00:04 +0300)]
Merge pull request #41916 from AmnonHanuhov/wip-Refactor_test_messenger
test/crimson: Use crimson's make_message in test_messenger
Sage Weil [Thu, 17 Jun 2021 21:01:09 +0000 (17:01 -0400)]
ceph_test_librados_service: wait longer for servicemap to update
mon thrashing may make this take a long time
Fixes: https://tracker.ceph.com/issues/51234
Signed-off-by: Sage Weil <sage@newdream.net>
Greg Farnum [Thu, 17 Jun 2021 19:56:20 +0000 (19:56 +0000)]
mon: Sanely set the default CRUSH rule when creating pools in stretch mode
If we get a pool create request while in stretch mode that does not explicitly
specify a crush rule, look at the stretch-mode pools and their rules, and
select the most common one.
Also update set_up_stretch_mode.sh to add a few more rules that let me test
this locally.
Fixes: https://tracker.ceph.com/issues/51270
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Amnon Hanuhov [Thu, 17 Jun 2021 12:18:57 +0000 (15:18 +0300)]
test/crimson: Use crimson's make_message in test_messenger
Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>
J. Eric Ivancich [Thu, 17 Jun 2021 19:37:30 +0000 (15:37 -0400)]
Merge pull request #41905 from ivancich/wip-improve-cls-rgw-tracing
rgw: clean-up logging of function entering to make thorough and consistent
Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
Reviewed-by: Ali Maredia <amaredia@redhat.com>
Abutalib Aghayev [Mon, 14 Jun 2021 18:11:52 +0000 (14:11 -0400)]
os/bluestore: Revise the naming scheme for per-zone cleaning information.
Use a single letter (G) for the namespace, and use zone_num+oid as the key.
Signed-off-by: Abutalib Aghayev <agayev@psu.edu>
J. Eric Ivancich [Mon, 10 May 2021 21:36:49 +0000 (17:36 -0400)]
rgw: clean-up logging of function entering to make thorough and consistent
This provides more thorough and consistent function tracing in CLS/RGW
when logging is set to 10 or higher.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Sebastian Wagner [Thu, 17 Jun 2021 14:53:56 +0000 (16:53 +0200)]
Merge pull request #41903 from liewegas/update-rook-client
rook-client-python: update to update-june-21
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Ali Maredia [Thu, 17 Jun 2021 14:30:28 +0000 (10:30 -0400)]
Merge pull request #41835 from TRYTOBE8TME/wip-rgw-keycloak-failure-fix
qa/tasks: Keycloak failure fix
Reviewed-by: Ali Maredia <amaredia@redhat.com>
Joseph Sawaya [Wed, 16 Jun 2021 16:49:53 +0000 (12:49 -0400)]
mgr/rook: comment out osd creation functions
This commit comments out the OSD creation functions in rook_cluster.py
and module.py, since the submodule update has broken them.
Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
Joseph Sawaya [Tue, 15 Jun 2021 22:07:51 +0000 (18:07 -0400)]
mgr/rook: fix some mypy typing errors in rook_cluster.py
This commit fixes some errors caught by mypy in rook_cluster.py. Most of the
errors were caused by the update of the rook-client-python submodule in a previous
commit.
Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
Joseph Sawaya [Tue, 15 Jun 2021 18:14:40 +0000 (14:14 -0400)]
mgr/rook: pass zone attribute to CephObjectStore CR when creating rgw
This commit passes the zone attribute to the CephObjectStore CR when
creating a RGW instance using the Rook Orchestrator backend:
`ceph orch apply rgw <rgw-name> --realm=<realm-name> --zone=<zone-name>`
Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
Xiubo Li [Thu, 17 Jun 2021 11:20:29 +0000 (19:20 +0800)]
os/memstore: make the used_bytes to atomic
Fixes: https://tracker.ceph.com/issues/51092
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Ernesto Puerta [Thu, 17 Jun 2021 10:44:57 +0000 (12:44 +0200)]
Merge pull request #41856 from rhcs-dashboard/maintenance-bug-fix
mgr/dashboard: Fix 500 error while exiting out of maintenance
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: sebastian-philipp <NOT@FOUND>
Sebastian Wagner [Thu, 17 Jun 2021 09:37:26 +0000 (11:37 +0200)]
Merge pull request #41694 from jmolmo/kcli_cephadm_doc
doc: Add kcli utilization for development environments
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Kefu Chai [Thu, 17 Jun 2021 04:23:29 +0000 (12:23 +0800)]
Merge pull request #41848 from xxhdx1985126/wip-errorator-parallel_for_each
crimson: errorator parallel_for_each
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 17 Jun 2021 02:31:27 +0000 (10:31 +0800)]
Merge pull request #41894 from tchaikov/wip-crimson-sigint
crimson/{osd,store_nbd}: handle SIGINT
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Kefu Chai [Thu, 17 Jun 2021 00:13:49 +0000 (08:13 +0800)]
Merge pull request #41895 from ceph/wip-cacephcom
MIRRORS: Add ca.ceph.com
Reviewed-by: Kefu Chai <kchai@redhat.com>
Nizamudeen A [Tue, 15 Jun 2021 08:47:58 +0000 (14:17 +0530)]
mgr/dashboard: Fix 500 error while exiting out of maintenance
When you add a host in maintenance mode and then exit the maintenance
mode, a 500 server error will popup which will interrupt the whole
exit maintenance process and leave the host in an unknown/offline state.
It happened when I was setting the status of the host through the
HostSpec(). With this change, I am using the enter_maintenance api of
the orch to enable the maintenance.
Fixes: https://tracker.ceph.com/issues/51218
Signed-off-by: Nizamudeen A <nia@redhat.com>
Sage Weil [Wed, 16 Jun 2021 20:11:34 +0000 (16:11 -0400)]
rook-client-python: update to update-june-21
Signed-off-by: Sage Weil <sage@newdream.net>
Patrick Donnelly [Wed, 16 Jun 2021 19:16:36 +0000 (12:16 -0700)]
qa: use centos latest for fs:upgrade
Fixes: https://tracker.ceph.com/issues/51250
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Wed, 16 Jun 2021 16:30:41 +0000 (09:30 -0700)]
mon/MDSMonitor: check fscid exists for legacy case
If a client does not have permission to see the legacy fs, the monitor
will throw an exception when looking up the mdsmap later in the code.
We need to check existence for both code paths.
Fixes: https://tracker.ceph.com/issues/51077
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Wed, 16 Jun 2021 16:30:01 +0000 (09:30 -0700)]
mon/MDSMonitor: fix whitespace in debug message
So it doesn't look like:
2021-06-16T14:37:52.953+0000
7fec41d7c700 10 mon.a@0(leader).mds e10 check_sub: is_mds=0, fscid= 1
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Igor Fedotov [Wed, 16 Jun 2021 18:12:23 +0000 (21:12 +0300)]
blk/KernelDevice: be more verbose on read errors.
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
Sage Weil [Wed, 16 Jun 2021 18:01:26 +0000 (14:01 -0400)]
Merge PR #41890 into master
* refs/pull/41890/head:
doc/cephadm: removing "Octopus" from procedure
Reviewed-by: Sage Weil <sage@redhat.com>
David Galloway [Wed, 16 Jun 2021 16:30:22 +0000 (12:30 -0400)]
MIRRORS: Add ca.ceph.com
Signed-off-by: David Galloway <dgallowa@redhat.com>
Kefu Chai [Wed, 16 Jun 2021 16:04:37 +0000 (00:04 +0800)]
crimson/osd: use stop_signal from seastar
and disable app_cfg.auto_handle_sigint_sigterm, otherwise app template
handles SIGINT and SIGTERM by itself, and calls app.stop(). but we don't
use this mechinary at all. we use seastar::defer() instead of
seastar::at_exit() for doing graceful shutdown and cleanup.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 16 Jun 2021 13:59:59 +0000 (21:59 +0800)]
crimson/tools/store_nbd: update example usage in comment
--total-device-size is not supported any more.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 16 Jun 2021 13:51:57 +0000 (21:51 +0800)]
crimson/tools/store_nbd: add graceful shutdown support
we could have more sophisticate mechinary for interrupting fio job,
but so far it is able to stop itself if it idle by handling ctrl-C.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Yingxin Cheng [Wed, 16 Jun 2021 08:00:44 +0000 (16:00 +0800)]
crimson/seastore/nbd: destruct the store before create
Otherwise the store will register the conflicting metrics and result in
double_registration.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Zac Dover [Wed, 16 Jun 2021 14:34:10 +0000 (00:34 +1000)]
doc/cephadm: removing "Octopus" from procedure
This PR removes "Octopus" from the curl-based installation
procedure.
After we moved on to Pacific, referring to Octopus looks wrong.
It looks wrong because it now is wrong.
Signed-off-by: Zac Dover <zac.dover@gmail.com>
Kefu Chai [Wed, 16 Jun 2021 14:09:28 +0000 (22:09 +0800)]
Merge pull request #41882 from tchaikov/wip-crimson-int-safty
crimson/osd: guard non-pg-op handling with with_sequencer()
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Wed, 16 Jun 2021 12:36:32 +0000 (20:36 +0800)]
Merge pull request #41885 from tchaikov/wip-crimson-os-cleanups
crimson/os: cleanups and reformat
Reviewed-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Kefu Chai [Wed, 16 Jun 2021 10:27:07 +0000 (18:27 +0800)]
crimson/osd: reindent
for less indent, hence better readability
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 16 Jun 2021 10:26:15 +0000 (18:26 +0800)]
crimson/osd: wrap line at 80
for better readability
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 16 Jun 2021 10:25:55 +0000 (18:25 +0800)]
crimson/osd: guard non-pg-op handling with with_sequencer()
because we should only ensure the ordering of the requests touching
the objects, the other requests like pgls should not be ordered along
with them. so as the second step, guard the non-pg-op handling with
with_sequencer().
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 16 Jun 2021 11:58:42 +0000 (19:58 +0800)]
crimson/os: use reference for loop variable
for better performance, also silences the warning like:
../src/crimson/os/seastore/random_block_manager/nvme_manager.cc:444:23: warning: loop variable ‘b’ creates a copy from type ‘const crimson::os::seastore::rbm_alloc_delta_t’ [-Wrange-loop-construct]
444 | for (const auto b : alloc_blocks) {
| ^
Signed-off-by: Kefu Chai <kchai@redhat.com>
Sebastian Wagner [Wed, 16 Jun 2021 11:54:31 +0000 (13:54 +0200)]
Merge pull request #41859 from sebastian-philipp/mypy-constrains.txt
global,tox.ini: add mypy-constrains.txt
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Patrick Seidensal <pseidensal@suse.com>
Kefu Chai [Wed, 16 Jun 2021 11:05:53 +0000 (19:05 +0800)]
crimson/os: return seastar::now() in "finally()" block
so finally() is able to identify the return is a future, and discard it
manually.
otherwise the return value will be discarded even the future is marked
[[nodiscard]], hence the C++ compiler warns.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 16 Jun 2021 11:00:00 +0000 (19:00 +0800)]
crimson/os: remove unnecessary now()
the previous continuation in the chain already returns a future, no need
to hook up another now().
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 16 Jun 2021 10:38:22 +0000 (18:38 +0800)]
crimson/os: remove unnecessary parentheses
Signed-off-by: Kefu Chai <kchai@redhat.com>