Sage Weil [Tue, 18 Feb 2020 23:57:00 +0000 (17:57 -0600)]
mgr/pg_autoscaler: fix division by zero
Fixes: https://tracker.ceph.com/issues/44186
Signed-off-by: Sage Weil <sage@redhat.com>
Patrick Donnelly [Tue, 18 Feb 2020 20:04:52 +0000 (12:04 -0800)]
Merge PR #33390 into master
* refs/pull/33390/head:
mds: remove dead get_commands code
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Sage Weil [Tue, 18 Feb 2020 19:54:28 +0000 (13:54 -0600)]
Merge PR #33336 into master
* refs/pull/33336/head:
osd: fix racy accesses to OSD::osdmap.
Reviewed-by: Sage Weil <sage@redhat.com>
Patrick Donnelly [Tue, 18 Feb 2020 15:48:01 +0000 (07:48 -0800)]
mds: remove dead get_commands code
Left over from #31255,
d8c0bde04b88cb3ad96855bd0948ae10072c9da7.
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Sage Weil [Tue, 18 Feb 2020 15:29:11 +0000 (09:29 -0600)]
Merge PR #33369 into master
* refs/pull/33369/head:
cephadm: check for both chrony service names
Reviewed-by: Michael Fritch <mfritch@suse.com>
Sebastian Wagner [Tue, 18 Feb 2020 14:14:03 +0000 (15:14 +0100)]
Merge pull request #33360 from liewegas/fix-prom
qa/suites/rados/cephadm/smoke: disable rgw role for now
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Tue, 18 Feb 2020 09:05:30 +0000 (10:05 +0100)]
Merge pull request #32817 from sebastian-philipp/rename-orchestrator_cli-orchestrator
pybind/mgr: Rename orchestrator_cli to orchestrator
Reviewed-by: Sage Weil <sage@redhat.com>
Kefu Chai [Tue, 18 Feb 2020 01:17:42 +0000 (09:17 +0800)]
Merge pull request #33167 from tchaikov/wip-formatter-string_view
common,mgr,osd: pass string_view as "name"
Reviewed-by: Xie Xingguo <xie.xingguo@zte.com.cn>
Patrick Donnelly [Tue, 18 Feb 2020 00:41:37 +0000 (16:41 -0800)]
Merge PR #33227 into master
* refs/pull/33227/head:
mds: remove unused CDir members
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Tue, 18 Feb 2020 00:41:05 +0000 (16:41 -0800)]
Merge PR #33197 into master
* refs/pull/33197/head:
mount.ceph: fix incorrect options parsing
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Patrick Donnelly [Tue, 18 Feb 2020 00:40:20 +0000 (16:40 -0800)]
Merge PR #33180 into master
* refs/pull/33180/head:
mds: add scrub_info_t into mempool
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Tue, 18 Feb 2020 00:39:32 +0000 (16:39 -0800)]
Merge PR #33104 into master
* refs/pull/33104/head:
client: Fixes for missing consts SEEK_DATA and SEEK_HOLE on alpine linux
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Tue, 18 Feb 2020 00:38:34 +0000 (16:38 -0800)]
Merge PR #33005 into master
* refs/pull/33005/head:
mds: fix 'can wrlock' check in Locker::acquire_locks()
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Tue, 18 Feb 2020 00:35:14 +0000 (16:35 -0800)]
Merge PR #32435 into master
* refs/pull/32435/head:
mds: Reorganize structure and class members in mdstypes header
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Kefu Chai [Tue, 18 Feb 2020 00:33:33 +0000 (08:33 +0800)]
Merge pull request #33344 from athanatos/sjust/wip-errorator-handlers
errorator: improve general error handlers
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Daniel Gryniewicz [Mon, 17 Feb 2020 18:11:25 +0000 (13:11 -0500)]
Merge pull request #31001 from rosinL/wip-set-radosgw-cpu-affinity
rgw/rgw_main: auto set radosgw's cpu affinity according to numa_node configuration
Sage Weil [Mon, 17 Feb 2020 03:26:03 +0000 (21:26 -0600)]
qa/suites/rados/cephadm/smoke: remove rgw
Fixes: https://tracker.ceph.com/issues/44168
Signed-off-by: Sage Weil <sage@redhat.com>
Lenz Grimmer [Mon, 17 Feb 2020 16:36:25 +0000 (16:36 +0000)]
Merge pull request #33206 from rhcs-dashboard/44075-rgw-user-system-field
mgr/dashboard: show correct RGW user 'system' info
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Kefu Chai [Mon, 17 Feb 2020 16:05:56 +0000 (00:05 +0800)]
Merge pull request #33125 from yaarith/telemetry-add-last-report-to-status
mgr/telemetry: add 'last_upload' to status
Reviewed-by: Kefu Chai <kchai@redhat.com>
Lenz Grimmer [Mon, 17 Feb 2020 15:34:11 +0000 (15:34 +0000)]
Merge pull request #33013 from s0nea/wip-dashboard-43912
mgr/dashboard: wait for PG unknown state to be cleared
Reviewed-by: Patrick Seidensal <pnawracay@suse.com>
Lenz Grimmer [Mon, 17 Feb 2020 14:48:45 +0000 (14:48 +0000)]
Merge pull request #33107 from votdev/cleanup_code
mgr/dashboard: Cleanup code
Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Sage Weil [Mon, 17 Feb 2020 14:36:33 +0000 (08:36 -0600)]
cephadm: check for both chrony service names
Signed-off-by: Sage Weil <sage@redhat.com>
Daniel Gryniewicz [Mon, 17 Feb 2020 14:16:06 +0000 (09:16 -0500)]
Merge pull request #31901 from matthewoliver/rgw-admin-user-add-modify
rgw: make radosgw-admin user create and modify distinct
Kefu Chai [Mon, 17 Feb 2020 09:31:25 +0000 (17:31 +0800)]
Merge pull request #32632 from cyx1231st/rfc-seastar-test-socket-nevermove
crimson/net: configure seastar to accept on a fixed core
Reviewed-by: Kefu Chai <kchai@redhat.com>
Sebastian Wagner [Wed, 12 Feb 2020 16:18:32 +0000 (17:18 +0100)]
mgr/dashboard: add pyyaml to requirements.txt
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Fri, 24 Jan 2020 12:00:47 +0000 (13:00 +0100)]
doc: rename orchestrator_cli -> orchestrator
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Fri, 24 Jan 2020 13:15:44 +0000 (14:15 +0100)]
doc: add prettytable to admin/doc-requirements.txt
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Fri, 24 Jan 2020 12:03:00 +0000 (13:03 +0100)]
mon: always_on_modules: Rename orchestrator_cli to orchestrator
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Fri, 24 Jan 2020 12:02:33 +0000 (13:02 +0100)]
debian,spec: Rename orchestrator_cli to orchestrator
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Fri, 24 Jan 2020 12:02:04 +0000 (13:02 +0100)]
.github: Rename orchestrator_cli to orchestrator
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Fri, 24 Jan 2020 13:10:24 +0000 (14:10 +0100)]
mgr/dashboard: add prettytable to requirmenets.txt
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Fri, 24 Jan 2020 12:01:21 +0000 (13:01 +0100)]
mgr/dashboard: Fix doc urls to orchestrator
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Wed, 12 Feb 2020 15:21:05 +0000 (16:21 +0100)]
mgr/orchestrator: Use CLICommand, except it's global variable
`CLICommand.COMMANDS` is a global varialbe that prevents
anyone from importing other modules, as the `COMMANS` are then
merged together. Let's use a meta class instead of a global variable.
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Fri, 24 Jan 2020 11:59:11 +0000 (12:59 +0100)]
pybind/mgr: orchestrator_cli rename: fix imports
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Fri, 24 Jan 2020 11:58:21 +0000 (12:58 +0100)]
mgr/orchestsrator: make parse_host_specs a classmethod
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Fri, 24 Jan 2020 12:08:02 +0000 (13:08 +0100)]
mgr/orchestrator_cli: rename to mgr/orchestrator
* Move `mgr/orchestrator.py` to `orchestrator/_interface.py`
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Kefu Chai [Mon, 17 Feb 2020 09:15:35 +0000 (17:15 +0800)]
Merge pull request #33361 from bk201/wip-44164
ceph.spec.in: fix python coverage dependency for non-rhel distros
Reviewed-by: Volker Theile <vtheile@suse.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yingxin Cheng [Mon, 10 Feb 2020 09:00:31 +0000 (17:00 +0800)]
crimson/net: remove duplicated error codes and conditions
The duplicated error codes and conditions were originally introduced to
match connection errors with both system category (thrown by seastar)
and generic category (thrown by standard library). Since error_code
with system category can be matched by error_condition with generic
category (see std::errc and
system_error_category::default_error_condition(int)), our duplicated
counterparts are not needed actually.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Kefu Chai [Mon, 17 Feb 2020 04:38:13 +0000 (12:38 +0800)]
Merge pull request #33358 from howard0su/nvme1
os/bluestore: allocate Task on stack
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kiefer Chang [Mon, 17 Feb 2020 04:09:46 +0000 (12:09 +0800)]
ceph.spec.in: fix python coverage dependency for non-rhel distros
The coverage package under openSUSE (and other distros) are named as
python{major_version}-coverage (without minor version).
Fixes: https://tracker.ceph.com/issues/44164
Signed-off-by: Kiefer Chang <kiefer.chang@suse.com>
Matthew Oliver [Fri, 10 Jan 2020 03:17:11 +0000 (03:17 +0000)]
rgw: make radosgw-admin user create and modify distinct
Currently if you run 'radosgw-admin user create ..' when the user
already exists and you happen to specify, at least, '--uid' and
'--display-name' that match the existing user, radowgw-admin will
actaully go modify the existing user.
This behaviour is a little confusing, hence the bug this patch is
fixing. This patch instead simplifies the tool to make
'create' create and 'modify' modify.
Meaning when you go 'create' a user that already exists, you'll get an
error, as expected. If you want to modify a user, you actually have to
use 'modify'.
For exapmle, now:
$ radosgw-admin user create --uid="test-user" --display-name="test user"
could not create user: unable to create user, user: test-user exists
Signed-off-by: Matthew Oliver <moliver@suse.com>
Fixes: https://tracker.ceph.com/issues/38619
Sage Weil [Mon, 17 Feb 2020 03:51:27 +0000 (21:51 -0600)]
Merge PR #33342 into master
* refs/pull/33342/head:
cephadm: expose `timeout` for `is_available` check
cephadm: add `--retry` arg
Reviewed-by: Sage Weil <sage@redhat.com>
Sage Weil [Mon, 17 Feb 2020 02:23:27 +0000 (20:23 -0600)]
qa/tasks/cephadm: fix prometheus shutdown
Signed-off-by: Sage Weil <sage@redhat.com>
Jun Su [Sun, 16 Feb 2020 14:26:44 +0000 (22:26 +0800)]
osd/bluestore: Avoid allocate Task on Heap
When the I/O is synced, we can allocate the task object
on stack to avoid malloc calls.
Signed-off-by: Jun Su <howard0su@gmail.com>
Kefu Chai [Sun, 16 Feb 2020 17:26:03 +0000 (01:26 +0800)]
Merge pull request #33357 from tchaikov/wip-crimson-asok
crimson: clean up and refactor asok
Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
Kefu Chai [Sun, 16 Feb 2020 11:05:09 +0000 (19:05 +0800)]
crimson/admin: no need to check for '\n'
as we don't need to mimic the behavior of classic OSD, what we need to
to fulfill the needs of ceph cli. see `admin_socket()` in
`src/pybind/ceph_daemon.py`, which sends a `\0` to indicate the end of a
command.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sun, 16 Feb 2020 10:03:24 +0000 (18:03 +0800)]
crimson/asok: disconnect client when shutdown
track the established connection as well, please note, the current asok
implementation only allows a single connection at the same time, even
though unix domain socket allows multiple concurrent clients. so there
is no need to track multiple clients at this moment.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sun, 16 Feb 2020 08:40:04 +0000 (16:40 +0800)]
crimson/asok: do not assume the order of param eval
* do not assume the order of parameter evaluation, before this change,
we have `do_with(cn.input(), cn.output(), std::move(cn) ...)`, see
https://en.cppreference.com/w/cpp/language/eval_order,
> side effects of the initialization of every parameter are
> indeterminately sequenced with respect to value computations and side
> effects of any other parameter.
we cannot move `cn` out and then call its member functions. so
introduce a struct for capturing its input and output.
* move `do_until_gate()` into `start()`, no need to check if
gate is stopped in `safe_action`, as `sestar::do_until()` will do
this for us.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sun, 16 Feb 2020 08:39:26 +0000 (16:39 +0800)]
crimson: register commands separately
so we can do command registration in the same place, in future, we can
move all of them into another place if necessary
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sun, 16 Feb 2020 02:03:36 +0000 (10:03 +0800)]
crimson: refactor asok command
* do not define another iterator type, use `map::const_iterator`
directly
* do not register hooks/commands with server block, register them
one by one, much simpler this way.
* encapsulate the hook metadata in `AdminSocketHook`, so each
`AdminSocketHook` instance is self-contained in the sense that
we don't need to use an extra type for keeping track of them.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sun, 16 Feb 2020 06:12:37 +0000 (14:12 +0800)]
crimson/osd/pg_map: add 'PGMap::get_pgs() const'
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sun, 16 Feb 2020 06:11:18 +0000 (14:11 +0800)]
crimson/osd: send beacon only if active
mimic the behavior of classic osd, and this behavior does make sense.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sun, 16 Feb 2020 06:10:25 +0000 (14:10 +0800)]
crimson/osd: add OSD::dump_status()
so it can be used by asock command
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sun, 16 Feb 2020 01:00:47 +0000 (09:00 +0800)]
crimson/osd: refactor OSD::stop_asok_admin()
the comment does not apply anymore, since `admin` and `asok` are
created in the constructor.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sun, 16 Feb 2020 00:58:51 +0000 (08:58 +0800)]
cmake: s/thread/Threads::Threads/
Threads::Threads is cmake's way to present thread library. see
https://cmake.org/cmake/help/v3.1/module/FindThreads.html
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sun, 16 Feb 2020 00:54:32 +0000 (08:54 +0800)]
crimson: remove unused include and forward decl
and add those used
Signed-off-by: Kefu Chai <kchai@redhat.com>
Sage Weil [Sun, 16 Feb 2020 15:29:44 +0000 (09:29 -0600)]
Merge PR #33339 into master
* refs/pull/33339/head:
osd/PeeringState: require SERVER_OCTOPUS to respond to RenewLease
Reviewed-by: Sage Weil <sage@redhat.com>
Sage Weil [Sat, 15 Feb 2020 14:42:10 +0000 (08:42 -0600)]
Merge PR #33289 into master
* refs/pull/33289/head:
qa/tasks/cephadm: deploy rgw daemons too
Reviewed-by: Michael Fritch <mfritch@suse.com>
Sage Weil [Sat, 15 Feb 2020 14:37:41 +0000 (08:37 -0600)]
Merge PR #32775 into master
* refs/pull/32775/head:
ceph.spec.in: fix python3 dependencies in centos7
Reviewed-by: Kefu Chai <kchai@redhat.com>
Sage Weil [Sat, 15 Feb 2020 14:37:13 +0000 (08:37 -0600)]
Merge PR #33129 into master
* refs/pull/33129/head:
osd/PeeringState: do not start renewing leases until PG is activated
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Sage Weil [Sat, 15 Feb 2020 14:36:43 +0000 (08:36 -0600)]
Merge PR #33163 into master
* refs/pull/33163/head:
msg/Policy: limit unregistered anon connections to mon
Reviewed-by: Neha Ojha <nojha@redhat.com>
Kefu Chai [Sat, 15 Feb 2020 14:17:47 +0000 (22:17 +0800)]
Merge pull request #33350 from rzarzynski/wip-crimson-clang-watchnotify
crimson/osd: fix the Clang build in create_watch_info().
Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 15 Feb 2020 14:16:50 +0000 (22:16 +0800)]
Merge pull request #33347 from tchaikov/wip-orch-addr
mgr/orchestrator: "addr" is optional for constructing InventoryNode
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
Kefu Chai [Sat, 15 Feb 2020 12:48:50 +0000 (20:48 +0800)]
Merge pull request #33349 from ronen-fr/clang-5
crimson/osd: remove unneeded captures - pg.cc
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Radoslaw Zarzynski [Sat, 15 Feb 2020 12:20:47 +0000 (13:20 +0100)]
crimson/osd: fix the Clang build in create_watch_info().
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Sat, 15 Feb 2020 10:46:51 +0000 (18:46 +0800)]
Merge pull request #33325 from tchaikov/wip-super-setup
qa/tasks: call super class's setUp()
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Kefu Chai [Sat, 15 Feb 2020 10:43:32 +0000 (18:43 +0800)]
Merge pull request #33348 from tchaikov/wip-rbd-mirror-test
mgr/dashboard: s/fsid/mirror_uuid/
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Kefu Chai [Sat, 15 Feb 2020 03:36:08 +0000 (11:36 +0800)]
mgr/orchestrator: "addr" is optional for constructing InventoryNode
this addresses a regression introduced by
5276871e15
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 15 Feb 2020 10:41:16 +0000 (18:41 +0800)]
qa/tasks/mgr/orch: s/service ls/ps/
to fix the broken test of "test_load_data". it's a regression introduced
by
aacc9a650f052fd5be543e9265ec94833b8e8bb3
Signed-off-by: Kefu Chai <kchai@redhat.com>
Ronen Friedman [Sat, 15 Feb 2020 10:37:28 +0000 (12:37 +0200)]
crimson/osd: remove unneeded captures - pg.cc
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
Kefu Chai [Sat, 15 Feb 2020 04:55:39 +0000 (12:55 +0800)]
Merge pull request #33243 from tchaikov/wip-43795
ceph_argparse: put args from env before existing ones
Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Kefu Chai [Fri, 14 Feb 2020 12:31:25 +0000 (20:31 +0800)]
qa/tasks: call super class's setUp()
to address the regression introduced by
87292811215f6ded9a784d3216a910faaef648e2
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 15 Feb 2020 04:29:07 +0000 (12:29 +0800)]
mgr/dashboard: s/fsid/mirror_uuid/
to fix the broken rbd-mirror test. it's a regression introduced by
7b07e3c9dcf1eda325fc4fe7960765c019243076
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 15 Feb 2020 03:49:59 +0000 (11:49 +0800)]
Merge pull request #33345 from athanatos/sjust/wip-fix-crimson-build
Crimson build fixes
Reviewed-by: Kefu Chai <kchai@redhat.com>
Sage Weil [Sat, 15 Feb 2020 01:52:57 +0000 (19:52 -0600)]
Merge PR #33343 into master
* refs/pull/33343/head:
qa/suites/rados/cephadm/upgrade: add simple upgrade test
qa/tasks/cephadm: improve shell subcommand
Reviewed-by: Michael Fritch <mfritch@suse.com>
Sage Weil [Thu, 13 Feb 2020 19:47:23 +0000 (13:47 -0600)]
qa/tasks/cephadm: deploy rgw daemons too
Signed-off-by: Sage Weil <sage@redhat.com>
Samuel Just [Fri, 17 Jan 2020 21:04:30 +0000 (13:04 -0800)]
crimson/common/errorator: restrict all_same_way to valid types
As with pass_further/discard_all, we don't want the returned handler
to work on types outside of the errorator at all. Otherwise, the
handler will transparently apply to any error.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Fri, 17 Jan 2020 20:56:46 +0000 (12:56 -0800)]
common/crimson/errorator: add universal pass_further_all, discard_all, all_same_way
In many cases, we simply want to add catch-all handling.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Fri, 17 Jan 2020 20:33:42 +0000 (12:33 -0800)]
crimson/common/errorator: fix errorator::pass_further and discard_all
Previously, both of these were invocable on all errors, but would
static_assert on invalid ones. What we actually want is for them
to only be invocable on valid errors. That way, we can do, for
instance:
}).handle_error(
roll_journal_segment_ertr::pass_further{},
SegmentManager::open_ertr::all_same_way([this](auto &&e) {
logger().error(
"error {} in close segment {}",
e,
current_journal_segment_id);
ceph_assert(0 == "error in close");
return;
})
to explicitely propogate any errors in roll_journal_segment_ertr
while asserting on anything else.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Fri, 17 Jan 2020 20:31:02 +0000 (12:31 -0800)]
crimson/common/errorator: add pass_further and discard to unthrowable_wrapper
This lets us use, for instance, ct_error::enoent::pass_further{} to
explicitely propogate enoent.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Sat, 15 Feb 2020 00:35:29 +0000 (16:35 -0800)]
test_alien_echo: convert Condition to use readable_eventfd
Should have been included in
5f05a50bae8bb4889dba0d249ed5fc3a2fcdcfa5.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Sat, 15 Feb 2020 00:34:36 +0000 (16:34 -0800)]
crimson/CMakeLists.txt: link pthread to libcrimson for setting thread affinity
Signed-off-by: Samuel Just <sjust@redhat.com>
Michael Fritch [Wed, 12 Feb 2020 17:39:27 +0000 (10:39 -0700)]
cephadm: expose `timeout` for `is_available` check
allow for the `--timeout` arg to override the default 30sec timeout
while waiting for a service to become available
Signed-off-by: Michael Fritch <mfritch@suse.com>
Michael Fritch [Wed, 12 Feb 2020 17:14:21 +0000 (10:14 -0700)]
cephadm: add `--retry` arg
enables overriding the the default number of retries when waiting for a
service to become available
Signed-off-by: Michael Fritch <mfritch@suse.com>
Sage Weil [Fri, 14 Feb 2020 21:26:35 +0000 (21:26 +0000)]
qa/suites/rados/cephadm/upgrade: add simple upgrade test
Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Fri, 14 Feb 2020 21:10:36 +0000 (21:10 +0000)]
qa/tasks/cephadm: improve shell subcommand
- pass teuth job params through environment
- run commands via bash -c
Signed-off-by: Sage Weil <sage@redhat.com>
Neha [Fri, 14 Feb 2020 19:09:14 +0000 (19:09 +0000)]
osd/PeeringState: require SERVER_OCTOPUS to respond to RenewLease
To avoid sending pg_lease to pre-octopus OSDs during upgrades.
Fixes: https://tracker.ceph.com/issues/44156
Signed-off-by: Neha Ojha <nojha@redhat.com>
Sage Weil [Fri, 14 Feb 2020 18:52:26 +0000 (12:52 -0600)]
Merge PR #33073 into master
* refs/pull/33073/head:
qa/suites/rados/cephadm: deploy prometheus.a
mgr/cephadm: implement prometheus add/update
mgr/cephadm: teach _create_daemon how to provision prometheus
mgr/orch: add prom hooks
Reviewed-by: Patrick Seidensal <pseidensal@suse.com>
Jan Fajerski [Fri, 14 Feb 2020 16:13:00 +0000 (17:13 +0100)]
Merge pull request #33332 from jan--f/c-v-filestore-zap-fix
ceph-volume: don't remove vg twice when zapping filestore
Radoslaw Zarzynski [Mon, 10 Feb 2020 20:40:06 +0000 (21:40 +0100)]
osd: fix racy accesses to OSD::osdmap.
Accordingly to cppreference.com [1]:
"If multiple threads of execution access the same std::shared_ptr
object without synchronization and any of those accesses uses
a non-const member function of shared_ptr then a data race will
occur (...)"
[1]: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic
One of the coredumps showed the `shared_ptr`-typed `OSD::osdmap`
with healthy looking content but damaged control block:
```
[Current thread is 1 (Thread 0x7f7dcaf73700 (LWP 205295))]
(gdb) bt
#0 0x0000559cb81c3ea0 in ?? ()
#1 0x0000559c97675b27 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148
#2 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148
#3 0x0000559c975ef8aa in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167
#4 std::__shared_ptr<OSDMap const, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167
#5 std::shared_ptr<OSDMap const>::~shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr.h:103
#6 OSD::create_context (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9053
#7 0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...)
at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665
#8 0x0000559c97886db6 in ceph::osd::scheduler::PGPeeringItem::run (this=<optimized out>, osd=<optimized out>, sdata=<optimized out>, pg=..., handle=...) at /usr/include/c++/8/ext/atomicity.h:96
#9 0x0000559c9764862f in ceph::osd::scheduler::OpSchedulerItem::run (handle=..., pg=..., sdata=<optimized out>, osd=<optimized out>, this=0x7f7dcaf703f0) at /usr/include/c++/8/bits/unique_ptr.h:342
#10 OSD::ShardedOpWQ::_process (this=<optimized out>, thread_index=<optimized out>, hb=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:10677
#11 0x0000559c97c76094 in ShardedThreadPool::shardedthreadpool_worker (this=0x559ca22aca28, thread_index=14) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.cc:311
#12 0x0000559c97c78cf4 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.h:706
#13 0x00007f7df17852de in start_thread () from /lib64/libpthread.so.0
#14 0x00007f7df052f133 in __libc_ifunc_impl_list () from /lib64/libc.so.6
#15 0x0000000000000000 in ?? ()
(gdb) frame 7
#7 0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...)
at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665
9665 in /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc
(gdb) print osdmap
$24 = std::shared_ptr<const OSDMap> (expired, weak count 0) = {get() = 0x559cba028000}
(gdb) print *osdmap
# pretty sane OSDMap
(gdb) print sizeof(osdmap)
$26 = 16
(gdb) x/2a &osdmap
0x559ca22acef0: 0x559cba028000 0x559cba0ec900
(gdb) frame 2
#2 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148
148 /usr/include/c++/8/bits/shared_ptr_base.h: No such file or directory.
(gdb) disassemble
Dump of assembler code for function std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release():
...
0x0000559c97675b1e <+62>: mov (%rdi),%rax
0x0000559c97675b21 <+65>: mov %rdi,%rbx
0x0000559c97675b24 <+68>: callq *0x10(%rax)
=> 0x0000559c97675b27 <+71>: test %rbp,%rbp
...
End of assembler dump.
(gdb) info registers rdi rbx rax
rdi 0x559cba0ec900
94131624790272
rbx 0x559cba0ec900
94131624790272
rax 0x559cba0ec8a0
94131624790176
(gdb) x/a 0x559cba0ec8a0 + 0x10
0x559cba0ec8b0: 0x559cb81c3ea0
(gdb) bt
#0 0x0000559cb81c3ea0 in ?? ()
...
(gdb) p $_siginfo._sifields._sigfault.si_addr
$27 = (void *) 0x559cb81c3ea0
```
Helgrind seems to agree:
```
==00:00:02:54.519 510301== Possible data race during write of size 8 at 0xF123930 by thread #90
==00:00:02:54.519 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8
==00:00:02:54.519 510301== at 0x7218DD: operator= (shared_ptr_base.h:1078)
==00:00:02:54.519 510301== by 0x7218DD: operator= (shared_ptr.h:103)
==00:00:02:54.519 510301== by 0x7218DD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116)
==00:00:02:54.519 510301== by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678)
==00:00:02:54.519 510301== by 0x72A06C: Context::complete(int) (Context.h:77)
==00:00:02:54.519 510301== by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66)
==00:00:02:54.519 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
==00:00:02:54.519 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
==00:00:02:54.519 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so)
==00:00:02:54.519 510301==
==00:00:02:54.519 510301== This conflicts with a previous read of size 8 by thread #117
==00:00:02:54.519 510301== Locks held: 1, at address 0x2123E9A0
==00:00:02:54.519 510301== at 0x6B5842: __shared_ptr (shared_ptr_base.h:1165)
==00:00:02:54.519 510301== by 0x6B5842: shared_ptr (shared_ptr.h:129)
==00:00:02:54.519 510301== by 0x6B5842: get_osdmap (OSD.h:1700)
==00:00:02:54.519 510301== by 0x6B5842: OSD::create_context() (OSD.cc:9053)
==00:00:02:54.519 510301== by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665)
==00:00:02:54.519 510301== by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701)
==00:00:02:54.519 510301== by 0x70E62E: run (OpSchedulerItem.h:148)
==00:00:02:54.519 510301== by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677)
==00:00:02:54.519 510301== by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311)
==00:00:02:54.519 510301== by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706)
==00:00:02:54.519 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
==00:00:02:54.519 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
==00:00:02:54.519 510301== Address 0xf123930 is 3,824 bytes inside a block of size 10,296 alloc'd
==00:00:02:54.519 510301== at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433)
==00:00:02:54.519 510301== by 0x66F766: main (ceph_osd.cc:688)
==00:00:02:54.519 510301== Block was alloc'd by thread #1
```
Actually there is plenty of similar issues reported like:
```
==00:00:05:04.903 510301== Possible data race during read of size 8 at 0x1E3E0588 by thread #119
==00:00:05:04.903 510301== Locks held: 1, at address 0x1EAD41D0
==00:00:05:04.903 510301== at 0x753165: clear (hashtable.h:2051)
==00:00:05:04.903 510301== by 0x753165: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t>
>, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__deta
il::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369)
==00:00:05:04.903 510301== by 0x75331C: ~unordered_map (unordered_map.h:102)
==00:00:05:04.903 510301== by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350)
==00:00:05:04.903 510301== by 0x753606: operator() (shared_cache.hpp:100)
==00:00:05:04.903 510301== by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr
_base.h:471)
==00:00:05:04.903 510301== by 0x73BB26: _M_release (shared_ptr_base.h:155)
==00:00:05:04.903 510301== by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148)
==00:00:05:04.903 510301== by 0x6B58A9: ~__shared_count (shared_ptr_base.h:728)
==00:00:05:04.903 510301== by 0x6B58A9: ~__shared_ptr (shared_ptr_base.h:1167)
==00:00:05:04.903 510301== by 0x6B58A9: ~shared_ptr (shared_ptr.h:103)
==00:00:05:04.903 510301== by 0x6B58A9: OSD::create_context() (OSD.cc:9053)
==00:00:05:04.903 510301== by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665)
==00:00:05:04.903 510301== by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701)
==00:00:05:04.903 510301== by 0x70E62E: run (OpSchedulerItem.h:148)
==00:00:05:04.903 510301== by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677)
==00:00:05:04.903 510301== by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311)
==00:00:05:04.903 510301== by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706)
==00:00:05:04.903 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
==00:00:05:04.903 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
==00:00:05:04.903 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so)
==00:00:05:04.903 510301==
==00:00:05:04.903 510301== This conflicts with a previous write of size 8 by thread #90
==00:00:05:04.903 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8
==00:00:05:04.903 510301== at 0x7531E1: clear (hashtable.h:2054)
==00:00:05:04.903 510301== by 0x7531E1: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t> >, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369)
==00:00:05:04.903 510301== by 0x75331C: ~unordered_map (unordered_map.h:102)
==00:00:05:04.903 510301== by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350)
==00:00:05:04.903 510301== by 0x753606: operator() (shared_cache.hpp:100)
==00:00:05:04.903 510301== by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr_base.h:471)
==00:00:05:04.903 510301== by 0x73BB26: _M_release (shared_ptr_base.h:155)
==00:00:05:04.903 510301== by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148)
==00:00:05:04.903 510301== by 0x72191E: operator= (shared_ptr_base.h:747)
==00:00:05:04.903 510301== by 0x72191E: operator= (shared_ptr_base.h:1078)
==00:00:05:04.903 510301== by 0x72191E: operator= (shared_ptr.h:103)
==00:00:05:04.903 510301== by 0x72191E: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116)
==00:00:05:04.903 510301== by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678)
==00:00:05:04.903 510301== by 0x72A06C: Context::complete(int) (Context.h:77)
==00:00:05:04.903 510301== by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66)
==00:00:05:04.903 510301== Address 0x1e3e0588 is 872 bytes inside a block of size 1,208 alloc'd
==00:00:05:04.903 510301== at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433)
==00:00:05:04.903 510301== by 0x6C7C0C: OSDService::try_get_map(unsigned int) (OSD.cc:1606)
==00:00:05:04.903 510301== by 0x7213BD: get_map (OSD.h:699)
==00:00:05:04.903 510301== by 0x7213BD: get_map (OSD.h:1732)
==00:00:05:04.903 510301== by 0x7213BD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8076)
==00:00:05:04.903 510301== by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678)
==00:00:05:04.903 510301== by 0x72A06C: Context::complete(int) (Context.h:77)
==00:00:05:04.903 510301== by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66)
==00:00:05:04.903 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389)
==00:00:05:04.903 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so)
==00:00:05:04.903 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so)
```
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Jan Fajerski [Fri, 14 Feb 2020 14:55:13 +0000 (15:55 +0100)]
Merge pull request #33320 from jan--f/c-v-fix-filestore-journal-size
ceph-volume: pass journal_size as Size not string
Jan Fajerski [Fri, 14 Feb 2020 13:10:36 +0000 (14:10 +0100)]
ceph-volume: don't remove vg twice when zapping filestore
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Fixes: https://tracker.ceph.com/issues/44149
Kefu Chai [Fri, 14 Feb 2020 13:46:26 +0000 (21:46 +0800)]
Merge pull request #32679 from rzarzynski/wip-crimson-watchnotify_part1
crimson: add support for watch / notify, part 1
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 14 Feb 2020 13:44:01 +0000 (21:44 +0800)]
Merge pull request #33296 from tchaikov/wip-crimson-cmake
cmake: compile crimson-auth with crimson::cflags
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Mykola Golub [Fri, 14 Feb 2020 13:38:32 +0000 (15:38 +0200)]
Merge pull request #33097 from dillaman/wip-43933
librbd: tweak deep-copy to avoid creating last snapshot until sync is complete
Reviewed-by: Mykola Golub <mgolub@suse.com>
Kefu Chai [Fri, 14 Feb 2020 12:22:02 +0000 (20:22 +0800)]
Merge pull request #33298 from sebastian-philipp/debian-rook-jsonpatch
debian: add python3-jsonpatch as dependency
Reviewed-by: Kefu Chai <kchai@redhat.com>
Jan Fajerski [Fri, 14 Feb 2020 11:50:47 +0000 (12:50 +0100)]
ceph-volume: pass journal_size as Size not string
Fixes: https://tracker.ceph.com/issues/44148
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Kefu Chai [Fri, 14 Feb 2020 11:32:20 +0000 (19:32 +0800)]
Merge pull request #32174 from ronen-fr/admin_commands_3
common,crimson: supporting admin-socket commands
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Sebastian Wagner [Fri, 14 Feb 2020 08:43:32 +0000 (09:43 +0100)]
debian: add python3-jsonpatch as dependency
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Jan Fajerski [Fri, 14 Feb 2020 07:46:27 +0000 (08:46 +0100)]
Merge pull request #33283 from jan--f/c-v-zap-on-vg-without-lv
ceph-volume: avoid calling zap_lv with a LV-less VG