git.apps.os.sepia.ceph.com Git

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

Radoslaw Zarzynski [Wed, 12 May 2021 16:02:29 +0000 (16:02 +0000)]

crimson/osd: unify the interruption handling between {Internal,}ClientRequest.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Wed, 12 May 2021 14:29:25 +0000 (14:29 +0000)]

crimson/osd: share do_recover_missing() between {Internal,}ClientRequest.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Wed, 12 May 2021 13:38:32 +0000 (13:38 +0000)]

crimson/osd: ClientRequest::do_recover_missing doesn't depend on OSD anymore.

This commit enables the unification of missing objects between
`ClientRequest` and `InternalClientRequest`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Wed, 7 Apr 2021 11:41:39 +0000 (11:41 +0000)]

crimson/osd: sending EVENT_DISCONNECT becomes implementation detail of Watch.

In contrast to ceph-osd crimson sends CEPH_WATCH_EVENT_DISCONNECT directly
from the timeout handler and after CEPH_WATCH_EVENT_NOTIFY_COMPLETE.
This simplifies the Watch::remove() interface as callers aren't obliged
anymore to decide whether EVENT_DISCONNECT needs to be send or not -- it
becomes an implementation detail of Watch.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Mon, 15 Mar 2021 11:59:54 +0000 (11:59 +0000)]

crimson/osd: wire up handling of watch timeouts.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Mon, 15 Mar 2021 11:54:22 +0000 (11:54 +0000)]

crimson/osd: s/do_timeout/do_notify_timeout/ per the upcoming do_watch_timeout().

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Thu, 18 Mar 2021 09:49:39 +0000 (09:49 +0000)]

crimson/osd: introduce the InternalClientRequest infrastructure.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Wed, 31 Mar 2021 17:47:00 +0000 (17:47 +0000)]

crimson/osd: PG::with_locked_obc() doesn't depend on MOSDOp anymore.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Wed, 31 Mar 2021 17:40:26 +0000 (17:40 +0000)]

crimson/osd: PG::get_oid() doesn't depend on MOSDOp anymore.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Wed, 31 Mar 2021 15:28:54 +0000 (15:28 +0000)]

osd: introduce OpInfo filling from a vector of OSDOps.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 30 Mar 2021 18:38:47 +0000 (18:38 +0000)]

crimson/osd: expose the non-MOSDOp-taking variant of do_osd_ops() externally.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 23 Mar 2021 21:39:58 +0000 (21:39 +0000)]

crimson/osd: PG::do_osd_ops_execute() doesn't depend on MOSDOp anymore.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 23 Mar 2021 21:27:16 +0000 (21:27 +0000)]

crimson/osd: pass std::vector<OSDOp>& to PG::submit_transaction().

This will allow in a moment to get rid of the dependency on
`MOSDOp` on all paths of `PG::do_osd_ops_execute()`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 23 Mar 2021 21:21:02 +0000 (21:21 +0000)]

crimson/osd: PG::do_osd_ops_execute() doesn't directly takes ObjectContextRef.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 23 Mar 2021 20:45:17 +0000 (20:45 +0000)]

crimson/osd: PG::repair_object() doesn't depend on MOSDOp anymore.

Before this commit the method was depending on `MOSDOp::get_min_epoch()`
to start an `UrgentRecovery`. However, it seems `PG::get_osdmap_epoch()`
would be sufficient here as the very early stages of the processing
in `ClientRequest` ensure the PG fits the `get_min_epoch()` requirement.

In the classical OSD the counterpart code looks like below:

```
int PrimaryLogPG::rep_repair_primary_object(const hobject_t& soid, OpContext *ctx)
{
  // ...
  queue_peering_event(
      PGPeeringEventRef(
        std::make_shared<PGPeeringEvent>(
        get_osdmap_epoch(),
        get_osdmap_epoch(),
        PeeringState::DoRecovery())));

  return -EAGAIN;
}
```

In addition to the dependency minimalisation, the commits reformats
the code around `PG::repair_object()` to fit our guidelines.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 23 Mar 2021 16:51:22 +0000 (16:51 +0000)]

crimson/osd: reload obc also when handling ct_error::object_corrupted.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 23 Mar 2021 13:26:53 +0000 (13:26 +0000)]

crimson/osd: introduce RollbackOrchestrator to OpsExecuter.

If the execution of an `OSDOp` fails, we're left with potentially
altered `ObjectContext`. We deal with that by reloading `obc` if
there was any modification to it. To figure this out, `has_seen_write()`
on `OpsExecuter` is being called. Unfortunately, the current impl.
has following drawbacks:

* `has_seen_write()` can be called after `std::move(ox).flush_...()`
    which is very inelegant;
* it requires catching both `ObjectContext` and `OpsExecuter` while
   the latter already references the former;
* there is no explicitly given reason in the header for justifying
   the presence of `has_seen_writes()`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Mon, 29 Mar 2021 17:03:19 +0000 (17:03 +0000)]

crimson/osd: split PG::do_osd_ops() to facilitate InternalClientRequest.

This commit brings `PG::do_osd_ops_execute()` a subset of
`PG::do_osd_ops()`; it handles the ops execution through
`OpsExecuter` and the `submit_transaction()` but it stays
indepedent from `MOSDOp` and `MOSDOpReply`. This trait
facilitates the `InternalClientRequest`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Thu, 18 Mar 2021 13:17:35 +0000 (13:17 +0000)]

crimson/osd: erase the message type in OpsExecuter.

THe reason is unification of infrastructure between external
client requests (everything represented by the `ClientRequest`)
and internal requests.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Thu, 18 Mar 2021 09:54:40 +0000 (09:54 +0000)]

crimson/osd: drop namespace for arg in PG::with_locked_obc().

It's unnecessary.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Thu, 18 Mar 2021 09:41:55 +0000 (09:41 +0000)]

crimson/osd: split ClientRequest::PGPipeline into CommonPGPipeline.

This is another step towards the `InternalClientRequst`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Thu, 18 Mar 2021 09:34:54 +0000 (09:34 +0000)]

crimson/osd: the ClientRequest::do_recover_missing() takes oid externally.

This refactor is a first step towards sharing the recovery bits
with `InternalClientRequest`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Mon, 29 Mar 2021 16:01:47 +0000 (16:01 +0000)]

crimson/osd: implement ObjectContext relocking.

This commit introduces a `ObjectContext`-taking variant of
`PG::with_locked_obc()`. The upcoming internal counterpart
for the `ClientRequest` is the intended audience.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Mon, 15 Mar 2021 19:56:43 +0000 (19:56 +0000)]

crimson/osd: ObjectContext allows the hobject_t to be std::moved in ctor.

Taken with "crimson/osd: use obc->get_oid() instead of passing
hobject_t around" and enriched with the move-constructing down
the `ObjectState` path this should allows to save some work in
e.g. `std::string` instances that are part of the `hobject_t`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 9 Mar 2021 16:18:21 +0000 (16:18 +0000)]

crimson/osd: OpsExecuter retrieves PG when doing op effects.

This will necessary to spawn the upcoming `InternalClientRequest`
from the `Watch`'s timeout handler.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Mon, 10 May 2021 14:43:43 +0000 (07:43 -0700)]

Merge PR #41128 into master

* refs/pull/41128/head:
qa/crontab: reduce frequency of pacific nightlies

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

J. Eric Ivancich [Mon, 10 May 2021 14:32:39 +0000 (10:32 -0400)]

Merge pull request #40563 from BryceCao/wip_add_check_for_sync_url

rgw : add check empty for sync url

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

J. Eric Ivancich [Mon, 10 May 2021 14:32:11 +0000 (10:32 -0400)]

Merge pull request #38729 from rosinL/fix-rgw-file-read

rgw/rgw_file: Fix the return value of read() and readlink()

Reviewed-by: Matt Benjamin mbenjamin@redhat.com

commit | commitdiff | tree

J. Eric Ivancich [Mon, 10 May 2021 14:31:42 +0000 (10:31 -0400)]

Merge pull request #36305 from ivancich/wip-ordered-list-map-efficiency

rgw: ordered list map efficiency

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Daniel Gryniewicz [Mon, 10 May 2021 14:15:57 +0000 (10:15 -0400)]

Merge pull request #41108 from dang/wip-dang-zipper-link

RGW Zipper - Remove link/unlink from API

commit | commitdiff | tree

Kefu Chai [Mon, 10 May 2021 13:26:17 +0000 (21:26 +0800)]

Merge pull request #37720 from ifed01/wip-ifed-alloc-tool-fixes

os/bluestore: some minor fixes/improvements for allocator's stats inquiries

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

commit | commitdiff | tree

Amnon Hanuhov [Mon, 10 May 2021 10:37:29 +0000 (13:37 +0300)]

Merge pull request #40931 from AmnonHanuhov/wip-refactor_conn_send

crimson/net: Refactor conn::send()

commit | commitdiff | tree

Ernesto Puerta [Mon, 10 May 2021 08:28:41 +0000 (10:28 +0200)]

Merge pull request #41218 from rhcs-dashboard/revert-base-href

mgr/dashboard: fix base-href: revert it to previous approach

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Ilya Dryomov [Sun, 9 May 2021 19:49:53 +0000 (21:49 +0200)]

Merge pull request #41185 from idryomov/wip-rbd-pwl-reopen

librbd/cache/pwl: fix parsing of cache_type in create_image_cache_state()

Reviewed-by: Mahati Chamarthy <mahati.chamarthy@intel.com>
Reviewed-by: Yin Congmin <congmin.yin@intel.com>

commit | commitdiff | tree

J. Eric Ivancich [Sat, 8 May 2021 14:55:37 +0000 (10:55 -0400)]

Merge pull request #41141 from ivancich/wip-listing-initial-marker

rgw: fix bucket object listing when marker matches prefix

Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>

commit | commitdiff | tree

J. Eric Ivancich [Sat, 8 May 2021 14:54:58 +0000 (10:54 -0400)]

Merge pull request #41140 from ivancich/wip-bucket-purge-paging

rgw: radosgw_admin remove bucket not purging past 1,000 objects

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

J. Eric Ivancich [Sat, 8 May 2021 14:54:08 +0000 (10:54 -0400)]

Merge pull request #40886 from pritha-srivastava/wip-rgw-mfa-pin-check

rgw: fix for mfa resync crash when supplied with only one totp_pin.

Reviewed-by: Matt Benjamin mbenjamin@redhat.com

commit | commitdiff | tree

Kefu Chai [Sat, 8 May 2021 13:17:19 +0000 (21:17 +0800)]

Merge pull request #41166 from tchaikov/wip-cmake-cython-cflags

cmake: remove cflags from CC

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sat, 8 May 2021 12:00:37 +0000 (20:00 +0800)]

Merge pull request #41234 from tchaikov/wip-crimson-common

crimson/common: use string_view when appropriate

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
Reviewed-by: Xuehan Xu <xxhdx1985126@gmail.com>

commit | commitdiff | tree

Kefu Chai [Sat, 8 May 2021 08:43:55 +0000 (16:43 +0800)]

crimson/common: use string_view when appropriate

the typical use case of get_val() passes a literal string as the key,
in that case, there is no need to create a std::string. as
md_config_t::get_val() always accepts a string_view as the option name.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sat, 8 May 2021 08:36:51 +0000 (16:36 +0800)]

Merge pull request #41080 from t-msn/readdir-fix2

os/FileStore: fix to handle readdir error correctly

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sat, 8 May 2021 08:31:52 +0000 (16:31 +0800)]

Merge pull request #40993 from neha-ojha/wip-50466

osd/PG.cc: handle removal of pgmeta object

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>

commit | commitdiff | tree

Kefu Chai [Sat, 8 May 2021 08:29:37 +0000 (16:29 +0800)]

Merge pull request #41143 from idryomov/wip-posix-memalign-fix

common/buffer: adjust align before calling posix_memalign()

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sat, 8 May 2021 08:29:00 +0000 (16:29 +0800)]

Merge pull request #41155 from rzarzynski/wip-global-backtrace-bug-50653

log: fix the formatting when dumping thread IDs.

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 15:52:44 +0000 (23:52 +0800)]

Merge pull request #41220 from rzarzynski/wip-crimson-monc-honor-cancel

crimson/monc: honor auth_result_t::canceled as the result of do_auth().

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 15:39:32 +0000 (23:39 +0800)]

Merge pull request #41222 from tchaikov/wip-crimson-cleanups

crimson/os: cleanups

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Xuehan Xu <xxhdx1985126@gmail.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 15:38:32 +0000 (23:38 +0800)]

Merge pull request #41223 from rzarzynski/wip-crimson-alienstore-sighup

crimson/alienstore: block SIGHUP to coexist with Seastar's signal handling

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Fri, 7 May 2021 15:33:47 +0000 (17:33 +0200)]

Merge pull request #41177 from dsavineau/cv_remove_legacy_release_check

ceph-volume: remove legacy release check

commit | commitdiff | tree

Guillaume Abrioux [Fri, 7 May 2021 15:31:55 +0000 (17:31 +0200)]

Merge pull request #41178 from dsavineau/cv_tox_py3

ceph-volume: remove duplicate py3 env

commit | commitdiff | tree

Neha Ojha [Fri, 7 May 2021 15:08:39 +0000 (08:08 -0700)]

Merge pull request #40016 from neha-ojha/wip-default-mclock

use mclock_scheduler as the default scheduler

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Sunny Kumar <sunkumar@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Fri, 7 May 2021 00:08:19 +0000 (00:08 +0000)]

crimson/monc: honor auth_result_t::canceled as the result of do_auth().

An attempt to `Connection::do_auth()` may finish in one of three states:
_success_, _failure_ and _cancellation_. Unfortunately, its callers were
missing the third treating cancellation like a failure. This was the root
cause of the following failure at Sepia:

```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-06_22:08:43-rados-master-distro-basic-smithi/6102605$ less ./remote/smithi204/log/ceph-osd.3.log.gz
...
WARN  2021-05-06 22:35:40,464 [shard 0] osd - ms_handle_reset
...
INFO  2021-05-06 22:35:40,465 [shard 0] monc - do_auth_single: connection closed
INFO  2021-05-06 22:35:40,465 [shard 0] ms - [osd.3(client) v2:172.21.15.204:6808/31418@57568 >> mon.? v2:172.21.15.204:3300/0] execute_connecting(): protocol aborted at CLOSING -- std::system_error (error crimson::net:6, protocol aborted)
...
ERROR 2021-05-06 22:35:40,465 [shard 0] osd - mon.osd.3 dispatch() ms_handle_reset caught exception: std::system_error (error crimson::net:3, negotiation failure)
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3909-g81233a18/rpm/el8/BUILD/ceph-17.0.0-3909-g81233a18/src/crimson/common/gated.h:36: crimson::common::Gated::dispatch(const char*, T&, Func&&) [with Func = crimson::mon::Client::ms_handle_reset(crimson::net::ConnectionRef, bool)::<lambda()>&; T = crimson::mon::Client]::<lambda(std::__exception_ptr::exception_ptr)>: Assertion `*eptr.__cxa_exception_type() == typeid(seastar::gate_closed_exception)' failed.
Aborting on shard 0.
Backtrace:
0# 0x00005618C973932F in ceph-osd
1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
3# 0x00007F7BB592EB20 in /lib64/libpthread.so.0
4# gsignal in /lib64/libc.so.6
5# abort in /lib64/libc.so.6
6# 0x00007F7BB3F29B09 in /lib64/libc.so.6
7# 0x00007F7BB3F37DE6 in /lib64/libc.so.6
8# 0x00005618C9FF295C in ceph-osd
9# 0x00005618C3907313 in ceph-osd
10# 0x00005618CCA2F84F in ceph-osd
11# 0x00005618CCA34D90 in ceph-osd
12# 0x00005618CCBEC9BB in ceph-osd
13# 0x00005618CC744E9A in ceph-osd
14# main in ceph-osd
15# __libc_start_main in /lib64/libc.so.6
16# _start in ceph-osd
daemon-helper: command crashed with signal 6
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Thu, 6 May 2021 17:21:28 +0000 (17:21 +0000)]

crimson/alienstore: block SIGHUP to coexist with Seastar's signal handling.

In `crimson/osd/main.cc` we instruct Seastar to handle `SIGHUP`.

```
        // just ignore SIGHUP, we don't reread settings
        seastar::engine().handle_signal(SIGHUP, [] {})
```

This happens using the Seastar's signal handling infrastructure
which is incompliant with the alien world.

```
void
reactor::signals::handle_signal(int signo, noncopyable_function<void ()>&& handler) {
    // ...
    struct sigaction sa;
    sa.sa_sigaction = [](int sig, siginfo_t *info, void *p) {
        engine()._backend->signal_received(sig, info, p);
    };
    // ...
}
```

```
extern __thread reactor* local_engine;
extern __thread size_t task_quota;

inline reactor& engine() {
    return *local_engine;
}
```

The low-level signal handler above assumes `local_engine._backend`
is not null which stays true only for threads from the S*'s world.
Unfortunately, as we don't block the `SIGHUP` for alien threads,
kernel is perfectly authorized to pick up one them to run the handler
leading to weirdly-looking segfaults like this one:

```
INFO  2021-04-23 07:06:57,807 [shard 0] bluestore - stat
DEBUG 2021-04-23 07:06:58,753 [shard 0] ms - [osd.1(client) v2:172.21.15.100:6802/30478@51064 >> mgr.4105 v2:172.21.15.109:6800/29891] --> #7 === pg_stats(0 pgs seq 55834574872 v 0) v2 (87)
...
INFO  2021-04-23 07:06:58,813 [shard 0] bluestore - stat
DEBUG 2021-04-23 07:06:59,753 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"}
INFO  2021-04-23 07:06:59,753 [shard 0] osd - asok response length: 2947
INFO  2021-04-23 07:06:59,817 [shard 0] bluestore - stat
DEBUG 2021-04-23 07:06:59,865 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"}
INFO  2021-04-23 07:06:59,866 [shard 0] osd - asok response length: 2947
DEBUG 2021-04-23 07:07:00,020 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"}
INFO  2021-04-23 07:07:00,020 [shard 0] osd - asok response length: 2947
INFO  2021-04-23 07:07:00,820 [shard 0] bluestore - stat
...
Backtrace:
0# 0x00005600CD0D6AAF in ceph-osd
1# FatalSignal::signaled(int) in ceph-osd
2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
3# 0x00007F5877C7EB20 in /lib64/libpthread.so.0
4# 0x00005600CD830B81 in ceph-osd
5# 0x00007F5877C7EB20 in /lib64/libpthread.so.0
6# pthread_cond_timedwait in /lib64/libpthread.so.0
7# crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) in ceph-osd
8# 0x00007F5877999BA3 in /lib64/libstdc++.so.6
9# 0x00007F5877C7414A in /lib64/libpthread.so.0
10# clone in /lib64/libc.so.6
daemon-helper: command crashed with signal 11
```

Ultimately, it turned out the thread came out from a syscall (`futex`)
and started crunching the `SIGHUP` handler's code in which a nullptr
dereference happened.

This patch blocks `SIGHUP` for all threads spawned by `AlienStore`.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 13:39:36 +0000 (21:39 +0800)]

crimson/os: use this explicitly

to silence the warning from clang. it fails to figure out that this is
actually used, and complains that this is captured but not used.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 13:36:48 +0000 (21:36 +0800)]

crimson/os/seastore: use map::merge() to merge maps

C++17's std::map allows us to merge two maps, and in this case, we can
even consume `child_result`. so map::merge() is used instead of insert()
in hope to avoid the memcpy and allocation of pair<> nodes.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 13:36:10 +0000 (21:36 +0800)]

crimson.os/seastore: do not capture unused variables

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 13:14:30 +0000 (21:14 +0800)]

crimson/os/seastore: do not start identifier with "__"

avoid starting identifiers with two underscores, these names are
reserved for C/C++ compiler and standard library.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 12:22:42 +0000 (20:22 +0800)]

Merge pull request #41217 from petrutlucian94/boost_url

win*.sh,cmake: Fix Windows build

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 11:51:33 +0000 (19:51 +0800)]

Merge pull request #41129 from athanatos/sjust/wip-seastore-osd

crimson: add initial osd support for seastore

Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 10:48:05 +0000 (18:48 +0800)]

Merge pull request #41215 from CloudFerro/new_boost_url

cmake: Replace boost download url

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Avan Thakkar [Fri, 7 May 2021 09:38:11 +0000 (15:08 +0530)]

mgr/dashboard: fix base-href: revert it to previous approach

Fixes: https://tracker.ceph.com/issues/50684
Signed-off-by: Avan Thakkar <athakkar@redhat.com>

commit | commitdiff | tree

Lucian Petrut [Fri, 7 May 2021 09:23:30 +0000 (09:23 +0000)]

win*.sh,cmake: Fix Windows linking errors

The Windows build is hitting linking errors after
bumping the Boost version to 1.75. The issue is that Boost
is now setting the zlib dependecy using INTERFACE_LINK_LIBRARIES,
which means that it's no longer located using the standard
"find_package" mechanism.

In order for the linker to locate zlib, we'll add it to the
linker search path.

[1] https://github.com/boostorg/boost_install/issues/47

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>

commit | commitdiff | tree

Rafał Wądołowski [Fri, 7 May 2021 08:12:43 +0000 (10:12 +0200)]

cmake: Replace boost download url

Boost has moved downloads to JFrog Artifactory
https://www.boost.org/users/news/boost_has_moved_downloads_to_jfr.html

Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com>

commit | commitdiff | tree

Samuel Just [Thu, 6 May 2021 07:13:54 +0000 (00:13 -0700)]

crimson/tools/store-nbd: fix help message for path

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Wed, 5 May 2021 05:22:18 +0000 (22:22 -0700)]

seastore: add comment to do_transaction outlining ordering TODO

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Mon, 3 May 2021 21:26:09 +0000 (14:26 -0700)]

crimson/os/seastore/segment_manager/block: create block device during mkfs

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Tue, 27 Apr 2021 22:00:11 +0000 (15:00 -0700)]

crimson/os/seastore: refactor segment_cleaner to init segment_manager params after mount

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Mon, 26 Apr 2021 20:52:46 +0000 (13:52 -0700)]

vstart.sh: add --seastore

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Thu, 22 Apr 2021 23:57:19 +0000 (16:57 -0700)]

crimson/os/seastore: add seastore to FuturizedStore::create

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Mon, 26 Apr 2021 20:49:54 +0000 (13:49 -0700)]

crimson/os/seastore/segment_manager/block: DSYNC not needed

We are expressly flushing, so this shouldn't be needed.

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Thu, 22 Apr 2021 23:56:08 +0000 (16:56 -0700)]

crimson/os/seastore: refactor SegmentManager reference ownership

Gives SeaStore ownership over SegmentManager and rearranges mkfs/mount.
Replaces mkfs_config_t/mount_config_t with config params.

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Thu, 22 Apr 2021 23:53:56 +0000 (16:53 -0700)]

test/crimson/seastore/transaction_manager_test_state: remove duplicate setup

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Thu, 22 Apr 2021 22:11:28 +0000 (15:11 -0700)]

test/crimson/gtest_seastar: init config and perf counters for crimson tests

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Tue, 27 Apr 2021 21:56:00 +0000 (14:56 -0700)]

crimson/tools/store-nbd: don't use detailed space tracker

Intended for debugging rather than performance testing.

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Wed, 28 Apr 2021 07:22:16 +0000 (00:22 -0700)]

crimson/os/seastore/journal: close open segment and reset soft state in close()

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Wed, 28 Apr 2021 07:21:19 +0000 (00:21 -0700)]

crimson/os/journal: use SegmentManager methods for block size, etc

Otherwise, initing those values needs to be done after the SegmentManager
instance is mounted. This is simpler for now.

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Wed, 28 Apr 2021 07:25:02 +0000 (00:25 -0700)]

crimson/os/seastore: only create handle in create_new_collection

FuturizedStore::create_new_collection isn't supposed to actually
create the collection. See OSD::mkfs for a usage example.

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Wed, 28 Apr 2021 07:24:00 +0000 (00:24 -0700)]

crimson/os/seastore: clean up meta implementation

There's really no reason to cache the decoded representation here since
the meta keys are only accessed during startup, mkfs. This approach is
much simpler.

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Wed, 28 Apr 2021 08:43:02 +0000 (01:43 -0700)]

crimson/.../transaction_manager: skip zero mappings in mount

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Wed, 28 Apr 2021 08:42:35 +0000 (01:42 -0700)]

crimson/os/seastore: fix read() to use onode.size for len=0

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Fri, 30 Apr 2021 21:12:27 +0000 (14:12 -0700)]

crimson/os/seastore: fix do_transaction -- transactions may be empty

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Fri, 30 Apr 2021 21:11:43 +0000 (14:11 -0700)]

crimson/os/seastore: wire up get_fsid

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Fri, 30 Apr 2021 21:10:57 +0000 (14:10 -0700)]

crimson/os/seastore: wire up stat

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Fri, 30 Apr 2021 07:31:45 +0000 (07:31 +0000)]

crimson/os/seastore/seastore.cc: update to use new debug macros

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Samuel Just [Thu, 29 Apr 2021 22:15:40 +0000 (15:15 -0700)]

crimson/os/seastore: convert cache and transaction_manager to use new debugging macros

The goal here is to capture the transaction address and to standardize the
prefix format.

Signed-off-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Lucian Petrut [Fri, 7 May 2021 07:18:39 +0000 (07:18 +0000)]

win32*.sh: ensure that the build dir exists

The Windows build scripts try to use the build dir before
actually creating it.

We'll have to move up the "mkdir" command a few lines.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 06:59:41 +0000 (14:59 +0800)]

Merge pull request #41214 from tchaikov/wip-crimson-clang-cleanups

crimson: clang related cleanups

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 05:14:33 +0000 (13:14 +0800)]

crimson/common/config_proxy: add a helper for get_val<>()

otherwise we have to put something like

local_conf().template get_val<T>(name)

which is not quite convenient or readable.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 05:03:24 +0000 (13:03 +0800)]

crimson/os/seastore: do not redefine default argument

we should not redefine a default argument of a method of templated class.

this change also address following error from clang:

../src/crimson/os/seastore/onode_manager/staged-fltree/node.cc:621:30: error: template parameter redefines default argument
template <bool FORCE_MERGE = false>
                             ^
../src/crimson/os/seastore/onode_manager/staged-fltree/node.h:438:32: note: previous default template argument defined here
  template <bool FORCE_MERGE = false>
                               ^

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 04:59:36 +0000 (12:59 +0800)]

crimson/os/seastore: do not capture non-variables

merge_stage and merge_size are structured bindings, they are not
variables. so cannot be captured without defining variables in
capture list.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 7 May 2021 04:59:14 +0000 (12:59 +0800)]

crimson/os/seastore: do not capture unused variable

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Neha Ojha [Fri, 7 May 2021 02:10:28 +0000 (19:10 -0700)]

Merge pull request #41211 from neha-ojha/wip-remove-mon-election

qa/suites/rados/standalone: remove mon_election symlink

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Neha Ojha [Fri, 7 May 2021 00:35:35 +0000 (00:35 +0000)]

qa/suites/rados/standalone: remove mon_election symlink

The standalone tests need parameters to be passed as ceph_args to
override defaults.

This was just doubling the number of standalone tests being run in each rados
run with no effect!

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 20:55:16 +0000 (16:55 -0400)]

Merge PR #41201 into master

* refs/pull/41201/head:
doc/releases: 16.2.3

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: David Galloway <dgallowa@redhat.com>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 20:01:57 +0000 (16:01 -0400)]

Merge PR #41179 into master

* refs/pull/41179/head:
qa/tasks/cephadm_cases: longer wait for osd to start

Reviewed-by: Sebastian Wagner <swagner@suse.com>

commit | commitdiff | tree

Sage Weil [Thu, 6 May 2021 15:26:51 +0000 (10:26 -0500)]

doc/releases: 16.2.3

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sridhar Seshasayee [Thu, 6 May 2021 08:27:02 +0000 (13:57 +0530)]

qa/suites/rados/mgr/tasks/progress: use high_recovery_ops for faster recovery

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

Neha Ojha [Tue, 6 Apr 2021 19:56:06 +0000 (19:56 +0000)]

PendingReleaseNotes: mclock_scheduler is the default scheduler for quincy

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Neha Ojha [Mon, 3 May 2021 19:28:27 +0000 (19:28 +0000)]

qa/standalone: use osd op queue = wpq

mclock_scheduler is now the default and some of these tests need to be modified
to run well with it. Continue using wpq until
https://tracker.ceph.com/issues/50574 is addressed.

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Neha Ojha [Mon, 3 May 2021 18:35:35 +0000 (18:35 +0000)]

common/options/global.yaml.in: use mclock_scheduler as the default scheduler

The aim is to default to mclock_scheduler in quincy, so start early and
get more testing.

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Amnon Hanuhov [Tue, 4 May 2021 13:20:15 +0000 (16:20 +0300)]

crimson/osd: Use crimson::net::make_message() in ReplicatedRecoveryBackend

Signed-off-by: Amnon Hanuhov <ahanukov@redhat.com>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom