git.apps.os.sepia.ceph.com Git

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

Kefu Chai [Mon, 2 Aug 2021 03:09:47 +0000 (11:09 +0800)]

Merge pull request #42572 from tchaikov/wip-cmake-mgr-cleanup

cmake: initialize dpdk_LIBRARIES with empty list

Reviewed-by: Xiubo Li <xiubli@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sun, 1 Aug 2021 04:24:07 +0000 (12:24 +0800)]

Merge pull request #42129 from tchaikov/wip-cmake-build-type

cmake: set CMAKE_BUILD_TYPE only if .git exists

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sun, 1 Aug 2021 01:58:56 +0000 (09:58 +0800)]

Merge pull request #42573 from tchaikov/wip-crimson-header

cmake: let crimson-admin depend on legacy-option-headers

Reviewed-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sun, 1 Aug 2021 01:58:15 +0000 (09:58 +0800)]

Merge pull request #42445 from LiumxNL/ups-eliminate-rollfwd

osd/PGLog: set acceptable rollback_info_trimmed_to for pg of replicated pool

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Sat, 31 Jul 2021 20:34:56 +0000 (13:34 -0700)]

Merge PR #40511 into master

* refs/pull/40511/head:
qa: update mds_pre_upgrade to no longer stop standbys
qa: update mds_pre_upgrade to disable standby-replay
qa: add tests for compat manipulation and upgrade
doc: remove deprecated compat commands
doc: update MDS upgrade procedure
mon,mds: use per-MDS compat to inform replacement
mon: do not update inline incompat except via mds
mds: add MDSMap method for creating null MDSMap
mds: only update beacon epoch if newer
mds: harden standby_mds lookup
mon/FSCommands: accept generic ostream rather than stringstream
include: add less verbose CompatSet dump
include: add dump operator for Feature
include: add const qualifier to appropriate CompatSet methods

Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sat, 31 Jul 2021 09:01:38 +0000 (17:01 +0800)]

cmake: let crimson-admin depend on legacy-option-headers

legacy-option-headers provides global_legacy_options.h, which is in turn
required for building crimson-admin.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sat, 31 Jul 2021 08:35:37 +0000 (16:35 +0800)]

Merge pull request #42400 from adk3798/offline-daemon-removal

mgr/cephadm: don't remove daemons from offline hosts

Reviewed-by: Daniel Pivonka <dpivonka@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sat, 31 Jul 2021 08:08:44 +0000 (16:08 +0800)]

cmake: prefer static library when finding DPDK

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sat, 31 Jul 2021 07:49:20 +0000 (15:49 +0800)]

cmake: initialize dpdk_LIBRARIES with empty list

set(dpdk_LIBRARIES) does not reset this variable, it leaves it
unchanged.

if pkg-config manages to find DPDK libraries, dpdk_LIBRARIES would be
set with a string like "rte_node;rte_graph;..." by
pkg_check_modules(dpdk QUIET libdpdk).

but we would want to set this variable to the import paths of the
required libraries. so reset it before appending them to this variable.

this change helps to address the build failure when building Ceph with
DPDK installed into system along with its .pc file.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sat, 31 Jul 2021 04:13:40 +0000 (12:13 +0800)]

mgr/ServiceMap: do not include unused headers

<experimental/iterator> was included for std::experimental::make_ostream_joiner
in a968f65d784b3d6c6a172929aa293f09e6917fa6. but the code using
std::experimental::make_ostream_joiner was later rewritten in
ab0d8f2ae9f551e15a4c7bacbf69161e91263785, in which
std::experimental::make_ostream_joiner is not used anymore.

so let's drop it.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sat, 31 Jul 2021 05:10:06 +0000 (13:10 +0800)]

Merge pull request #42553 from cfsnyder/wip_51961

mgr/cephadm: fix exceptions causing stuck progress indicators

Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sat, 31 Jul 2021 05:08:01 +0000 (13:08 +0800)]

Merge pull request #42506 from p-se/mgr-prom-counter-type

mgr/prometheus: Fix metric types from gauge to counter

Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 30 Mar 2021 21:55:54 +0000 (14:55 -0700)]

qa: update mds_pre_upgrade to no longer stop standbys

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 30 Mar 2021 21:53:04 +0000 (14:53 -0700)]

qa: update mds_pre_upgrade to disable standby-replay

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 30 Mar 2021 21:06:28 +0000 (14:06 -0700)]

qa: add tests for compat manipulation and upgrade

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 31 Mar 2021 13:45:08 +0000 (06:45 -0700)]

doc: remove deprecated compat commands

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 30 Mar 2021 21:46:45 +0000 (14:46 -0700)]

doc: update MDS upgrade procedure

Now that CompatSet changes to the FSMap no longer cause old MDS to
suicide.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 30 Mar 2021 21:26:08 +0000 (14:26 -0700)]

mon,mds: use per-MDS compat to inform replacement

This diff makes the following changes:

- FSMap::compat is now just a "default compat" of currently unknown
  utility. It is used when constructing a new file system but does
  not really have any effect or current use.

- The `mds compat *` CLI commands are deprecated. They manipulate
  the default compat which has no useful effect.

- Each MDS sends its compat to the mons in its beacon. This is from
  MDSMap::get_compat_set_all() at MDS boot. This CompatSet does not
  change for the duration of the MDS lifetime.

- Mons record each MDS compat in the FSMap to inform standby failover.
  An MDS is only promoted if it is compatible with the file system
  compat.

- Mons upgrade (merge) the file system compat when (a) the number of
  *in* MDS is 1 (effected by max_mds=1) and (b) the mons are promoting a
  standby with a new compat. A file system is never upgraded when there
  is more than 1 rank to prevent two MDS with incompatible compat.

- A suite of `fs compat` commands exist to manipulate the file system
  compat. These exist mostly for testing.

The consequence of these changes is that the upgrade procedure for MDS
can be updated to no longer require turning off all MDS but rank 0
before performing any upgrades. A CompatSet change would cause all MDS
receiving the new MDSMap to suicide due to incompatibility (if so).
Instead, the monitors will no longer assign an incompatible MDS to a
file system and enforce an upgrade procedure if incompatibilities exist.

Fixes: https://tracker.ceph.com/issues/49720
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 30 Mar 2021 21:07:46 +0000 (14:07 -0700)]

mon: do not update inline incompat except via mds

The MDS_FEATURE_INCOMPAT_INLINE feature indicates that an MDS knows how
to read/write inline data and that the file system may have it. The
separate setting for inline_data protects this file system feature.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 17 Mar 2021 16:55:04 +0000 (09:55 -0700)]

mds: add MDSMap method for creating null MDSMap

It's not necessary to distribute a CompatSet with the null mdsmap. We
only need to communicate that the MDS is not part of any map.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 6 Apr 2021 15:20:54 +0000 (08:20 -0700)]

mds: only update beacon epoch if newer

This is a defensive programming change. We don't want the beacon epoch
to ever go backwards.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 30 Mar 2021 21:13:42 +0000 (14:13 -0700)]

mds: harden standby_mds lookup

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 30 Mar 2021 19:40:58 +0000 (12:40 -0700)]

mon/FSCommands: accept generic ostream rather than stringstream

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Mon, 5 Apr 2021 14:55:20 +0000 (07:55 -0700)]

include: add less verbose CompatSet dump

For printing in `fs dump`.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 30 Mar 2021 21:10:28 +0000 (14:10 -0700)]

include: add dump operator for Feature

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 30 Mar 2021 19:39:36 +0000 (12:39 -0700)]

include: add const qualifier to appropriate CompatSet methods

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Fri, 30 Jul 2021 21:03:36 +0000 (14:03 -0700)]

Merge PR #42513 into master

* refs/pull/42513/head:
qa: multifs already enabled as default

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Fri, 30 Jul 2021 21:02:33 +0000 (14:02 -0700)]

Merge PR #42499 into master

* refs/pull/42499/head:
client:make sure only to update dir dist from auth mds

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Fri, 30 Jul 2021 21:00:19 +0000 (14:00 -0700)]

Merge PR #42201 into master

* refs/pull/42201/head:
qa: fold frag confs into conf/mds.yaml

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>

commit | commitdiff | tree

Neha Ojha [Fri, 30 Jul 2021 20:14:03 +0000 (13:14 -0700)]

Merge pull request #42133 from sseshasa/wip-persist-osd-iops-cap-mclock

osd: Add mechanism to avoid running OSD bench on every OSD init when mclock_scheduler is enabled

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Daniel Gryniewicz [Fri, 30 Jul 2021 19:12:21 +0000 (15:12 -0400)]

Merge pull request #42550 from dang/wip-dang-zipper-writer

Zipper Writer API
Reviewed-by: cbodley@redhat.com

commit | commitdiff | tree

Daniel Gryniewicz [Fri, 30 Jul 2021 18:00:30 +0000 (14:00 -0400)]

Merge pull request #31454 from soumyakoduri/dbstore

rgw/Zipper: DB Backend store

Reviewed-by: dang@redhat.com
Reviewed-by: amaredia@redhat.com

commit | commitdiff | tree

Daniel Gryniewicz [Wed, 21 Jul 2021 14:56:59 +0000 (10:56 -0400)]

RGW - Zipper - Proper Writer API

With the implementation of DBStore, it was determined that the API used
for writing in Zipper was too tied to RADOS. Implement a clean writing
API named Writer.

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>

commit | commitdiff | tree

Daniel Gryniewicz [Fri, 30 Jul 2021 14:06:56 +0000 (10:06 -0400)]

Merge pull request #42266 from dang/wip-dang-zipper-raw_obj

Wip dang zipper raw obj

Reviewed-by: Soumya Koduri <skoduri@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Mon, 5 Jul 2021 06:20:04 +0000 (11:50 +0530)]

qa/standalone/misc: ver-health.sh: Increase wait_for_health_string() timeout

Modified test cases:

1. ver-health.sh:
  a. TEST_check_version_health_1():
    To avoid intermittent timeouts observed in wait_for_health_string(),
    increase the wait time to 20 secs.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Mon, 21 Jun 2021 12:47:32 +0000 (18:17 +0530)]

qa/standalone/scrub: Force a subset of scrub tests to use "wpq" scheduler

The following tests in the test files mentioned below use the
"osd_scrub_sleep" option to introduce delays during scrubbing to help
determine scrubbing states, validate reservations during scrubbing etc..
This works when using the "wpq" scheduler.

But when the "mclock_scheduler" is enabled, the "osd_scrub_sleep" is
disabled and overridden to 0. This is done to delegate the scheduling of
the background scrubs to the "mclock_scheduler" based on the set QoS
parameters. Due to this, the checks to verify the scrub states,
reservations etc. fail since the window to check them is very short
due to scrubs completing very quickly. This affects a small subset of
scrub tests mentioned below,

1. osd-scrub-dump.sh -> TEST_recover_unexpected()
2. osd-scrub-repair.sh -> TEST_auto_repair_bluestore_tag()
3. osd-scrub-test.sh -> TEST_scrub_abort(), TEST_deep_scrub_abort()

Only for the above tests, until there's a reliable way to query scrub
states with "--osd-scrub-sleep" set to 0, the "osd_op_queue" config
option is set to "wpq".

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Thu, 17 Jun 2021 11:41:58 +0000 (17:11 +0530)]

qa/standalone/erasure-code: Modify erasure-code tests for mclock scheduler

Modified test cases:

1. test-erasure-eio.sh:
  a. Test_ec_backfill_unfound():
    - Set osd_mclock_profile to high_recovery_ops profile.
    - Increase the wait for backfill_unfound timeout to 240 secs.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Thu, 17 Jun 2021 12:19:29 +0000 (17:49 +0530)]

qa/standalone/osd-backfill: Modify backfill tests for mclock scheduler

Modified test cases:

1. osd-backfill-prio.sh:
  Set osd_op_queue = wpq for all tests since the mclock doesn't
  consider recovery priority as part of its scheduling algorithm.

2. osd-backfill-space.sh:
  Set osd_mclock_profile to high_recovery_ops and increase the wait
  for backfills timeout to 1200 secs for the following tests:
  - TEST_backfill_test_simple()
  - TEST_backfill_test_multi()
  - TEST_backfill_test_sametarget()
  - TEST_backfill_multi_partial()
  - TEST_ec_backfill_simple()
  - TEST_ec_backfill_multi()
  - SKIP_TEST_ec_backfill_multi_partial()
  - SKIP_TEST_ec_backfill_multi_partial()

3. osd-backfill-stats:
  - TEST_backfill_ec_down_all_out():
   Set osd_mclock_profile to high_recovery_ops and increase the wait
   for recovery timeout to 240 secs.

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Wed, 16 Jun 2021 11:26:54 +0000 (16:56 +0530)]

qa/standalone/osd: Modify osd tests for mclock scheduler

Modified test cases:
1. osd-recovery-prio.sh:
   Set osd_op_queue = wpq for all tests since mclock
   doesn't consider recovery priority as part of its
   scheduling algorithm.

2. osd-recovery-stats.sh:
   a. TEST_recovery_undersized():
     - Set osd_mclock_profile to high_recovery_ops profile.
     - Increase wait for recovery timeout to 300 secs.

3. osd-rep-recov-eio.sh:
   a. TEST_rep_backfill_unfound():
     - Set osd_mclock_profile to high_recovery_ops profile.
     - Increase wait for backfill_unfound to 360 secs.

4. repeer-on-acting-back.sh:
   a. TEST_repeer_on_down_act():
     - Set osd_mclock_profile to high_recovery_ops profile.
       (To improve the test duration)

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Mon, 14 Jun 2021 08:06:23 +0000 (13:36 +0530)]

qa/standalone: Modify ceph-helpers.sh tests for mclock scheduler.

List of changes:

1. Remove the enforcement to use osd_op_queue=wpq when an osd is brought
   up in the following functions:
   - run_osd()
   - run_osd_filestore() and
   - activate_osd()

2. New functions:
   - get_op_scheduler() - Get the current osd_op_queue for an osd.

3. Modified test cases:
   - test_run_osd() - Add check for osd_max_backfill count.
     The mclock scheduler overrides the count to 1000.

4. New test cases:
   - test_activate_osd_after_mark_down()
   - test_get_op_scheduler()

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Thu, 24 Jun 2021 13:15:33 +0000 (18:45 +0530)]

osd: Add a new config option to forcibly run OSD benchmark on init

The new config option "osd_mclock_force_run_benchmark_on_init" is
introduced to allow a user to force run the OSD benchmark test on every
OSD boot-up even if the historical data about the OSD's iops capacity is
available on the MON config store. The 'force_run_benchmark' flag is set
to the value indicated by the new config option.

By default this new config option is set to false.

The utility of this option is to help refresh the OSD iops capacity
when the underlying device's performance characteristics have changed
significantly. In such cases, the OSD can be restarted with this option
enabled temporarily. Once the new iops capacity is updated to the MON
store, this option can be removed from the OSD's start-up config.

Fixes: https://tracker.ceph.com/issues/51464
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Thu, 24 Jun 2021 07:53:23 +0000 (13:23 +0530)]

osd: Add mechanism to avoid running OSD benchmark on every OSD boot-up

Use "mon_cmd_set_config()" to store the OSD's max iops capacity to
the MON store during the first bring-up. Don't run the OSD benchmark
test on subsequent boot-ups if a previously persisted iops capacity is
available on the MON store and is different from the default iops
capacity.

Add the 'force_run_benchmark' flag to force a run of the benchmark
in case the default iops capacity cannot be determined.

Fixes: https://tracker.ceph.com/issues/51464
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Wed, 30 Jun 2021 09:22:50 +0000 (14:52 +0530)]

common/config: Add methods to return the default value of a config option

Add wrapper method "get_val_default()" to the ConfigProxy class that takes
the config option key to search. This method in-turn calls another method
with the same name added to md_config_t class that does the actual work of
searching for the config option. If the option is valid, _get_val_default()
is used to get the default value. Otherwise, the wrapper method returns
std::nullopt.

Fixes: https://tracker.ceph.com/issues/51464
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Thu, 24 Jun 2021 07:44:28 +0000 (13:14 +0530)]

osd: Add method to store config option key/value on the MON store

Add method mon_cmd_set_config() to save config option key and
value to the MON store. The ConfigMonitor command, 'config set' is
used to achieve this.

A corresponding get method is unnecessary since any config option
found on the MON store is loaded during OSD boot-up and set using
the md_config_t::set_mon_vals() method. Therefore, the existing
versions of ConfigProxy::get_val() method are sufficient to get
the latest value for the config option.

Fixes: https://tracker.ceph.com/issues/51464
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 30 Jul 2021 11:03:19 +0000 (19:03 +0800)]

Merge pull request #42308 from jtlayton/wip-51644

osd: don't assert on zero-length OP_ZERO request

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Kefu Chai [Fri, 30 Jul 2021 11:01:32 +0000 (19:01 +0800)]

Merge pull request #42523 from mgfritch/cephadm-fsid-validate

cephadm: validate `fsid` command arg

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 30 Jul 2021 11:00:30 +0000 (19:00 +0800)]

Merge pull request #42528 from liewegas/fix-51816

mon/LogMonitor: fix crash when cluster log file is not writeable

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 30 Jul 2021 10:59:05 +0000 (18:59 +0800)]

Merge pull request #42538 from dsavineau/issue_51902

cephadm: don't use ctx.fsid for clean_cgroup

Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 30 Jul 2021 08:49:02 +0000 (16:49 +0800)]

Merge pull request #42558 from tchaikov/wip-crimson-cleanup

crimson/os: cleanups for building with Clang

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Kefu Chai [Fri, 30 Jul 2021 06:57:47 +0000 (14:57 +0800)]

crimson/os: do not capture unused variable

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 30 Jul 2021 06:56:50 +0000 (14:56 +0800)]

crimson/os: reference this explicitly

to silence false alarm from Clang that `this` is not used.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 30 Jul 2021 05:40:09 +0000 (13:40 +0800)]

crimson/os: do not capture labels

structured binding does not define variables, so we cannot capture them
without defining variables in capture list.

in this change, instead of using a map<> for defining labels, just
create labels on the fly.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 30 Jul 2021 06:30:09 +0000 (14:30 +0800)]

Merge pull request #42556 from tchaikov/wip-fair-mutex

common: add ceph::fair_mutex

Reviewed-by: Xiubo Li <xiubli@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 30 Jul 2021 05:22:13 +0000 (13:22 +0800)]

Merge pull request #42539 from cyx1231st/wip-seastore-cache-metrics-2

crimson/os/seastore/cache: refine metrics

Reviewed-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Kefu Chai [Fri, 30 Jul 2021 04:44:52 +0000 (12:44 +0800)]

common: add ceph::fair_mutex

a mutex which enqueues and wakes up the waiters in FIFO order, to
ensure the fairness of the mutex.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Yingxin Cheng [Thu, 29 Jul 2021 06:52:20 +0000 (14:52 +0800)]

crimson/os/seastore: reassign extent_types_t values and remove extent_type_to_index()

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Wed, 28 Jul 2021 01:36:20 +0000 (09:36 +0800)]

crimson/os/seastore/cache: misc cleanup to metrics

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Tue, 27 Jul 2021 08:50:52 +0000 (16:50 +0800)]

crimson/os/seastore/cache: remove derived metrics

Only keep the basic metrics to minimize the total number of metrics.

Derived metrics can be numerous according to different needs and can be
confusing with labels.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Tue, 27 Jul 2021 08:45:03 +0000 (16:45 +0800)]

crimson/os/seastore/cache: remove counter labels

Do not label metrics by counter type which could be confusing.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Tue, 27 Jul 2021 08:36:44 +0000 (16:36 +0800)]

crimson/os/seastore/cache: cleanup, replace unordered_map by array

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Ilya Dryomov [Thu, 29 Jul 2021 21:54:44 +0000 (23:54 +0200)]

Merge pull request #40965 from rokj/patch-3

doc: mention copying keyrings and adjust node names in manual deployment example

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Cory Snyder [Thu, 29 Jul 2021 20:08:19 +0000 (16:08 -0400)]

mgr/cephadm: fix exceptions causing stuck progress indicators

Added a try block to ensure that progress of applying a service spec is updated as failed in the case of exceptions.

Fixes: https://tracker.ceph.com/issues/51961
Signed-off-by: Cory Snyder <csnyder@iland.com>

commit | commitdiff | tree

Soumya Koduri [Thu, 29 Jul 2021 16:04:31 +0000 (21:34 +0530)]

rgw/dbstore: Fix library link issues

Now that rgw_common is no more linked with rgw_a library (commit#7b61667),
dbstore (rgw_sal_dbstore) should be linked directly to rgw_common.

Signed-off-by: Soumya Koduri <skoduri@redhat.com>

commit | commitdiff | tree

Kefu Chai [Thu, 29 Jul 2021 15:40:03 +0000 (23:40 +0800)]

Merge pull request #42432 from tchaikov/wip-mon-crush-cleanup

mon: let CrushWrapper::get_validated_type_id() return an optional<>

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Dimitri Savineau [Thu, 29 Jul 2021 13:42:15 +0000 (09:42 -0400)]

Merge pull request #42524 from guits/cv_wait_destroy_tests

ceph-volume/tests: retry when destroying osd

commit | commitdiff | tree

Ernesto Puerta [Thu, 29 Jul 2021 12:47:24 +0000 (14:47 +0200)]

Merge pull request #42515 from rhcs-dashboard/decouple-unit-tests-from-build-dir

mgr/dashboard: backend unit tests: decouple from build dir

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>

commit | commitdiff | tree

Patrick Seidensal [Tue, 27 Jul 2021 13:18:46 +0000 (15:18 +0200)]

mgr/prometheus: Fix metric types from gauge to counter

Affected metrics:
- ceph_pool_rd
- ceph_pool_rd_bytes
- ceph_pool_rw
- ceph_pool_rw_bytes

Fixes: https://tracker.ceph.com/issues/51868
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>

commit | commitdiff | tree

Kefu Chai [Thu, 29 Jul 2021 09:15:16 +0000 (17:15 +0800)]

Merge pull request #42516 from tchaikov/wip-win32-snappy

win32_deps_build.sh: bump snappy version to 1.1.9

Reviewed-by: Nathan Cutler <ncutler@suse.com>

commit | commitdiff | tree

Rok Jaklič [Wed, 21 Apr 2021 14:35:07 +0000 (16:35 +0200)]

doc: adding missing command. changed node naming.

Signed-off-by: Rok Jaklič <rokj@rasca.net>

commit | commitdiff | tree

Yingxin Cheng [Tue, 27 Jul 2021 08:10:51 +0000 (16:10 +0800)]

crimson/os/seastore/cache: cleanup, rename to get_by_src()

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Tue, 27 Jul 2021 08:04:28 +0000 (16:04 +0800)]

crimson/os/seastore: measure cache hit ratio by src

Remove excessive amount of cache hit/access metrics by extent type.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Tue, 27 Jul 2021 07:39:05 +0000 (15:39 +0800)]

crimson/os/seastore: measure committed efforts by extent

In order to cross-check the writes at segment manager level, and
evaluate the write amplification from each sub-component.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Kefu Chai [Wed, 28 Jul 2021 09:12:33 +0000 (17:12 +0800)]

win32_deps_build.sh: only clone the tip of required tag

no need to clone the whole repo, just clone the tip of the specified
tag. this saves the bandwidth, disk IO and precious time.

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 28 Jul 2021 09:01:06 +0000 (17:01 +0800)]

win32_deps_build.sh: bump snappy version to 1.1.9

in snappy, the commit of 26102a0c66175bc39edbf484c994a21902e986dc
fixes the SNAPPY_VERSION generation. and this commit was included by
v1.1.8 and v1.1.9.

also, in v1.1.9, a change was introduced, where the function signature
was changed, and more importantly, this change is not backward
compatible:

< bool GetUncompressedLength(Source* source, uint32_t* result);
---
> bool GetUncompressedLength(Source* source, uint32* result);

see also, https://tracker.ceph.com/issues/50934

so we check SNAPPY_VERSION to tell if we should use `uint32_t` or
`uint32`.

in this change, snappy version used to build win32 client is bumped
to the latest stable version, v1.1.9, to include the fix of
SNAPPY_VERSION. this paves the road to fix of https://tracker.ceph.com/issues/50934

Signed-off-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Nathan Cutler [Tue, 27 Jul 2021 13:27:58 +0000 (15:27 +0200)]

compression/snappy: use uint32_t to be compatible with 1.1.9

The snappy project made the following change in snappy.h between version 1.1.8
and 1.1.9:

< bool GetUncompressedLength(Source* source, uint32_t* result);
---
> bool GetUncompressedLength(Source* source, uint32* result);

This causes Ceph to FTBFS with snappy 1.1.9.

Thanks to Chris Denice for bringing this to our attention via Redmine.

Fixes: https://tracker.ceph.com/issues/50934
Signed-off-by: Nathan Cutler <ncutler@suse.com>

commit | commitdiff | tree

Brad Hubbard [Wed, 28 Jul 2021 23:19:32 +0000 (09:19 +1000)]

Merge pull request #42442 from badone/wip-insights-reports-non-persistent-storage

Don't persist report data

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Dimitri Savineau [Wed, 28 Jul 2021 20:52:05 +0000 (16:52 -0400)]

cephadm: don't use ctx.fsid for clean_cgroup

The clean_cgroup method assumes that the ctx.fsid is set while this is
true for the bootstrap command, it isn't set for adopt or deploy commands
(and maybe others).

This ends up to the adopt command to fails:

Traceback (most recent call last):
  File "/sbin/cephadm", line 8301, in <module>
    main()
  File "/sbin/cephadm", line 8289, in main
    r = ctx.func(ctx)
  File "/sbin/cephadm", line 1764, in _default_image
    return func(ctx)
  File "/sbin/cephadm", line 5091, in command_adopt
    command_adopt_ceph(ctx, daemon_type, daemon_id, fsid)
  File "/sbin/cephadm", line 5299, in command_adopt_ceph
    osd_fsid=osd_fsid)
  File "/sbin/cephadm", line 2884, in deploy_daemon_units
    clean_cgroup(ctx, unit_name)
  File "/sbin/cephadm", line 2724, in clean_cgroup
    if not ctx.fsid:
  File "/sbin/cephadm", line 155, in __getattr__
    return super().__getattribute__(name)
AttributeError: 'CephadmContext' object has no attribute 'fsid'

Since we already have the fsid value in deploy_daemon_units (which calls
clean_cgroup) then we can pass the fsid value directly.

This fixes a regression introduced by 1fee255

Fixes: https://tracker.ceph.com/issues/51902
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>

commit | commitdiff | tree

Neha Ojha [Wed, 28 Jul 2021 20:00:32 +0000 (13:00 -0700)]

Merge pull request #42527 from ceph/ljflores-patch-3

doc/mgr/telemetry: fix formatting problem

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Adam King [Mon, 19 Jul 2021 16:19:15 +0000 (12:19 -0400)]

mgr/cephadm: don't return hosts in offline_hosts set as schedulable

we are only checking for host status here but we
should also be checking the offline_hosts set

Signed-off-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Adam King [Mon, 19 Jul 2021 16:14:16 +0000 (12:14 -0400)]

mgr/cephadm: fix unit test for don't touch offline hosts

We use an offline_hosts set for marking offline hosts
rather than the host status so changing this unit test
to reflect that

Signed-off-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Adam King [Mon, 19 Jul 2021 16:07:39 +0000 (12:07 -0400)]

mgr/cephadm: stop removal of daemons from offline hosts

This check was only looking for the status of the
host and not looking at the offline_hosts set so
it wasn't actually stopping daemons from being removed
from offline hosts

Signed-off-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 28 Jul 2021 17:45:08 +0000 (10:45 -0700)]

Merge PR #42349 into master

* refs/pull/42349/head:
mon/MDSMonitor: propose if FSMap struct_v is too old
mon/MDSMonitor: give a proper error message if FSMap struct_v is too old
mds/FSMap: use DECODE_OLDEST to gate FSMap version
qa: add tests for fs dump of epoch and trimming
qa: add file system support for dumping epoch
mon/MDSMonitor: return mon_mds_force_trim_to even if equal to current epoch
mon: add debugging for trimming methods
mon: fix debug spacing
qa: add nofs upgrade suite

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 28 Jul 2021 17:36:35 +0000 (10:36 -0700)]

Merge PR #42199 into master

* refs/pull/42199/head:
mds: add debugging when rejecting mksnap with EPERM

Reviewed-by: Milind Changire <mchangir@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 28 Jul 2021 17:34:12 +0000 (10:34 -0700)]

Merge PR #41025 into master

* refs/pull/41025/head:
qa: wait pgs to be clean before using the pools
qa: ignore PG_RECOVERY_FULL and PG_DEGRADED for mds-full
qa: wait more time since there have many more pgs than before
qa: do not multiple the full ratio twice
qa: do not raise for kclient for _fsync test
qa: use the pg autoscale mode to calcuate the pg_num
qa: set the object_size to 1M
qa: move the is_full() to parent class

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 28 Jul 2021 17:30:31 +0000 (10:30 -0700)]

Merge PR #38388 into master

* refs/pull/38388/head:
mds: check rejoin_ack_gather before enter rejoin_gather_finish

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sidharth Anupkrishnan <sanupkri@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 28 Jul 2021 17:01:03 +0000 (01:01 +0800)]

Merge pull request #42453 from sebastian-philipp/githubmap-rh

.githubmap: Update Sebastian Wagner's mapping

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: David Galloway <dgallowa@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 28 Jul 2021 16:59:10 +0000 (00:59 +0800)]

Merge pull request #42501 from ybwang0211/doc-cap

doc/man: add missing right parenthesis in manpage.

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 28 Jul 2021 16:32:00 +0000 (00:32 +0800)]

Merge pull request #42495 from hjwsm1989/wip-51842

crush: cancel upmaps with up set size != pool size

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>

commit | commitdiff | tree

Kefu Chai [Wed, 28 Jul 2021 16:28:53 +0000 (00:28 +0800)]

Merge pull request #42511 from adk3798/shutil-copy-exception

cephadm: don't fail hard on SameFileError during shutil.copy

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 28 Jul 2021 16:21:49 +0000 (00:21 +0800)]

Merge pull request #40337 from ideepika/wip-bugzilla-1857447

mon/PGMap: remove DIRTY field in `ceph df detail` when cache tiering is not in use

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 28 Jul 2021 16:20:09 +0000 (00:20 +0800)]

Merge pull request #42508 from cybozu/kv-rocksdbstore-enrich-debug-message

kv/RocksDBStore: enrich debug message

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 28 Jul 2021 16:19:51 +0000 (00:19 +0800)]

Merge pull request #42500 from ybwang0211/doc-list-get-attr

tools/rados: improve the usage message of {get,set}omapaheader

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Daniel Gryniewicz [Tue, 6 Jul 2021 14:32:19 +0000 (10:32 -0400)]

RGW - Zipper - MultipartUpload

Create a MultipartUpload object in the Zipper API.

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>

commit | commitdiff | tree

Daniel Gryniewicz [Mon, 14 Jun 2021 14:58:26 +0000 (10:58 -0400)]

Remove get_raw_obj() from Object

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>

commit | commitdiff | tree

Daniel Gryniewicz [Mon, 12 Jul 2021 19:08:13 +0000 (15:08 -0400)]

rgw - Configure ceph.conf for orphan-list

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>

commit | commitdiff | tree

Sage Weil [Wed, 28 Jul 2021 15:45:19 +0000 (11:45 -0400)]

mon/LogMonitor: fix crash when cluster log file is not writeable

If we are in this block, then p == channel_fds.end() and p->first is not
valid.

Also, no need to populate channel_fds with an fd of -1.

Fixes: https://tracker.ceph.com/issues/51816
Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Laura Flores [Wed, 28 Jul 2021 15:11:17 +0000 (10:11 -0500)]

doc/mgr/telemetry: fix formatting problem

There was strange bolding and bullet point placement due to a missing new line in the perf description.

Signed-off-by: Laura Flores <lflores@redhat.com>

commit | commitdiff | tree

Kefu Chai [Wed, 28 Jul 2021 14:56:52 +0000 (22:56 +0800)]

Merge pull request #42502 from tchaikov/wip-bloomfilter-cleanups

include/intarith, common/bloom_filter: add popcount() and cleanups

Reviewed-by: Sage Weil <sage@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 28 Jul 2021 14:22:09 +0000 (16:22 +0200)]

ceph-volume/tests: retry when destroying osd

Sometimes, it can happen that the osds being destroyed in those tests
are not yet marked as 'down' for some reason. Let's add some retries on
those tasks to avoid CI failures.

Fixes: https://tracker.ceph.com/issues/51903
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Thu, 15 Jul 2021 01:02:20 +0000 (18:02 -0700)]

mon/MDSMonitor: propose if FSMap struct_v is too old

To flush older versions which may still be an empty MDSMap (for clusters
that have never used CephFS), we need to force a proposal so older
versions of the struct are trimmed.

This is the main fix of this branch. We removed code which processed old
encodings of the MDSMap in the mon store via 60bc524. That broke old
ceph clusters which never used CephFS (see cited ticket below). This is
because the initial epoch is an empty MDSMap (back in Infernalis/Hammer)
that is never updated. So, the fix here is to just do proposals
periodically until all of the old structs are automatically trimmed by
the mons.

Fixes: 60bc524827bac072658203e56b1fa3dede9641c5
Fixes: https://tracker.ceph.com/issues/51673
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom