]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
2 years agocrimson/monc: replace then_unpack() with discard_result() if possible 47285/head
Radoslaw Zarzynski [Tue, 26 Jul 2022 16:24:12 +0000 (16:24 +0000)]
crimson/monc: replace then_unpack() with discard_result() if possible

More understandable this way.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2 years agocrimson/monc: renew_rotating_keyring() doesn't assume the clock is monotonic 47277/head
Radoslaw Zarzynski [Tue, 26 Jul 2022 09:39:08 +0000 (09:39 +0000)]
crimson/monc: renew_rotating_keyring() doesn't assume the clock is monotonic

According to the `seastar::lowres_system_clock` reference:

> This clock has the same granularity as lowres_clock,
> but it is not required to be monotonic and its time
> points correspond to system time.

we should similar check as the classical `MonClient` does:

```cpp
  if ((now > last_rotating_renew_sent) &&
      double(now - last_rotating_renew_sent) < 1) {
    ldout(cct, 10) << __func__ << " called too often (last: "
                   << last_rotating_renew_sent << "), skipping refresh" << dendl;
    return 0;
  }
```

This check is rather paranoidal and the main reason behind it
in crimson is replicating the classical behaviour.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2 years agocrimson/monc: unify handling auth_service_ticket_ttl with classical OSD
Radoslaw Zarzynski [Tue, 26 Jul 2022 09:31:38 +0000 (09:31 +0000)]
crimson/monc: unify handling auth_service_ticket_ttl with classical OSD

In the classical `MonClient` the `auth_service_ticket_ttl` is lower
bounded to `30` units.

```cpp
  utime_t now = ceph_clock_now();
  utime_t cutoff = now;
  cutoff -= std::min(30.0, cct->_conf->auth_service_ticket_ttl / 4.0);
  utime_t issued_at_lower_bound = now;
  issued_at_lower_bound -= cct->_conf->auth_service_ticket_ttl;
  if (!rotating_secrets->need_new_secrets(cutoff)) {
    ldout(cct, 10) << "_check_auth_rotating have uptodate secrets (they expire after " << cutoff << ")" << dendl;
    rotating_secrets->dump_rotating();
    return 0;
  }
```

The unification affects also the debug mesages.

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2 years agocrimson/monc: improve debugs around Connection::renew_tickets()
Radoslaw Zarzynski [Mon, 25 Jul 2022 14:48:07 +0000 (14:48 +0000)]
crimson/monc: improve debugs around Connection::renew_tickets()

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
2 years agocrimson/monc: renew tickets and rotating keys on MAuthReply
Radoslaw Zarzynski [Mon, 25 Jul 2022 14:42:00 +0000 (14:42 +0000)]
crimson/monc: renew tickets and rotating keys on MAuthReply

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoMerge pull request #46736 from baergj/fix-log-buffer-resize
Ilya Dryomov [Tue, 19 Jul 2022 16:56:19 +0000 (18:56 +0200)]
Merge pull request #46736 from baergj/fix-log-buffer-resize

log: Make log_max_recent have an effect again

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #46666 from MrFreezeex/batch-blockdb-fix
Guillaume Abrioux [Tue, 19 Jul 2022 15:22:51 +0000 (17:22 +0200)]
Merge pull request #46666 from MrFreezeex/batch-blockdb-fix

ceph-volume: fix fast device alloc size on mulitple device

3 years agoMerge pull request #46928 from linuxbox2/wip-rgwlc-azone
Matt Benjamin [Tue, 19 Jul 2022 14:46:07 +0000 (10:46 -0400)]
Merge pull request #46928 from linuxbox2/wip-rgwlc-azone

rgwlc: permit lifecycle to reduce data conditionally in archive zone

3 years agoMerge pull request #47019 from badone/wip-get_or_fail-debug-louder
Yuri Weinstein [Tue, 19 Jul 2022 14:07:14 +0000 (07:07 -0700)]
Merge pull request #47019 from badone/wip-get_or_fail-debug-louder

msg: Log at higher level when Throttle::get_or_fail() fails

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #46866 from myoungwon/wip-fix-53294
Yuri Weinstein [Tue, 19 Jul 2022 14:05:47 +0000 (07:05 -0700)]
Merge pull request #46866 from myoungwon/wip-fix-53294

osd: return ENOENT if pool information is invalid during tier-flush

Reviewed-by: Laura Flores <lflores@redhat.com>
3 years agoMerge pull request #47116 from chrisphoffman/wip-rbd-56549
Ilya Dryomov [Tue, 19 Jul 2022 08:27:44 +0000 (10:27 +0200)]
Merge pull request #47116 from chrisphoffman/wip-rbd-56549

librbd: bail from schedule_request_lock() if already lock owner

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 years agoMerge pull request #47139 from rishabh-d-dave/cephfs-top-man
Jos Collin [Tue, 19 Jul 2022 07:45:40 +0000 (13:15 +0530)]
Merge pull request #47139 from rishabh-d-dave/cephfs-top-man

doc/man/cephfs-top.rst: add missing options: --delay, --conffile

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Jos Collin <jcollin@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
3 years agoMerge pull request #46889 from rhcs-dashboard/osd-followup-iops
Nizamudeen A [Tue, 19 Jul 2022 07:36:34 +0000 (13:06 +0530)]
Merge pull request #46889 from rhcs-dashboard/osd-followup-iops

mgr/dashboard: do not recommend throughput for ssd's only cluster

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: sunilangadi2 <NOT@FOUND>
3 years agoMerge pull request #47135 from rhcs-dashboard/doc-default-main
Aashish Sharma [Tue, 19 Jul 2022 06:17:26 +0000 (11:47 +0530)]
Merge pull request #47135 from rhcs-dashboard/doc-default-main

mgr/dashboard: change doc service default release from master to main

3 years agoMerge pull request #47034 from tchaikov/wip-mon-signness-cleanup
Kefu Chai [Tue, 19 Jul 2022 04:27:06 +0000 (12:27 +0800)]
Merge pull request #47034 from tchaikov/wip-mon-signness-cleanup

mon: make paxos_size() unsigned

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Reviewed-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
3 years agoMerge pull request #47089 from athanatos/sjust/wip-pg-shard-manager
Samuel Just [Tue, 19 Jul 2022 03:31:26 +0000 (20:31 -0700)]
Merge pull request #47089 from athanatos/sjust/wip-pg-shard-manager

crimson: introduce pg_shard_manager, begin to separate osd-wide from core-local state

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agodoc/man/cephfs-top.rst: add missing options: --delay, --conffile 47139/head
wangxinyu [Wed, 23 Mar 2022 01:21:36 +0000 (09:21 +0800)]
doc/man/cephfs-top.rst: add missing options: --delay, --conffile

add missing options: --delay, --conffile

Signed-off-by: wangxinyu <wangxinyu@inspur.com>
Signed-off-by: Rishabh Dave <ridave@redhat.com>
3 years agoMerge pull request #47152 from tchaikov/wip-crimson-close-later
Kefu Chai [Tue, 19 Jul 2022 02:30:49 +0000 (10:30 +0800)]
Merge pull request #47152 from tchaikov/wip-crimson-close-later

crimson/net: postpone the close() using yield()

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agoMerge pull request #46493 from adk3798/tuned-profiles
Adam King [Mon, 18 Jul 2022 21:29:51 +0000 (17:29 -0400)]
Merge pull request #46493 from adk3798/tuned-profiles

mgr/cephadm: support for os tuning profiles

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>
3 years agoMerge pull request #46924 from hookak/loki-support
Adam King [Mon, 18 Jul 2022 21:27:20 +0000 (17:27 -0400)]
Merge pull request #46924 from hookak/loki-support

mgr/cephadm: fix the loki address in grafana, promtail configuration file

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #46620 from rzarzynski/wip-tools-cot-force-pg-import
Yuri Weinstein [Mon, 18 Jul 2022 20:38:05 +0000 (13:38 -0700)]
Merge pull request #46620 from rzarzynski/wip-tools-cot-force-pg-import

tools: COT ignores fsid mismatch when importing PG with --force

Reviewed-by: Nitzan Mordechai <nmordech@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agoMerge pull request #47044 from alimaredia/wip-rgw-suite-s3test-refactor
Ali Maredia [Mon, 18 Jul 2022 19:21:29 +0000 (15:21 -0400)]
Merge pull request #47044 from alimaredia/wip-rgw-suite-s3test-refactor

add s3tests-brach.yaml for rgw teuthology suites that run s3tests

Reviewed-by: Ali Maredia <amaredia@redhat.com>
3 years agoMerge pull request #46615 from selvakumaar5496/main
Ali Maredia [Mon, 18 Jul 2022 19:15:56 +0000 (15:15 -0400)]
Merge pull request #46615 from selvakumaar5496/main

rgw: fix api response in case of get and delete object tagging apis

Reviewed-by: Ali Maredia <amaredia@redhat.com>
3 years agocrimson/net: postpone the close() using yield() 47152/head
Kefu Chai [Mon, 18 Jul 2022 14:35:19 +0000 (22:35 +0800)]
crimson/net: postpone the close() using yield()

otherwise we'd erase an element in a container when we are still
iterating through it.

Fixes: https://tracker.ceph.com/issues/56589
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
3 years agoMerge pull request #47136 from idryomov/wip-48038
Ilya Dryomov [Mon, 18 Jul 2022 14:42:42 +0000 (16:42 +0200)]
Merge pull request #47136 from idryomov/wip-48038

qa/suites/rbd: disable workunit timeout for dynamic_features_no_cache

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agolibrbd: bail from schedule_request_lock() if already lock owner 47116/head
Christopher Hoffman [Thu, 14 Jul 2022 18:20:29 +0000 (12:20 -0600)]
librbd: bail from schedule_request_lock() if already lock owner

Race condition may be hit if there are multiple pending locks for the
same image and pending callbacks. Abort exclusive lock process if
already exclusive lock owner.

Fixes: https://tracker.ceph.com/issues/56549
Signed-off-by: Christopher Hoffman <choffman@redhat.com>
3 years agoMerge pull request #46911 from ifed01/wip-ifed-fix-mempool-cache-other
Igor Fedotov [Mon, 18 Jul 2022 09:06:46 +0000 (12:06 +0300)]
Merge pull request #46911 from ifed01/wip-ifed-fix-mempool-cache-other

os/bluestore: fix AU accounting in bluestore_cache_other mempool.

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
3 years agoMerge pull request #47106 from idryomov/wip-56561
Ilya Dryomov [Mon, 18 Jul 2022 08:08:20 +0000 (10:08 +0200)]
Merge pull request #47106 from idryomov/wip-56561

rbd: don't default empty pool name unless namespace is specified

Reviewed-by: Christopher Hoffman <choffman@redhat.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>
3 years agoMerge pull request #46741 from Matan-B/wip-matanb-docker-debug
Matan [Mon, 18 Jul 2022 07:40:06 +0000 (10:40 +0300)]
Merge pull request #46741 from Matan-B/wip-matanb-docker-debug

script: CentOS 8 EOL, use archived mirror in ceph-debug-docker.sh

Reviewed-by: Nitzan Mordechai nmordech@redhat.com
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoMerge pull request #46375 from ceph/cv-loop-devs
Guillaume Abrioux [Mon, 18 Jul 2022 07:24:24 +0000 (09:24 +0200)]
Merge pull request #46375 from ceph/cv-loop-devs

ceph-volume: Optionally consume loop devices

3 years agomgr/dashboard: do not recommend throughput for ssd's only cluster 46889/head
Nizamudeen A [Mon, 18 Jul 2022 05:38:28 +0000 (11:08 +0530)]
mgr/dashboard: do not recommend throughput for ssd's only cluster

This is just a bug fix where we recommend the throughput option even if
there are only ssd's are present in the cluster.

Fixes: https://tracker.ceph.com/issues/56413
Signed-off-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #47113 from zhscn/fix-split-test
Yingxin [Mon, 18 Jul 2022 02:42:29 +0000 (10:42 +0800)]
Merge pull request #47113 from zhscn/fix-split-test

crimson/os/seastore: fix bugs in test_map_existing_extent_concurrent

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agorgwlc: return std::string_view from sal::Zone::get_tier_type() 46928/head
Matt Benjamin [Fri, 15 Jul 2022 15:01:15 +0000 (11:01 -0400)]
rgwlc: return std::string_view from sal::Zone::get_tier_type()

Valid values are all small strings, often static.

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
3 years agorgwlc: count LCFilter flags towards multi-condition
Matt Benjamin [Fri, 15 Jul 2022 00:04:20 +0000 (20:04 -0400)]
rgwlc:  count LCFilter flags towards multi-condition

Found by Soumya Koduri <skoduri@redhat.com>

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
3 years agorgwlc: return real default zone type from sal_dbstore and sal_motr
Matt Benjamin [Wed, 13 Jul 2022 18:28:08 +0000 (14:28 -0400)]
rgwlc: return real default zone type from sal_dbstore and sal_motr

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
3 years agorgwlc: update LCFilter::dump_xml(...) to add flags/ArchiveZone
Matt Benjamin [Thu, 7 Jul 2022 21:29:27 +0000 (17:29 -0400)]
rgwlc:  update LCFilter::dump_xml(...) to add flags/ArchiveZone

Also updates the location of Prefix, which is supposed to *generate*
as <Filter><Prefix/></Filter>, regardless of how we parsed it.

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
3 years agorgwlc: activate lifecycle processing on non-master zones
Matt Benjamin [Tue, 5 Jul 2022 22:33:09 +0000 (18:33 -0400)]
rgwlc: activate lifecycle processing on non-master zones

The basic idea of this change is the same as the proposal by
Ilsoo Byun <ilsoobyun@linecorp.com>, but some details have changed.

The main differences are to use the existing
RGWLC::set(remove)_bucket_config methods, and to use the
RGWBucketInstanceMetadataHandler infrastructue to dispatch
the corresponding calls.  Thank you!

Fixes: https://tracker.ceph.com/issues/44268
Related PR: #33524

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
3 years agorgwlc: zone-conditional lifecycle processing
Matt Benjamin [Fri, 1 Jul 2022 21:55:33 +0000 (17:55 -0400)]
rgwlc:  zone-conditional lifecycle processing

Lifecycle rules with the ArchiveZone flag must execute on archive zones,
but must not execute on others.

Fixes: https://tracker.ceph.com/issues/56440
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
3 years agorgwlc: represent lc filter flags as XML elements
Matt Benjamin [Fri, 1 Jul 2022 12:54:32 +0000 (08:54 -0400)]
rgwlc: represent lc filter flags as XML elements

Suggested by Casey in review, this makes the XML prettier.

also: fix filter parsing, remove unused code

Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
3 years agorgwlc: introduce lifecycle config flags extension
Matt Benjamin [Sat, 20 Nov 2021 18:45:51 +0000 (13:45 -0500)]
rgwlc: introduce lifecycle config flags extension

rgwlc: add uint32_t flags bitmap to LCFilter

This is intended to support a concise set of extensions to S3
LifecycleConfiguration, initially, just a flag that indicates a
rule is intended for execution on RGW ArchiveZone.

rgwlc: add machinery to define and recognize LCFilter flags

Add a concept of filter flags to lifecycle filter rules, an RGW
extension.  The initial purpose of flags is to permit marking
specific lifecycle rules as specific to an RGW archive zone, but
other flags could be added in future.

rgwlc: add new unittest_rgw_lc to run internal checks, add a few
valid and invalid lifecycle configuration xml  parses for now.

Fixes: https://tracker.ceph.com/issues/53361
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
3 years agoqa/suites/rbd: disable workunit timeout for dynamic_features_no_cache 47136/head
Ilya Dryomov [Fri, 20 May 2022 12:05:03 +0000 (14:05 +0200)]
qa/suites/rbd: disable workunit timeout for dynamic_features_no_cache

The I/O workload in this test is xfstests (qa/run_xfstests_qemu.sh)
which isn't subjected to any timeout other than global max_job_time
limit in any other subsuite (e.g. qemu/workloads/qemu_xfstests.yaml).
But here, there is a parallel "op" workload defined as a workunit.
The workunit task has a default timeout of 3 hours which is effectively
imposed on the entire job.  In the "rbd cache = false" configuration,
it's sometimes exceeded.

Fixes: https://tracker.ceph.com/issues/48038
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agomgr/dashboard: change doc service default release from master to main 47135/head
Nizamudeen A [Sun, 17 Jul 2022 15:12:28 +0000 (20:42 +0530)]
mgr/dashboard: change doc service default release from master to main

Signed-off-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #47025 from adamemerson/wip-55765
Adam C. Emerson [Sun, 17 Jul 2022 02:17:05 +0000 (22:17 -0400)]
Merge pull request #47025 from adamemerson/wip-55765

rgw: Guard against malformed bucket URLs

Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #46897 from rkachach/fix_issue_55808
Adam King [Sat, 16 Jul 2022 22:43:11 +0000 (18:43 -0400)]
Merge pull request #46897 from rkachach/fix_issue_55808

mgr/cephadm: check for events key before accessing it

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
3 years agodoc/cephadm: os tuning profile documentation 46493/head
Adam King [Thu, 23 Jun 2022 19:46:22 +0000 (15:46 -0400)]
doc/cephadm: os tuning profile documentation

Signed-off-by: Adam King <adking@redhat.com>
3 years agomgr/cephadm: unit tests for tuned os profiles
Adam King [Thu, 23 Jun 2022 16:57:14 +0000 (12:57 -0400)]
mgr/cephadm: unit tests for tuned os profiles

Signed-off-by: Adam King <adking@redhat.com>
3 years agomgr/cephadm: support for os tuning profiles
Adam King [Tue, 31 May 2022 20:22:49 +0000 (16:22 -0400)]
mgr/cephadm: support for os tuning profiles

Fixes: https://tracker.ceph.com/issues/55819
Signed-off-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #47011 from s0nea/wip-prevent-alert-redirects
Nizamudeen A [Sat, 16 Jul 2022 13:56:26 +0000 (19:26 +0530)]
Merge pull request #47011 from s0nea/wip-prevent-alert-redirects

mgr/dashboard: prevent alert redirect

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #44217 from CongMinYin/fix-pwl-recovery-test
Ilya Dryomov [Sat, 16 Jul 2022 09:34:55 +0000 (11:34 +0200)]
Merge pull request #44217 from CongMinYin/fix-pwl-recovery-test

qa/suites/rbd/pwl-cache: ensure recovery is actually tested

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agorbd: drop unused default_empty_pool_name argument 47106/head
Ilya Dryomov [Thu, 14 Jul 2022 12:42:45 +0000 (14:42 +0200)]
rbd: drop unused default_empty_pool_name argument

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agorbd: don't default empty pool name unless namespace is specified
Ilya Dryomov [Thu, 14 Jul 2022 12:19:06 +0000 (14:19 +0200)]
rbd: don't default empty pool name unless namespace is specified

Commit 96f05a7956b3 ("rbd: delay determination of default pool name")
broke "rbd perf image iostat" and "rbd perf image iotop" GLOBAL_POOL_KEY
support (the ability to blend all rbd pools together into a single
view).

Fixes: https://tracker.ceph.com/issues/56561
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agoqa/tasks: rename persistent write log cache trash task 44217/head
Ilya Dryomov [Sat, 16 Jul 2022 06:54:38 +0000 (08:54 +0200)]
qa/tasks: rename persistent write log cache trash task

It doesn't really thrash anything, just repeatedly restarts the
workload on top of a dirty cache file.  rbd_pwl_cache_recovery is
more on point and gets covered by existing CODEOWNERS.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agomon: make paxos_size() unsigned 47034/head
Kefu Chai [Sun, 10 Jul 2022 04:50:14 +0000 (00:50 -0400)]
mon: make paxos_size() unsigned

* make paxos_size() unsigned, as paxos_size() returns the size of
  MonMap::mon_info, so it should be always a non-negative value,
  and more importantly, it represents a size.
* change the type of MonMap::removed_ranks from std::set<int>
  to std::set<unsigned>. for two reasons:
  - removed_ranks only tracks the rank which is greater or equal to 0
  - helps to silence the warnings listed below.
  MonMap::removed_ranks is persisted using encode()/decode(), but this
  change is backward compatible, as we use the raw encoder to encode
  signed and unsigned integers, the difference between the encoding
  schema between them only matters when MSB in the number is used,
  but this is not likely happen, as we neither have a negative
  rank in removed_ranks, no have a rank greater than `(unsigned)-1`,
  i.e., 0xffffffff.

this change partially reverts f75dfbc055ccf4e43b817ed5aa52898ff680e19e

to address the compiling warnings like:

/home/kefu/dev/ceph/src/mon/ElectionLogic.cc: In member function ‘void ElectionLogic::end_election_period()’:
/home/kefu/dev/ceph/src/mon/ElectionLogic.cc:173:23: error: comparison of integer expressions of different signedness: ‘std::set<int>::size_type’ {aka ‘long unsigned int’} and ‘int’ [-Werror=sign-compare]
  173 |       acked_me.size() > (elector->paxos_size() / 2)) {
      |       ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/kefu/dev/ceph/src/mon/ElectionLogic.cc: In member function ‘void ElectionLogic::propose_connectivity_handler(int, epoch_t, const ConnectionTracker*)’:
/home/kefu/dev/ceph/src/mon/ElectionLogic.cc:338:28: error: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Werror=sign-compare]
  338 |     for (unsigned i = 0; i < elector->paxos_size(); ++i) {
      |                          ~~^~~~~~~~~~~~~~~~~~~~~~~
/home/kefu/dev/ceph/src/mon/ElectionLogic.cc: In member function ‘void ElectionLogic::receive_ack(int, epoch_t)’:
/home/kefu/dev/ceph/src/mon/ElectionLogic.cc:469:25: error: comparison of integer expressions of different signedness: ‘std::set<int>::size_type’ {aka ‘long unsigned int’} and ‘int’ [-Werror=sign-compare]
  469 |     if (acked_me.size() == elector->paxos_size()) {
      |         ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
make[3]: *** [src/mon/CMakeFiles/mon.dir/build.make:328: src/mon/CMakeFiles/mon.dir/ElectionLogic.cc.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[3]: Leaving directory '/home/kefu/dev/ceph/build'
[ 48%] Built target libglobal_objs
/home/kefu/dev/ceph/src/mon/Elector.cc: In member function ‘void Elector::notify_rank_removed(int)’:
/home/kefu/dev/ceph/src/mon/Elector.cc:734:43: error: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Werror=sign-compare]
  734 |     for (unsigned i = rank_removed + 1; i <= paxos_size() ; ++i) {
      |                                         ~~^~~~~~~~~~~~~~~

Fixes: https://tracker.ceph.com/issues/56581
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
3 years agoMerge pull request #47109 from zdover23/wip-doc-2022-07-15-start-hw-recs-cleanup-1
zdover23 [Sat, 16 Jul 2022 02:28:43 +0000 (12:28 +1000)]
Merge pull request #47109 from zdover23/wip-doc-2022-07-15-start-hw-recs-cleanup-1

doc/start: update hardware recs

Reviewed-by: Anthony D'Atri
3 years agoMerge pull request #46908 from mlausch/snapshot_key_conversion
Neha Ojha [Fri, 15 Jul 2022 20:50:47 +0000 (13:50 -0700)]
Merge pull request #46908 from mlausch/snapshot_key_conversion

osd/SnapMapper: fix legacy key conversion in snapmapper class

Reviewed-by: Matan Breizman <mbreizma@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
3 years agorgw: Guard against malformed bucket URLs 47025/head
Adam C. Emerson [Fri, 8 Jul 2022 18:58:16 +0000 (14:58 -0400)]
rgw: Guard against malformed bucket URLs

Misplaced colons can result in radosgw thinking is has a bucket URL
but with no bucket name, leading to a crash later on.

Fixes: https://tracker.ceph.com/issues/55765
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
3 years agorgw: Fix `rgw::sal::Bucket::empty` static method signatures
Adam C. Emerson [Mon, 11 Jul 2022 15:52:09 +0000 (11:52 -0400)]
rgw: Fix `rgw::sal::Bucket::empty` static method signatures

`unique_ptr` overload should take by reference.

Both should be const.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
3 years agoMerge pull request #47052 from neha-ojha/wip-cot-label
Neha Ojha [Fri, 15 Jul 2022 18:36:18 +0000 (11:36 -0700)]
Merge pull request #47052 from neha-ojha/wip-cot-label

.github/labeler.yml: add core label to some tools

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoMerge pull request #47115 from ceph/fix-mib
Ilya Dryomov [Fri, 15 Jul 2022 16:48:47 +0000 (18:48 +0200)]
Merge pull request #47115 from ceph/fix-mib

ceph.spec.in: fix path for mib file and properly mark in %files

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 years agoMerge pull request #47035 from idryomov/wip-56516
Ilya Dryomov [Fri, 15 Jul 2022 13:46:24 +0000 (15:46 +0200)]
Merge pull request #47035 from idryomov/wip-56516

rbd-mirror: remove bogus completed_non_primary_snapshots_exist check

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
3 years agoceph.spec.in: fix path for mib file and properly mark in %files 47115/head
Justin Caratzas [Thu, 14 Jul 2022 22:45:49 +0000 (18:45 -0400)]
ceph.spec.in: fix path for mib file and properly mark in %files

Fixes typos introduced in https://github.com/ceph/ceph/pull/46918

Signed-off-by: Justin Caratzas <jcaratza@redhat.com>
3 years agoosd: return ENOENT if pool information is invalid during tier-flush 46866/head
myoungwon oh [Tue, 28 Jun 2022 04:42:21 +0000 (13:42 +0900)]
osd: return ENOENT if pool information is invalid during tier-flush

During tier-flush, OSD sends reference increase message to target OSD.
At this point, sending message with invalid pool information (e.g., deleted pool)
causes unexpected behavior.

Therefore, this commit return ENOENT early before sending the message

fixes: https://tracker.ceph.com/issues/53294

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
3 years agoMerge pull request #47028 from xxhdx1985126/wip-seastore-backref-cache-refactor
Yingxin [Fri, 15 Jul 2022 07:16:42 +0000 (15:16 +0800)]
Merge pull request #47028 from xxhdx1985126/wip-seastore-backref-cache-refactor

crimson/os/seastore: simplify backref cache

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agocrimson/os/seastore: fix bugs in test_map_existing_extent_concurrent 47113/head
Zhang Song [Fri, 15 Jul 2022 06:40:46 +0000 (14:40 +0800)]
crimson/os/seastore: fix bugs in test_map_existing_extent_concurrent

Signed-off-by: Zhang Song <zhangsong325@gmail.com>
3 years agocrimson/os/seastore: simplify backref cache 47028/head
Xuehan Xu [Thu, 7 Jul 2022 08:05:20 +0000 (16:05 +0800)]
crimson/os/seastore: simplify backref cache

Currently, the following transaction exec sequence would lead to
loss of backref:

1. Trans `A` merge a alloc backref for extent `X`
2. Trans `B` add a release backref for extent `X` to backref cache,
   during which it finds an in-cache alloc backref for extent `X` and
   decide not to add the release backref to cache
3. Trans `A` commit

In the above sequece, the release backref for extent `X` is lost.

This is a regression introduced when we try to optimize the backref cache.

This commit fix the issue by caching inflight backrefs in a multiset,
alloc/release ops that happen on the same paddr are queued in the order of
their happening. When doing gc, all those backrefs are merged.

Fixes: https://tracker.ceph.com/issues/56519
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
3 years agodoc/start: update hardware recs 47109/head
Zac Dover [Thu, 14 Jul 2022 19:29:11 +0000 (05:29 +1000)]
doc/start: update hardware recs

This PR picks up the parts of
https://github.com/ceph/ceph/pull/44466
that were not merged back in January, when that
pull request was raised.

Matters added here:
* improved organzation of matter
* emphasis of IOPs per core over cores per OSD

Signed-off-by: Zac Dover <zac.dover@gmail.com>
3 years agoMerge pull request #46546 from mgfritch/vstart-stop-mds
Adam King [Thu, 14 Jul 2022 18:02:32 +0000 (14:02 -0400)]
Merge pull request #46546 from mgfritch/vstart-stop-mds

src/stop.sh: stop existing ceph-mds daemons

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
3 years agoMerge pull request #46014 from vrushch/rbd_form_fix
Pere Diaz Bou [Thu, 14 Jul 2022 16:20:24 +0000 (18:20 +0200)]
Merge pull request #46014 from vrushch/rbd_form_fix

mgr/dashboard: rbd striping setting pre-population and pop-over

Reviewed-by: Pegonzal <NOT@FOUND>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agoMerge pull request #47066 from adk3798/osd-removal-docs-update
Adam King [Thu, 14 Jul 2022 15:07:08 +0000 (11:07 -0400)]
Merge pull request #47066 from adk3798/osd-removal-docs-update

doc/cephadm: add note about OSDs being recreated to OSD removal section

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>
3 years agoMerge pull request #47062 from guits/update-latest-stable-default
Adam King [Thu, 14 Jul 2022 15:03:39 +0000 (11:03 -0400)]
Merge pull request #47062 from guits/update-latest-stable-default

cephadm: update LATEST_STABLE_RELEASE

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #46260 from CongMinYin/wip-enable-ndctl
Ilya Dryomov [Thu, 14 Jul 2022 13:21:29 +0000 (15:21 +0200)]
Merge pull request #46260 from CongMinYin/wip-enable-ndctl

cmake: enable ndctl when building PMDK for WITH_BLUESTORE_PMEM

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 years agoMerge pull request #46644 from rhcs-dashboard/rbd-list-pagination
Nizamudeen A [Thu, 14 Jul 2022 11:34:31 +0000 (17:04 +0530)]
Merge pull request #46644 from rhcs-dashboard/rbd-list-pagination

mgr/dashboard: rbd image pagination

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #45981 from rhcs-dashboard/box-no-more-ultron
Nizamudeen A [Thu, 14 Jul 2022 11:13:49 +0000 (16:43 +0530)]
Merge pull request #45981 from rhcs-dashboard/box-no-more-ultron

cephadm/box: Rootless podman box implementation

Reviewed-by: Pegonzal <NOT@FOUND>
Reviewed-by: anthonyeleven <NOT@FOUND>
Reviewed-by: melissa-kun-li <NOT@FOUND>
3 years agoosd/SnapMapper: fix pacific legacy key conversion and introduce test 46908/head
Manuel Lausch [Thu, 30 Jun 2022 12:29:53 +0000 (14:29 +0200)]
osd/SnapMapper: fix pacific legacy key conversion and introduce test

Octopus modified the SnapMapper key format from

  <LEGACY_MAPPING_PREFIX><snapid>_<shardid>_<hobject_t::to_str()>

to

  <MAPPING_PREFIX><pool>_<snapid>_<shardid>_<hobject_t::to_str()>

When this change was introduced, 94ebe0ea also introduced a conversion
with a crucial bug which essentially destroyed legacy keys by mapping them
to

  <MAPPING_PREFIX><poolid>_<snapid>_

without the object-unique suffix.  This commit fixes this conversion going
forward, but a fix for existing clusters still needs to be developed.

Fixes: https://tracker.ceph.com/issues/56147
Signed-off-by: Manuel Lausch <manuel.lausch@1und1.de>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
3 years agoMerge pull request #46650 from faithuniterh/Modifying-boto3-examples
Yuval Lifshitz [Thu, 14 Jul 2022 08:03:20 +0000 (11:03 +0300)]
Merge pull request #46650 from faithuniterh/Modifying-boto3-examples

Modifying boto3 examples

reviewed-by: ylifshit@redhat.com

3 years agomgr/cephadm: fix the loki address in grafana, promtail configuration files 46924/head
jinhong.kim [Fri, 24 Jun 2022 05:50:05 +0000 (14:50 +0900)]
mgr/cephadm: fix the loki address in grafana, promtail configuration files

- Fix to use loki address instead of MGR address(grafana datasource, promtail loki host)
- Add a loki dependency on grafana and promtail services

Signed-off-by: jinhong.kim <jinhong.kim0@navercorp.com>
3 years agoMerge pull request #47039 from rosinL/fix-perf-cirmson-msgr
Yingxin [Thu, 14 Jul 2022 06:25:53 +0000 (14:25 +0800)]
Merge pull request #47039 from rosinL/fix-perf-cirmson-msgr

tools/crimson/perf_crimson_msgr:fix perf_crimson_msgr abort

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agocrimson/osd: move start_pg_operation to pg_shard_manager 47089/head
Samuel Just [Tue, 12 Jul 2022 23:35:50 +0000 (23:35 +0000)]
crimson/osd: move start_pg_operation to pg_shard_manager

Signed-off-by: Samuel Just <sjust@redhat.com>
3 years agocrimson/osd: move pg_map and associated state to CoreState
Samuel Just [Fri, 8 Jul 2022 06:16:25 +0000 (06:16 +0000)]
crimson/osd: move pg_map and associated state to CoreState

Signed-off-by: Samuel Just <sjust@redhat.com>
3 years agocrimson/osd: move osdmap service to CoreState
Samuel Just [Fri, 1 Jul 2022 07:22:32 +0000 (00:22 -0700)]
crimson/osd: move osdmap service to CoreState

Signed-off-by: Samuel Just <sjust@redhat.com>
3 years agocrimson/osd: factor out open_meta_coll, open_or_create_meta_coll
Samuel Just [Fri, 1 Jul 2022 07:23:34 +0000 (00:23 -0700)]
crimson/osd: factor out open_meta_coll, open_or_create_meta_coll

Signed-off-by: Samuel Just <sjust@redhat.com>
3 years agocrimson/osd: move osd_state to CoreState
Samuel Just [Wed, 13 Jul 2022 00:04:45 +0000 (00:04 +0000)]
crimson/osd: move osd_state to CoreState

Signed-off-by: Samuel Just <sjust@redhat.com>
3 years agocrimson/osd: move osdmap_gate to CoreState
Samuel Just [Tue, 12 Jul 2022 23:51:01 +0000 (16:51 -0700)]
crimson/osd: move osdmap_gate to CoreState

Signed-off-by: Samuel Just <sjust@redhat.com>
3 years agocrimson/osd: introduce pg_shard_manager to clarify shard-local vs osd-wide state
Samuel Just [Tue, 12 Jul 2022 22:35:44 +0000 (22:35 +0000)]
crimson/osd: introduce pg_shard_manager to clarify shard-local vs osd-wide state

This commits begins to change ShardServices to be the interface by which
PGs access shard local and osd wide state.  Future work will further
clarify this interface boundary and introduce machinery to mediate cold
path access to state on remote shards.

Signed-off-by: Samuel Just <sjust@redhat.com>
3 years agoMerge pull request #47081 from zhscn/fix-map-exist
Samuel Just [Wed, 13 Jul 2022 20:59:51 +0000 (13:59 -0700)]
Merge pull request #47081 from zhscn/fix-map-exist

crimson/os/seastore: fix bug of Transaction::is_retired

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agodoc/cephadm: add note about OSDs being recreated to OSD removal section 47066/head
Adam King [Tue, 12 Jul 2022 20:54:19 +0000 (16:54 -0400)]
doc/cephadm: add note about OSDs being recreated to OSD removal section

Signed-off-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #46918 from bigjust/wip-jcaratza-mib-rpm
Justin Caratzas [Wed, 13 Jul 2022 16:56:53 +0000 (12:56 -0400)]
Merge pull request #46918 from bigjust/wip-jcaratza-mib-rpm

monitoring:package SNMP MIB file as an rpm

3 years agomgr/dashboard: fix rbdconfiguration init type 46644/head
Pere Diaz Bou [Wed, 6 Jul 2022 15:47:31 +0000 (17:47 +0200)]
mgr/dashboard: fix rbdconfiguration init type

Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agomgr/dashboard: prevent alert redirect 47011/head
Tatjana Dehler [Thu, 7 Jul 2022 15:21:14 +0000 (17:21 +0200)]
mgr/dashboard: prevent alert redirect

Prevent Alertmanager alerts from being redirected to the active mgr
dashboard instance. There are two reasons for it:

1. It doesn't bring any additional benefit. The Alertmanager config
   includes all available mgr instances - active and passive ones. In
   case of an alert, it will be sent to all of them. It ensures that
   the active mgr dashboard will receive the alert in any case.
2. The redirect URL includes the mgr IP and NOT the FQDN. This leads
   to issues in environments where an SSL certificate is configured and
   matches the FQDNs, only.

Fixes: https://tracker.ceph.com/issues/56401
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
3 years agocephadm/box: Choose between docker or podman with --engine 45981/head
Pere Diaz Bou [Tue, 12 Jul 2022 10:28:47 +0000 (12:28 +0200)]
cephadm/box: Choose between docker or podman with --engine

With ./box.py --engine docker you can specify you want to use docker
instead of podman. With docker box.py command should be run with sudo.

Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agoMerge pull request #46114 from myoungwon/wip-dedup-tool-object-dedup-snapshot
Yuri Weinstein [Wed, 13 Jul 2022 14:32:32 +0000 (07:32 -0700)]
Merge pull request #46114 from myoungwon/wip-dedup-tool-object-dedup-snapshot

tool/ceph-dedup-tool: add performing dedup option on cloned object

Reviewed-by: Samuel Just <sjust@redhat.com>
3 years agoMerge pull request #47021 from aisakaki/wip-cache-opt
Yingxin [Wed, 13 Jul 2022 14:01:13 +0000 (22:01 +0800)]
Merge pull request #47021 from aisakaki/wip-cache-opt

crimson/os/seastore/cache: fine-grained lru cache control with GC

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agocrimson/os/seastore: fix bug of Transaction::is_retired 47081/head
Zhang Song [Wed, 13 Jul 2022 12:45:05 +0000 (20:45 +0800)]
crimson/os/seastore: fix bug of Transaction::is_retired

The retired extent may exist as a RetiredExtentPlaceholder, casting
this extent to LogicalCachedExtent will cause undefined behavior.

Signed-off-by: Zhang Song <zhangsong325@gmail.com>
3 years agoMerge pull request #46898 from rhcs-dashboard/cleanup-55720-master
Pere Diaz Bou [Wed, 13 Jul 2022 12:43:06 +0000 (14:43 +0200)]
Merge pull request #46898 from rhcs-dashboard/cleanup-55720-master

mgr/dashboard: don't log tracebacks on 404s

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #46987 from rhcs-dashboard/fix-ingress-backend-service-filter
Pere Diaz Bou [Wed, 13 Jul 2022 12:41:31 +0000 (14:41 +0200)]
Merge pull request #46987 from rhcs-dashboard/fix-ingress-backend-service-filter

mgr/dashboard: ingress backend service should list all supported services

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: sunilangadi2 <NOT@FOUND>
3 years agoMerge pull request #47006 from teuchert/fix_56269
Venky Shankar [Wed, 13 Jul 2022 09:29:30 +0000 (14:59 +0530)]
Merge pull request #47006 from teuchert/fix_56269

mgr/snap_schedule: Use rados.Ioctx.remove_object() instead of remove().

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
3 years agoqa/tasks: add thrash test for persistent write log cache
Yin Congmin [Fri, 7 Jan 2022 07:03:44 +0000 (15:03 +0800)]
qa/tasks: add thrash test for persistent write log cache

add thrash test for persistent write log cache. run rbd bench
on persistent write log cache, thrashes rbd bench, test the
recovery function of persistent write log cache.

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
3 years agocrimson/os/seastore/cache: fine-grained lru cache control with GC 47021/head
Xinyu Huang [Tue, 12 Jul 2022 10:19:04 +0000 (10:19 +0000)]
crimson/os/seastore/cache: fine-grained lru cache control with GC

GC transaction is not sourced by user behaviors, so the extent read
operations from GC transaction don’t satisfy the time locality
principle. These extents should not be added to LRU cache.

Signed-off-by: Xinyu Huang <xinyu.huang@intel.com>
3 years agocrimson/osd/osdmap_gate: remove ShardServices as a member
Samuel Just [Fri, 24 Jun 2022 23:33:53 +0000 (23:33 +0000)]
crimson/osd/osdmap_gate: remove ShardServices as a member

Instead, pass into get_map.

Signed-off-by: Samuel Just <sjust@redhat.com>
3 years agocrimson/osd/osd_operations/peering_event: remove shard_services from constructor
Samuel Just [Fri, 24 Jun 2022 22:02:06 +0000 (15:02 -0700)]
crimson/osd/osd_operations/peering_event: remove shard_services from constructor

We'll want to supply this as part of with_pg_operation etc.

Signed-off-by: Samuel Just <sjust@redhat.com>