git.apps.os.sepia.ceph.com Git - ceph.git/log

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

projects / ceph.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Greg Farnum [Fri, 12 Nov 2021 23:05:02 +0000 (23:05 +0000)]

mon: MonMap: display disallowed_leaders whenever they're set

In c59a6f89465e3933631afa2ba92e8c1ae1c31c06, I erroneously changed
the CLI display output so it would only dump disallowed_leaders in
stretch mode. But they can also be set in connectivity or disallow
election modes and we want users to be able to see them then as well.

Fixes: https://tracker.ceph.com/issues/53258
Signed-off-by: Greg Farnum <gfarnum@redhat.com>

commit | commitdiff | tree

Neha Ojha [Thu, 11 Nov 2021 19:51:59 +0000 (11:51 -0800)]

Merge pull request #43570 from ljflores/wip-pg-stats

mgr/telemetry: modify stats_per_pool and add stats_per_pg

Reviewed-by: Yaarit Hatuka <yaarit@redhat.com>

commit | commitdiff | tree

Yuval Lifshitz [Thu, 11 Nov 2021 17:40:12 +0000 (19:40 +0200)]

Merge pull request #43587 from zenomri/wip-omri-tracer-opentelemetry

common/tracer: Tracer implementation using opentelemetry sdk

commit | commitdiff | tree

Ernesto Puerta [Thu, 11 Nov 2021 16:36:30 +0000 (17:36 +0100)]

Merge pull request #43464 from rsommer/wip-prometheus-standby-behaviour

mgr/prometheus: Make prometheus standby behaviour configurable

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

commit | commitdiff | tree

Sebastian Wagner [Thu, 11 Nov 2021 15:43:34 +0000 (16:43 +0100)]

Merge pull request #43874 from liewegas/qa-podman-add-stream

qa/suites/orch/cephadm: add 8.stream + container_tools

Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Sage Weil [Thu, 11 Nov 2021 15:31:22 +0000 (10:31 -0500)]

Merge PR #43046 into master

* refs/pull/43046/head:
mgr/rook: get running pods, auth rm, better error checking for orch nfs
qa/tasks/rook: add apply nfs to rook qa task
mgr/rook: prevent creation of NFS clusters not in .nfs rados pool
mgr/rook, mgr/nfs: update rook orchestrator to create and use .nfs pool

Reviewed-by: Juan Miguel Olmo <jolmomar@redhat.com>
Reviewed-by: Varsha Rao <rvarsha016@gmail.com>

commit | commitdiff | tree

Patrick Donnelly [Thu, 11 Nov 2021 15:17:56 +0000 (10:17 -0500)]

Merge PR #43851 into master

* refs/pull/43851/head:
mds/FSMap: allow upgrade when no MDS is "in"

Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Thu, 11 Nov 2021 15:14:50 +0000 (10:14 -0500)]

Merge PR #43800 into master

* refs/pull/43800/head:
pybind/mgr/cephadm: disable FSMap sanity checks during MDS upgrade
mds/FSMap: assign v16.2.4 compat to pre-v16.2.5 standby daemons

Reviewed-by: Jeff Layton <jlayton@redhat.com>

commit | commitdiff | tree

Roland Sommer [Fri, 8 Oct 2021 06:40:26 +0000 (08:40 +0200)]

mgr/prometheus: Make standby discoverable

Enable config settings to modify standby's behaviour on the index page
This makes the standby discoverable by reverse proxy or loadbalancer
setups. Testing for the empty response of the '/metrics' endpoint would
trigger metric collection on the active manager instance.

The newly added configuration options settings standby_behaviour and
standby_error_status_code are documented and flagged as runtime, as
modifying both settings has an immediate effect (no restart required).

Co-authored-by: Ernesto Puerta <37327689+epuertat@users.noreply.github.com>
Signed-off-by: Roland Sommer <rol@ndsommer.de>
Fixes: https://tracker.ceph.com/issues/53229

commit | commitdiff | tree

Ronen Friedman [Thu, 11 Nov 2021 07:25:41 +0000 (09:25 +0200)]

Merge pull request #43521 from ronen-fr/wip-rf-scrub-noscrub

osd/scrub: fix the handling of deep-scrub when noscrub is set

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Matan Breizman <mbreizma@redhat.com>

commit | commitdiff | tree

Samuel Just [Thu, 11 Nov 2021 00:04:55 +0000 (16:04 -0800)]

Merge pull request #43860 from rzarzynski/wip-crimson-msgr-drop-extra-clientident-check

crimson/net: drop crimson-specific check for the addr in ClientIdentFrame

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Sebastian Wagner [Wed, 10 Nov 2021 21:32:51 +0000 (22:32 +0100)]

Merge pull request #43783 from pcuzner/alert-fixes

mgr/prometheus: Update rule format and enhance SNMP support

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Wed, 10 Nov 2021 20:52:11 +0000 (21:52 +0100)]

Merge pull request #43845 from rhcs-dashboard/add-mfa-ids-in-rgw-user-details

mgr/dashboard: include mfa_ids in rgw user-details section

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

commit | commitdiff | tree

Laura Flores [Wed, 10 Nov 2021 20:09:51 +0000 (14:09 -0600)]

Merge pull request #43411 from ljflores/wip-mgr-command-cleanup

mon: simplify 'mgr module ls' output

commit | commitdiff | tree

Patrick Donnelly [Wed, 10 Nov 2021 19:00:19 +0000 (14:00 -0500)]

Merge PR #43767 into master

* refs/pull/43767/head:
qa: increase the timeout value to wait a litte longer

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 10 Nov 2021 18:58:48 +0000 (13:58 -0500)]

Merge PR #42520 into master

* refs/pull/42520/head:
test: add cephfs-mirror HA active/active workunit and test yamls
test: add cephfs_mirror thrasher
tasks/cephfs_mirror: optionally run in foreground
mgr/mirroring: throttle directory reassigment to mirror daemons

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Joseph Sawaya [Tue, 14 Sep 2021 18:54:41 +0000 (14:54 -0400)]

mgr/rook: get running pods, auth rm, better error checking for orch nfs

This commit updates orch ls to show the age and the number of running nfs
pods, removes auth entities when removing an nfs service and implements
better error checking when creating nfs daemons.

Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>

commit | commitdiff | tree

Joseph Sawaya [Thu, 9 Sep 2021 19:35:37 +0000 (15:35 -0400)]

qa/tasks/rook: add apply nfs to rook qa task

This commit adds apply nfs to the rook qa task to see if the
command runs with no errors, this doesn't actually check if
an NFS daemon was created.

Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>

commit | commitdiff | tree

Joseph Sawaya [Tue, 3 Aug 2021 17:31:08 +0000 (13:31 -0400)]

mgr/rook: prevent creation of NFS clusters not in .nfs rados pool

This commit prevents the creation of NFS clusters that don't use the
.nfs RADOS pool using ceph orch apply nfs.

Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>

commit | commitdiff | tree

Joseph Sawaya [Fri, 30 Jul 2021 16:07:31 +0000 (12:07 -0400)]

mgr/rook, mgr/nfs: update rook orchestrator to create and use .nfs pool

This commit moves the functionality for creating the .nfs pool from the
nfs module to the rook module and makes the rook module use the .nfs
pool when creating an NFS daemon.

Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>

commit | commitdiff | tree

Sage Weil [Mon, 8 Nov 2021 17:01:45 +0000 (11:01 -0600)]

qa/suites/orch/cephadm: add 8.stream + container_tools

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sebastian Wagner [Wed, 10 Nov 2021 16:34:07 +0000 (17:34 +0100)]

Merge pull request #43775 from liewegas/wip-mgr-rook-osd-creation

mgr/rook: persist drive groups

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 10 Nov 2021 14:40:31 +0000 (06:40 -0800)]

Merge pull request #43746 from kamoltat/wip-fix-autoscale-typo

pybind/mgr/pg_autoscaler: typo default option scale-up to scale-down

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Casey Bodley [Wed, 10 Nov 2021 14:23:31 +0000 (09:23 -0500)]

Merge pull request #43847 from cbodley/wip-53095

qa/rgw: bump tempest version to resolve dependency issue

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Wed, 13 Oct 2021 07:12:09 +0000 (07:12 +0000)]

osd/scrub: fix the handling of deep-scrub when noscrub is set

Recent scrub scheduling code errs in (at one location) incorrectly considering noscrub as not
precluding deep-scrub.

Fixes: https://tracker.ceph.com/issues/52901
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Sebastian Wagner [Wed, 10 Nov 2021 13:10:08 +0000 (14:10 +0100)]

Merge pull request #43536 from guits/lvm-wrapper

cephadm/ceph-volume: do not use lvm binary in containers

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>

commit | commitdiff | tree

Avan Thakkar [Mon, 8 Nov 2021 16:48:23 +0000 (22:18 +0530)]

mgr/dashboard: include mfa_ids in rgw user-details section

Fixes: https://tracker.ceph.com/issues/53193
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Introducing mfa_ids in user details section.

commit | commitdiff | tree

Sebastian Wagner [Wed, 10 Nov 2021 10:15:07 +0000 (11:15 +0100)]

Merge pull request #43712 from adk3798/endpoint-port

mgr/cephadm: fix port handling for cephadm endpoint

Reviewed-by: Daniel Pivonka <dpivonka@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>

commit | commitdiff | tree

Sebastian Wagner [Wed, 10 Nov 2021 10:08:25 +0000 (11:08 +0100)]

Merge pull request #43789 from sebastian-philipp/could-not-locate-podmanb

cephadm: Avoid "Could not locate podman: podman not found"

Reviewed-by: Michael Fritch <mfritch@suse.com>

commit | commitdiff | tree

Sebastian Wagner [Wed, 10 Nov 2021 10:08:11 +0000 (11:08 +0100)]

Merge pull request #43825 from Daniel-Pivonka/cephadm_upgrade_name_already_inuse

cephadm: fix upgrade name already in use

Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Samuel Just [Wed, 10 Nov 2021 02:03:20 +0000 (18:03 -0800)]

Merge pull request #43853 from cyx1231st/wip-seastore-fix-journal-write-boundary

crimson/os/seastore: fix journal updates to the write boundaries

Reviewed-by: Samuel Just <sjust@redhat.com>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2021 01:04:30 +0000 (20:04 -0500)]

Merge PR #43290 into master

* refs/pull/43290/head:
.github/pull_request_template: address review comments
.github/pull_request_template.md: update
.github/workflows/pr-checklist: add checklist action

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Sage Weil [Wed, 10 Nov 2021 01:00:23 +0000 (20:00 -0500)]

Merge PR #43821 into master

* refs/pull/43821/head:
mgr/cephadm: allow osd spec removal
mgr/orchestrator: pass 'force' flag down for remove_service

Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 9 Nov 2021 20:24:40 +0000 (12:24 -0800)]

Merge pull request #43527 from sskaur/pybind-rados-omap-cmp

pybind: add wrapper for rados_write_op_omap_cmp

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 9 Nov 2021 20:24:03 +0000 (12:24 -0800)]

Merge pull request #43403 from ronen-fr/wip-rf-scrub-deep-v2

osd/scrub: expose PGs scrubbing schedule to the operator

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Matan Breizman <Matan.Brz@gmail.com>

commit | commitdiff | tree

Deepika Upadhyay [Tue, 9 Nov 2021 20:02:10 +0000 (01:32 +0530)]

Merge pull request #43677 from majianpeng/remove-larger-debug-message

librbd/pwl: don't need print cache_bl contents.

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 9 Nov 2021 19:30:13 +0000 (11:30 -0800)]

Merge pull request #43585 from anthonyeleven/anthonyeleven/fix-with-legacy

common/options: fix typo

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Daniel Gryniewicz [Tue, 9 Nov 2021 18:05:14 +0000 (13:05 -0500)]

Merge pull request #43844 from dang/wip-dang-zipper-doc

RGW Zipper - Cleanup and API doc pass

commit | commitdiff | tree

Patrick Donnelly [Thu, 4 Nov 2021 14:26:53 +0000 (10:26 -0400)]

pybind/mgr/cephadm: disable FSMap sanity checks during MDS upgrade

See comment for explanation.

Fixes: https://tracker.ceph.com/issues/53155
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 3 Nov 2021 20:41:24 +0000 (16:41 -0400)]

mds/FSMap: assign v16.2.4 compat to pre-v16.2.5 standby daemons

With v16.2.5, the monitors store an MDS's CompatSet with its mds_info_t
in the MDSMap. If an older MDS fails and rejoins the cluster, it gets
assigned the empty CompatSet. This is problematic during upgrades as an
MDS failure may prevent the upgrade process from continuing and cause
file system unavailability.

This patch makes it so the mons will assign a reasonable default: a
CompatSet used since v14.2.0 until v16.2.5.

Fixes: https://tracker.ceph.com/issues/53150
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Sage Weil [Fri, 5 Nov 2021 15:39:07 +0000 (11:39 -0400)]

mgr/cephadm: allow osd spec removal

OSD specs/drivegroups are essentially templates for OSD creation but do
not map to the full lifecycle of the OSDs that they create.  When a spec
is removed, remove it immediately.

If no --force is provided, the error lists which OSDs will be left behind.
If --force is passed, the service is removed.

This leaves behind a few oddities:

- When you list services, OSDs that were created by the drivegroup may
  still exist, causing the drivegroup to appear in the list as
  unmanaged services.
- If you create a new drivegroup with the same name, the prior OSDs will
  appear to belong to the new spec instance, regardless of whether the
  spec/drivegroup parameters are the same.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Mon, 8 Nov 2021 17:04:27 +0000 (12:04 -0500)]

mgr/orchestrator: pass 'force' flag down for remove_service

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Ronen Friedman [Tue, 9 Nov 2021 15:13:01 +0000 (17:13 +0200)]

Merge pull request #43836 from ronen-fr/wip-rf-atomic-enums

crimson/common: disable arithmetic operators for atomic enums

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Kamoltat [Fri, 29 Oct 2021 21:23:52 +0000 (21:23 +0000)]

pybind/mgr/pg_autoscaler: typo default option scale-up to scale-down

Typo: `scale-up` should be `scale-down` in Module
Option.

This typo doesn't trigger a bug because we create
a key-value of `scale-down` profile in
the function `create_initial()` in `src/mon/KVMonitor.cc`.
This will override whatever is the default option
in pg_autoscaler/module.py when we start the cluster and
the monitor gets created.

The command: `ceph osd pool set autoscale-profile <option>`
is still the primary command to change the autoscale-profiler
after the pool is created.

Fixes: https://tracker.ceph.com/issues/53203
Signed-off-by: Kamoltat <ksirivad@redhat.com>

commit | commitdiff | tree

David Galloway [Tue, 9 Nov 2021 14:01:34 +0000 (09:01 -0500)]

Merge pull request #43525 from ceph/wip-yuriw-release-15.2.15-master

doc: 15.2.15 Release Notes

commit | commitdiff | tree

Radoslaw Zarzynski [Tue, 26 Oct 2021 19:27:06 +0000 (19:27 +0000)]

crimson/net: drop crimson-specific check for the addr in ClientIdentFrame.

In crimson (but not in the classic OSD) we have an extra check that
verifies the address sent by our peer in `ClientIdentFrame` matches
`AsyncConnection::target_addr` at our side. In the Rook environment
this leads to problems with all cluster entities that are lacking
the `ms_learn_addr_from_peer=false` setting in their configurations.
This is true for `ceph-mgr`:

```
[root@rook-ceph-tools-698545dc56-zxrrx /]# ceph config show mgr.a ms_learn_addr_from_peer
true
```

Unfortunately, testing has shown that:
  * clients in Rook also lack this extra bit of configuration while
  * removing the extra check in crimson stops requiring any additional
    configuration at clients.

Although this still might look like a workaround for Rook having
`ms_learn_addr_from_peer=false` solely for OSDs, I think we should
drop the check to preserve both:
  * consistency of behaviour between OSD implementations,
  * compatibility with Ceph clients in existing k8s clusters.

```
INFO  2021-10-26 18:53:26,067 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> unknown.? -@59700] ProtocolV2::start_accept(): target_addr=172.17.0.5:59700/0
DEBUG 2021-10-26 18:53:26,067 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> unknown.? -@59700] TRIGGER ACCEPTING, was NONE
DEBUG 2021-10-26 18:53:26,067 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> unknown.? -@59700] SEND(26) banner: len_payload=16, supported=1, required=0, banner="ceph v2
"
DEBUG 2021-10-26 18:53:26,068 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> unknown.? -@59700] RECV(10) banner: "ceph v2
"
DEBUG 2021-10-26 18:53:26,068 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> unknown.? -@59700] GOT banner: payload_len=16
DEBUG 2021-10-26 18:53:26,068 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> unknown.? -@59700] RECV(16) banner features: supported=1 required=0
DEBUG 2021-10-26 18:53:26,068 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> unknown.? -@59700] WRITE HelloFrame: my_type=osd, peer_addr=172.17.0.5:59700/0
DEBUG 2021-10-26 18:53:26,068 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> unknown.? -@59700] GOT HelloFrame: my_type=client peer_addr=v2:172.17.0.2:6800/1270141526
INFO  2021-10-26 18:53:26,068 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> client.? -@59700] UPDATE: peer_type=client, policy(lossy=true server=true standby=false resetcheck=false)
WARN  2021-10-26 18:53:26,068 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> client.? -@59700] my_addr_from_peer v2:172.17.0.2:6800/1270141526 port/nonce DOES match myaddr v2:172.17.0.2:6800/1270141526
DEBUG 2021-10-26 18:53:26,068 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> client.? -@59700] GOT AuthRequestFrame: method=2, preferred_modes={1, 2}, payload_len=174
INFO  2021-10-26 18:53:26,068 [shard 0] monc - added challenge on [osd.0(client) v2:172.17.0.2:6800/1270141526 >> client.? -@59700]
DEBUG 2021-10-26 18:53:26,068 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> client.? -@59700] WRITE AuthReplyMoreFrame: payload_len=32
DEBUG 2021-10-26 18:53:26,068 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> client.? -@59700] GOT AuthRequestMoreFrame: payload_len=174
DEBUG 2021-10-26 18:53:26,069 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> client.? -@59700] WRITE AuthDoneFrame: gid=14788, con_mode=crc, payload_len=36
DEBUG 2021-10-26 18:53:26,069 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> client.? -@59700] WRITE AuthSignatureFrame: signature=975c5d3ae09036abcb2ca7d4f7704ee681ca13151d9de2ee29394ec8aed9950c
DEBUG 2021-10-26 18:53:26,069 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> client.? -@59700] GOT AuthSignatureFrame: signature=6209032314d560a21a3109ec6d7c0623ebd78cf1ea4fc9462411dbabe28b2d8d
DEBUG 2021-10-26 18:53:26,069 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> client.? -@59700] GOT ClientIdentFrame: addrs=172.17.0.1:0/1137248631, target=v2:172.17.0.2:6800/1270141526, gid=14788, gs=9, features_supported=4540138297136906239, features_required=576460752303427584, flags=1, cookie=0
WARN  2021-10-26 18:53:26,069 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> client.? -@59700] peer's address 172.17.0.1:0/1137248631 is not v2 or not the same host with 172.17.0.5:59700/0
INFO  2021-10-26 18:53:26,070 [shard 0] ms - [osd.0(client) v2:172.17.0.2:6800/1270141526 >> client.? -@59700] execute_accepting(): fault at ACCEPTING, going to CLOSING -- std::system_error (error crimson::net:2, bad peer address)
```

This connectivity issue has been overcome by appending
`--ms_learn_addr_from_peer=false` to the `argv`:

```
[root@rook-ceph-tools-698545dc56-zxrrx /]# bin/rados bench -p test-pool 5 rand --ms_learn_addr_from_peer=false
hints = 1
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
    0       0         0         0         0         0           -           0
    1      16      1235      1219   4.76106   4.76172    0.020999   0.0129408
    2      16      2531      2515   4.91158    5.0625   0.0131776   0.0126796
    3      16      3746      3730    4.8563   4.74609   0.0145268   0.0128361
    4      16      4951      4935   4.81889   4.70703   0.0154604   0.0129421
    5      15      6236      6221   4.85972   5.02344   0.0121689   0.0128415
Total time run:       5.01136
Total reads made:     6236
Read size:            4096
Object size:          4096
Bandwidth (MB/sec):   4.86083
Average IOPS:         1244
Stddev IOPS:          43.1706
Max IOPS:             1296
Min IOPS:             1205
Average Latency(s):   0.01284
Max latency(s):       0.0244048
Min latency(s):       0.00201867
```

However, on classical OSD and **crimson with this patch applied** there
is no need for any configurables at the client-side:

```
[rook@rook-ceph-tools-698545dc56-xkkpf /]$ bin/rados bench -p test-pool 5 rand
hints = 1
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
    0       0         0         0         0         0           -           0
    1      16      1124      1108   4.32747   4.32812    0.011878   0.0143472
    2      16      2323      2307   4.50534   4.68359   0.0117413   0.0138221
    3      16      3517      3501   4.55813   4.66406   0.0195142   0.0136663
    4      16      4680      4664   4.55425   4.54297   0.0131425   0.0136958
    5      16      5725      5709   4.45976   4.08203   0.0143174   0.0139868
Total time run:       5.01332
Total reads made:     5725
Read size:            4096
Object size:          4096
Bandwidth (MB/sec):   4.46077
Average IOPS:         1141
Stddev IOPS:          65.113
Max IOPS:             1199
Min IOPS:             1045
Average Latency(s):   0.0139892
Max latency(s):       0.0361518
Min latency(s):       0.00231195
```

Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Daniel Gryniewicz [Thu, 21 Oct 2021 17:50:45 +0000 (13:50 -0400)]

RGW Zipper - Initial round of API docs

This is the initial round of API docs for Zipper, in the form of Doxygen
comments. It's not detailed document, but it's a start.

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>

commit | commitdiff | tree

Daniel Gryniewicz [Thu, 21 Oct 2021 14:54:00 +0000 (10:54 -0400)]

RGW Zipper - API Cleanups

During the documentation pass for the Zipper API, a number of cleanups
were found: APIs that should be slightly different, or that were unused
entirely.  This is a rollup commit of all those cleanups.

- move get_multipart_upload() to Bucket
- remove unused defer_gc
- move create_bucket() into User
- rename get_bucket_info() to load_bucket() to match load_user()
- Remove read_bucket_stats()
  The codepaths using read_bucket_stats() used CLS data types, and the
  function is confusingly named.  Load the ent in load_bucket(), and use
  an alternative data structure to get size stats for the bucket.
- rename get_bucket_stats to read_stats
- Remove remove_metadata() from API
- remove copy_obj_data() from API
- rename get_obj_layout to dump_obj_layout
- use SAL range_to_ofs

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Tue, 9 Nov 2021 13:00:31 +0000 (15:00 +0200)]

Merge pull request #42479 from ronen-fr/wip-ronenf-scrub-prefix

osd/scrub: remove Scrubber sub-objects reliance on PG::gen_prefix()

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Laura Flores <lflores@redhat.com>

commit | commitdiff | tree

Venky Shankar [Tue, 9 Nov 2021 12:50:43 +0000 (18:20 +0530)]

Merge pull request #43730 from nmshelke/newline-vstart-help

vstart.sh: print newline character after vstart.sh help text

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Jos Collin [Tue, 9 Nov 2021 11:35:27 +0000 (17:05 +0530)]

Merge pull request #43414 from joscollin/wip-update-get-involved-cephfs

doc: update get-involved for cephfs

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Nikhilkumar Shelke [Fri, 29 Oct 2021 06:44:18 +0000 (12:14 +0530)]

vstart.sh: print newline character after vstart.sh help text

Without this newline character the help text messes up the
expected position of the prompt string.

Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>

commit | commitdiff | tree

Jos Collin [Tue, 22 Dec 2020 13:04:24 +0000 (18:34 +0530)]

doc: update dev list and kernel client

Signed-off-by: Jos Collin <jcollin@redhat.com>

commit | commitdiff | tree

Jos Collin [Tue, 22 Dec 2020 12:31:30 +0000 (18:01 +0530)]

doc: update #cephfs channel

Signed-off-by: Jos Collin <jcollin@redhat.com>

commit | commitdiff | tree

Venky Shankar [Wed, 16 Jun 2021 05:23:27 +0000 (01:23 -0400)]

test: add cephfs-mirror HA active/active workunit and test yamls

Fixes: http://tracker.ceph.com/issues/50372
Signed-off-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Venky Shankar [Wed, 28 Jul 2021 12:24:21 +0000 (08:24 -0400)]

test: add cephfs_mirror thrasher

Signed-off-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Venky Shankar [Tue, 10 Aug 2021 07:04:51 +0000 (03:04 -0400)]

tasks/cephfs_mirror: optionally run in foreground

cephfs mirror damon thrasher needs to send SIGTERM to mirror
daemons. The mirror daemon needs to run in foreground for
it to receive signal via `daemon.signal`.

Signed-off-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Venky Shankar [Sun, 1 Aug 2021 14:21:46 +0000 (10:21 -0400)]

mgr/mirroring: throttle directory reassigment to mirror daemons

This is to avoid over-shuffling directories when lots of mirror
daemons come and go.

Signed-off-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Yingxin Cheng [Mon, 8 Nov 2021 03:52:50 +0000 (11:52 +0800)]

crimson/os/seastore/journal: fix updates to journal seq head and target

* update journal sequences at the write boundary;
* update journal head to the sequence of write end;

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Yingxin Cheng [Fri, 5 Nov 2021 08:00:16 +0000 (16:00 +0800)]

crimson/os/seastore/journal: mark committed_to at the write boundary

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Patrick Donnelly [Tue, 9 Nov 2021 00:37:19 +0000 (19:37 -0500)]

mds/FSMap: allow upgrade when no MDS is "in"

Fixes: https://tracker.ceph.com/issues/52975
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Samuel Just [Mon, 8 Nov 2021 23:22:05 +0000 (15:22 -0800)]

Merge pull request #43835 from xxhdx1985126/wip-segment-avail-bytes

crimson/os/seastore/segment_cleaner: initialize segments' avail_bytes…

Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>

commit | commitdiff | tree

Samuel Just [Mon, 8 Nov 2021 20:59:40 +0000 (12:59 -0800)]

Merge pull request #43754 from cyx1231st/wip-seastore-fix-journal-committed-to

crimson/os/seastore: fix ordered updates to JournalSegmentManager::committed_to

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Xuehan Xu <xxhdx1985126@gmail.com>

commit | commitdiff | tree

Sage Weil [Mon, 8 Nov 2021 19:43:25 +0000 (14:43 -0500)]

Merge PR #43827 into master

* refs/pull/43827/head:
qa/suites/orch/cephadm: add repave-all test case
mgr/cephadm/services/osd: less noisy
mgr/cephadm/services/osd: do not log ok-to-stop/safe-to-destroy failures
mgr/orchestrator: clean up 'orch osd rm status'

Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Sage Weil [Thu, 4 Nov 2021 15:49:07 +0000 (10:49 -0500)]

mgr/rook: make 'osd' unmanaged service appear as unmanaged

This makes it match the cephadm result, where e.g., the placement
shows as '<unmanaged>' in 'orch ls'.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 4 Nov 2021 15:36:54 +0000 (10:36 -0500)]

mgr/rook: implement removal of drive groups

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 4 Nov 2021 15:25:18 +0000 (10:25 -0500)]

mgr/rook: include drivegroups in 'orch ls' result

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 4 Nov 2021 14:50:13 +0000 (09:50 -0500)]

mgr/rook/rook_cluster: ignore pods with no containers

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Wed, 3 Nov 2021 21:35:58 +0000 (16:35 -0500)]

mgr/rook: fix service_type filtering vs crash

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Yuri Weinstein [Mon, 8 Nov 2021 17:51:27 +0000 (09:51 -0800)]

Merge pull request #43699 from sebastian-philipp/qa-rados-mgr-random-objectstore

qa/suites/rados/mgr: use only one objectstore instead of all

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 8 Nov 2021 17:50:33 +0000 (09:50 -0800)]

Merge pull request #43621 from ifed01/wip-ifed-fix-53011

os/bluestore: use proper prefix when removing undecodable Share Blob.

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

commit | commitdiff | tree

Sebastian Wagner [Mon, 8 Nov 2021 16:15:55 +0000 (17:15 +0100)]

Merge pull request #43635 from adk3798/agent-responsiveness

mgr/cephadm: improve agent responsiveness

Reviewed-by: Daniel Pivonka <dpivonka@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>

commit | commitdiff | tree

Casey Bodley [Mon, 8 Nov 2021 15:06:24 +0000 (10:06 -0500)]

qa/rgw: bump tempest version to resolve dependency issue

Fixes: https://tracker.ceph.com/issues/53095
Signed-off-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Mon, 8 Nov 2021 13:16:42 +0000 (14:16 +0100)]

Merge pull request #43574 from sabzco/ceph-volume-fix

ceph-volume: fix a typo causing AttributeError

commit | commitdiff | tree

Guillaume Abrioux [Mon, 8 Nov 2021 09:27:41 +0000 (10:27 +0100)]

Merge pull request #43679 from guits/cv_quick_update_tests

ceph-volume/tests: update setup_mixed_type playbook

commit | commitdiff | tree

Ronen Friedman [Sun, 7 Nov 2021 09:03:59 +0000 (11:03 +0200)]

crimson/common: disable arithmetic operators for atomic enums

While class enums are 'integral types', they do not support
all that an int does.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Xuehan Xu [Sun, 7 Nov 2021 07:47:02 +0000 (15:47 +0800)]

crimson/os/seastore/segment_cleaner: initialize segments' avail_bytes with segments' sizes

Currently, we initialize segments' avail_bytes with "segment_size * num_segments". Both segment_size
and num_segments are 32 bits long, multiplying them would lead to overflow.

Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>

commit | commitdiff | tree

Sage Weil [Sat, 6 Nov 2021 16:29:53 +0000 (12:29 -0400)]

Merge PR #43826 into master

* refs/pull/43826/head:
mgr/cephadm: allow zapping devices from other clusters

Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Sage Weil [Fri, 5 Nov 2021 19:00:10 +0000 (15:00 -0400)]

qa/suites/orch/cephadm: add repave-all test case

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 5 Nov 2021 18:37:58 +0000 (14:37 -0400)]

mgr/cephadm/services/osd: less noisy

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 5 Nov 2021 18:37:47 +0000 (14:37 -0400)]

mgr/cephadm/services/osd: do not log ok-to-stop/safe-to-destroy failures

These failures are normal and expected; they should not pollute the log.

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Thu, 21 Oct 2021 15:20:20 +0000 (10:20 -0500)]

.github/pull_request_template: address review comments

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sage Weil [Fri, 5 Nov 2021 18:36:54 +0000 (14:36 -0400)]

mgr/cephadm: allow zapping devices from other clusters

This is the 99% of the devices that ever get zapped.

Fixes: b7782084ac9657be9b2da6ebd56b5029cf859225
Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Neha Ojha [Fri, 5 Nov 2021 18:37:57 +0000 (11:37 -0700)]

Merge pull request #43814 from neha-ojha/wip-more-cv

qa/suites/upgrade/octopus-x/stress-split-no-cephadm: exclude ceph-volume

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Sage Weil [Fri, 5 Nov 2021 18:24:47 +0000 (14:24 -0400)]

mgr/orchestrator: clean up 'orch osd rm status'

Signed-off-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Daniel Pivonka [Fri, 5 Nov 2021 17:54:36 +0000 (13:54 -0400)]

cephadm: fix upgrade name already in use

Signed-off-by: Daniel Pivonka <dpivonka@redhat.com>

commit | commitdiff | tree

Ali Maredia [Fri, 5 Nov 2021 16:34:53 +0000 (12:34 -0400)]

Merge pull request #43808 from cbodley/wip-qa-rgw-java-master

qa/rgw: master branch targets ceph-master branch of java_s3tests

commit | commitdiff | tree

Ronen Friedman [Thu, 4 Nov 2021 06:09:27 +0000 (08:09 +0200)]

doc: document new scrub-related 'pg dump' columns

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Mon, 25 Oct 2021 16:03:05 +0000 (16:03 +0000)]

osd/scrub: update the stand-alone tests to check 'scrub scheduling' entries

Analyzing and verifying the relevant entries in 'pg query' and
'pg dump' output.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Mon, 1 Nov 2021 12:08:57 +0000 (12:08 +0000)]

osd/scrub: scrubbing schedule - minor related cleanups

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Ronen Friedman [Tue, 21 Sep 2021 10:53:20 +0000 (10:53 +0000)]

osd/scrub: expose scrubbing schedule to operator

Add a 'scrub scheduling info' column to pgs dump.
Modify the name and behavior of 'last-scrub-duration'.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
Co-Authored-By: Aishwarya Mathuria <amathuri@redhat.com>

commit | commitdiff | tree

Sebastian Wagner [Fri, 5 Nov 2021 14:09:20 +0000 (15:09 +0100)]

Merge pull request #43807 from sebastian-philipp/osd_memory_target_autotune-true

doc/cephadm: Recommend osd_memory_target_autotune

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Sage Weil <sage@newdream.net>

commit | commitdiff | tree

Sebastian Wagner [Thu, 4 Nov 2021 15:49:21 +0000 (16:49 +0100)]

doc/cephadm: Recommend osd_memory_target_autotune

In case the cluster runs on hardware that is used exclusively for
Ceph, let's recommend `osd_memory_target_autotune`

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>

commit | commitdiff | tree

Paul Cuzner [Wed, 3 Nov 2021 02:24:20 +0000 (15:24 +1300)]

mgr/prometheus: Update rule format and enhance SNMP support

Rules now adhere to the format defined by Prometheus.io.
This changes alert naming and each alert now includes a
a summary description to provide a quick one-liner.

In addition to reformatting some missing alerts for MDS and
cephadm have been added, and corresponding tests added.

The MIB has also been refactored, so it now passes standard
lint tests and a README included for devs to understand the
OID schema.

Fixes: https://tracker.ceph.com/issues/53111
Signed-off-by: Paul Cuzner <pcuzner@redhat.com>

commit | commitdiff | tree

Neha Ojha [Mon, 1 Nov 2021 23:55:22 +0000 (23:55 +0000)]

qa/suites/upgrade/octopus-x/stress-split-no-cephadm: exclude ceph-volume

To address failures like

```
Command failed on smithi096 with status 100: 'sudo DEBIAN_FRONTEND=noninteractive apt-get -y --force-yes -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" install ceph=15.2.15-11-g5f8f263c-1focal ceph-mds=15.2.15-11-g5f8f263c-1focal ceph-mgr=15.2.15-11-g5f8f263c-1focal ceph-common=15.2.15-11-g5f8f263c-1focal ceph-fuse=15.2.15-11-g5f8f263c-1focal ceph-test=15.2.15-11-g5f8f263c-1focal ceph-volume=15.2.15-11-g5f8f263c-1focal radosgw=15.2.15-11-g5f8f263c-1focal python3-rados=15.2.15-11-g5f8f263c-1focal python3-rgw=15.2.15-11-g5f8f263c-1focal python3-cephfs=15.2.15-11-g5f8f263c-1focal python3-rbd=15.2.15-11-g5f8f263c-1focal libcephfs2=15.2.15-11-g5f8f263c-1focal librados2=15.2.15-11-g5f8f263c-1focal librbd1=15.2.15-11-g5f8f263c-1focal rbd-fuse=15.2.15-11-g5f8f263c-1focal'
```

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Laura Flores [Tue, 2 Nov 2021 02:00:58 +0000 (02:00 +0000)]

mgr/telemetry: add stats_per_pg

We decided that it might be more useful to report data on a pg level in addition to what we were reporting for pool level.

Signed-off-by: Laura Flores <lflores@redhat.com>

commit | commitdiff | tree

Laura Flores [Thu, 4 Nov 2021 17:55:51 +0000 (17:55 +0000)]

mgr/telemetry: modify stats_per_pool

There is a much easier way to collect stats_per_pool than the current implementation. Fetching 'pg_dump' from the mgr module already provides a field called "pool_stats" that is the same as aggregated pg stats, which was the implementation up until this commit.

All in all, this solution should provide the information we want, with a much cleaner implementation.

Signed-off-by: Laura Flores <lflores@redhat.com>
Backport Message: In the case that this commit is backported, it is important to note that the commits in PR #42569 should be backported first, as the implementation of "get_stat_sum_per_pool()" in #42569 precedes the removal of it here.

commit | commitdiff | tree

Neha Ojha [Thu, 4 Nov 2021 21:27:48 +0000 (14:27 -0700)]

Merge pull request #43406 from ljflores/wip-telemetry-perf-improvements

mgr/telemetry: add mempool stats to telemetry perf report

Reviewed-by: Yaarit Hatuka <yaarit@redhat.com>

commit | commitdiff | tree

Sage Weil [Thu, 4 Nov 2021 21:14:19 +0000 (17:14 -0400)]

Merge PR #42727 into master

* refs/pull/42727/head:
mgr/orchestrator: improve usage string for 'orch daemon add osd'
ceph-volume: activate: try simple mode too
mgr/cephadm: identify and instantiate raw osds post-create
mgr/orchestrator: accept --method arg to 'orch daemon add osd'
python-common: drivegroup: add 'method' property
cephadm: use generic ceph-volume activate
ceph-volume: top-level 'activate' command
ceph-volume: lvm activate: add --no-tmpfs
ceph-volume: lvm activate: infer bluestore or filestore
ceph-volume: raw activate: accept --osd-id and/or --osd-uuid instead of device

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 4 Nov 2021 21:10:57 +0000 (14:10 -0700)]

Merge pull request #43705 from tchaikov/wip-no-more-python2

mgr: do not handle Python2

Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
Reviewed-by: Laura Flores <lflores@redhat.com>

Unnamed repository; edit this file 'description' to name the repository.