]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
23 months agoosd/scrub: do not clear PG_STATE_REPAIR unconditionally 53843/head
Ronen Friedman [Sat, 28 Oct 2023 16:42:34 +0000 (11:42 -0500)]
osd/scrub: do not clear PG_STATE_REPAIR unconditionally

As we now call clear_pgscrub_state() at the end of each
'Session' state, we must not clear PG_STATE_REPAIR
unconditionally.

Previously - scrubs that reached normal completion, i.e.
reached PgScrubber::scrub_finish(), would have only cleared
that PG flag under specific conditions. That was changed in
previous commits of this PR, and is now fixed.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
23 months agoosd/scrub: extend scrub reservation timeout
Ronen Friedman [Mon, 23 Oct 2023 15:38:18 +0000 (18:38 +0300)]
osd/scrub: extend scrub reservation timeout

As replicas are now reserved sequentially, the scrub reservation
timeout should be extended.

Taking into account the low priority of scrub-related messages,
modify both 'osd_scrub_reservation_timeout' and
'osd_scrub_slow_reservation_response'.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
23 months agoosd/scrub: modify slow-replica-reply warning implementation
Ronen Friedman [Sat, 14 Oct 2023 12:55:42 +0000 (07:55 -0500)]
osd/scrub: modify slow-replica-reply warning implementation

Merging the 'do once' functionality into the timeout duration.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
23 months agoosd/scrub: handle reservation completion within the Scrubber FSM
Ronen Friedman [Sat, 14 Oct 2023 12:36:06 +0000 (07:36 -0500)]
osd/scrub: handle reservation completion within the Scrubber FSM

with special handling for the 0-replica case.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
23 months agoosd/scrub: move ReplicaReservations into the Scrubber FSM
Ronen Friedman [Fri, 13 Oct 2023 17:14:31 +0000 (12:14 -0500)]
osd/scrub: move ReplicaReservations into the Scrubber FSM

Handle grant/deny messages within the FSM.
One exception at this point: the handling of "granted by everyone"
(due to the technical inconvenience of having to handle the
"0 replicas" case in the FSM state constructor).

Note: after this commit, ScrubMachineListener - an API which is
a subset of the Scrubber API to be used by the Scrubber FSM - does
no longer make sense. The FSM should now have full access to the
scrubber, and that interface will be removed in a subsequent PR.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
23 months agoosd/scrub: extract ReplicaReservations into separate files
Ronen Friedman [Fri, 13 Oct 2023 16:07:56 +0000 (11:07 -0500)]
osd/scrub: extract ReplicaReservations into separate files

As a preliminary step before ReplicaReservations ownership
is moved to the scrubber's FSM.

No code changes in this commit (apart from required 'include's).

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
23 months agoosd/scrub: route grant/deny messages through the scrubber FSM
Ronen Friedman [Fri, 13 Oct 2023 12:48:44 +0000 (07:48 -0500)]
osd/scrub: route grant/deny messages through the scrubber FSM

The scrubber FSM will now be responsible for handling the grant/deny
ops received from the replica OSDs.
For this temporary step - the scrubber FSM will simply forward a
call to the ReplicaReservations object in the Scrubber.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
23 months agoosd/scrub: reserve replicas one by one, and in consistent order
Ronen Friedman [Sun, 24 Sep 2023 12:34:14 +0000 (07:34 -0500)]
osd/scrub: reserve replicas one by one, and in consistent order

Issuing the reservation requests one by one - waiting for
approval from the secondary before the next request is sent.

The requests are sent in ascending target pg-shard-id order, reducing the
chance of having two PGs repeatedly competing for the same set of
OSDs - and doing so in an interleaved sequence.

Modifying the Session state in the scrubber FSM to react to interval
changes by discarding replica reservations.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
23 months agoosd/scrub: group all scrub session states into a Session state
Ronen Friedman [Mon, 2 Oct 2023 16:29:51 +0000 (11:29 -0500)]
osd/scrub: group all scrub session states into a Session state

The Session state now includes the ReservingReplicas & Active
sub-states.

This new state will hold (in future commits) most of the scrub
state information that relates to a specific scrub session (and
should be cleaned up when that session terminates).

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
23 months agoosd/scrub: modify 'a PG is reserving' to note PG
Ronen Friedman [Mon, 2 Oct 2023 09:43:54 +0000 (04:43 -0500)]
osd/scrub: modify 'a PG is reserving' to note PG

Only the PG that had set the 'I am in the process of
reserving my replicas' is allowed to clear that status.

That will simplify the follow-up commit, setting this flag from
a specific scrub-FSM state.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoMerge pull request #53531 from ronen-fr/wip-rf-squeue2 53263/head
Ronen Friedman [Fri, 22 Sep 2023 13:46:38 +0000 (16:46 +0300)]
Merge pull request #53531 from ronen-fr/wip-rf-squeue2

osd/scrub: extract scrub initiation code out of the OSD

Reviewed-by: Samuel Just <sjust@redhat.com>
2 years agoMerge pull request #53311 from idryomov/wip-62711
Ilya Dryomov [Fri, 22 Sep 2023 13:44:35 +0000 (15:44 +0200)]
Merge pull request #53311 from idryomov/wip-62711

qa/suites/{rbd,krbd}: disable POOL_APP_NOT_ENABLED health check

Reviewed-by: Ramana Raja <rraja@redhat.com>
2 years agoMerge pull request #53528 from rishabh-d-dave/cephfs-qa-mdtest
Rishabh Dave [Fri, 22 Sep 2023 06:26:03 +0000 (11:56 +0530)]
Merge pull request #53528 from rishabh-d-dave/cephfs-qa-mdtest

qa/cephfs: fix build failure for mdtest project

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2 years agoqa/suites/krbd: disable POOL_APP_NOT_ENABLED health check 53311/head
Ilya Dryomov [Fri, 15 Sep 2023 13:33:27 +0000 (15:33 +0200)]
qa/suites/krbd: disable POOL_APP_NOT_ENABLED health check

... same as for rbd suite.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2 years agoqa/suites/rbd: drop POOL_APP_NOT_ENABLED from ignorelists
Ilya Dryomov [Fri, 15 Sep 2023 13:33:27 +0000 (15:33 +0200)]
qa/suites/rbd: drop POOL_APP_NOT_ENABLED from ignorelists

With "mon warn on pool no app = false" in the config, it's obviously
redundant.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2 years agoqa/suites/rbd: disable POOL_APP_NOT_ENABLED health check
Ilya Dryomov [Fri, 15 Sep 2023 13:33:27 +0000 (15:33 +0200)]
qa/suites/rbd: disable POOL_APP_NOT_ENABLED health check

Commit 990806e635a1 ("mon, qa: issue pool application warning even
if pool is empty") made it impossible to create a pool without raising
a (bogus) health alert.  See [1] for details.

[1] https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/ZTDYC5HN677RR26EB4P6PORN6L2IFH4R/

Fixes: https://tracker.ceph.com/issues/62711
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2 years agoMerge pull request #53448 from cbodley/wip-62378
Casey Bodley [Thu, 21 Sep 2023 21:15:37 +0000 (22:15 +0100)]
Merge pull request #53448 from cbodley/wip-62378

rgw/crypt: don't deref null manifest_bl

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge pull request #53045 from smanjara/wip-shilpa-revert-51772
Casey Bodley [Thu, 21 Sep 2023 19:59:59 +0000 (20:59 +0100)]
Merge pull request #53045 from smanjara/wip-shilpa-revert-51772

rgw/multisite: fixes assertion failure during realm reload

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2 years agoMerge pull request #53493 from zdover23/wip-doc-2023-09-18-architecture-7-of-x
zdover23 [Thu, 21 Sep 2023 17:41:21 +0000 (03:41 +1000)]
Merge pull request #53493 from zdover23/wip-doc-2023-09-18-architecture-7-of-x

doc/architecture: "Edit HA Auth" (one of several)

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2 years agoMerge pull request #53505 from cbodley/wip-62771
Casey Bodley [Thu, 21 Sep 2023 17:30:07 +0000 (18:30 +0100)]
Merge pull request #53505 from cbodley/wip-62771

rgw/sal: get_placement_target_names() returns void

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge pull request #52392 from yuvalif/wip-yuval-trace-name
Yuval Lifshitz [Thu, 21 Sep 2023 17:12:01 +0000 (20:12 +0300)]
Merge pull request #52392 from yuvalif/wip-yuval-trace-name

rgw: rename request traces and change for tags

reviewed-by: cbodley

2 years agoMerge PR #50503 into main
Patrick Donnelly [Thu, 21 Sep 2023 15:51:31 +0000 (11:51 -0400)]
Merge PR #50503 into main

* refs/pull/50503/head:
mon: do not change pending if strategy is unchanged
mon/MonmapMonitor: do not propose on error in prepare_update
mon/MonmapMonitor: wait for commit before reply
mon: use wait_for_commit to reply
mon: add context list for commit wait
mon: remove unused method
test/mon: add commit benchmark script
mon/MonClient: provide config to target specific rank

Reviewed-by: Laura Flores <lflores@redhat.com>
Reviewed-by: Ramana Raja <rraja@redhat.com>
2 years agodoc/architecture: "Edit HA Auth" (one of several) 53493/head
Zac Dover [Sun, 17 Sep 2023 20:41:28 +0000 (06:41 +1000)]
doc/architecture: "Edit HA Auth" (one of several)

Edit "High Availability Authentication" in doc/architecture.rst.

Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com>
Signed-off-by: Zac Dover <zac.dover@proton.me>
2 years agoosd/scrub: modify schedule_result_t to report error class 53531/head
Ronen Friedman [Thu, 21 Sep 2023 12:34:52 +0000 (07:34 -0500)]
osd/scrub: modify schedule_result_t to report error class

(which directly translates to the required followup action)
instead of reporting the exact failure. The specific of the failure
were never used by the scrub scheduler.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: scheduler refactoring - cleanups
Ronen Friedman [Thu, 21 Sep 2023 09:57:30 +0000 (04:57 -0500)]
osd/scrub: scheduler refactoring - cleanups

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoqa/cephfs: add "set -x" in mdtest.yaml 53528/head
Rishabh Dave [Thu, 21 Sep 2023 12:41:41 +0000 (18:11 +0530)]
qa/cephfs: add "set -x" in mdtest.yaml

Set the flag for printing the commands that will be executed so that
it's easier to go through teuthology.log

Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 years agoqa/cephfs: fix build failure for mdtest project
Rishabh Dave [Wed, 20 Sep 2023 08:42:43 +0000 (14:12 +0530)]
qa/cephfs: fix build failure for mdtest project

To fix the mdtest job failure (which happens because building mdtest
project fails) do -

1. Use ior projects intead of mdtest project bcecause latter was merged
   into former. See:
   https://github.com/MDTEST-LANL/mdtest/blob/master/README.md

2. Purge mpich package and then install it again. This is a vital step
   that's needed to build ior project on Ubuntu 22.04.

Fixes: https://tracker.ceph.com/issues/61574
Signed-off-by: Rishabh Dave <ridave@redhat.com>
2 years agoMerge pull request #52863 from batrick/i62326
Adam King [Thu, 21 Sep 2023 12:14:54 +0000 (08:14 -0400)]
Merge pull request #52863 from batrick/i62326

pybind/mgr/cephadm/upgrade: stop disabling FSMap sanity checks

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2 years agoosd/scrub: scheduler: removing unused code
Ronen Friedman [Thu, 21 Sep 2023 09:59:11 +0000 (04:59 -0500)]
osd/scrub: scheduler: removing unused code

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: correct placement for some scheduler-related methods
Ronen Friedman [Tue, 19 Sep 2023 14:04:23 +0000 (09:04 -0500)]
osd/scrub: correct placement for some scheduler-related methods

Moving some member functions to their corresponding files.
Including ScrubQueue::dump_scrubs()
as it was moved in a previous commit,
and some ScrubJob code.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoMerge pull request #53501 from zhscn/wip-lba-backref-node-size
Yingxin [Thu, 21 Sep 2023 09:20:28 +0000 (17:20 +0800)]
Merge pull request #53501 from zhscn/wip-lba-backref-node-size

crimson/os/seastore: create page aligned bufferptr in copy ctor of CachedExtent

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
2 years agoMerge pull request #52150 from paulreece42/wip-grafana-quorum-fix
Nizamudeen A [Thu, 21 Sep 2023 07:06:21 +0000 (12:36 +0530)]
Merge pull request #52150 from paulreece42/wip-grafana-quorum-fix

monitoring: grafana mons out of quorum should be count - sum

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
2 years agoMerge PR #53028 into main
Venky Shankar [Thu, 21 Sep 2023 00:41:54 +0000 (06:11 +0530)]
Merge PR #53028 into main

* refs/pull/53028/head:
Update MDSDaemon.cc
Update MDSRank.cc - Logoutput: Fix personal pronoun "I" to uppercase

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2 years agoMerge PR #53176 into main
Venky Shankar [Thu, 21 Sep 2023 00:33:42 +0000 (06:03 +0530)]
Merge PR #53176 into main

* refs/pull/53176/head:
doc: add note for removing (automatic) partitioning policy

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Rishabh Dave <ridave@redhat.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
2 years agoMerge pull request #53467 from phlogistonjohn/jjm-cephadm-move-get_data_dir
Adam King [Wed, 20 Sep 2023 18:20:09 +0000 (14:20 -0400)]
Merge pull request #53467 from phlogistonjohn/jjm-cephadm-move-get_data_dir

cephadm: move get data dir function to daemonidentity method

Reviewed-by: Adam King <adking@redhat.com>
2 years agoMerge pull request #53415 from rkachach/fix_issue_62814
Adam King [Wed, 20 Sep 2023 18:19:02 +0000 (14:19 -0400)]
Merge pull request #53415 from rkachach/fix_issue_62814

cephadm: fix cephadm binary mount when --shared_ceph_folder is used

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
2 years agoMerge pull request #53298 from phlogistonjohn/jjm-logging-2
Adam King [Wed, 20 Sep 2023 18:17:03 +0000 (14:17 -0400)]
Merge pull request #53298 from phlogistonjohn/jjm-logging-2

cephadm: enhance logging behavior

Reviewed-by: Adam King <adking@redhat.com>
2 years agoMerge pull request #52251 from rkachach/fix_issue_61856
Adam King [Wed, 20 Sep 2023 18:14:23 +0000 (14:14 -0400)]
Merge pull request #52251 from rkachach/fix_issue_61856

mgr/cephadm: Adding sort-by support for ceph orch ps

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
2 years agoMerge pull request #52982 from batrick/backport-cp-simplify
Ilya Dryomov [Wed, 20 Sep 2023 17:58:40 +0000 (19:58 +0200)]
Merge pull request #52982 from batrick/backport-cp-simplify

script/ceph-backport: perform cherry-pick in single command

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2 years agoMerge PR #53145 into main
Patrick Donnelly [Wed, 20 Sep 2023 12:57:07 +0000 (08:57 -0400)]
Merge PR #53145 into main

* refs/pull/53145/head:
mds: log message when exiting due to asok command

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
2 years agoMerge PR #53149 into main
Patrick Donnelly [Wed, 20 Sep 2023 12:31:17 +0000 (08:31 -0400)]
Merge PR #53149 into main

* refs/pull/53149/head:
qa: lengthen shutdown timeout for thrashed MDS

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
2 years agodoc: add note for removing (automatic) partitioning policy 53176/head
Venky Shankar [Mon, 28 Aug 2023 10:42:57 +0000 (16:12 +0530)]
doc: add note for removing (automatic) partitioning policy

Signed-off-by: Venky Shankar <vshankar@redhat.com>
2 years agoMerge pull request #53309 from guits/bz2203397
Guillaume Abrioux [Wed, 20 Sep 2023 07:32:49 +0000 (09:32 +0200)]
Merge pull request #53309 from guits/bz2203397

ceph-volume: fix mpath device support

2 years agoosd/scrub: handle configuration changes in OsdScrub
Ronen Friedman [Tue, 19 Sep 2023 12:16:03 +0000 (07:16 -0500)]
osd/scrub: handle configuration changes in OsdScrub

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: move initiate_a_scrub() to OsdScrub
Ronen Friedman [Tue, 19 Sep 2023 11:55:25 +0000 (06:55 -0500)]
osd/scrub: move initiate_a_scrub() to OsdScrub

Scrub initiation is now fully owned by OsdScrub.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: add ScrubQueue::ready_to_scrub()
Ronen Friedman [Mon, 18 Sep 2023 16:31:37 +0000 (11:31 -0500)]
osd/scrub: add ScrubQueue::ready_to_scrub()

At this phase of the refactoring:
this is the main interface from the scrub scheduler in OsdScrub
to the ScrubQueue. The ScrubQueue provides the ordered list of
all targets (for now - PGs) that are ready for scrubbing.

Scrub initiation code is modified to use the new interface.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: move scrub_sleep_time() to OsdScrub
Ronen Friedman [Mon, 18 Sep 2023 15:11:21 +0000 (10:11 -0500)]
osd/scrub: move scrub_sleep_time() to OsdScrub

also scrub_time_permit().

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: extract restrictions_on_scrubbing()
Ronen Friedman [Mon, 18 Sep 2023 11:00:34 +0000 (06:00 -0500)]
osd/scrub: extract restrictions_on_scrubbing()

from ScrubQueue::select_pg_and_scrub().

Clearing the path to moving some ScrubQueue methods into
OscScrub. Starting here with the CPU load tracker.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: introduce OsdService::get_locked_pg()
Ronen Friedman [Sat, 16 Sep 2023 16:53:35 +0000 (11:53 -0500)]
osd/scrub: introduce OsdService::get_locked_pg()

which returns an RAII-wrapper around a locked PG.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: move OSD::sched_scrub() to OsdScrub
Ronen Friedman [Fri, 15 Sep 2023 14:03:09 +0000 (09:03 -0500)]
osd/scrub: move OSD::sched_scrub() to OsdScrub

... (as OsdScrub::initiate_scrub()).

The random backoff dice roller (scrub_random_backoff())
is moved as well.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: declare OsdScrub, an OSD subobject
Ronen Friedman [Thu, 14 Sep 2023 14:09:43 +0000 (09:09 -0500)]
osd/scrub: declare OsdScrub, an OSD subobject

for all OSD scrub things.

For now: OsdScrub is mostly a forwarder to the ScrubQueue object
(which it now owns).
The resource counters moved into a separate object within OsdScrub.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: unify dout macros across scrub scheduling code
Ronen Friedman [Wed, 13 Sep 2023 07:09:14 +0000 (02:09 -0500)]
osd/scrub: unify dout macros across scrub scheduling code

to facilitate easy migration of code fragments between
related classes.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: complete ScrubJob transition from within ScrubQueue
Ronen Friedman [Mon, 11 Sep 2023 06:56:33 +0000 (01:56 -0500)]
osd/scrub: complete ScrubJob transition from within ScrubQueue

ScrubJob is now in the Scrub namespace.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: set_reserving_now() signature modified
Ronen Friedman [Sun, 10 Sep 2023 19:44:33 +0000 (14:44 -0500)]
osd/scrub: set_reserving_now() signature modified

set_reserving_now() can now return a failure status, indicating
a race between two PGs to start scrubbing on the same OSD.

The scrubber FSM is modified to handle the failure.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: select_pg_and_scrub() moved into osd_scrub.cc
Ronen Friedman [Sun, 10 Sep 2023 18:50:35 +0000 (13:50 -0500)]
osd/scrub: select_pg_and_scrub() moved into osd_scrub.cc

No code changes.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: renaming & fmt support for restrictions structure
Ronen Friedman [Sun, 10 Sep 2023 18:32:53 +0000 (13:32 -0500)]
osd/scrub: renaming & fmt support for restrictions structure

Renaming ScrubPreconds, the collection of "environmental"
restrictions on possible scrubs, to OSDRestrictions.
Also - providing fmtlib support for that structure.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: introducing random_bool_with_probability()
Ronen Friedman [Sun, 10 Sep 2023 18:07:31 +0000 (13:07 -0500)]
osd/scrub: introducing random_bool_with_probability()

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: moving the resources counters code into a separate file
Ronen Friedman [Sun, 10 Sep 2023 17:55:59 +0000 (12:55 -0500)]
osd/scrub: moving the resources counters code into a separate file

No code changes in this commit.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: moving code as-is into osd_scrub.cc
Ronen Friedman [Sun, 10 Sep 2023 15:14:39 +0000 (10:14 -0500)]
osd/scrub: moving code as-is into osd_scrub.cc

Code from OSD.cc & osd_scrub_sched.cc moved as-is,
to be modified in followup commits.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agoosd/scrub: moving ScrubJob declaration as-is
Ronen Friedman [Sat, 9 Sep 2023 19:44:00 +0000 (14:44 -0500)]
osd/scrub: moving ScrubJob declaration as-is

from osd_scrub_sched.h into scrub_job.h.

A purely mechanical step. No code is changed.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
2 years agocrimson/os/seastore: create page aligned bufferptr in copy ctor of CachedExtent 53501/head
Zhang Song [Tue, 19 Sep 2023 06:08:51 +0000 (14:08 +0800)]
crimson/os/seastore: create page aligned bufferptr in copy ctor of CachedExtent

Signed-off-by: Zhang Song <zhangsong02@qianxin.com>
2 years agodoc/cephadm: document new cephadm logging destination settings 53298/head
John Mulligan [Wed, 6 Sep 2023 20:56:40 +0000 (16:56 -0400)]
doc/cephadm: document new cephadm logging destination settings

Add docs for setting the binary's log destination at cephadm bootstrap
or on a running cluster.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2 years agodoc/cephadm: clarify what cephadm component writes to the cluster log channel
John Mulligan [Wed, 6 Sep 2023 20:18:37 +0000 (16:18 -0400)]
doc/cephadm: clarify what cephadm component writes to the cluster log channel

Clarify that the cephadm orchestrator module, a part of the ceph mgr,
logs to the cluster log channel. This prepares for adding a specific
section to cover logging for the cephadm "binary".

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2 years agocephadm: remember log destination used during bootstrap
John Mulligan [Wed, 6 Sep 2023 18:15:41 +0000 (14:15 -0400)]
cephadm: remember log destination used during bootstrap

Store the log destination(s) specified on the CLI for cephadm bootstrap
as the manager configuration, unless the configuration key is explicitly
set by the input config.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2 years agomgr/cephadm: add a module option for controlling cephadm log dest
John Mulligan [Wed, 6 Sep 2023 17:39:06 +0000 (13:39 -0400)]
mgr/cephadm: add a module option for controlling cephadm log dest

Now that cephadm has multiple possible persistent logging destinations
we need a way to choose which one to use when the command is started by
the mgr. Add the option 'cephadm_log_destination' which can take one
of 'file', 'syslog', or 'file,syslog'. If left unset (empty string)
then the behavior is equivalent to 'file' and that is the same as
previous cephadm versions.

Fixes: https://tracker.ceph.com/issues/62233
Signed-off-by: John Mulligan <jmulligan@redhat.com>
2 years agocephadm: add cli option to enable logging to syslog
John Mulligan [Tue, 22 Aug 2023 19:11:35 +0000 (15:11 -0400)]
cephadm: add cli option to enable logging to syslog

Add the --log-dest option to cephadm. The --log-dest option can be
specified 0, 1 or more times. If unspecified, cephadm will log to
the default location, the log file. If specified one ore more times,
each instance will enable the named logging destination.
Example:

```
cephadm boostrap

cephadm --log-dest=syslog bootstrap

cephadm --log-dest=file bootstrap

cephadm --log-dest=syslog --log-dest=file bootstrap
```

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2 years agocephadm: add support for logging to syslog/journal
John Mulligan [Tue, 22 Aug 2023 19:11:16 +0000 (15:11 -0400)]
cephadm: add support for logging to syslog/journal

Add support to logging.py for persistent logging to syslog and thus to
journald. This is accomplished by switching logging handlers depending
on the log_dest attribute of the context. Setting this value is left
for a future patch.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2 years agocephadm: move colored output support into logging.py
John Mulligan [Tue, 22 Aug 2023 16:42:14 +0000 (12:42 -0400)]
cephadm: move colored output support into logging.py

Rewrite cephadm's colored output support such that it abstracts away
the colorization into extra logging metadata. The new code will not
unconditionally put control characters into the log files. It will
only print the control chars if the stderr is a tty.
In theory this is probably more future proof as well, but it's only
got two callers so it is hard to say how useful it'll be.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2 years agorgw/crypt: don't deref null manifest_bl 53448/head
Casey Bodley [Wed, 13 Sep 2023 20:30:03 +0000 (16:30 -0400)]
rgw/crypt: don't deref null manifest_bl

with dbstore, the manifest_bl pointer was null; check for null before
dereferencing for read_manifest_parts()

Fixes: https://tracker.ceph.com/issues/62378
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2 years agoMerge pull request #51921 from kamoltat/wip-ksirivad-fix-54136
Kamoltat (Junior) Sirivadhna [Tue, 19 Sep 2023 18:12:35 +0000 (14:12 -0400)]
Merge pull request #51921 from kamoltat/wip-ksirivad-fix-54136

pybind/mgr/pg_autoscaler: Use bytes_used for actual_raw_used
Reviewed-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
2 years agoMerge pull request #53532 from zdover23/wip-doc-2023-09-19-man-ceph-monstore-tool
Ilya Dryomov [Tue, 19 Sep 2023 14:35:18 +0000 (16:35 +0200)]
Merge pull request #53532 from zdover23/wip-doc-2023-09-19-man-ceph-monstore-tool

doc/man: s/kvstore-tool/monstore-tool/

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
2 years agodoc/man: s/kvstore-tool/monstore-tool/ 53532/head
Zac Dover [Tue, 19 Sep 2023 13:12:34 +0000 (23:12 +1000)]
doc/man: s/kvstore-tool/monstore-tool/

s/kvstore-tool/monstore-tool/ in accordance with Ilya's remark here:
https://github.com/ceph/ceph/pull/53450#discussion_r1329804085

Signed-off-by: Zac Dover <zac.dover@proton.me>
2 years agoMerge pull request #53411 from rhcs-dashboard/align-charts
Nizamudeen A [Tue, 19 Sep 2023 12:38:29 +0000 (18:08 +0530)]
Merge pull request #53411 from rhcs-dashboard/align-charts

mgr/dashboard: align charts of landing page

Reviewed-by: Nizamudeen A <nia@redhat.com>
2 years agoceph-volume: fix mpath device support 53309/head
Guillaume Abrioux [Wed, 6 Sep 2023 09:30:41 +0000 (09:30 +0000)]
ceph-volume: fix mpath device support

commit [1] broke mpath devices support in `disk.is_device()`

[1] https://github.com/ceph/ceph/commit/4fc6bc394dffaf3ad375ff29cbb0a3eb9e4dbefc

Fixes: https://tracker.ceph.com/issues/62722
Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
2 years agoMerge pull request #53230 from myoungwon/fix-cbj-overflow-bug
Yingxin [Tue, 19 Sep 2023 08:03:34 +0000 (16:03 +0800)]
Merge pull request #53230 from myoungwon/fix-cbj-overflow-bug

crimson/os/seastore/cbj: fix a potential overflow bug on segment_seq

Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
2 years agoMerge PR #52631 into main
Milind Changire [Tue, 19 Sep 2023 07:51:09 +0000 (13:21 +0530)]
Merge PR #52631 into main

* refs/pull/52631/head:
mds: add debug logs to monitor ceph.dir.subvolume management
mds: dump subvolume flag for inode

Reviewed-by: Venky Shankar <vshankar@redhat.com>
2 years agoMerge PR #52892 into main
Milind Changire [Tue, 19 Sep 2023 07:23:29 +0000 (12:53 +0530)]
Merge PR #52892 into main

* refs/pull/52892/head:
qa: add test to validate periodic checks by async threads
mgr/volumes: periodically check for async work

Reviewed-by: Kotresh Hiremath Ravishankar <khiremat@redhat.com>
2 years agoMerge PR #52755 into main
Milind Changire [Tue, 19 Sep 2023 07:22:48 +0000 (12:52 +0530)]
Merge PR #52755 into main

* refs/pull/52755/head:
mds: adjust pre_segments_size for MDLog when trimming segments for standby-replay

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Jos Collin <jcollin@redhat.com>
2 years agoMerge PR #52692 into main
Milind Changire [Tue, 19 Sep 2023 07:21:52 +0000 (12:51 +0530)]
Merge PR #52692 into main

* refs/pull/52692/head:
qa/tasks/cephfs: reset the client_inject_fixed_oldest_tid after test

Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
Reviewed-by: Neeraj Pratap Singh <neesingh@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2 years agoMerge pull request #53518 from likid0/patch-2
zdover23 [Mon, 18 Sep 2023 23:04:02 +0000 (09:04 +1000)]
Merge pull request #53518 from likid0/patch-2

doc/dev: Fix typos in cephfs-mirroring.rst and  deduplication.rst

Reviewed-by: Zac Dover <zac.dover@proton.me>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
2 years agodoc/dev: Fix typos in files cephfs-mirroring.rst and deduplication.rst 53518/head
Daniel Parkes [Mon, 18 Sep 2023 21:03:28 +0000 (23:03 +0200)]
doc/dev: Fix typos in files cephfs-mirroring.rst and deduplication.rst

Typo Error in Doc cephfs-mirroring.rst , replace RAODS with RADOS
Typo Error in Doc deduplication.rst , replace RAODS with RADOS

Signed-off-by: Daniel Parkes <dparkes@redhat.com>
2 years agoMerge pull request #53502 from dang/wip-dang-cls-test
Casey Bodley [Mon, 18 Sep 2023 22:01:31 +0000 (23:01 +0100)]
Merge pull request #53502 from dang/wip-dang-cls-test

RGW - Fix cls test build on new gcc

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2 years agoMerge pull request #53508 from dang/wip-dang-posix-cache
Daniel Gryniewicz [Mon, 18 Sep 2023 18:36:58 +0000 (14:36 -0400)]
Merge pull request #53508 from dang/wip-dang-posix-cache

RGW - Add wait backoff to posix bucket cache test

Reviewed-by Matt Benjamin <mbenjamin@redhat.com>

2 years agoRGW - Add wait backoff to posix bucket cache test 53508/head
Daniel Gryniewicz [Mon, 18 Sep 2023 15:47:54 +0000 (11:47 -0400)]
RGW - Add wait backoff to posix bucket cache test

The CI appears to be really slow, and even a second of wait for inotify
sometimes fails.  Add an exponential backoff wait of up to ~25 seconds
to hopefully make the test pass reliably.

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
2 years agorgw/sal: get_placement_target_names() returns void 53505/head
Casey Bodley [Mon, 18 Sep 2023 15:15:02 +0000 (11:15 -0400)]
rgw/sal: get_placement_target_names() returns void

the function returned an integer error code, but two callers were
incorrectly testing the return value as a boolean

the function just returns placement ids that are in-memory, so none of
the drivers have a failure case; change the return value to void

Fixes: https://tracker.ceph.com/issues/62771
Signed-off-by: Casey Bodley <cbodley@redhat.com>
2 years agoMerge pull request #53478 from EdwardVitor/cuiming_chinamobile
Casey Bodley [Mon, 18 Sep 2023 13:46:55 +0000 (14:46 +0100)]
Merge pull request #53478 from EdwardVitor/cuiming_chinamobile

auth:rectify a cmake compilation warning

Reviewed-by: Casey Bodley <cbodley@redhat.com>
2 years agoRGW - Fix cls test build on new gcc 53502/head
Daniel Gryniewicz [Mon, 18 Sep 2023 12:50:53 +0000 (08:50 -0400)]
RGW - Fix cls test build on new gcc

The new encoder types broke building the cls test on newer gcc (13+) due
to undefined encoder/decoder.  Add the file that defines those to the
test.

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
2 years agoMerge PR #53175 into main
Patrick Donnelly [Mon, 18 Sep 2023 12:37:16 +0000 (08:37 -0400)]
Merge PR #53175 into main

* refs/pull/53175/head:
qa: increase the http postBuffer size and disable sslVerify

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
2 years agoauth:rectify a cmake compilation warning 53478/head
cuiming_yewu [Fri, 15 Sep 2023 06:14:39 +0000 (14:14 +0800)]
auth:rectify a cmake compilation warning

rectify src/auth/cephx/CephxProtocol.cc 1 warning
with the variable 'ch' Used before initialized
auth/cephx/CephxProtocol.cc:595:57: warning: '*((void*)& ch +8)' may be used uninitialized in this function [-Wmaybe-uninitialized]
     msg.server_challenge_plus_one = ch.server_challenge + 1;
                                     ~~~~~~~~~~~~~~~~~~~~^~~

Signed-off-by: cuiming <cuiming_yewu@cmss.chinamobile.com>
2 years agoMerge pull request #53490 from zdover23/wip-doc-2023-09-17-architecture-6-of-x
Anthony D'Atri [Sun, 17 Sep 2023 14:40:26 +0000 (10:40 -0400)]
Merge pull request #53490 from zdover23/wip-doc-2023-09-17-architecture-6-of-x

doc/architecture: "Edit HA Auth" (one of several)

2 years agodoc/architecture: "Edit HA Auth" (one of several) 53490/head
Zac Dover [Sun, 17 Sep 2023 08:56:40 +0000 (18:56 +1000)]
doc/architecture: "Edit HA Auth" (one of several)

Edit "High Availability Authentication" in doc/architecture.rst.

Signed-off-by: Zac Dover <zac.dover@proton.me>
2 years agoMerge pull request #53487 from zdover23/wip-doc-2023-09-16-architecture-5-of-x
Anthony D'Atri [Sat, 16 Sep 2023 16:46:58 +0000 (12:46 -0400)]
Merge pull request #53487 from zdover23/wip-doc-2023-09-16-architecture-5-of-x

doc/architecture: Edit "HA Auth"

2 years agodoc/architecture: Edit "HA Auth" 53487/head
Zac Dover [Sat, 16 Sep 2023 12:27:29 +0000 (22:27 +1000)]
doc/architecture: Edit "HA Auth"

Edit "High Availability Authentication" in doc/architecture.rst.

Signed-off-by: Zac Dover <zac.dover@proton.me>
2 years agoMerge PR #52638 into main
Patrick Donnelly [Fri, 15 Sep 2023 18:38:02 +0000 (14:38 -0400)]
Merge PR #52638 into main

* refs/pull/52638/head:
mds: flush monc log before abort
qa: check for expected cluster log message
qa: ignore expected cluster warning from damage tests

Reviewed-by: Kotresh Hiremath Ravishankar <khiremat@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Dhairya Parmar <dparmar@redhat.com>
2 years agoMerge PR #52199 into main
Patrick Donnelly [Fri, 15 Sep 2023 16:11:37 +0000 (12:11 -0400)]
Merge PR #52199 into main

* refs/pull/52199/head:
mds: continue linking if targeti is temporarily located in stray dir
Revert "mds: wait unlink to finish to avoid conflict when creating same dentries"
Revert "mds: clear the STATE_UNLINKING state when the unlink fails"
Revert "mds: wait reintegrate to finish when unlinking"
Revert "mds: notify the waiters in replica MDSs"
Revert "mds: wait the linkmerge/migrate to finish after unlink"

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
2 years agoMerge pull request #52933 from dang/wip-dang-posix-driver
Daniel Gryniewicz [Fri, 15 Sep 2023 12:58:52 +0000 (08:58 -0400)]
Merge pull request #52933 from dang/wip-dang-posix-driver

RGW - Add POSIX Driver

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Matt Benjamin <mbenjamin@redhat.com>
2 years agoMerge pull request #53296 from phlogistonjohn/jjm-fix-git-ls-files 52810/head
Adam King [Thu, 14 Sep 2023 17:10:16 +0000 (13:10 -0400)]
Merge pull request #53296 from phlogistonjohn/jjm-fix-git-ls-files

cephadm: remove duplicates when listing files in scan

Reviewed-by: Adam King <adking@redhat.com>
2 years agoRGW - Add POSIX Driver 52933/head
Daniel Gryniewicz [Tue, 24 Jan 2023 15:02:07 +0000 (10:02 -0500)]
RGW - Add POSIX Driver

This is the MVP for a driver for RGW that operates on top of a POSIX
filesystem.  It supports get, put, list, copy, multipart, external
access via the filesystem itself, and ordered bucket listings via an
LRU-based cache.

Note that this is currently a Filter, indended to run on top of dbstore.
This is because it currently doesn't have any User implementation, so it
depends on dbstore's User.  Everything else is implemented in
POSIXDriver.  Once there is a User implementation, this will become a
Store, instead of a Filter.

Commit messages from bucket listing cache:

  rgw/posixdriver: recycle lmdb database handles as required

    While LMDB workflows often do not close/return database handles,
    ours continually reuses them.  This requires us to close each
    handle (atomically) when a cache entry is recycled.

  rgw/posixdriver: don't instantiate bucket cache entries from notify events

  rgw/posixdriver: incorporate lmdb-safe for now

    The current inclusion is based on https://github.com/Martchus/lmdb-safe,
    which is actively maintained but currently has some packaging issues the
    author has agreed to accept fixes for.

    For now, skip the submodule to save time and remove an external dependency.

  rgw/posixdriver: fix listing of cached, empty bucket

    * check lmdb enumeration result in all cases and w/better style
    * add unit test for enumeration of an empty cached directory

  rgw/posixdriver: nest lmdbs in a directory under the dbroot path to avoid cleanup issues

  rgw/posixdriver: refactor for posix integration

    * Derive BucketCache types as templates on a SAL driver and SAL
      bucket pair.

    * Integrate cache fills as callbacks into SAL layer (or mock, for
      tests)

    * Renaming and cleanups

  rgw/posixdriver: add bucket cache implementation and tests

    Adds free-standing cache of buckets and object names, with
    bucket names (and listing attributes, upcoming) managed in
    a hashed set of lmdb databases, which provides ordering and
    a high-performance listing cache.

    An framework for notification on new object creation (e.g.,
    outside S3 workflow) is provided, and a Linux implementation
    using inotify.

    FindLMDB.cmake taken with attribution and license.

  rgw/posixdriver: add zpp_bits serialization (FAST)

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
Signed-off-by: Ali Maredia <amaredia@redhat.com>
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
2 years agocephadm: replace get_data_dir with DaemonIdentity.data_dir 53467/head
John Mulligan [Tue, 12 Sep 2023 18:23:06 +0000 (14:23 -0400)]
cephadm: replace get_data_dir with DaemonIdentity.data_dir

Replace the near trivial get_data_dir call with a method on
the DaemonIdentity type.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
2 years agocephadm: add a data_dir method to DaemonIdentity
John Mulligan [Tue, 12 Sep 2023 18:22:25 +0000 (14:22 -0400)]
cephadm: add a data_dir method to DaemonIdentity

This will replace `get_data_dir` in a future commit.

Signed-off-by: John Mulligan <jmulligan@redhat.com>