git.apps.os.sepia.ceph.com Git

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

Adam King [Thu, 8 Sep 2022 16:18:14 +0000 (12:18 -0400)]

Merge pull request #47659 from adk3798/wip-57102-quincy

quincy: mgr/cephadm: recreate osd config when redeploy/reconfiguring

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 8 Sep 2022 16:16:21 +0000 (12:16 -0400)]

Merge pull request #47944 from adk3798/wip-57426-quincy

quincy: cephadm/mgr: adding logic to handle --no-overwrite for tuned profiles

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 8 Sep 2022 16:14:20 +0000 (12:14 -0400)]

Merge pull request #47946 from adk3798/wip-57423-quincy

quincy: mgr/cephadm: Fix how we check if a host belongs to public network

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 8 Sep 2022 16:13:04 +0000 (12:13 -0400)]

Merge pull request #47950 from adk3798/wip-57383-quincy

quincy: mgr/cephadm: Adding logic to store grafana cert/key per node

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 8 Sep 2022 16:12:21 +0000 (12:12 -0400)]

Merge pull request #47951 from adk3798/wip-57382-quincy

quincy: mgr/cephadm: allow binding to loopback for rgw daemons

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 8 Sep 2022 16:11:40 +0000 (12:11 -0400)]

Merge pull request #47952 from adk3798/wip-57378-quincy

quincy: cephadm: return nonzero exit code when applying spec fails in bootstrap

Reviewed-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 8 Sep 2022 16:11:02 +0000 (12:11 -0400)]

Merge pull request #47953 from adk3798/wip-57377-quincy

quincy: mgr/cephadm: don't try to write client/os tuning profiles to known offline hosts

Reviewed-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

Neha Ojha [Thu, 8 Sep 2022 15:26:21 +0000 (08:26 -0700)]

Merge pull request #48004 from sseshasa/wip-57461-quincy

quincy: PendingReleaseNotes: Note the fix for high CPU utilization during recovery

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 8 Sep 2022 14:25:26 +0000 (07:25 -0700)]

Merge pull request #47894 from kotreshhr/wip-57242-quincy

quincy: mgr/volumes: Few mgr volumes backports

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 8 Sep 2022 13:53:08 +0000 (06:53 -0700)]

Merge pull request #47996 from idryomov/wip-52810-quincy

quincy: librbd: retry ENOENT in V2_REFRESH_PARENT as well

Reviewed-by: Christopher Hoffman <choffman@redhat.com>

commit | commitdiff | tree

Sridhar Seshasayee [Tue, 16 Aug 2022 11:45:29 +0000 (17:15 +0530)]

PendingReleaseNotes: Note the fix for high CPU utilization during recovery

Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit d6982022325a17dbe91e760530ab21832459a817)

Conflicts:
PendingReleaseNotes
- Moved the note under ">=17.2.4" section

commit | commitdiff | tree

Yuri Weinstein [Wed, 7 Sep 2022 21:01:26 +0000 (14:01 -0700)]

Merge pull request #47993 from soumyakoduri/wip-skoduri-quincy

rgw/backport/quincy: Fix crashes with Sync policy APIs

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 7 Sep 2022 15:00:54 +0000 (08:00 -0700)]

Merge pull request #46005 from rzarzynski/wip-common-no-cpp17-second_round-quincy

quincy: common/bl: fix FTBFS on C++11 due to C++17's if-with-initializer

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 7 Sep 2022 14:57:41 +0000 (07:57 -0700)]

Merge pull request #47901 from amathuria/wip-56736-quincy

quincy: osd/PeeringState: fix missed recheck_readable from laggy

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 7 Sep 2022 14:55:32 +0000 (07:55 -0700)]

Merge pull request #45892 from nkshirsagar/wip-55297-quincy

quincy: Catch exception if thrown by __generate_command_map()

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 6 Sep 2022 20:41:58 +0000 (13:41 -0700)]

Merge pull request #46559 from pdvian/wip-55305-quincy

quincy: mgr, mon: Keep upto date metadata with mgr for MONs

Reviewed-by: Laura Flores <lflores@redhat.com>

commit | commitdiff | tree

Ilya Dryomov [Sun, 4 Sep 2022 17:14:04 +0000 (19:14 +0200)]

librbd: make RefreshRequest tests compatible with clone v1

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 36f1d515ca92784631d29fa1c5d1465e957af2a7)

commit | commitdiff | tree

Ilya Dryomov [Sun, 4 Sep 2022 15:52:51 +0000 (17:52 +0200)]

librbd: retry ENOENT in V2_REFRESH_PARENT as well

With auto-deletion of trashed snapshots, it is relatively easy to lose
a race to "rbd flatten" as follows:

- when V2_GET_PARENT runs, the image is technically still a clone
- when V2_REFRESH_PARENT runs, the image is fully flattened and the
snapshot in the parent image is deleted

This results in a spurious ENOENT error, mainly when trying to open the
image (e.g. for "rbd info"). This race condition has always been there
but auto-deletion of trashed snapshots makes it much worse.

Retry ENOENT in V2_REFRESH_PARENT the same way as in V2_GET_SNAPSHOTS.

Fixes: https://tracker.ceph.com/issues/52810
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit bd885d75b2e4d728086f744e0d10e7cd12d3f15b)

commit | commitdiff | tree

Ilya Dryomov [Sun, 4 Sep 2022 10:40:36 +0000 (12:40 +0200)]

librbd: limit the number of ENOENT retries in RefreshRequest

If the image header is corrupt, ENOENT error may be persistent. Avoid
an infinite loop in that case.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 8570194b133462db6b7d4ab108383db0967b1cb9)

commit | commitdiff | tree

Ilya Dryomov [Fri, 2 Sep 2022 14:58:36 +0000 (16:58 +0200)]

librbd: fix a bunch of issues with restarting RefreshRequest

Make RefreshRequest properly restartable, at least up until and including
V2_REFRESH_PARENT step:

- clear m_migration_spec when skipping GET_MIGRATION_HEADER
- don't rely on potentially stale m_incomplete_update on retry
- reset m_legacy_parent when retrying more than just V2_GET_PARENT
- don't rely on potentially stale m_parent_md.overlap and
m_head_parent_overlap on retry
- clear m_metadata before fetching image metadata (but not before
fetching pool metadata)
- clear m_op_features when skipping V2_GET_OP_FEATURES
- clear m_group_spec on EOPNOTSUPP error in V2_GET_GROUP
- reset m_legacy_snapshot when retrying more than just V2_GET_SNAPSHOTS
- don't rely on potentially stale m_snap_parents on retry

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 6bd89ea119520cf5a45ac93b0e16edf35ddd4e57)

commit | commitdiff | tree

Ilya Dryomov [Tue, 30 Aug 2022 19:33:04 +0000 (21:33 +0200)]

librbd: check *result consistently in RefreshRequest

Stick to *result >= 0 checks everywhere and add missing checks for
op_features_get_finish() and image_group_get_finish() errors.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ce6dff89c0f005c1ae1dc71cadfbef9f82df37a4)

commit | commitdiff | tree

Ilya Dryomov [Tue, 30 Aug 2022 18:38:10 +0000 (20:38 +0200)]

librbd: reflect V2_GET_SNAPSHOTS ENOENT retry in state diagram

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ca36ffa347f0c68115a7d6b54ebb47ac5e82698d)

commit | commitdiff | tree

Soumya Koduri [Wed, 24 Aug 2022 05:38:38 +0000 (11:08 +0530)]

radosgw-admin: fix crash with 'sync flow create/remove' cmd

Avoid dereferencing an empty optional "flow-type" (if not specified).

Fixes: https://tracker.ceph.com/issues/57275
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit 539c5b87a2965ce43002430790abd586b98f620d)

commit | commitdiff | tree

Soumya Koduri [Thu, 26 May 2022 16:55:06 +0000 (22:25 +0530)]

rgw: Avoid dereferencing nullptr while configuring bucket sync policy

While configuring bucket sync policy, in "rgw_sync_bucket_entities::set_bucket()",
there could be a case where in bucket doesnt contain any value but is still being
dereferenced. This commit fixes the same.

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
(cherry picked from commit 86cf8275224536a7ca77eaf8a6e59951b3f25261)

commit | commitdiff | tree

Yuri Weinstein [Tue, 6 Sep 2022 16:07:09 +0000 (09:07 -0700)]

Merge pull request #47940 from idryomov/wip-56703-quincy

quincy: librbd/cache/pwl: narrow the scope of m_lock in write_image_cache_state()

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 6 Sep 2022 15:09:10 +0000 (08:09 -0700)]

Merge pull request #47235 from cfsnyder/wip-55714-quincy

quincy: rgw_rest_user_policy: Fix GetUserPolicy & ListUserPolicies responses

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 6 Sep 2022 15:08:36 +0000 (08:08 -0700)]

Merge pull request #46107 from BenoitKnecht/wip-55499-quincy

quincy: rgw: Avoid segfault when OPA authz is enabled

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 6 Sep 2022 15:07:45 +0000 (08:07 -0700)]

Merge pull request #45714 from cbodley/wip-55136

quincy: rgw: data sync uses yield_spawn_window()

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Ilya Dryomov [Tue, 6 Sep 2022 09:21:54 +0000 (11:21 +0200)]

Merge pull request #47980 from tchaikov/quincy-pr-47962

quincy: test/{librbd, rgw}: retry when bind fail with port 0

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Kefu Chai [Sun, 4 Sep 2022 12:37:32 +0000 (20:37 +0800)]

test/{librbd, rgw}: retry when bind fail with port 0

there is chance that the bind() call may fail if we have another test
happen to pick the free port picked by operating system. in this case,
we just retry up to 42 times.

in theory, this change does not fully address the racing, but it should
help to alleviate this issue.

See-also: https://tracker.ceph.com/issues/57116
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit aa7885f7cc41390fcc8eeb82bc7142c3ff6a53f9)

commit | commitdiff | tree

Yuri Weinstein [Mon, 5 Sep 2022 14:09:51 +0000 (07:09 -0700)]

Merge pull request #47765 from rzarzynski/wip-get_or_fail-debug-louder-quincy

quincy: msg: Log at higher level when Throttle::get_or_fail() fails

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Vikhyat Umrao <vikhyat@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 5 Sep 2022 14:07:56 +0000 (07:07 -0700)]

Merge pull request #47619 from tchaikov/quincy-pr-47449

quincy: cmake: disable LTO when building pmdk

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 5 Sep 2022 14:07:01 +0000 (07:07 -0700)]

Merge pull request #47302 from petrutlucian94/wip-56728-quincy

quincy: msg: Fix Windows IPv6 support

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 5 Sep 2022 14:02:33 +0000 (07:02 -0700)]

Merge pull request #47909 from Matan-B/wip-57372-quincy

quincy: SimpleRADOSStriper: Avoid moving bufferlists by using deque in read()

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

zdover23 [Mon, 5 Sep 2022 06:23:41 +0000 (16:23 +1000)]

Merge pull request #47955 from zdover23/wip-doc-2022-09-04-backport-47841-to-quincy

quincy: doc/start: update documenting-ceph branch names

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Yuri Weinstein [Sun, 4 Sep 2022 15:02:07 +0000 (08:02 -0700)]

Merge pull request #47914 from idryomov/wip-56154-quincy

quincy: rbd-mirror: resume pending shutdown on error in snapshot replayer

Reviewed-by: Christopher Hoffman <choffman@redhat.com>
Reviewed-by: Mykola Golub <mgolub@suse.com>

commit | commitdiff | tree

Yuri Weinstein [Sun, 4 Sep 2022 15:01:00 +0000 (08:01 -0700)]

Merge pull request #47912 from idryomov/wip-57317-quincy

quincy: librbd: use actual monitor addresses when creating a peer bootstrap token

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Christopher Hoffman <choffman@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sun, 4 Sep 2022 10:12:50 +0000 (18:12 +0800)]

Merge pull request #47694 from SUSE/wip-quincy-include-memory

include/buffer: include <memory>

Reviewed-by: Kefu Chai <tchaikov@gmail.com>

commit | commitdiff | tree

Zac Dover [Tue, 30 Aug 2022 11:48:08 +0000 (21:48 +1000)]

doc/start: update documenting-ceph branch names

This PR updates the branch names in the
documenting-ceph.rst file. It gets rid of all references
to the "master" branch, and updates the language to
reflect the state of play in 2022.

inb4: This PR merely removes the most egregious inaccuracies,
the ones that were most readily evident on a cursory perusal.
The full text remains to be carefully read and fitted together
with care.

I had to start somewhere.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 7bc6262547c82dd6519e4099bfc4f082f14343ac)

commit | commitdiff | tree

Adam King [Sat, 3 Sep 2022 19:45:19 +0000 (15:45 -0400)]

Merge pull request #47947 from adk3798/wip-57413-quincy

quincy: doc/cephadm/services: fix example for specifying rgw placement

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>

commit | commitdiff | tree

Adam King [Wed, 17 Aug 2022 23:03:18 +0000 (19:03 -0400)]

mgr/cephadm: don't try to write client/os tuning profiles to known offline hosts

Fixes: https://tracker.ceph.com/issues/57175
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit df3557200dcae2ab1b55acf616b13adfc77fc771)

commit | commitdiff | tree

Adam King [Wed, 17 Aug 2022 20:54:54 +0000 (16:54 -0400)]

cephadm: return nonzero exit code when applying spec fails in bootstrap

This is mostly useful for testing automation, but right now if applying the
spec provided with --apply-spec fails, the return code remains zero. We don't
want to error out entirely in that case as we still want to print the remaining
output (e.g. the dashboard password). Continuing onward and then returning a
nonzero code could provide a balance where we still give all the output but
still have something to make it easier for those writing automation around bootstrap.

Fixes: https://tracker.ceph.com/issues/57173
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit be17f1d4b30e19aa6039fa5d6a694129cb5f3583)

commit | commitdiff | tree

Redouane Kachach [Fri, 26 Aug 2022 12:00:05 +0000 (14:00 +0200)]

mgr/cephadm: allow binding to loopback for rgw daemons
Fixes: https://tracker.ceph.com/issues/57304
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 1a722f35c1f40353dbdf06fdb63364822ebaeffc)

commit | commitdiff | tree

Redouane Kachach [Thu, 14 Jul 2022 11:36:32 +0000 (13:36 +0200)]

mgr/cephadm: Adding logic to store grafana cert/key per node
Fixes: https://tracker.ceph.com/issues/56508
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 3c990f974e3beac0fc03f58c4c47f26f9d5afe56)

commit | commitdiff | tree

Redouane Kachach [Fri, 2 Sep 2022 09:57:43 +0000 (11:57 +0200)]

doc/cephadm/services: fix example for specifying rgw placement
fixes: https://tracker.ceph.com/issues/56953

Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 1ed4c30876262c8890247325eb84ff46621d34fe)

commit | commitdiff | tree

Redouane Kachach [Wed, 31 Aug 2022 11:49:37 +0000 (13:49 +0200)]

mgr/cephadm: Fix how we check if a host belongs to public network
Fixes: https://tracker.ceph.com/issues/57060
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 1c8833feaf42fd518e19c9a347c6c5781943862a)

commit | commitdiff | tree

Redouane Kachach [Fri, 26 Aug 2022 10:31:45 +0000 (12:31 +0200)]

cephadm/mgr: adding logic to handle -no-overwrite for tuned profiles
Fixes: https://tracker.ceph.com/issues/57032
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 028cb031ddb72c1f37048c8568ecdf43f5b77b50)

commit | commitdiff | tree

Yuri Weinstein [Sat, 3 Sep 2022 14:51:02 +0000 (07:51 -0700)]

Merge pull request #47861 from lxbsz/wip-57253

quincy: libcephfs: define AT_NO_ATTR_SYNC back for backward compatibility

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Nikhilkumar Shelke <nshelke@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 3 Sep 2022 14:50:07 +0000 (07:50 -0700)]

Merge pull request #47768 from neesingh-rh/wip-57264-quincy

quincy: mgr/volumes: Add volume info command

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Nikhilkumar Shelke <nshelke@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Ilya Dryomov [Thu, 18 Aug 2022 16:48:39 +0000 (18:48 +0200)]

librbd/cache/pwl: generate image cache state json under m_lock

The previous commit moved the entirety of write_image_cache_state()
from under m_lock. This was a step too far because the generated image
cache state json is no longer guaranteed to be consistent.

Arrange for m_lock to still be held during image cache json generation
but released before owner_lock is grabbed.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ad504b10f60290cc7461ea96eaada1fb3f7639d7)

commit | commitdiff | tree

Yin Congmin [Thu, 28 Jul 2022 05:43:07 +0000 (13:43 +0800)]

librbd/cache/pwl: move write_image_cache_state() out of m_lock

periodic_stats() will get m_lock, then get owner_lock. It is opposite
to the lock getting order of SnapshotCreateRequest::handle_notify_quiesce().
move write_image_cache_state() out of m_lock scope. After calling
update_image_cache_state(), and m_lock auto released, then call
write_image_cache_state() to update state in osds.

Fixes: https://tracker.ceph.com/issues/56703
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
(cherry picked from commit a0e2868d9473a9120c4c5d478f6a592859ce1aec)

commit | commitdiff | tree

Yuri Weinstein [Fri, 2 Sep 2022 22:10:36 +0000 (15:10 -0700)]

Merge pull request #47902 from vshankar/tr-57370

quincy: mon/MDSMonitor: fix standby-replay mds being removed from MDSMap unexpectedly

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 2 Sep 2022 20:05:13 +0000 (13:05 -0700)]

Merge pull request #47910 from adk3798/wip-57314-quincy

quincy: qa/cephadm: specify using container host distros for workunits

Reviewed-by: John Mulligan <jmulligan@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 2 Sep 2022 20:01:37 +0000 (13:01 -0700)]

Merge pull request #47826 from ceph/wip-telemetry-memory-stats-quincy

quincy: mgr/telemetry: add `perf_memory_metrics` collection to telemetry

Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 2 Sep 2022 16:17:54 +0000 (09:17 -0700)]

Merge pull request #47825 from ceph/wip-bug-57119-quincy

quincy: osd, mds: fix the "heap" admin cmd printing always to error stream

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 2 Sep 2022 15:26:25 +0000 (08:26 -0700)]

Merge pull request #47648 from joscollin/wip-57156-quincy

quincy: cephfs-top: fix the rsp/wsp display

Reviewed-by: Neeraj Pratap Singh <neesingh@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Radoslaw Zarzynski [Fri, 2 Sep 2022 15:22:39 +0000 (17:22 +0200)]

Merge pull request #47621 from pdvian/wip-56134-quincy

quincy: osd/scrub: Reintroduce scrub starts message

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Nizamudeen A [Fri, 2 Sep 2022 10:19:31 +0000 (15:49 +0530)]

Merge pull request #47867 from MrFreezeex/quincy-ceph-mixin-backports

quincy: monitoring: ceph mixin backports

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Anthony D Atri <anthony.datri@gmail.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>

commit | commitdiff | tree

Nizamudeen A [Fri, 2 Sep 2022 05:31:59 +0000 (11:01 +0530)]

Merge pull request #47387 from s0nea/wip-56991-quincy

quincy: monitoring/ceph-mixin: OSD overview typo fix

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 1 Sep 2022 23:05:03 +0000 (16:05 -0700)]

Merge pull request #47057 from lxbsz/wip-56448

quincy: mds: notify the xattr_version to replica MDSes

Reviewed-by: Kotresh HR khiremat@redhat.com

commit | commitdiff | tree

zdover23 [Thu, 1 Sep 2022 20:16:58 +0000 (06:16 +1000)]

Merge pull request #47822 from zdover23/wip-doc-2022-08-27-backport-47810-to-quincy

quincy: doc/mgr: add prompt directives to dashboard.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

zdover23 [Thu, 1 Sep 2022 20:13:11 +0000 (06:13 +1000)]

Merge pull request #47869 from zdover23/wip-doc-2022-08-30-backport-447843-to-quincy

quincy: doc/mgr: update prompts in dboard.rst includes

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Ilya Dryomov [Tue, 30 Aug 2022 09:45:44 +0000 (11:45 +0200)]

rbd-mirror: skip setting error code on snapshot replayer shutdown

This is regarding failures in unregister_remote_update_watcher() and
unregister_local_update_watcher().  handle_replay_complete() can't be
called in these cases anymore as it would blindly attempt to unregister
watchers from scratch again.  Dropping handle_replay_complete() calls
there means that these failures would only be logged and would not be
surfaced by snapshot replayer.  But the only caller ignores them
anyway:

  void ImageReplayer<I>::shut_down(int r) {
    ...
    // close the replayer
    if (m_replayer != nullptr) {
      ctx = new LambdaContext([this, ctx](int r) {
        m_replayer->destroy();
        m_replayer = nullptr;
        ctx->complete(0);             <------
      });
      ctx = new LambdaContext([this, ctx](int r) {
        m_replayer->shut_down(ctx);
      });
    }

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ced071f0de57af8cddffebca24baeb27f2a211d8)

commit | commitdiff | tree

Ilya Dryomov [Wed, 24 Aug 2022 10:56:31 +0000 (12:56 +0200)]

rbd-mirror: resume pending shutdown on error in snapshot replayer

If a shutdown is requested, e.g. by update_pool_replayers() because
remote RADOS instance got blocklisted, and Replayer::shut_down() pends
it on completion of current snapshot sync, it gets stuck if replayer
encounters an error in the interim. This is particularly likely in the
blocklist case: a higher layer may detect that client got blocklisted
and request a shutdown first, and then when replayer sees EBLOCKLISTED
in turn, it calls handle_replay_complete() -- which does not resume
a pending shutdown. Because update_pool_replayers() blocks on shutdown
with Mirror::m_lock held, eventually the entire daemon hangs in
perpetuity.

Fixes: https://tracker.ceph.com/issues/56154
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit fc4cc575bc53f62f88ee3faf0daba8906bc1c6c1)

commit | commitdiff | tree

Ilya Dryomov [Sat, 27 Aug 2022 09:09:00 +0000 (11:09 +0200)]

librbd: use actual monitor addresses when creating a peer bootstrap token

Relying on mon_host config option is fragile, as the user may confuse
v1 and v2 addresses, group them incorrectly, etc. Get mon_host value
only as a fallback.

Fixes: https://tracker.ceph.com/issues/57317
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit de0ba80b37bf3df22bb2976871332344a4fb141e)

commit | commitdiff | tree

Nizamudeen A [Thu, 1 Sep 2022 17:29:07 +0000 (22:59 +0530)]

Merge pull request #47892 from rhcs-dashboard/value-error-quincy

quincy: install-deps: script exit on /ValueError: in centos_stream8

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 18 Aug 2022 12:49:57 +0000 (08:49 -0400)]

qa/cephadm: specify using container host distros for workunits

Right now, the OS Type and OS Version for these workunits
tests is left blank on pulpito and they appear to be trying to
run ubuntu jammy currently which is causing failures. We should
specify what distros the tests should run on then very explicitly
tell it to start trying new distros when we can get the tests to
pass.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 1c0bf2f9f6bce08949fd5281fac1afbae1788fe7)

commit | commitdiff | tree

Matan Breizman [Thu, 1 Sep 2022 08:16:03 +0000 (08:16 +0000)]

test/librados/aio_cxx: add multithreaded aio_read test

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit c4a2c380ea319ddc1e9997be4130a365f483cf0e)

commit | commitdiff | tree

Matan Breizman [Wed, 31 Aug 2022 08:08:27 +0000 (08:08 +0000)]

SimpleRADOSStriper: Avoid moving bufferlists by using deque

Fixes: https://tracker.ceph.com/issues/57152
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 410f8c74f43caee179379a4ba02b475c51ed6af4)

commit | commitdiff | tree

胡玮文 [Sun, 9 Jan 2022 15:17:38 +0000 (23:17 +0800)]

mon/MDSMonitor: remove redundant state change check

There are two sets of checks to state change in prepare_beacon.
Since the last commit, many of these checks are covered by
`MDSMap::state_transition_valid`. So merging these checks.

This fixes the bug that standby-replay is evicted unexpectedly.
This bug is introduced in
794d13c9ff4 (mon/MDSMonitor: reject illegal want_states from MDS)
but only reveal itself after
20509bb6c82 (MDSMonitor: handle damaged from standby-replay)

Fixes: https://tracker.ceph.com/issues/53811
Signed-off-by: 胡玮文 <huww98@outlook.com>
(cherry picked from commit bf75a9ea08084afe4a02083473a7146cb91dae3b)

commit | commitdiff | tree

胡玮文 [Fri, 7 Jan 2022 17:14:00 +0000 (01:14 +0800)]

osd/PeeringState: fix missed `recheck_readable` from laggy

Previously, the first `pg_lease_ack_t` after becoming laggy would not
trigger `recheck_readable`. However, every other ack would trigger it.
The logic is inverted, causing unnecessarily long laggy PG state.

Fixes: 3bb8a7210a6 (osd: requeue ops when PG is no longer laggy)
Fixes: https://tracker.ceph.com/issues/53806
Signed-off-by: 胡玮文 <huww98@outlook.com>
(cherry picked from commit caeca396e8b149cfa09ed99eda4f7a7186b005b4)

commit | commitdiff | tree

胡玮文 [Sun, 9 Jan 2022 14:52:16 +0000 (22:52 +0800)]

mds/FSMap: stricter state_transition_valid

Reject any unknown transitions.

MDSRank::state initialize to standy and assert no update is missed.

Signed-off-by: 胡玮文 <huww98@outlook.com>
(cherry picked from commit 0601552b91a1c91314bc6799514f972098b02f30)

commit | commitdiff | tree

胡玮文 [Fri, 7 Jan 2022 17:10:35 +0000 (01:10 +0800)]

osd/PeeringState: proc_lease_ack break once found from OSD

We should not have duplicated OSD ID in `acting`. So the loop would
execute once anyway.

Signed-off-by: 胡玮文 <huww98@outlook.com>
(cherry picked from commit 9337fa6704180de90e1c18e314849566ff628818)

commit | commitdiff | tree

胡玮文 [Sun, 9 Jan 2022 13:53:40 +0000 (21:53 +0800)]

doc: complete MDS state diagram

Add missing rejoin -> stopped.
MDS can transit from replay-standy to damaged Since 20509bb6c82.

Signed-off-by: 胡玮文 <huww98@outlook.com>
(cherry picked from commit 969061e8d5f6d965150c4ac9d1b804f24b84dc4b)

commit | commitdiff | tree

胡玮文 [Sun, 9 Jan 2022 13:45:34 +0000 (21:45 +0800)]

mds: remove reference to mds-state-diagram.svg

We no longer generate that file since c783ae10aa4

Signed-off-by: 胡玮文 <huww98@outlook.com>
(cherry picked from commit b4cc22a57f79b531401515badf39a06e613ef0c9)

commit | commitdiff | tree

Rishabh Dave [Fri, 6 May 2022 16:06:28 +0000 (21:36 +0530)]

qa/cephfs: omit_sudo must be passed to underlying method...

so that it can have it's intended effect.

Fixes: https://tracker.ceph.com/issues/55572
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 02f0a3f136f85e472f3657a4af2b94e8af33c46b)

Conflicts:
qa/tasks/cephfs/mount.py: timeout change wasn't backported

commit | commitdiff | tree

Kotresh HR [Thu, 7 Jul 2022 08:00:56 +0000 (13:30 +0530)]

qa: Validate cleaning of the stale snapshot metadata

Fixes: https://tracker.ceph.com/issues/55976
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit aece3b9b28fccb6cad77e81fd3e4b84c333f1609)

commit | commitdiff | tree

Kotresh HR [Tue, 16 Aug 2022 11:41:33 +0000 (17:11 +0530)]

mgr/volumes: Remove stale snapshot user metadata

This patch adds the capability to remove the stale snapshot user
metadata while loading the subvolume if it is present. It can't
be done in 'SubvolumeBase.discover' since v1 and v2 snapshot paths
are different. This is done just after the discover before returning
the specific version object.

Fixes: https://tracker.ceph.com/issues/55976
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 65af2d123a1f1ef9c4b370e908ece588eec19a1f)

commit | commitdiff | tree

Kotresh HR [Wed, 6 Jul 2022 11:59:39 +0000 (17:29 +0530)]

mgr/volumes: Allow forceful snapshot removal on osd full

When the osd is full, if the snapshot has metadata set, it
can't be removed as user metadata can't be removed when osd
is full. This patch provides a way to remove the snapshot
with 'force' option while keeping the corresponding metadata
which gets removed on subvolume discover when it finds space.

Fixes: https://tracker.ceph.com/issues/55976
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 0687f78650dd348619b06e20c299f82f2a0c1bf5)

commit | commitdiff | tree

Kotresh HR [Wed, 15 Jun 2022 10:35:40 +0000 (16:05 +0530)]

qa: Add subvolume clone and snapshot rm tests when osd is full

Fixes: https://tracker.ceph.com/issues/55976
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit a64f049614454e98920e2abcb685ca61fa49a148)

commit | commitdiff | tree

Kotresh HR [Tue, 16 Aug 2022 11:38:16 +0000 (17:08 +0530)]

mgr/volumes: Better handle config file on osd full scenario

The 'metadata_mgr.flush()' used to truncate the config file
before flushing the new config data. This could lead to an
empty config file when there is no space to write new config
data. This patch handles this scenario by writing it to
temporary file and rename it to config file. This would
retain the config file without truncating it.

Also, there are bunch of places which wasn't handling
'MetadataMgrException' because of this. Fixed those.

Fixes: https://tracker.ceph.com/issues/55976
Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit ec57215d508e6cb5b3a4d84fd6a3a5b0c9b96c71)

commit | commitdiff | tree

Nikhilkumar Shelke [Thu, 5 May 2022 07:02:31 +0000 (12:32 +0530)]

qa: display in-progress clones for a snapshot

If any clone is in pending or in-progress state then
show these clones in 'fs subvolume snapshot info'
command output.

Fixes: https://tracker.ceph.com/issues/55041
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
(cherry picked from commit f4c6bdb3c6418efbf261bdce6e7f1b5753a61d7c)

commit | commitdiff | tree

Nikhilkumar Shelke [Thu, 5 May 2022 08:01:24 +0000 (13:31 +0530)]

docs: display in-progress clones for a snapshot

If any clone is in pending or in-progress state then
show these clones in 'fs subvolume snapshot info'
command output. This field only exists if clones are
in pending or in progress state.

Fixes: https://tracker.ceph.com/issues/55041
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
(cherry picked from commit a8b819da71804868d83bd9775c41ede39b1b65a7)

commit | commitdiff | tree

Nikhilkumar Shelke [Thu, 5 May 2022 06:56:03 +0000 (12:26 +0530)]

mgr/volumes: display in-progress clones for a snapshot

If any clone is in pending or in-progress state then
show these clones in 'fs subvolume snapshot info'
command output.

Fixes: https://tracker.ceph.com/issues/55041
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
(cherry picked from commit 099efb424977f86597826e3f56734b3deddfd0dc)

commit | commitdiff | tree

Nizamudeen A [Thu, 1 Sep 2022 06:23:11 +0000 (11:53 +0530)]

Merge pull request #47887 from rhcs-dashboard/wip-57356-quincy

quincy: mgr/dashboard: ensure limit 0 returns 0 images

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Nizamudeen A [Thu, 1 Sep 2022 06:21:56 +0000 (11:51 +0530)]

Merge pull request #47409 from rhcs-dashboard/wip-56567-quincy

quincy: mgr/dashboard: rbd striping setting pre-population and pop-over

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: vrushch <NOT@FOUND>

commit | commitdiff | tree

Nizamudeen A [Tue, 16 Aug 2022 15:39:25 +0000 (21:09 +0530)]

install-deps: script exit on /ValueError: in centos_stream8

this is happening locally as well as in our ceph-dev runs too https://github.com/rhcs-dashboard/ceph-dev/runs/7850564011

Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit b73d7d22d4dad5188d06fdec4892148af0757dc5)

commit | commitdiff | tree

Pere Diaz Bou [Thu, 18 Aug 2022 11:34:15 +0000 (13:34 +0200)]

mgr/dashboard: ensure limit 0 returns 0 images

Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
(cherry picked from commit a80c058ad21127bee09b4a67886745d880799a10)

commit | commitdiff | tree

Yuri Weinstein [Wed, 31 Aug 2022 14:28:49 +0000 (07:28 -0700)]

Merge pull request #47747 from kotreshhr/wip-57112-quincy

quincy: mgr/volumes: prevent intermittent ParsingError failure in "clone cancel"

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 31 Aug 2022 14:27:57 +0000 (07:27 -0700)]

Merge pull request #47734 from neesingh-rh/wip-57200-quincy

quincy: mgr/snap_schedule: replace .snap with the client configured snap dir name

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Kotresh HR khiremat@redhat.com

commit | commitdiff | tree

Adam King [Tue, 30 Aug 2022 13:04:15 +0000 (09:04 -0400)]

Merge pull request #47858 from adk3798/quincy-fix-tox-mgr

quincy: mgr/orchestrator/tests: don't match exact whitespace in table output

Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>

commit | commitdiff | tree

Zac Dover [Mon, 29 Aug 2022 00:39:51 +0000 (10:39 +1000)]

doc/mgr: update prompts in dboard.rst includes

This PR adds unselectable prompts to three files that are
transcluded in the doc/mgr/dashboard.rst file. These three
files are:

1. debug.inc.rst
2. feature_toggles.inc.rst
3. motd.inc.rst

The addition of unselectable prompts to these three files
completes the work begun in PR#47810 (d8064b4), which sought
to bring dashboard.rst into line with the unselectable prompt
standard introduced by Kefu Chai in 2020.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit fc70ccde758cb1f7c03f208115b88a4ef325aed7)

commit | commitdiff | tree

Nizamudeen A [Tue, 30 Aug 2022 11:05:05 +0000 (16:35 +0530)]

Merge pull request #47623 from aaSharma14/wip-57137-quincy

quincy: mgr/dashboard: add flag to automatically deploy loki/promtail service at bootstrap

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Aswin Toni [Tue, 23 Aug 2022 08:30:12 +0000 (10:30 +0200)]

ceph-mixin: fix CephNodeNetworkPacket alerts

Signed-off-by: Aswin Toni <aswin.toni@cern.ch>
(cherry picked from commit 351e1ac63950164ea5f08a6bfc7c14af586bb208)

commit | commitdiff | tree

Aswin Toni [Thu, 18 Aug 2022 14:21:36 +0000 (16:21 +0200)]

ceph-mixin: fix config inheritance

Signed-off-by: Aswin Toni <aswin.toni@cern.ch>
(cherry picked from commit 35183140f60fe445de8d256fa08639b288b6e768)

commit | commitdiff | tree

Arthur Outhenin-Chalandre [Thu, 18 Aug 2022 11:37:31 +0000 (13:37 +0200)]

ceph-mixin: fix PATH issues with jsonnet-bundler

In 4a3afcf, the $PATH is set for the test, but we cannot set multiple
properties with a single `set_property()` cmake command. We fix that by
adding the installation path of jsonnet-bundler
(CMAKE_CURRENT_BINARY_DIR) to the $PATH used for every tox test.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
Co-Authored-By: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit d46e14c71bffda1381dac7da244ab8347d035769)

commit | commitdiff | tree

Aswin Toni [Tue, 16 Aug 2022 14:17:21 +0000 (16:17 +0200)]

ceph-mixin: Remove jsonnet building

Signed-off-by: Aswin Toni <aswin.toni@cern.ch>
(cherry picked from commit 2e0e684fc20cbf6c2e48215b431419c8573b3863)

commit | commitdiff | tree

Aswin Toni [Tue, 16 Aug 2022 13:38:18 +0000 (15:38 +0200)]

prometheus: add multicluster support to alerts

Signed-off-by: Aswin Toni <aswin.toni@cern.ch>
(cherry picked from commit 5cdc1c62c5de52a1f777f3d83fc85c3fc144db38)

commit | commitdiff | tree

Anthony D'Atri [Tue, 26 Jul 2022 16:06:27 +0000 (09:06 -0700)]

monitoring/ceph-mixin: clean up prometheus_alerts.yml

Signed-off-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>
(cherry picked from commit 9b6597446814ebced6ee6d963af18ce1a915e0bf)

commit | commitdiff | tree

Tatjana Dehler [Thu, 28 Jul 2022 13:15:32 +0000 (15:15 +0200)]

monitoring/ceph-mixin: OSD overview typo fix

Correct a wrongly set bracket on ceph-dashboard -> OSD Overview ->
OSD Objectstore Types resulting in a parser error.

Fixes: https://tracker.ceph.com/issues/56948
Signed-off-by: Tatjana Dehler <tdehler@suse.com>
(cherry picked from commit 8faaca2082eeab09eaacfbe3180196c6ce065916)

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom