]> git.apps.os.sepia.ceph.com Git - ceph-ci.git/log
ceph-ci.git
3 years agomds: notify clients if the session has already opened
Xiubo Li [Wed, 9 Mar 2022 07:42:56 +0000 (15:42 +0800)]
mds: notify clients if the session has already opened

If the connection was accidently closed due to the socket issue or
something else the client will try to open the opened sessions, for
now the MDS will just discard the session open request.

But the client will keep waiting the reply from the mds forever.

We need to tell the clients what has happened instead of discard it
directly. And when the client get the session open reply, it can
do what needed.

Fixes: https://tracker.ceph.com/issues/53911
Signed-off-by: Xiubo Li <xiubli@redhat.com>
3 years agomds: remove the useless session seq for stale MClientSession
Xiubo Li [Mon, 21 Mar 2022 02:26:05 +0000 (10:26 +0800)]
mds: remove the useless session seq for stale MClientSession

Client side never uses this seq.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
3 years agoclient: skip reopening the opened or is under opening sessions
Xiubo Li [Wed, 9 Mar 2022 08:07:50 +0000 (16:07 +0800)]
client: skip reopening the opened or is under opening sessions

Signed-off-by: Xiubo Li <xiubli@redhat.com>
3 years agoMerge pull request #45114 from lxbsz/wip-54362
Venky Shankar [Sun, 17 Apr 2022 09:54:57 +0000 (15:24 +0530)]
Merge pull request #45114 from lxbsz/wip-54362

client: do not release the global snaprealm until unmounting

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #45688 from lxbsz/fwd
Venky Shankar [Sun, 17 Apr 2022 09:53:15 +0000 (15:23 +0530)]
Merge pull request #45688 from lxbsz/fwd

client: stop forwarding the request when exceeding 256 times

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #45107 from lxbsz/wip-54345
Venky Shankar [Sat, 16 Apr 2022 15:20:17 +0000 (20:50 +0530)]
Merge pull request #45107 from lxbsz/wip-54345

mds: reset heartbeat when fetching or committing dentries

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #45915 from ljflores/wip-dashboard-cypress-password
Laura Flores [Sat, 16 Apr 2022 04:58:58 +0000 (23:58 -0500)]
Merge pull request #45915 from ljflores/wip-dashboard-cypress-password

3 years agomgr/dashboard/frontend: fix cypress env password
Laura Flores [Thu, 14 Apr 2022 20:42:05 +0000 (20:42 +0000)]
mgr/dashboard/frontend: fix cypress env password

"LOGIN_PASSWORD" should be "LOGIN_PWD". Bug introduced
in e9128c4.

Fixes: https://tracker.ceph.com/issues/55323
Signed-off-by: Laura Flores <lflores@redhat.com>
3 years agoMerge pull request #45765 from m-ildefons/1196785-cephadm-status-trace
Adam King [Fri, 15 Apr 2022 15:05:05 +0000 (11:05 -0400)]
Merge pull request #45765 from m-ildefons/1196785-cephadm-status-trace

cephadm: avoid crashing on expected non-zero exit

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Michael Fritch <mfritch@suse.com>
3 years agoMerge pull request #45863 from adk3798/mgr-fail-retry
Adam King [Fri, 15 Apr 2022 15:02:55 +0000 (11:02 -0400)]
Merge pull request #45863 from adk3798/mgr-fail-retry

mgr/cephadm: retry mgr fail over in case of transient failure

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>
3 years agoMerge pull request #45898 from idryomov/wip-resurrect-mutex-debug
Ilya Dryomov [Fri, 15 Apr 2022 09:09:42 +0000 (11:09 +0200)]
Merge pull request #45898 from idryomov/wip-resurrect-mutex-debug

cmake: resurrect mutex debugging in all Debug builds

Reviewed-by: Casey Bodley <cbodley@redhat.com>
Reviewed-by: Kefu Chai <tchaikov@gmail.com>
3 years agoclient: do not release the global snaprealm until unmounting
Xiubo Li [Tue, 22 Feb 2022 03:46:44 +0000 (11:46 +0800)]
client: do not release the global snaprealm until unmounting

The global snaprealm would be created and then destroyed immediately
every time when updating it.

Fixes: https://tracker.ceph.com/issues/54362
Signed-off-by: Xiubo Li <xiubli@redhat.com>
3 years agoMerge pull request #45394 from iqbalredkhan/amrojiqbal
Ali Maredia [Thu, 14 Apr 2022 13:35:00 +0000 (09:35 -0400)]
Merge pull request #45394 from iqbalredkhan/amrojiqbal

cls/rgw : Add missing classes in < #include "cls/rgw/cls_rgw_types.h">

Reviewed-by: Ali Maredia <amaredia@redhat.com>
3 years agoclient: stop forwarding the request when exceeding 256 times
Xiubo Li [Tue, 29 Mar 2022 08:45:12 +0000 (16:45 +0800)]
client: stop forwarding the request when exceeding 256 times

The type of 'num_fwd' in ceph 'MClientRequestForward' is 'int32_t',
while in 'ceph_mds_request_head' the type is '__u8'. So in case
the request bounces between MDSes exceeding 256 times, the client
will get stuck.

In this case it's ususally a bug in MDS and continue bouncing the
request makes no sense.

Fixes: https://tracker.ceph.com/issues/55129
Signed-off-by: Xiubo Li <xiubli@redhat.com>
3 years agoMerge pull request #45549 from nmshelke/fuse-linux-only
Venky Shankar [Thu, 14 Apr 2022 12:08:20 +0000 (17:38 +0530)]
Merge pull request #45549 from nmshelke/fuse-linux-only

ceph-fuse: restrict already_fuse_mounted function only for linux

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #45636 from joscollin/wip-B54971-rank0-stale-perf-stats-assertion...
Venky Shankar [Thu, 14 Apr 2022 12:06:42 +0000 (17:36 +0530)]
Merge pull request #45636 from joscollin/wip-B54971-rank0-stale-perf-stats-assertion-error

qa: make test_perf_stats_stale_metrics check only the clients created for the tests

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #45663 from lxbsz/client_cleanup_dl
Venky Shankar [Thu, 14 Apr 2022 12:04:31 +0000 (17:34 +0530)]
Merge pull request #45663 from lxbsz/client_cleanup_dl

client: remove expect_null and cleanup the code get_or_create()

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #45683 from kotreshhr/subvolume-retainsnap-rm-fix
Venky Shankar [Thu, 14 Apr 2022 12:01:46 +0000 (17:31 +0530)]
Merge pull request #45683 from kotreshhr/subvolume-retainsnap-rm-fix

mgr/volumes: Fix idempotent subvolume rm

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #45897 from idryomov/wip-rbd-mirror-test-timer-lock
Ilya Dryomov [Thu, 14 Apr 2022 05:50:15 +0000 (07:50 +0200)]
Merge pull request #45897 from idryomov/wip-rbd-mirror-test-timer-lock

test/rbd_mirror: grab timer lock before calling add_event_after()

Reviewed-by: Christopher Hoffman <choffman@redhat.com>
3 years agoMerge pull request #45571 from rzarzynski/wip-doc-mempool-acct
Anthony D'Atri [Thu, 14 Apr 2022 02:18:26 +0000 (19:18 -0700)]
Merge pull request #45571 from rzarzynski/wip-doc-mempool-acct

doc/dev: Define what mempools we use in BlueStore

3 years agoMerge pull request #45884 from markhpc/wip-bs-avl-cursor-fix
Yuri Weinstein [Wed, 13 Apr 2022 23:18:47 +0000 (16:18 -0700)]
Merge pull request #45884 from markhpc/wip-bs-avl-cursor-fix

os/bluestore: Always update the cursor position in AVL near-fit search.

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
3 years agomgr/cephadm: retry mgr fail over in case of transient failure
Adam King [Mon, 11 Apr 2022 20:57:51 +0000 (16:57 -0400)]
mgr/cephadm: retry mgr fail over in case of transient failure

Fixes: https://tracker.ceph.com/issues/55279
Signed-off-by: Adam King <adking@redhat.com>
3 years agodoc/dev: define what mempools we use in bluestore
Anthony D'Atri [Wed, 13 Apr 2022 17:35:22 +0000 (10:35 -0700)]
doc/dev: define what mempools we use in bluestore

doc/dev: define what mempools we use in bluestore

Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>
3 years agoMerge pull request #45851 from rkachach/fix_issue_53528
Adam King [Wed, 13 Apr 2022 18:34:41 +0000 (14:34 -0400)]
Merge pull request #45851 from rkachach/fix_issue_53528

mgr/cephadm: skip loopback devices when gathering facts

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #45796 from asm0deuz/issue_54618_ssh_config
Adam King [Wed, 13 Apr 2022 18:34:06 +0000 (14:34 -0400)]
Merge pull request #45796 from asm0deuz/issue_54618_ssh_config

mgr/cephadm: ceph cephadm set-user does not reflect the user change in ssh-config

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #45768 from rkachach/fix_issue_55174
Adam King [Wed, 13 Apr 2022 18:33:21 +0000 (14:33 -0400)]
Merge pull request #45768 from rkachach/fix_issue_55174

mgr/cephadm: Adding cephadm networking configuration checks + refactoring

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #43796 from windgmbh/cephadm-sysctl-fhs-fix
Adam King [Wed, 13 Apr 2022 18:24:36 +0000 (14:24 -0400)]
Merge pull request #43796 from windgmbh/cephadm-sysctl-fhs-fix

cephadm: Fix sysctl.d location

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
3 years agocephadm: avoid crashing on expected non-zero exit
Moritz Röhrich [Mon, 21 Mar 2022 16:32:25 +0000 (17:32 +0100)]
cephadm: avoid crashing on expected non-zero exit

- Avoid crashing when a call out to an external program expectedly does
  not return exit status zero.

There are programs that communicate other information than error/no
error through exit status. E.g. `systemctl status` will return different
exit codes depending on the actual status of the units in question.
In cases where this is expected crashing with a RuntimeError exception
is inappropriate and should be avoided.

Fixes: https://tracker.ceph.com/issues/55117
Signed-off-by: Moritz Röhrich <moritz.rohrich@suse.com>
3 years agocmake: resurrect mutex debugging in all Debug builds
Ilya Dryomov [Wed, 13 Apr 2022 13:42:21 +0000 (15:42 +0200)]
cmake: resurrect mutex debugging in all Debug builds

Commit 403f1ec2888a ("cmake: make "WITH_CEPH_DEBUG_MUTEX" depend on
CMAKE_BUILD_TYPE") made WITH_CEPH_DEBUG_MUTEX depend on build type
being set to Debug, in CMakeLists.txt.  However, if CMAKE_BUILD_TYPE
isn't specified by the user, we may still set it to Debug later, in
src/CMakeLists.txt, and in that case WITH_CEPH_DEBUG_MUTEX doesn't
get enabled.  The result is that

  $ do_cmake.sh -DCMAKE_BUILD_TYPE=Debug ...

debug builds have mutex debugging enabled, while

  $ do_cmake.sh ...

builds, which are supposed to be the same, don't.  Jenkins builders
don't pass -DCMAKE_BUILD_TYPE=Debug so that commit effectively turned
off all ceph_mutex_is_locked* asserts in "make check".

Fixes: https://tracker.ceph.com/issues/55318
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agotest/rbd_mirror: grab timer lock before calling add_event_after()
Ilya Dryomov [Wed, 13 Apr 2022 13:24:04 +0000 (15:24 +0200)]
test/rbd_mirror: grab timer lock before calling add_event_after()

add_event_after() expects an externally provided mutex to be held
for the call.  This was missed in commit 8965a0f2a6f7 ("rbd-mirror:
synchronize with in-flight stop in ImageReplayer::stop()").

Fixes: https://tracker.ceph.com/issues/55317
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agoMerge pull request #45859 from rhcs-dashboard/add-cypress-env
Ernesto Puerta [Wed, 13 Apr 2022 12:09:21 +0000 (14:09 +0200)]
Merge pull request #45859 from rhcs-dashboard/add-cypress-env

mgr/dashboard: Add cypress env for login credentials

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agoMerge pull request #44236 from CongMinYin/fix-pwl-cache-lose
Ilya Dryomov [Wed, 13 Apr 2022 10:12:51 +0000 (12:12 +0200)]
Merge pull request #44236 from CongMinYin/fix-pwl-cache-lose

rbd: add persistent-cache command group

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 years agoMerge pull request #45059 from ceph/wip-merge_message_browser-master
Ernesto Puerta [Wed, 13 Apr 2022 08:37:30 +0000 (10:37 +0200)]
Merge pull request #45059 from ceph/wip-merge_message_browser-master

doc: browser extension for merge message

Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #45081 from ceph/epuertat-patch-2
Ernesto Puerta [Wed, 13 Apr 2022 08:37:18 +0000 (10:37 +0200)]
Merge pull request #45081 from ceph/epuertat-patch-2

doc: fix format issues

Reviewed-by: anthonyeleven <NOT@FOUND>
Reviewed-by: Guillaume Abrioux <gabrioux@redhat.com>
3 years agoMerge pull request #45083 from ceph/epuertat-patch-4
Ernesto Puerta [Wed, 13 Apr 2022 08:37:01 +0000 (10:37 +0200)]
Merge pull request #45083 from ceph/epuertat-patch-4

doc: fix config option links

Reviewed-by: anthonyeleven <NOT@FOUND>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
3 years agoMerge pull request #45574 from cyx1231st/wip-crimson-refactor-with-device
Samuel Just [Wed, 13 Apr 2022 05:05:43 +0000 (22:05 -0700)]
Merge pull request #45574 from cyx1231st/wip-crimson-refactor-with-device

crimson/os/seastore: introduce the generic Device class

Reviewed-by: Samuel Just <sjust@redhat.com>
3 years agoMerge pull request #45775 from liu-chunmei/seastore-zero
Samuel Just [Wed, 13 Apr 2022 03:20:54 +0000 (20:20 -0700)]
Merge pull request #45775 from liu-chunmei/seastore-zero

crimson: seastore add OP_ZERO support

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
3 years agoos/bluestore: Always update the cursor position in AVL near-fit search.
Mark Nelson [Wed, 13 Apr 2022 00:53:56 +0000 (00:53 +0000)]
os/bluestore: Always update the cursor position in AVL near-fit search.

Signed-off-by: Mark Nelson <mnelson@redhat.com>
3 years agocrimson: Implement ObjectDataHandler::zero using hole punching
Samuel Just [Thu, 7 Apr 2022 21:30:32 +0000 (21:30 +0000)]
crimson: Implement ObjectDataHandler::zero using hole punching

Trim already treats Reserved regions as zero, let's use that
for zero as well.

Signed-off-by: Samuel Just <sjust@redhat.com>
3 years agocrimson/os/seastore/object_data_handler: don't return empty buffers from split_pin*
Samuel Just [Fri, 8 Apr 2022 09:20:49 +0000 (02:20 -0700)]
crimson/os/seastore/object_data_handler: don't return empty buffers from split_pin*

Always return std::nullopt rather than an empty buffer -- this way users
can rely on this as an invariant.

Signed-off-by: Samuel Just <sjust@redhat.com>
3 years agotest/crimson/seastore: improve test_seastore zero() coverage
Samuel Just [Thu, 7 Apr 2022 20:48:38 +0000 (20:48 +0000)]
test/crimson/seastore: improve test_seastore zero() coverage

Signed-off-by: Samuel Just <sjust@redhat.com>
3 years agocrimson: add seastore::zero unit test
chunmei-liu [Wed, 6 Apr 2022 23:37:23 +0000 (16:37 -0700)]
crimson: add seastore::zero unit test

Signed-off-by: chunmei-liu <chunmei.liu@intel.com>
3 years agocrimson: seastore add OP_ZERO support
chunmei-liu [Sat, 2 Apr 2022 03:39:15 +0000 (20:39 -0700)]
crimson: seastore add OP_ZERO support

Signed-off-by: chunmei-liu <chunmei.liu@intel.com>
3 years agoMerge pull request #45756 from rzarzynski/wip-common-no-cpp17-second_round
Yuri Weinstein [Tue, 12 Apr 2022 20:51:58 +0000 (13:51 -0700)]
Merge pull request #45756 from rzarzynski/wip-common-no-cpp17-second_round

common/bl: fix FTBFS on C++11 due to C++17's if-with-initializer

Reviewed-by: Kefu Chai <kchai@redhat.com>
3 years agoMerge pull request #45819 from ljflores/wip-anonymize-telemetry-host-names
Yuri Weinstein [Tue, 12 Apr 2022 19:12:26 +0000 (12:12 -0700)]
Merge pull request #45819 from ljflores/wip-anonymize-telemetry-host-names

mgr/telemetry: anonymize daemons in telemetry `perf_counters`

Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>
3 years agoMerge pull request #45802 from ljflores/wip-config-dump-yaml
Yuri Weinstein [Tue, 12 Apr 2022 19:10:48 +0000 (12:10 -0700)]
Merge pull request #45802 from ljflores/wip-config-dump-yaml

ceph.in: clarify the usage of `--format` in the ceph command

Reviewed-by: Vikhyat Umrao <vikhyat@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
3 years agoMerge pull request #45531 from ifed01/wip-ifed-alloc-replay-with-bin
Neha Ojha [Tue, 12 Apr 2022 18:06:19 +0000 (11:06 -0700)]
Merge pull request #45531 from ifed01/wip-ifed-alloc-replay-with-bin

os/bluestore: proper locking for Allocators' dump

Reviewed-by: Adam Kupczyk <akucpzyk@redhat.com>
3 years agolibrbd/cache/pwl: remove RBD_FEATURE_DIRTY_CACHE check in DiscardRequest
Ilya Dryomov [Sun, 10 Apr 2022 16:13:48 +0000 (18:13 +0200)]
librbd/cache/pwl: remove RBD_FEATURE_DIRTY_CACHE check in DiscardRequest

"m_image_ctx.features &&RBD_FEATURE_DIRTY_CACHE" is obviously wrong
because it would pretty much always be true.  However, even if bitwise
AND was used, this check would still be dead because DiscardRequest is
only invoked if RBD_FEATURE_DIRTY_CACHE is enabled:

  int invalidate_cache(ImageCtx *ictx) {
  {
    ...
    // Delete writeback cache if it is not initialized
    if ((!ictx->exclusive_lock ||
         !ictx->exclusive_lock->is_lock_owner()) &&
ictx->test_features(RBD_FEATURE_DIRTY_CACHE)) {
      C_SaferCond ctx3;
      ictx->plugin_registry->discard(&ctx3);
      r = ctx3.wait();
    }

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agolibrbd/cache/pwl: don't crash if cache file removal fails
Ilya Dryomov [Sun, 10 Apr 2022 14:57:24 +0000 (16:57 +0200)]
librbd/cache/pwl: don't crash if cache file removal fails

The non-ec overload will throw fs::filesystem_error on any error
(e.g. EPERM due to unprivileged "rbd persistent-cache invalidate"
being brought up against a privileged workload).

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agorbd: add persistent-cache flush command
Yin Congmin [Mon, 27 Dec 2021 07:06:49 +0000 (15:06 +0800)]
rbd: add persistent-cache flush command

Add a flush command so that users can manually flush cache.

[ idryomov: error messages, incorporate doc and help.t hunks, drop
  do_persistent_cache_flush() ]

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agorbd: rename image-cache invalidate command
Yin Congmin [Mon, 27 Dec 2021 03:50:18 +0000 (11:50 +0800)]
rbd: rename image-cache invalidate command

Rename command image-cache to persistent-cache. Refactoring the code
of invalidate command.

[ idryomov: error message, incorporate doc and help.t hunks, drop
  do_persistent_cache_invalidate() ]

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agolibrbd/cache/pwl: rename persistent cache key
Yin Congmin [Wed, 22 Dec 2021 07:07:11 +0000 (15:07 +0800)]
librbd/cache/pwl: rename persistent cache key

librbd "internal" metadata keys was change to ".rbd" prefix. Change
peristent cache to ".rbd" too.
And the name of  persistent cache key is IMAGE_CACHE_STATE. Since
this key is planned to be used outside the pwl directory, it seems
more appropriate to change it to a clear name as PERSISTENT_CACHE_STATE.

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
3 years agoMerge pull request #45684 from CongMinYin/pwl-add-stats
Ilya Dryomov [Tue, 12 Apr 2022 14:48:54 +0000 (16:48 +0200)]
Merge pull request #45684 from CongMinYin/pwl-add-stats

librbd/cache/pwl: add pwl metrics in "rbd status" display

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 years agocrimson/os/seastore/EPM: use DEVICE_ID_GLOBAL_MAX for devices
Yingxin Cheng [Tue, 12 Apr 2022 13:48:14 +0000 (21:48 +0800)]
crimson/os/seastore/EPM: use DEVICE_ID_GLOBAL_MAX for devices

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
3 years agoMerge pull request #45357 from cbodley/wip-54531
Casey Bodley [Tue, 12 Apr 2022 13:34:58 +0000 (09:34 -0400)]
Merge pull request #45357 from cbodley/wip-54531

rgw: disable RGWDataChangesLog::add_entry() when log_data is off

Reviewed-by: Yuval Lifshitz <ylifshit@redhat.com>
3 years agomgr/cephadm: Adding cephadm networking configuration checks+refactoring
Redouane Kachach [Fri, 1 Apr 2022 16:03:42 +0000 (18:03 +0200)]
mgr/cephadm: Adding cephadm networking configuration checks+refactoring
Fixes: https://tracker.ceph.com/issues/55174
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
3 years agoos/bluestore: proper locking for Allocators' dump methods
Igor Fedotov [Mon, 21 Mar 2022 11:58:18 +0000 (14:58 +0300)]
os/bluestore: proper locking for Allocators' dump methods

Plus renaming parametrized dump to foreach()
Fixes: https://tracker.ceph.com/issues/54973
Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
3 years agotest/allocator_replay_test: introduce check for duplicates
Igor Fedotov [Fri, 18 Mar 2022 19:13:12 +0000 (22:13 +0300)]
test/allocator_replay_test: introduce check for duplicates

This performs check for duplicates using free dump.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
3 years agotest/allocator_replay_test: introduce binary format for free list dump
Igor Fedotov [Fri, 18 Mar 2022 11:35:16 +0000 (14:35 +0300)]
test/allocator_replay_test: introduce binary format for free list dump

Adding new command to export free dump to binary format plus capability
to use new format for replay.
This dramatically increases large dump loading.

Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
3 years agoMerge pull request #45677 from ideepika/wip-ninja-default
Ilya Dryomov [Tue, 12 Apr 2022 08:57:30 +0000 (10:57 +0200)]
Merge pull request #45677 from ideepika/wip-ninja-default

ceph.spec: make ninja-build package install always

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Tim Serong <tserong@suse.com>
3 years agoMerge pull request #45820 from liu-chunmei/crimson-do_osd_ops_params_t
Liu-Chunmei [Tue, 12 Apr 2022 04:57:36 +0000 (21:57 -0700)]
Merge pull request #45820 from liu-chunmei/crimson-do_osd_ops_params_t

crimson: keep do_osd_ops_params_t alive when call do_osd_ops

Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoMerge pull request #45137 from myoungwon/wip-51627-2
Samuel Just [Mon, 11 Apr 2022 23:31:03 +0000 (16:31 -0700)]
Merge pull request #45137 from myoungwon/wip-51627-2

osd: return appropriate error if the object is not manifest

Reviewed-by: Samuel Just <sjust@redhat.com>
3 years agocrimson: keep do_osd_ops_params_t alive when call do_osd_ops
chunmei-liu [Fri, 8 Apr 2022 00:35:37 +0000 (17:35 -0700)]
crimson: keep do_osd_ops_params_t alive when call do_osd_ops

otherwise stack-under-overflow

Signed-off-by: chunmei-liu <chunmei.liu@intel.com>
3 years agoMerge pull request #45744 from idryomov/wip-stretch-last-force-resend
Yuri Weinstein [Mon, 11 Apr 2022 22:58:23 +0000 (15:58 -0700)]
Merge pull request #45744 from idryomov/wip-stretch-last-force-resend

mon/OSDMonitor: properly set last_force_op_resend in stretch mode

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
3 years agoMerge pull request #45670 from pdvian/wip-fix-mgr-daemon-state
Yuri Weinstein [Mon, 11 Apr 2022 22:57:36 +0000 (15:57 -0700)]
Merge pull request #45670 from pdvian/wip-fix-mgr-daemon-state

mgr, mon: Keep upto date metadata with mgr for MONs

Reviewed-by: Laura Flores <lflores@redhat.com>
3 years agoMerge pull request #45599 from amathuria/amathuri-54994-fix
Yuri Weinstein [Mon, 11 Apr 2022 22:56:54 +0000 (15:56 -0700)]
Merge pull request #45599 from amathuria/amathuri-54994-fix

osd: add scrub duration for scrubs after recovery

Reviewed-by: Ronen Friedman <rfriedma@redhat.com>
3 years agoMerge pull request #45547 from nkshirsagar/master
Yuri Weinstein [Mon, 11 Apr 2022 22:55:43 +0000 (15:55 -0700)]
Merge pull request #45547 from nkshirsagar/master

Catch exception if thrown by __generate_command_map()

Reviewed-by: Laura Flores <lflores@redhat.com>
3 years agoMerge pull request #45505 from pdvian/wip-fix-daemon-version
Yuri Weinstein [Mon, 11 Apr 2022 22:51:26 +0000 (15:51 -0700)]
Merge pull request #45505 from pdvian/wip-fix-daemon-version

mgr, mgr/prometheus: Fix regression with prometheus metrics

Reviewed-by: Laura Flores <lflores@redhat.com>
3 years agoMerge pull request #45822 from liu-chunmei/crimson-enametoolong
Samuel Just [Mon, 11 Apr 2022 21:35:06 +0000 (14:35 -0700)]
Merge pull request #45822 from liu-chunmei/crimson-enametoolong

crimson: check -ENAMETOOLONG for Name, Locator, NameSpace

Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
3 years agoMerge pull request #45584 from votdev/issue_54983_default_rgw_daemon
Ernesto Puerta [Mon, 11 Apr 2022 18:50:56 +0000 (20:50 +0200)]
Merge pull request #45584 from votdev/issue_54983_default_rgw_daemon

mgr/dashboard: RGW users and buckets tables are empty if the selected gateway is down

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
3 years agoMerge pull request #45790 from rhcs-dashboard/host-toggle-column-fix
Ernesto Puerta [Mon, 11 Apr 2022 18:48:48 +0000 (20:48 +0200)]
Merge pull request #45790 from rhcs-dashboard/host-toggle-column-fix

mgr/dashboard: datatable in Cluster Host page hides wrong column on selection

Reviewed-by: Sarthak0702 <NOT@FOUND>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agoMerge pull request #45007 from VallariAg/dashboard-complexity-cleanup
Ernesto Puerta [Mon, 11 Apr 2022 18:45:58 +0000 (20:45 +0200)]
Merge pull request #45007 from VallariAg/dashboard-complexity-cleanup

mgr/dashboard: reduce method (cyclomatic) complexity

Reviewed-by: VallariAg <NOT@FOUND>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
3 years agomgr/dashboard:Add cypress env for login credentials
Sarthak0702 [Mon, 11 Apr 2022 18:35:34 +0000 (00:05 +0530)]
mgr/dashboard:Add cypress env for login credentials

Fixes: https://tracker.ceph.com/issues/55270
Signed-off-by: Sarthak0702 <sarthak.dev.0702@gmail.com>
3 years agoApply sysctl.d migration from /usr/lib to /etc
windgmbh [Fri, 12 Nov 2021 15:51:03 +0000 (16:51 +0100)]
Apply sysctl.d migration from /usr/lib to /etc
A fix regarding the SYSCTL_DIR location (#53130) requires to migrate
sysctl.d/*.conf files from /usr/lib to /etc.
Signed-off-by: Lukas Mayer <lmayer@wind.gmbh>
3 years agoMerge pull request #45070 from rkachach/fix_issue_52042
Adam King [Mon, 11 Apr 2022 17:18:36 +0000 (13:18 -0400)]
Merge pull request #45070 from rkachach/fix_issue_52042

mgr/cephadm: Making default cephadm shell cmd easier

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #45056 from ktdreyer/explain-cephadm-tox
Adam King [Mon, 11 Apr 2022 16:51:59 +0000 (12:51 -0400)]
Merge pull request #45056 from ktdreyer/explain-cephadm-tox

cephadm: add comment explaining docker.io grep test

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #45815 from cbodley/wip-55232
Casey Bodley [Mon, 11 Apr 2022 16:42:51 +0000 (12:42 -0400)]
Merge pull request #45815 from cbodley/wip-55232

test/rgw: use mock OpsLogSink instead of OpsLogSocket

Reviewed-by: Yuval Lifshitz <ylifshit@redhat.com>
3 years agoMerge pull request #45770 from yuvalif/wip-yuval-fix-54416
Casey Bodley [Mon, 11 Apr 2022 16:40:51 +0000 (12:40 -0400)]
Merge pull request #45770 from yuvalif/wip-yuval-fix-54416

test/multisite: dont use path when mrun outside of src tree

Reviewed-by: Casey Bodley <cbodley@redhat.com>
3 years agoMerge pull request #45347 from mgfritch/cephadm-config-noreplace
Adam King [Mon, 11 Apr 2022 14:10:55 +0000 (10:10 -0400)]
Merge pull request #45347 from mgfritch/cephadm-config-noreplace

cephadm: preserve `authorized_keys` file during upgrade

Reviewed-by: Adam King <adking@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
Reviewed-by: Tim Serong <tserong@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
3 years agoMerge pull request #45589 from melissa-kun-li/bootstrap_registry_warning
Adam King [Mon, 11 Apr 2022 14:07:50 +0000 (10:07 -0400)]
Merge pull request #45589 from melissa-kun-li/bootstrap_registry_warning

cephadm: show error during bootstrap if private registry cred not provided

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #45685 from rkachach/fix_issue_47905
Adam King [Mon, 11 Apr 2022 14:07:07 +0000 (10:07 -0400)]
Merge pull request #45685 from rkachach/fix_issue_47905

mgr/cephadm: improving logging to send errors to stderr

Reviewed-by: Adam King <adking@redhat.com>
3 years agoMerge pull request #45798 from adk3798/iscsi-only-pid-limit
Ilya Dryomov [Mon, 11 Apr 2022 14:02:50 +0000 (16:02 +0200)]
Merge pull request #45798 from adk3798/iscsi-only-pid-limit

cephadm: only apply unlimited pids-limit to iscsi and rgw

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
3 years agoMerge pull request #45689 from vshankar/wip-55110
Venky Shankar [Mon, 11 Apr 2022 13:49:23 +0000 (19:19 +0530)]
Merge pull request #45689 from vshankar/wip-55110

mount.ceph: remove `ms_mode' mount option when switching to old-syntax

Reviewed-by: Venky Shankar <vshankar@redhat.com>
3 years agoceph cephadm set-user does not reflect the user change in ssh-config
Teoman ONAY [Wed, 6 Apr 2022 09:32:17 +0000 (11:32 +0200)]
ceph cephadm set-user does not reflect the user change in ssh-config

Fixes: https://tracker.ceph.com/issues/54618
Signed-off-by: Teoman ONAY <tonay@redhat.com>
3 years agoqa: test_iscsi_pids_limit.sh: increase sleep time
Ilya Dryomov [Mon, 11 Apr 2022 10:45:02 +0000 (12:45 +0200)]
qa: test_iscsi_pids_limit.sh: increase sleep time

It could take longer than 30 seconds to fork off 40000 processes on
a busy system.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agomgr/cephadm: skip loopback devices when gathering facts
Redouane Kachach [Mon, 11 Apr 2022 11:04:13 +0000 (13:04 +0200)]
mgr/cephadm: skip loopback devices when gathering facts
Fixes: https://tracker.ceph.com/issues/53528
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
3 years agorbd: include persistent cache metrics in "rbd status" report
Ilya Dryomov [Sat, 9 Apr 2022 15:48:17 +0000 (17:48 +0200)]
rbd: include persistent cache metrics in "rbd status" report

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agomount.ceph: remove `ms_mode' mount option when switching to old-syntax
Venky Shankar [Tue, 29 Mar 2022 13:18:06 +0000 (09:18 -0400)]
mount.ceph: remove `ms_mode' mount option when switching to old-syntax

... and switch to using v1 addresses (if users haven't specified those
explicitly). kernel versions <5.11 do not understand `ms_mode' mount
option which would result in mount failure.

Fixes: http://tracker.ceph.com/issues/55110
Signed-off-by: Venky Shankar <vshankar@redhat.com>
3 years agoMerge pull request #45791 from rhcs-dashboard/rm-true
Ernesto Puerta [Mon, 11 Apr 2022 08:26:47 +0000 (10:26 +0200)]
Merge pull request #45791 from rhcs-dashboard/rm-true

build: install-deps failing in docker build

Reviewed-by: David Galloway <dgallowa@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
3 years agorbd: factor out get_percentage() helper
Ilya Dryomov [Sat, 9 Apr 2022 09:06:32 +0000 (11:06 +0200)]
rbd: factor out get_percentage() helper

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agolibrbd/cache/pwl: no need to set clean and empty in remove_pool_file()
Ilya Dryomov [Fri, 8 Apr 2022 13:53:38 +0000 (15:53 +0200)]
librbd/cache/pwl: no need to set clean and empty in remove_pool_file()

It is redundant -- the only caller sets both since commit 6593e31fff18
("librbd/cache/pwl: correct cache state").

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agolibrbd/cache/pwl: avoid inconsistencies in ImageCacheState
Ilya Dryomov [Thu, 7 Apr 2022 16:49:46 +0000 (18:49 +0200)]
librbd/cache/pwl: avoid inconsistencies in ImageCacheState

When empty and/or clean bools are updated in I/O handling code paths,
ImageCacheState becomes inconistent for a short while: e.g. with clean
transitioned to true, dirty_bytes counter could still be positive
because the counters are updated only in periodic_stats().  Move to
updating the counters in update_image_cache_state(Context*) to avoid
this.

update_image_cache_state(Context*) now requires m_lock -- most call
sites already hold it anyway.  The only problematic call site was
AbstractWriteLog::shut_down() callback chain: perf_stop() needed to
be moved to the very end since perf counters must be alive now for
update_image_cache_state() to work.

Don't override expect_op_work_queue() in unit tests: completing
context in the same thread now results in a deadlock on m_lock in
all test cases that call AbstractWriteLog::init().

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agolibrbd/cache/pwl: handle invalid ImageCacheState json
Ilya Dryomov [Thu, 7 Apr 2022 14:02:46 +0000 (16:02 +0200)]
librbd/cache/pwl: handle invalid ImageCacheState json

get_json_format() and create_image_cache_state() attempt to get
particular keys which could result in an unhandled std::runtime_error
exception.  Conversely, ImageCacheState constructor just swallows that
exception which could leave the newly constructed object incorrectly
initialized.  Avoid doing parsing in the constructor and introduce
init_from_config() and init_from_metadata() methods instead.

While at it, move everything out from under "persistent_cache" key.
Also fix init_state_json_write test case which stopped working now
that types are enforced by json_spirit.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agolibrbd/cache/pwl: add basic metrics to ImageCacheState
Yin Congmin [Tue, 29 Mar 2022 08:59:05 +0000 (16:59 +0800)]
librbd/cache/pwl: add basic metrics to ImageCacheState

Add basic metrics to ImageCacheState and persist them, including
allocated_bytes, cached_bytes, dirty_bytes, free_bytes and hit/miss
info.

Leverage periodic_stats() timer to call update_image_cache_state.
In order to avoid outputting too much debug information, the original
statistics output log level is changed to 5.

Switch to json_spirit for encoding because encode_json encodes bool as
"true"/"false" string.

Remove rbd_persistent_cache_log_periodic_stats option because we need
to always update cache state.

[ idryomov: add cached_bytes and hits_partial; report misses and
  miss_bytes instead of respective totals; naming ]

Fixes: https://tracker.ceph.com/issues/50614
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
3 years agoMerge pull request #45832 from anthonyeleven/anthonyeleven/min-alloc-size-53752
zdover23 [Sun, 10 Apr 2022 18:35:54 +0000 (04:35 +1000)]
Merge pull request #45832 from anthonyeleven/anthonyeleven/min-alloc-size-53752

doc/rados/configuration: document min_alloc_size values and space amplification

Reviewed-by: Zac Dover <zac.dover@gmail.com>
3 years agodoc/rados/configuration: document min_alloc_size values and space amplification
Anthony D'Atri [Sat, 9 Apr 2022 03:59:12 +0000 (20:59 -0700)]
doc/rados/configuration: document min_alloc_size values and space amplification

Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>
3 years agomgr/telemetry: fix daemon anonymization in perf_counters
Yaarit Hatuka [Sat, 9 Apr 2022 01:12:48 +0000 (21:12 -0400)]
mgr/telemetry: fix daemon anonymization in perf_counters

Anonymized daemons now appear with a SHA1 digest instead of their
original identifier, e.g.:

    "perf_counters": {
        "mon.1b1b829ba9298527f4934053a4742a1710937007": {
            "mon": {
                "election_call": {
                    "value": 1
                },
                ...
                "session_trim": {
                    "value": 0
                }
            },
        ...
        }
    ...
    }

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agomgr/telemetry: add anonymize_entity_name function
Yaarit Hatuka [Sat, 9 Apr 2022 00:33:04 +0000 (20:33 -0400)]
mgr/telemetry: add anonymize_entity_name function

The ability to anonymize entity names should have its own function
to prevent duplicate code.
Will clean up in a separate commit.

Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
3 years agocrimson: check -ENAMETOOLONG for Name, Locator, NameSpace
chunmei-liu [Fri, 8 Apr 2022 07:07:53 +0000 (00:07 -0700)]
crimson: check -ENAMETOOLONG for Name, Locator, NameSpace

Signed-off-by: chunmei-liu <chunmei.liu@intel.com>
3 years agoMerge pull request #45749 from josephsawaya/fix-rook-tests
Laura Flores [Fri, 8 Apr 2022 21:36:38 +0000 (16:36 -0500)]
Merge pull request #45749 from josephsawaya/fix-rook-tests

Remove orchestrator from rook task and suite