git.apps.os.sepia.ceph.com Git - ceph.git/log

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

projects / ceph.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Cory Snyder [Fri, 15 Apr 2022 00:54:15 +0000 (20:54 -0400)]

bluestore: set upper and lower bounds on rocksdb omap iterators

Limits RocksDB omap Seek operations to the relevant key range of the object's omap.
This prevents RocksDB from unnecessarily iterating over delete range tombstones in
irrelevant omap CF shards. Avoids extreme performance degradation commonly caused
by tombstones generated from RGW bucket resharding cleanup. Also prefer CFIteratorImpl
over ShardMergeIteratorImpl when we can determine that all keys within specified
IteratorBounds must be in a single CF.

Fixes: https://tracker.ceph.com/issues/55324
Signed-off-by: Cory Snyder <csnyder@iland.com>
(cherry picked from commit 850c16c2468c3200a340493c12930543f326b0e1)

commit | commitdiff | tree

Yuri Weinstein [Tue, 19 Apr 2022 17:53:16 +0000 (10:53 -0700)]

Merge pull request #45936 from adk3798/pacific-rerevert-pids-limit

pacific: cephadm: revert pids limit

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 19 Apr 2022 17:52:29 +0000 (10:52 -0700)]

Merge pull request #45895 from idryomov/wip-persistent-cache-status-pacific

pacific: rbd persistent cache UX improvements (status report, metrics, flush command)

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

commit | commitdiff | tree

Adam King [Mon, 18 Apr 2022 19:27:45 +0000 (15:27 -0400)]

Revert "cephadm: remove containers pids-limit"

This reverts commit db74cd951b14213c71b5715d8b123c2d9b27022e.

Signed-off-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Adam King [Mon, 18 Apr 2022 19:27:31 +0000 (15:27 -0400)]

Revert "qa/suites/orch/cephadm: restrict test_iscsi_pids_limit to CentOS"

This reverts commit 8b780ebf629082aadc68a86bc2ce72adffc8181a.

Signed-off-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Adam King [Mon, 18 Apr 2022 16:09:59 +0000 (12:09 -0400)]

Merge pull request #45919 from adk3798/pacific-april-batch1

Cephadm Pacific Batch Backport April

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 18 Apr 2022 15:58:34 +0000 (08:58 -0700)]

Merge pull request #45906 from vshankar/wip-snap-sched-backports-1

pacific: mgr/snap_schedule: backports

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Ilya Dryomov [Mon, 18 Apr 2022 10:23:55 +0000 (12:23 +0200)]

Merge pull request #45184 from ideepika/wip-54378-pacific

pacific: rbd-mirror: synchronize with in-flight stop in ImageReplayer::stop()

Reviewed-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Venky Shankar [Sat, 16 Apr 2022 15:24:44 +0000 (20:54 +0530)]

qa: adjust for old snapshot counts during comparison

This is pacific only commit since in master, the snap-schedule module
uses vfs-ceph backed libcephsqlite which seems to preserve the
snapshots stats (created_count, etc..) on ceph-mgr restarts. Pacific
uses in-memory db (serialized to a RADOS object) which seems to
reset these stats when ceph-mgr is restarted.

Also, remove `db_count' assert check as it doesn't make sense.

Signed-off-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Adam King [Sun, 17 Apr 2022 16:21:44 +0000 (12:21 -0400)]

qa/suites/orch/cephadm: stop upgrade tests if failures are seen

Otherwise the tests may run forever. This was already done for
mds upgrade sequence, justadding it in the other two places here

Related to: https://tracker.ceph.com/issues/53939

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 017aa9cfe8362e8512a581e39850ce70bd1ce82f)

commit | commitdiff | tree

Adam King [Wed, 6 Apr 2022 14:32:22 +0000 (10:32 -0400)]

mgr/cephadm: allow setting insecure_skip_verify for alertmanager

Add a "secure" parameter to alertmanager spec that will cause it
to deploy alertmanagers with insecure_skip_verify as true or false
depending on the value given for "secure".

NOTE: alertmanager must still be reconfigured after applying a yaml
with this option changed.

Fixes: https://tracker.ceph.com/issues/55272
Fixes: https://tracker.ceph.com/issues/55333
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit e583d4ef1ac23a7473d50d253e0edf70580542ae)

commit | commitdiff | tree

Adam King [Mon, 11 Apr 2022 20:57:51 +0000 (16:57 -0400)]

mgr/cephadm: retry mgr fail over in case of transient failure

Fixes: https://tracker.ceph.com/issues/55279
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 3fe2d7d553d475f1fe3840c98ee31d71f6188a1a)

commit | commitdiff | tree

Teoman ONAY [Wed, 6 Apr 2022 09:32:17 +0000 (11:32 +0200)]

ceph cephadm set-user does not reflect the user change in ssh-config

Fixes: https://tracker.ceph.com/issues/54618
Signed-off-by: Teoman ONAY <tonay@redhat.com>
(cherry picked from commit 071f72a734ce207e5cb2ff6d3d996e45396f5c7a)

commit | commitdiff | tree

Redouane Kachach [Fri, 1 Apr 2022 16:03:42 +0000 (18:03 +0200)]

mgr/cephadm: Adding cephadm networking configuration checks+refactoring
Fixes: https://tracker.ceph.com/issues/55174
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit e0bafe6b1da104782b29edf7035d7bc93f89e12f)

Conflicts:
src/cephadm/cephadm
src/cephadm/tests/test_cephadm.py

commit | commitdiff | tree

windgmbh [Fri, 12 Nov 2021 15:51:03 +0000 (16:51 +0100)]

Apply sysctl.d migration from /usr/lib to /etc
A fix regarding the SYSCTL_DIR location (#53130) requires to migrate
sysctl.d/*.conf files from /usr/lib to /etc.
Signed-off-by: Lukas Mayer <lmayer@wind.gmbh>
(cherry picked from commit a167a27f30536958e0f2c513d351642e81ba06d5)

commit | commitdiff | tree

windgmbh [Wed, 3 Nov 2021 17:16:53 +0000 (18:16 +0100)]

Fix sysctl.d location FHS compliance
This fixes #53130
Containers should not write to '/usr/lib'.
That location could be read-only or overwritten.
Signed-off-by: Lukas Mayer <lmayer@wind.gmbh>
(cherry picked from commit 77afa812ea8b7e1e802246e4aa3a31e7b644a502)

commit | commitdiff | tree

Redouane Kachach [Thu, 17 Feb 2022 12:48:08 +0000 (13:48 +0100)]

mgr/cephadm: Making default cephadm shell cmd easier
Fixes: https://tracker.ceph.com/issues/52042
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit dc201197639dcab471611ac3c4fefda74a74a94f)

commit | commitdiff | tree

Melissa Li [Wed, 23 Mar 2022 15:38:37 +0000 (11:38 -0400)]

cephadm: show error message if private registry credentials not provided

Raise UnauthorizedRegistryError in `_pull_image` if user tries to pull from a private registry without authentication, handle error in `command_bootstrap`, `commond_adopt`, `command_pull`

Fixes: https://tracker.ceph.com/issues/55015
Signed-off-by: Melissa Li <melissali@redhat.com>
(cherry picked from commit 4de0803ba893abf341ab634d1382208370de7c98)

commit | commitdiff | tree

Adam King [Thu, 24 Mar 2022 13:59:10 +0000 (09:59 -0400)]

cephadm: pass "--security-opt label=disable" to node-exporter container

in order to support setting '--path.procfs=/host/proc','--path.sysfs=/host/sys',
'--path.rootfs=/rootfs' for node-exporter we need to disable selinux separation
between the node-exporter container and the host to avoid selinux denials

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 6d4591723ba89dada9814118e2c14e08d4e4179a)

commit | commitdiff | tree

Adam King [Wed, 23 Mar 2022 17:22:51 +0000 (13:22 -0400)]

cephadm: Specify proc/sys path for node-exporter to use

Fixes: https://tracker.ceph.com/issues/55023
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 97373de71e080423a2321e2c889e6681b47bfc74)

Conflicts:
src/cephadm/cephadm

commit | commitdiff | tree

Redouane Kachach [Wed, 30 Mar 2022 13:48:40 +0000 (15:48 +0200)]

mgr/cephadm: fixing public network conf parsing
Fixes: https://tracker.ceph.com/issues/55132
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 3ef6341e8ef5fe6a01f15c847f6bc9e2205d4d97)

commit | commitdiff | tree

Redouane Kachach [Fri, 4 Feb 2022 12:28:51 +0000 (13:28 +0100)]

mgr/cephadm: Adding AGE field to device ls cmd
Fixes: https://tracker.ceph.com/issues/53540
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 1c5b3e86f9b8ae0ca3ae41798dfa18e9ffe9fcb7)

commit | commitdiff | tree

Milind Changire [Thu, 24 Feb 2022 06:20:18 +0000 (11:50 +0530)]

qa: test snap_schedule with mgr restart

Scheduled snaps should follow the created schedule even across mgr
restart.

Signed-off-by: Milind Changire <mchangir@redhat.com>
(cherry picked from commit ac6c7240d3b69de128ae2c5f4c172f12e313fd27)

commit | commitdiff | tree

Milind Changire [Mon, 28 Feb 2022 06:26:09 +0000 (11:56 +0530)]

mgr/snap_schedule: restart old schedules

Old schedules were not picked up from database when mgr was restarted.
Restart old schedules on mgr restart.

Fixes: https://tracker.ceph.com/issues/54052
Signed-off-by: Milind Changire <mchangir@redhat.com>
(cherry picked from commit dca7fdb600932d712280dd91a4eb63a17a8800e3)

commit | commitdiff | tree

Milind Changire [Mon, 28 Feb 2022 06:22:26 +0000 (11:52 +0530)]

mgr/util: add function to list all fs names

Signed-off-by: Milind Changire <mchangir@redhat.com>
(cherry picked from commit 24915c8ee926c27e335f6e94341770ee8088e721)

commit | commitdiff | tree

Milind Changire [Wed, 24 Nov 2021 08:06:30 +0000 (13:36 +0530)]

qa: add test for concurrent snap creates

Test if the number of snaps on the file-system and the stats on created
snaps in the DB match.

NOTE:
Since it is difficult to get the snapshot created on the exact second,
the timestamp comparison has been limited up to the last 'minute' as the
comparison granularity.

Signed-off-by: Milind Changire <mchangir@redhat.com>
(cherry picked from commit e2e4635c188f05e37b710b38d4173dbd4ebf0257)

commit | commitdiff | tree

Milind Changire [Wed, 24 Nov 2021 05:13:11 +0000 (10:43 +0530)]

mgr/snap_schedule: fix db connection concurrent usage

Serialize access to DB connection to avoid transaction aborts due to
concurrent use.

Some flake8-3.9 and mypy parsing error cleanups to keep 'make check' happy.

Fixes: https://tracker.ceph.com/issues/52642
Signed-off-by: Milind Changire <mchangir@redhat.com>
(cherry picked from commit 707543779e24c6bc1489c07f5fa1a239d110d9fb)

Conflicts:
src/pybind/mgr/snap_schedule/fs/schedule.py
src/pybind/mgr/snap_schedule/fs/schedule_client.py
- changes related to DBConnectionManager to serialize
db interactions

commit | commitdiff | tree

Ilya Dryomov [Wed, 13 Apr 2022 13:24:04 +0000 (15:24 +0200)]

test/rbd_mirror: grab timer lock before calling add_event_after()

add_event_after() expects an externally provided mutex to be held
for the call. This was missed in commit 8965a0f2a6f7 ("rbd-mirror:
synchronize with in-flight stop in ImageReplayer::stop()").

Fixes: https://tracker.ceph.com/issues/55317
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 60e16106837e0d23366709f70f39c4f1ae7a2a45)

commit | commitdiff | tree

Ilya Dryomov [Sun, 10 Apr 2022 16:13:48 +0000 (18:13 +0200)]

librbd/cache/pwl: remove RBD_FEATURE_DIRTY_CACHE check in DiscardRequest

"m_image_ctx.features &&RBD_FEATURE_DIRTY_CACHE" is obviously wrong
because it would pretty much always be true.  However, even if bitwise
AND was used, this check would still be dead because DiscardRequest is
only invoked if RBD_FEATURE_DIRTY_CACHE is enabled:

  int invalidate_cache(ImageCtx *ictx) {
  {
    ...
    // Delete writeback cache if it is not initialized
    if ((!ictx->exclusive_lock ||
         !ictx->exclusive_lock->is_lock_owner()) &&
ictx->test_features(RBD_FEATURE_DIRTY_CACHE)) {
      C_SaferCond ctx3;
      ictx->plugin_registry->discard(&ctx3);
      r = ctx3.wait();
    }

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit aee78bbb9d7edd606a8a235c57b2b704d7b94e4c)

commit | commitdiff | tree

Ilya Dryomov [Sun, 10 Apr 2022 14:57:24 +0000 (16:57 +0200)]

librbd/cache/pwl: don't crash if cache file removal fails

The non-ec overload will throw fs::filesystem_error on any error
(e.g. EPERM due to unprivileged "rbd persistent-cache invalidate"
being brought up against a privileged workload).

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 63197ff7003fa9e595527a7431f9f3f6790f7d57)

commit | commitdiff | tree

Yin Congmin [Mon, 27 Dec 2021 07:06:49 +0000 (15:06 +0800)]

rbd: add persistent-cache flush command

Add a flush command so that users can manually flush cache.

[ idryomov: error messages, incorporate doc and help.t hunks, drop
do_persistent_cache_flush() ]

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 644fbc9fcc8f12eb93d5cc20054cd8598ab001b7)

commit | commitdiff | tree

Yin Congmin [Mon, 27 Dec 2021 03:50:18 +0000 (11:50 +0800)]

rbd: rename image-cache invalidate command

Rename command image-cache to persistent-cache. Refactoring the code
of invalidate command.

[ idryomov: error message, incorporate doc and help.t hunks, drop
do_persistent_cache_invalidate() ]

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 05bfe10ad9fde533aa728f9aa0cc8a8f155c03c5)

commit | commitdiff | tree

Yin Congmin [Wed, 22 Dec 2021 07:07:11 +0000 (15:07 +0800)]

librbd/cache/pwl: rename persistent cache key

librbd "internal" metadata keys was change to ".rbd" prefix. Change
peristent cache to ".rbd" too.
And the name of persistent cache key is IMAGE_CACHE_STATE. Since
this key is planned to be used outside the pwl directory, it seems
more appropriate to change it to a clear name as PERSISTENT_CACHE_STATE.

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
(cherry picked from commit bd66fdda910f02ffe91bb026f82a85f28a6ff225)

commit | commitdiff | tree

Ilya Dryomov [Sat, 9 Apr 2022 15:48:17 +0000 (17:48 +0200)]

rbd: include persistent cache metrics in "rbd status" report

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit e996fd80601ec8c309c1517f33171e88a2f31cad)

commit | commitdiff | tree

Ilya Dryomov [Sat, 9 Apr 2022 09:06:32 +0000 (11:06 +0200)]

rbd: factor out get_percentage() helper

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 9324ab94711dbe9a1265643adcc79ae0a3cba812)

commit | commitdiff | tree

Ilya Dryomov [Fri, 8 Apr 2022 13:53:38 +0000 (15:53 +0200)]

librbd/cache/pwl: no need to set clean and empty in remove_pool_file()

It is redundant -- the only caller sets both since commit 6593e31fff18
("librbd/cache/pwl: correct cache state").

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit d64a3ae265897806809d9fa08ac72c549b4bca4f)

commit | commitdiff | tree

Ilya Dryomov [Thu, 7 Apr 2022 16:49:46 +0000 (18:49 +0200)]

librbd/cache/pwl: avoid inconsistencies in ImageCacheState

When empty and/or clean bools are updated in I/O handling code paths,
ImageCacheState becomes inconistent for a short while: e.g. with clean
transitioned to true, dirty_bytes counter could still be positive
because the counters are updated only in periodic_stats(). Move to
updating the counters in update_image_cache_state(Context*) to avoid
this.

update_image_cache_state(Context*) now requires m_lock -- most call
sites already hold it anyway. The only problematic call site was
AbstractWriteLog::shut_down() callback chain: perf_stop() needed to
be moved to the very end since perf counters must be alive now for
update_image_cache_state() to work.

Don't override expect_op_work_queue() in unit tests: completing
context in the same thread now results in a deadlock on m_lock in
all test cases that call AbstractWriteLog::init().

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 016882925a63f4f03a9c445d008b2325d479bc30)

commit | commitdiff | tree

Ilya Dryomov [Thu, 7 Apr 2022 14:02:46 +0000 (16:02 +0200)]

librbd/cache/pwl: handle invalid ImageCacheState json

get_json_format() and create_image_cache_state() attempt to get
particular keys which could result in an unhandled std::runtime_error
exception.  Conversely, ImageCacheState constructor just swallows that
exception which could leave the newly constructed object incorrectly
initialized.  Avoid doing parsing in the constructor and introduce
init_from_config() and init_from_metadata() methods instead.

While at it, move everything out from under "persistent_cache" key.
Also fix init_state_json_write test case which stopped working now
that types are enforced by json_spirit.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 7678ee2490965a8a73c02a47283adaa5036dbcab)

Conflicts:
src/librbd/cache/pwl/ImageCacheState.cc [ commit
  6eb14774fec0 ("librbd: build without "using namespace std"")
  not in pacific ]

commit | commitdiff | tree

Yin Congmin [Tue, 29 Mar 2022 08:59:05 +0000 (16:59 +0800)]

librbd/cache/pwl: add basic metrics to ImageCacheState

Add basic metrics to ImageCacheState and persist them, including
allocated_bytes, cached_bytes, dirty_bytes, free_bytes and hit/miss
info.

Leverage periodic_stats() timer to call update_image_cache_state.
In order to avoid outputting too much debug information, the original
statistics output log level is changed to 5.

Switch to json_spirit for encoding because encode_json encodes bool as
"true"/"false" string.

Remove rbd_persistent_cache_log_periodic_stats option because we need
to always update cache state.

[ idryomov: add cached_bytes and hits_partial; report misses and
miss_bytes instead of respective totals; naming ]

Fixes: https://tracker.ceph.com/issues/50614
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 769f3a06ecf85249c1473cbb6bab7503beb1ba78)

Conflicts:
src/common/options/rbd.yaml.in [ options are defined in
src/common/options.cc in pacific ]

commit | commitdiff | tree

Yin Congmin [Tue, 28 Dec 2021 06:10:35 +0000 (14:10 +0800)]

librbd/cache/pwl: correct cache state

update cache state after dirty_entries or log_enties list updated.

Fixes: https://tracker.ceph.com/issues/50614
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
(cherry picked from commit 6593e31fff180ec4123e37107c88eb39f7d10fdf)

commit | commitdiff | tree

Ernesto Puerta [Wed, 13 Apr 2022 08:34:38 +0000 (10:34 +0200)]

Merge pull request #45849 from rhcs-dashboard/fix-install_deps-pacific

pacific: build: install-deps failing in docker build

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: David Galloway <dgallowa@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

David Galloway [Tue, 12 Apr 2022 20:02:25 +0000 (16:02 -0400)]

Merge pull request #45876 from ceph/pacific-sphinx

pacific: admin/doc-requirements: bump sphinx to 4.4.0

commit | commitdiff | tree

Ernesto Puerta [Tue, 12 Apr 2022 17:50:58 +0000 (19:50 +0200)]

Merge pull request #45880 from rhcs-dashboard/wip-55119-pacific

pacific: mgr/dashboard: fix api test issue with pip

Reviewed-by: David Galloway <dgallowa@redhat.com>

commit | commitdiff | tree

Ernesto Puerta [Fri, 25 Mar 2022 15:26:48 +0000 (16:26 +0100)]

mgr/dashboard: fix api test issue with pip

Fix
```
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
apache-libcloud 3.5.0 requires requests>=2.26.0, but you have requests 2.25.1 which is incompatible.
Successfully installed CherryPy-13.1.0 PyJWT-2.0.1 Routes-2.4.1 bcrypt-3.1.4 ceph-1.0.0 chardet-4.0.0 cheroot-8.6.0 idna-2.10 jaraco.functools-3.5.0 more-itertools-4.1.0 natsort-8.1.0 portend-3.1.0 pyopenssl-22.0.0 pytz-2022.1 repoze.lru-0.7 requests-2.25.1 tempora-5.0.1
```

Fixes: https://tracker.ceph.com/issues/55060
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 2289ad2bc327b0d86916a1c96f4af2967a80c1b9)

Conflicts:
src/pybind/mgr/dashboard/constraints.txt
- keep requests 2.26

commit | commitdiff | tree

Kefu Chai [Sat, 5 Mar 2022 17:44:30 +0000 (01:44 +0800)]

admin/doc-requirements: bump sphinx to 4.4.0

bump sphinx to latest stable. to address following build failure

ERROR: sphinx-autodoc-typehints 1.17.0 has requirement Sphinx>=4, but you'll have sphinx 3.5.4 which is incompatible.
ERROR: sphinx-substitution-extensions 2022.2.16 has requirement sphinx>=4.0.0, but you'll have sphinx 3.5.4 which is incompatible.

also bump bump sphinx-rtd-theme, otherwise we'd have following
build failure:

ERROR: sphinx-rtd-theme 0.5.2 has requirement docutils<0.17, but you'll have docutils 0.17.1 which is incompatible.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 0a5fab53b3804be5ef1377a2f35006b8df857d39)

commit | commitdiff | tree

Kefu Chai [Sun, 6 Mar 2022 06:05:07 +0000 (14:05 +0800)]

mgr/cephadm: set docstring for shim() methods

this allows the "rpc"ized methods of OrchestratorClientMixin to
have the docstring defined by the original methods.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit d0db2ae4f946e1a985402640ef8f1733b40e91ef)

commit | commitdiff | tree

Kefu Chai [Sun, 6 Mar 2022 06:23:42 +0000 (14:23 +0800)]

mgr/cephadm: add empty line after param list in docstring

this helps to silence the warning from sphinx, like

src/pybind/mgr/orchestrator/_interface.py:docstring of orchestrator._interface.Orchestrator.remove_osds:9: WARNING: Field list ends without a blank line; unexpected unindent.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit d9b8e38e3dfe8e6eec6d56ee934c4632de46fc68)

commit | commitdiff | tree

Kefu Chai [Sun, 6 Mar 2022 06:27:50 +0000 (14:27 +0800)]

doc/conf.py: silence warnings from breathe

breathe calls doxygen for extracting/generating docs from code.
while doxygen complains at seeing undocumented fields/func. these
warnings could fail the sphinx-build command, if it takes warnings
as errors.

in this change, these warnings are silenced.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 8891d653198c30f9578499126e1ee9ee67eca04a)

commit | commitdiff | tree

Kefu Chai [Sun, 6 Mar 2022 07:04:21 +0000 (15:04 +0800)]

mgr/cephadm: document notes using "note::" directive

so it can be rendered by sphinx in a better way.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit ba3ccee01b31ef9e39a5016a0ffda18628ec3bc2)

commit | commitdiff | tree

Kefu Chai [Sun, 6 Mar 2022 07:20:14 +0000 (15:20 +0800)]

mgr/cephadm: improve the formatting of docstring

add an empty line before a doctest block would help
sphinx to tell where the session starts.

see also https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#doctest-blocks

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 8685fffdf20eeb4e2068c421e351aa02c48ff860)

commit | commitdiff | tree

Kefu Chai [Sun, 6 Mar 2022 07:28:16 +0000 (15:28 +0800)]

mgr/cephadm: use block quote for "typical use"

otherwise sphinx takes "Typical use" and the following line as a
field. see also

https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#field-lists

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 05798f0cae9afda598f5a154c62fdd24bab9ca30)

commit | commitdiff | tree

Ernesto Puerta [Mon, 11 Apr 2022 19:20:16 +0000 (21:20 +0200)]

Merge pull request #45678 from rhcs-dashboard/wip-54586-pacific

pacific: mgr/dashboard: highlight the search text in cluster logs

Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

commit | commitdiff | tree

Nizamudeen A [Wed, 6 Apr 2022 07:39:26 +0000 (13:09 +0530)]

build: install-deps failing in docker build

install-deps.sh was failing in our docker build due to the recent change in
the script. Failure can be seen here: https://github.com/rhcs-dashboard/ceph-dev/runs/5844502455?check_suite_focus=true#step:3:2586

This seems to fix the issue.

Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 72841fdcbe5445b5f5ada5d244d497f0b3f04e4f)
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 8 Apr 2022 14:33:42 +0000 (07:33 -0700)]

Merge pull request #45716 from adk3798/pacific-backport-march

Cephadm Pacific Batch Backport March

Reviewed-by: Michael Fritch <mfritch@suse.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 8 Apr 2022 14:30:34 +0000 (07:30 -0700)]

Merge pull request #45632 from adk3798/pacific-ssh-offline

pacific: mgr/cephadm: add keep-alive requests to ssh connections

Reviewed-by: Michael Fritch <mfritch@suse.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 7 Apr 2022 21:26:58 +0000 (14:26 -0700)]

Merge pull request #45785 from ronen-fr/wip-rf-45640-pacific

pacific: osd/scrub: restart snap trimming only after scrubbing is done

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 7 Apr 2022 20:55:11 +0000 (13:55 -0700)]

Merge pull request #45773 from ljflores/wip-53605-pacific

pacific: mgr/telemetry: fix waiting for mgr to warm up

Reviewed-by: Yaarit Hatuka <yaarithatuka@gmail.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 7 Apr 2022 20:54:25 +0000 (13:54 -0700)]

Merge pull request #45731 from ronen-fr/wip-rf-42951-pacific

pacific: osd/scrub: destruct the scrubber shortly before the PG is destructed

Reviewed-by: Aishwarya Mathuria <amathuri@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 7 Apr 2022 20:50:33 +0000 (13:50 -0700)]

Merge pull request #45729 from ronen-fr/wip-rf-42479-pacific

pacific: osd/scrub: remove reliance of Scrubber objects' logging on the PG

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Adam King [Thu, 7 Apr 2022 20:28:52 +0000 (16:28 -0400)]

Merge pull request #45803 from ljflores/wip-telemetry-cephadm-link

pacific: cephadm: fix broken telemetry documentation link

Reviewed-by: Adam King <adking@redhat.com>

commit | commitdiff | tree

Laura Flores [Wed, 6 Apr 2022 18:03:04 +0000 (13:03 -0500)]

cephadm: fix broken telemetry documentation link

Signed-off-by: Laura Flores <lflores@redhat.com>

commit | commitdiff | tree

Josh Durgin [Wed, 6 Apr 2022 04:46:07 +0000 (21:46 -0700)]

Merge pull request #45789 from zdover23/wip-doc-2022-04-06-backport-to-pacific-basic-workflow

doc/dev: s/repostory/repository/ (really)

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

commit | commitdiff | tree

Zac Dover [Tue, 8 Jun 2021 15:57:13 +0000 (01:57 +1000)]

doc/dev: s/reposotory/repository/ (really)

This corrects the heinous misspelling described in the
substitution expression in the title. This misspelling is
all the more egregious because it appears in a title, and
therefore would be used to create links if it had not been
caught.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 298b446c35d19ce43ede513a802d0655bcbdf82f)

commit | commitdiff | tree

Adam King [Fri, 11 Mar 2022 20:25:36 +0000 (15:25 -0500)]

qa/suites/fs: stop looping in mds upgrade test if upgrade failed

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 37019aad80aec15f9a34301c6051f065eb913e29)

commit | commitdiff | tree

Adam King [Wed, 2 Mar 2022 05:23:52 +0000 (00:23 -0500)]

mgr/cephadm: fixing prometheus port handling
Fixes: https://tracker.ceph.com/issues/51072
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 8eb1397d77dace25f387e88137a1807993a0796d)

Conflicts:
src/pybind/mgr/prometheus/module.py

commit | commitdiff | tree

Adam King [Tue, 15 Mar 2022 18:33:52 +0000 (14:33 -0400)]

cephadm: respect --skip-firewalld flag

Fixes: https://tracker.ceph.com/issues/54137
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit d97057f8d7263cce8efc0857e3fe4a10faee30c8)

commit | commitdiff | tree

Matan Breizman [Tue, 15 Feb 2022 08:55:14 +0000 (08:55 +0000)]

qa/tasks/cephfs: increase timeout in test_nfs.py

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
(cherry picked from commit 44ad552093b4f0dc21563dd9f804974ade239440)

commit | commitdiff | tree

Adam King [Mon, 21 Mar 2022 01:44:28 +0000 (21:44 -0400)]

python-common/drive_group: add extra_container_args to supported features

Should have been added when extending extra container args
to all the services but was missed

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit f036bdaf5a1e5f6b18a9591949be878fea8bb70d)

Conflicts:
src/python-common/ceph/deployment/drive_group.py

commit | commitdiff | tree

wangyunqing [Wed, 30 Mar 2022 03:53:57 +0000 (11:53 +0800)]

doc/cephadm/operations.rst: fix typos

Signed-off-by: wangyunqing <wangyunqing@inspur.com>
(cherry picked from commit 92eb799a952db4f2fe2290aef56d2f66b8f64802)

commit | commitdiff | tree

Redouane Kachach [Wed, 2 Mar 2022 11:38:42 +0000 (12:38 +0100)]

mgr/cephadm: check spec host when adding osd
Fixes: https://tracker.ceph.com/issues/47872
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit b87c966697d36ef51f1e62425d77200667e651ae)

Conflicts:
src/pybind/mgr/orchestrator/module.py

commit | commitdiff | tree

Adam King [Fri, 4 Mar 2022 02:47:47 +0000 (21:47 -0500)]

mgr/cephadm: offline host watcher

To be able to detect if certain offline hosts go
offline quicker. Could be useful for the NFS
HA feature as this requires moving nfs daemons from
offline hosts within 90 seconds.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit bd9eb596570cfcc7fea793c2b380bc66dd719439)

Conflicts:
src/pybind/mgr/cephadm/module.py
src/pybind/mgr/cephadm/ssh.py
src/pybind/mgr/cephadm/tests/fixtures.py
src/pybind/mgr/cephadm/utils.py

commit | commitdiff | tree

Adam King [Tue, 22 Mar 2022 22:57:21 +0000 (18:57 -0400)]

mgr/cephadm: Reschedule nfs daemons from offline hosts

In order to improve nfs availability, if there are other
hosts we can place an nfs daemon on or if there is a host
with a lower rank nfs daemon when a higher rank one is on
an offline host, we should reschedule the nfs daemons

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 9febc21c14b7ad26e6d811444e7daf0b0a292afb)

Conflicts:
src/pybind/mgr/cephadm/utils.py

commit | commitdiff | tree

Redouane Kachach [Wed, 9 Mar 2022 13:19:02 +0000 (14:19 +0100)]

mgr/cephadm: checking service name before removal
Fixes: https://tracker.ceph.com/issues/54503
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit b26c114c8456941d6cccf7d4355445f21cb373a7)

commit | commitdiff | tree

Adam King [Tue, 15 Mar 2022 20:41:15 +0000 (16:41 -0400)]

cephadm: verify config file exists when inferring it

Fixes: https://tracker.ceph.com/issues/54571
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 1568875a281d56b413e75b244c9c75311cf353a0)

commit | commitdiff | tree

Redouane Kachach [Mon, 7 Mar 2022 13:03:07 +0000 (14:03 +0100)]

mgr/cephadm: adding HostSpec validation
Fixes: https://tracker.ceph.com/issues/54342
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 15ba147a2a4cae8ca69437382136d328a1f416f2)

commit | commitdiff | tree

wangyunqing [Wed, 9 Mar 2022 08:55:13 +0000 (16:55 +0800)]

doc/cephadm/adoption.rst: fix typos

Signed-off-by: wangyunqing <wangyunqing@inspur.com>
(cherry picked from commit e4db28f6b294909e0f177e82dbda8cfcc8129846)

commit | commitdiff | tree

Adam King [Mon, 21 Feb 2022 21:34:47 +0000 (16:34 -0500)]

cephadm: still set container_image when --no-assimilate-config is provided

Fixes: https://tracker.ceph.com/issues/54141
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 59d004cb901eb6d84fb6907cb88314fd31b87904)

commit | commitdiff | tree

Adam King [Thu, 10 Feb 2022 01:42:42 +0000 (20:42 -0500)]

qa/tasks/cephadm_cases: increase timeouts in test_cli.py

These seem to be failing sometimes but in my testing
sometimes these events are happening a few seconds after
we hit the timeout. Trying to see if this makes the tests
more consistent. No need to mark the test as failed
if we report something up in 34 seconds vs 25 especially
when cephadm works on a cyclic daemon refresh.

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 46f939f057bd05a885eaf750663310375f9dd929)

Conflicts:
qa/tasks/cephadm_cases/test_cli.py

commit | commitdiff | tree

Ronen Friedman [Fri, 25 Mar 2022 10:45:47 +0000 (10:45 +0000)]

pacific: osd/scrub: restart snap trimming only after scrubbing is done

Snap trimming that was postponed as the target PG was scrubbing
must be restarted at scrub completion.
PR #38111 moved trimming restart to just before the scrub fully
terminated. The current PR fixes that.

Trimming is also restarted in those cases where scrub was
queued but aborted immediately.

Fixes: https://tracker.ceph.com/issues/52026
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 948d3266c67bf896d1c20472977b849178d233d3)

Conflicts:
src/osd/pg_scrubber.cc

Conflict resolved by removing a clear_queued_or_active() call that
was dragged in.

commit | commitdiff | tree

Yaarit Hatuka [Tue, 9 Nov 2021 18:31:11 +0000 (18:31 +0000)]

mgr/telemetry: fix waiting for mgr to warm up

1. The implementation of config_notify() in telemetry module sets the
flag for event, which is supposed to wake up the 'serve' thread whenever
a config option is changed. The problem is that we call config_notify()
at the beginning of serve(), before we enter its 'run' loop. This call
sets the event which cancels the 10 seconds wait for the mgr to warm up.
To fix this, we extract the logic of updating the config options to a
separate function (config_update_module_option()), and call it on
__init__, instead of calling config_notify() in serve().

2. We should always wait for the mgr to warm up here (10 seconds). In
case of a sporadic event (e.g. a config option change via CLI) the event
will be set, and wait will return immediately. We enforce this wait by
using time.sleep(10) instead of event.wait(10).

Fixes: https://tracker.ceph.com/issues/53204
Signed-off-by: Yaarit Hatuka <yaarit@redhat.com>
(cherry picked from commit fa5cc0ca081ca3cce552e0cb21a1e17273cf3482)

Conflicts:
src/pybind/mgr/telemetry/module.py

- Several options under __init__ had to be removed that were not present
in Pacific
- No type checking in Pacific

commit | commitdiff | tree

Yuri Weinstein [Mon, 4 Apr 2022 21:50:43 +0000 (14:50 -0700)]

Merge pull request #45654 from ljflores/wip-pacific-fast-shutdown-backports

Pacific fast shutdown backports

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Sridhar Seshasayee <sseshasa@redhat.com>
Reviewed-by: Nitzan Mordechai <nmordech@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 4 Apr 2022 21:47:26 +0000 (14:47 -0700)]

Merge pull request #45586 from idryomov/wip-pool-reverse-lookup-osdmap-pacific

pacific: librados: check latest osdmap on ENOENT in pool_reverse_lookup()

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

David Galloway [Mon, 4 Apr 2022 21:33:06 +0000 (17:33 -0400)]

Merge pull request #45753 from ceph/wip-pacific-debug

build: Add some debugging messages

commit | commitdiff | tree

Yuri Weinstein [Mon, 4 Apr 2022 15:59:41 +0000 (08:59 -0700)]

Merge pull request #45638 from idryomov/wip-diff-iterate-striping-fix-pacific

pacific: librbd: make diff-iterate in fast-diff mode sort and merge reported extents

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

commit | commitdiff | tree

David Galloway [Fri, 25 Mar 2022 21:29:44 +0000 (17:29 -0400)]

build: Add some debugging messages

Having a unique string like "CI_DEBUG" will help me know where we are in the build process in Jenkins logs.

Signed-off-by: David Galloway <dgallowa@redhat.com>
(cherry picked from commit 57edb76ea46893294a70aa080916bc723fb35f9e)

commit | commitdiff | tree

Ronen Friedman [Thu, 26 Aug 2021 12:30:38 +0000 (12:30 +0000)]

osd/scrub: destruct the scrubber shortly before the PG is destructed

By destructing the scrubber when the PG is still intact, we guarantee that
Scrubber's code can refer to the PG object - especially in dout()s.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit bcd13e134c1f335506e425800170d55cd8a2af1b)

commit | commitdiff | tree

Ronen Friedman [Sun, 25 Jul 2021 11:58:51 +0000 (14:58 +0300)]

osd/scrub: remove reliance of Scrubber objects' logging on the PG

Modify the Scrubber's sub-objects to use their own gen_prefix()
functions, instead of using PG::gen_prefix().

Fixes: https://tracker.ceph.com/issues/51843
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 2aeb9263d643b19d59219e8e187e1a0fa0292693)

Conflicts:
        src/osd/PG.h
        src/osd/PrimaryLogScrub.cc
        src/osd/pg_scrubber.cc
        src/osd/pg_scrubber.h

Conflict resolution:
- manually removing some scrub scheduling changes from
  PR #40984
- pg_scrubber.h: removing some irrelevant lines that were dragged
  in.
- PG.h: restoring lines removed by the merge.

commit | commitdiff | tree

Sarthak0702 [Mon, 21 Mar 2022 18:29:08 +0000 (23:59 +0530)]

mgr/dashboard: Remove padding in search highlighted text

Signed-off-by: Sarthak0702 <sarthak.0702@gmail.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 29 Mar 2022 20:20:56 +0000 (13:20 -0700)]

Merge pull request #45620 from s0nea/wip-55036-pacific

pacific: mgr/cephadm: try to get FQDN for configuration files

Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Adam King adking@redhat.com

commit | commitdiff | tree

Sarthak0702 [Tue, 1 Mar 2022 18:07:38 +0000 (23:37 +0530)]

mgr/dashboard: highlight the search text in cluster logs

Fixes: https://tracker.ceph.com/issues/54445
Signed-off-by: Sarthak0702 <sarthak.0702@gmail.com>
(cherry picked from commit a878c7442059d11ac14edd226d71abbabda9a3c4)

commit | commitdiff | tree

Yuri Weinstein [Mon, 28 Mar 2022 21:54:43 +0000 (14:54 -0700)]

Merge pull request #45374 from ronen-fr/wip-rf-42684-pacific

pacific: osd/scrub: tag replica scrub messages to identify stale events

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 28 Mar 2022 21:53:37 +0000 (14:53 -0700)]

Merge pull request #45355 from mgfritch/backport-45347-pacific

pacific: cephadm: preserve `authorized_keys` file during upgrade

Reviewed-by: Adam King adking@redhat.com

commit | commitdiff | tree

Yuri Weinstein [Mon, 28 Mar 2022 15:51:51 +0000 (08:51 -0700)]

Merge pull request #45591 from vumrao/wip-vumrao-55020

pacific: osd/PrimaryLogPG.cc: CEPH_OSD_OP_OMAPRMKEYRANGE should mark omap dirty

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Ronen Friedman <rfriedma@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 28 Mar 2022 15:51:01 +0000 (08:51 -0700)]

Merge pull request #45203 from rhcs-dashboard/wip-54113-pacific

pacific: mgr/dashboard: perform daemon actions

Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Reviewed-by: Aashish Sharma <aasharma@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 28 Mar 2022 15:47:56 +0000 (08:47 -0700)]

Merge pull request #45173 from kamoltat/wip-ksirivad-backport-pacific-44054

pacific: osd: add pg_num_max value & pg_num_max reordering

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Laura Flores [Mon, 28 Mar 2022 15:39:57 +0000 (10:39 -0500)]

Merge pull request #45588 from ljflores/wip-pacific-perfcounter-fix

pacific: common: fix missing name in PriorityCache perf counters

commit | commitdiff | tree

Yuri Weinstein [Mon, 28 Mar 2022 14:23:30 +0000 (07:23 -0700)]

Merge pull request #45561 from idryomov/wip-readv-writev-overflow-pacific

pacific: librbd: readv/writev fix iovecs length computation overflow

Reviewed-by: Christopher Hoffman <choffman@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 28 Mar 2022 14:19:58 +0000 (07:19 -0700)]

Merge pull request #45474 from nmshelke/wip-54573-pacific

pacific: mgr/volumes: the 'mode' should honor idempotent subvolume creation

Reviewed-by: Venky Shankar vshankar@redhat.com

commit | commitdiff | tree

Yuri Weinstein [Mon, 28 Mar 2022 14:19:13 +0000 (07:19 -0700)]

Merge pull request #45464 from cfsnyder/wip-53471-pacific

pacific: common: avoid pthread_mutex_unlock twice

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Mon, 28 Mar 2022 14:17:47 +0000 (07:17 -0700)]

Merge pull request #45436 from cfsnyder/wip-51783-pacific

pacific: qa/rgw: add failing tempest test to blocklist

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Unnamed repository; edit this file 'description' to name the repository.