git-server-git.apps.pok.os.sepia.ceph.com Git

mgr/dashboard: enable addition custom Prometheus alerts

Fixes: https://tracker.ceph.com/issues/57294
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit 3551d7f8b36d883a72b85f0bd5568a33ac00e62c)

Conflicts:
doc/cephadm/services/monitoring.rst
src/pybind/mgr/cephadm/services/monitoring.py
src/pybind/mgr/cephadm/tests/test_services.py

Merge pull request #48098 from adk3798/wip-57424-pacific

pacific: cephadm: Fix disk size calculation

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

Merge pull request #48100 from adk3798/wip-57427-pacific

pacific: mgr/cephadm: allow setting prometheus retention time

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>

Merge pull request #48102 from adk3798/wip-57379-pacific

pacific: cephadm: return nonzero exit code when applying spec fails in bootstrap

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

Merge pull request #48103 from adk3798/wip-57384-pacific

pacific: mgr/cephadm: Adding logic to store grafana cert/key per node

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

Merge pull request #47888 from rhcs-dashboard/wip-57357-pacific

pacific: mgr/dashboard: ensure limit 0 returns 0 images

Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #47636 from rhcs-dashboard/wip-57143-pacific

pacific: mgr/dashboard: fix _rbd_image_refs caching

Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #47978 from neesingh-rh/wip-57439-pacific

pacific: cephfs-top: display average read/write/metadata latency

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #48045 from rhcs-dashboard/wip-57493-pacific

pacific: mgr/dashboard: fix openapi-check

Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #47995 from idryomov/wip-52810-pacific

pacific: librbd: retry ENOENT in V2_REFRESH_PARENT as well

Reviewed-by: Christopher Hoffman <choffman@redhat.com>

Merge pull request #47866 from neesingh-rh/wip-57274-pacific

pacific: mgr/stats: missing clients in perf stats command output.

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Nikhilkumar Shelke <nshelke@redhat.com>
Reviewed-by: Jos Collin <jcollin@redhat.com

Merge pull request #47769 from neesingh-rh/wip-57263-pacific

pacific: mgr/volumes: Add volume info command

Reviewed-by: Neeraj Pratap Singh <neesingh@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #47647 from joscollin/wip-57155-pacific

pacific: cephfs-top: fix the rsp/wsp display

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Neeraj Pratap Singh <neesingh@redhat.com>

Merge pull request #47386 from s0nea/wip-56990-pacific

pacific: monitoring/ceph-mixin: OSD overview typo fix

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #47528 from lxbsz/wip-57083

pacific: qa/import-legacy: install python3 package for nautilus ceph

Reviewed-by: Kotresh HR khiremat@redhat.com

doc: include read, write, metadata average latencies in doc/man.

Also, the sample cephfs-top image in the doc is outdated. Update that!

Fixes: http://tracker.ceph.com/issues/48619
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit aa04f3faedb6edcb0897e802a8390904deb6f936)

cephfs-top: display latency in milliseconds

Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit bf261f2a07111361ec8db36a7a4b13b54ff5d891)

cephfs-top: switch to displaying average latencies and stdev

Do away with cumulative latencies -- those are not much useful.
However, these types need to be maintained since `perf stats`
command (via mgr/stats plugin) includes them. So, maintain a
legacy metrics list which is ignored when choosing metrics to
display.

Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 60f33a8ca3055ec5ae5c8d67fd03f571bcec8892)

mgr/stats: include average latencies and stdev in `perf stat` dump

Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit b2bc45223b02ded7a5cc921980b3961c5e1d5893)

mgr/stats: auto generate metrics names from configured metrics

Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit aaeec93efd2ae03d740299a5b22bb9203fbd7b8d)

client: forward read, write, metadata average latency and stdev

Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 9b4f210b58571c0d88e5e01b90e6106cd894c3be)

Conflicts:
src/client/Client.cc:Added 'if' condition in read,write and
metadata latencies in 'Client::collect_and_send_global_metrics()'

mds, mgr: plumb in new client metrics

Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit f1083c0b3d41e6691945e8b40df8aa707a261217)

Conflicts:
src/include/cephfs/metrics/Types.h:instead of std::ostream
using ostream as is used in other places for latencies.

client: track average read, write and metadata IO latencies

And also standard deviation for each to measure the variance
(volatility) of latencies.

Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 967e24fe5c0efd9d7eb870494610fd1b4412f1d6)

qa: add test_perf_stats_stale_metrics_with_multiple_filesystem

Fixes: https://tracker.ceph.com/issues/56483
Signed-off-by: Neeraj Pratap Singh <neesingh@redhat.com>
(cherry picked from commit e717e216ea956de91cf41986a9f8e1b8b4ddf09f)

Conflicts:
qa/tasks/cephfs/test_mds_metrics.py

mgr/stats: missing clients in perf stats command output.

perf stats doesn't get the client info w.r.t new filesystems
created or filesystems created on failing other filesystem
after running the perf stats command once with existing filesystems.

Fixes: https://tracker.ceph.com/issues/56483
Signed-off-by: Neeraj Pratap Singh <neesingh@redhat.com>
(cherry picked from commit 584394fb243416ca50c2b5e05de5d20dd46be114)

Merge pull request #47972 from vshankar/tr-55931

pacific: client: allow overwrites to file with size greater than the max_file_size

Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #47923 from petrutlucian94/wip-57403-pacific

pacific: include: fix IS_ERR on Windows

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Merge pull request #47862 from lxbsz/wip-57252

pacific: libcephfs: define AT_NO_ATTR_SYNC back for backward compatibility

Reviewed-by: Nikhilkumar Shelke <nshelke@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #47851 from joscollin/wip-57279-pacific

pacific: mgr/stats: change in structure of perf_stats o/p

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>
Reviewed-by: Neeraj Pratap Singh <neesingh@redhat.com>

mgr/cephadm: Adding logic to store grafana cert/key per node
Fixes: https://tracker.ceph.com/issues/56508
Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 3c990f974e3beac0fc03f58c4c47f26f9d5afe56)

Conflicts:
src/pybind/mgr/cephadm/tests/test_services.py

cephadm: return nonzero exit code when applying spec fails in bootstrap

This is mostly useful for testing automation, but right now if applying the
spec provided with --apply-spec fails, the return code remains zero. We don't
want to error out entirely in that case as we still want to print the remaining
output (e.g. the dashboard password). Continuing onward and then returning a
nonzero code could provide a balance where we still give all the output but
still have something to make it easier for those writing automation around bootstrap.

Fixes: https://tracker.ceph.com/issues/57173
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit be17f1d4b30e19aa6039fa5d6a694129cb5f3583)

doc/cephadm: documentation for setting prometheus retention time

Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 11fc0ef194dc347de075cde1274933ec83164404)

mgr/cephadm: allow setting prometheus retention time

When we deploy Prometheus server, we don't provide any
ability to define the tsdb retention time - so it defaults to 15d.

This change adds a field that can be passed in a prometheus service
spec that will be passed as an arg to the --storage.tsdb.retention.time
parameter for the prometheus daemon.

Fixes: https://tracker.ceph.com/issues/54308
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 91dd03fd648d25773a83fdad311b62b781619fc4)

Conflicts:
src/pybind/mgr/cephadm/services/monitoring.py
src/pybind/mgr/cephadm/tests/test_services.py
src/python-common/ceph/deployment/service_spec.py

cephadm: Fix disk size calculation

With native 4k sectors, the logical blocksize is set to
4096, which yields a disk size 8x the size of the actual
device. According to kernel source, device size only
uses 512 byte sectors, so the use of logical blocksize
is unnecessary.

Fixes: https://tracker.ceph.com/issues/57335
Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
(cherry picked from commit a6f10ebd572cbf95c94614a94f981ca3550fca25)

Merge pull request #48060 from zdover23/wip-doc-2022-09-13-backport-47575-to-pacific

pacific: doc/rados: add prompts to pools.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #48062 from zdover23/wip-doc-2022-09-13-backport-47305-to-pacific

pacific: doc/monitoring: add min vers of apps in mon stack

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/monitoring: add min vers of apps in mon stack

https://tracker.ceph.com/issues/45447

This PR adds recommended versions of grafana and
prometheus and alert manager.

This PR is a second attempt at getting the information
in the following PR into the docs:
https://github.com/ceph/ceph/pull/46000/files

Himadri Maheshwari deserves the credit for the work
in this commit.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
Signed-off-by: Himadri Maheshwari <himadri.maheshwari7915@gmail.com>
(cherry picked from commit 367695f5b09f75ee723d53116e2f4a6e45dd795d)

doc/rados: add prompts to pools.rst

This commit adds ".. prompt:: bash $"-style prompts to pools.rst.
This brings this file up to the standard established in 2020 when
Kefu added support for the ".. prompt::" directive.

This commit is a part of an initiative to modernize the presentation
of all BASH commands in the RADOS documentation.

The progress of this project can be tracked here:
https://tracker.ceph.com/issues/57108

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 1bd64192568242b141d8e30fef6758bf162ec350)

Merge pull request #47823 from zdover23/wip-doc-2022-08-27-backport-47810-to-pacific

pacific: doc/mgr: add prompt directives to dashboard.rst

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

mgr/dashboard: docs gen tags sort

When generating tags the order of endpoints wasn't taken into account.
Two endpoints with the same url prefix, for example `/api/cluster/` and
`/api/cluster/user`, have different docs and the tags is generated from
a doc of one of these two, and since the order of these endpoints might
vary it is imperative to sort them to have a deterministic output.

Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
(cherry picked from commit 9673ed18699cdca3f032fd131d4248b010750ef6)

Merge pull request #48024 from idryomov/wip-57116-cont-pacific

pacific: test/{librbd, rgw}: increase delay between and number of bind attempts

Reviewed-by: Laura Flores <lflores@redhat.com>

test/{librbd, rgw}: increase delay between and number of bind attempts

Commit aa7885f7cc41 ("test/{librbd, rgw}: retry when bind fail with
port 0") reduced the frequency of sporadic unit test failures caused
by EADDRINUSE a lot, but not entirely.

Currently, it yields a cumulative sleep of ~9 seconds. Let's increase
that to 1 minute.

Fixes: https://tracker.ceph.com/issues/57116
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 48016eaa1511ac8a39ed33084e0e230f3b1b5821)

test/{librbd, rgw}: retry when bind fail with port 0

there is chance that the bind() call may fail if we have another test
happen to pick the free port picked by operating system. in this case,
we just retry up to 42 times.

in theory, this change does not fully address the racing, but it should
help to alleviate this issue.

See-also: https://tracker.ceph.com/issues/57116
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit aa7885f7cc41390fcc8eeb82bc7142c3ff6a53f9)

Conflicts:
src/test/rgw/test_http_manager.cc [ commit f5019d2a8388 ("rgw:
Set CURLOPT_NOBODY for HEAD request") not in pacific ]

Merge pull request #47693 from pdvian/wip-55309-pacific

pacific: mgr, mgr/prometheus: Fix regression with prometheus metrics

Reviewed-by: Laura Flores <lflores@redhat.com>

Merge pull request #47433 from lxbsz/wip-56462

pacific: mds: skip fetching the dirfrags if not a directory

Reviewed-by: Kotresh HR khiremat@redhat.com

Merge pull request #47056 from lxbsz/wip-56449

pacific: mds: notify the xattr_version to replica MDSes

Reviewed-by: Kotresh HR khiremat@redhat.com

librbd: make RefreshRequest tests compatible with clone v1

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 36f1d515ca92784631d29fa1c5d1465e957af2a7)

librbd: retry ENOENT in V2_REFRESH_PARENT as well

With auto-deletion of trashed snapshots, it is relatively easy to lose
a race to "rbd flatten" as follows:

- when V2_GET_PARENT runs, the image is technically still a clone
- when V2_REFRESH_PARENT runs, the image is fully flattened and the
snapshot in the parent image is deleted

This results in a spurious ENOENT error, mainly when trying to open the
image (e.g. for "rbd info"). This race condition has always been there
but auto-deletion of trashed snapshots makes it much worse.

Retry ENOENT in V2_REFRESH_PARENT the same way as in V2_GET_SNAPSHOTS.

Fixes: https://tracker.ceph.com/issues/52810
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit bd885d75b2e4d728086f744e0d10e7cd12d3f15b)

librbd: limit the number of ENOENT retries in RefreshRequest

If the image header is corrupt, ENOENT error may be persistent. Avoid
an infinite loop in that case.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 8570194b133462db6b7d4ab108383db0967b1cb9)

librbd: fix a bunch of issues with restarting RefreshRequest

Make RefreshRequest properly restartable, at least up until and including
V2_REFRESH_PARENT step:

- clear m_migration_spec when skipping GET_MIGRATION_HEADER
- don't rely on potentially stale m_incomplete_update on retry
- reset m_legacy_parent when retrying more than just V2_GET_PARENT
- don't rely on potentially stale m_parent_md.overlap and
m_head_parent_overlap on retry
- clear m_metadata before fetching image metadata (but not before
fetching pool metadata)
- clear m_op_features when skipping V2_GET_OP_FEATURES
- clear m_group_spec on EOPNOTSUPP error in V2_GET_GROUP
- reset m_legacy_snapshot when retrying more than just V2_GET_SNAPSHOTS
- don't rely on potentially stale m_snap_parents on retry

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 6bd89ea119520cf5a45ac93b0e16edf35ddd4e57)

librbd: check *result consistently in RefreshRequest

Stick to *result >= 0 checks everywhere and add missing checks for
op_features_get_finish() and image_group_get_finish() errors.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ce6dff89c0f005c1ae1dc71cadfbef9f82df37a4)

librbd: reflect V2_GET_SNAPSHOTS ENOENT retry in state diagram

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ca36ffa347f0c68115a7d6b54ebb47ac5e82698d)

Merge pull request #47556 from ifed01/wip-ifed-cleanup-onode-pin-pac

pacific: os/bluestore: get rid of fake onode nref increment for pinned entry

Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>

Merge pull request #47611 from adk3798/pacific-multiple-vips

pacific: Cephadm: Allow multiple virtual IP addresses for keepalived and haproxy

Reviewed-by: John Mulligan <jmulligan@redhat.com>

Merge pull request #47512 from nmshelke/wip-57058-pacific

pacific: mgr/volumes: filter internal directories in 'subvolumegroup ls' command

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Kotresh HR khiremat@redhat.com

Merge pull request #47535 from guits/wip-57088-pacific

pacific: ceph-volume: system.get_mounts() refactor

Reviewed-by: Adam King <adking@redhat.com>

Merge pull request #47661 from adk3798/wip-57169-pacific

pacific: cephadm: support for Oracle Linux 8

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

Merge pull request #47663 from adk3798/wip-57103-pacific

pacific: mgr/cephadm: recreate osd config when redeploy/reconfiguring

Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>

Merge pull request #47662 from adk3798/wip-57148-pacific

pacific: mgr/cephadm: set dashboard grafana-api-password when user provides one

Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Redouane Kachach <rkachach@redhat.com>

mgr/cephadm: loop over all vips when trying to find ingress' interface

Signed-off-by: Luis Domingues <domingues.luis@protonmail.ch>
(cherry picked from commit 1b9a6a0f58a9a7550e8b93573b3191816da5f900)

Split single and multiple vips test into 2 functions

Signed-off-by: Luis Domingues <domingues.luis@protonmail.ch>
(cherry picked from commit f6d4ab9f25e5c9ee1872dbfd18bebbaf9a72a2d0)

mgr/cephadm: update haproxy/keepalive service test for newly generated files

Signed-off-by: Luis Domingues <domingues.luis@protonmail.ch>
(cherry picked from commit a69a6fb4f5275af8a2757003f7fb5ca1f1ab9d2f)

mgr/cephadm: set explicit * bind for haproxy when using multiple vips

Signed-off-by: Luis Domingues <domingues.luis@protonmail.ch>
(cherry picked from commit a11e181b98ffccff40939068d86254e7f8a98c06)

mgr/cephadm: update doc for multiple vips for ingress

Signed-off-by: Luis Domingues <domingues.luis@protonmail.ch>
(cherry picked from commit 7b064e8b0eab0b577470122534e1b2647f5191cc)

mgr/cephadm: set test for multiple vips options for ingress service

Signed-off-by: Luis Domingues <domingues.luis@protonmail.ch>
(cherry picked from commit 5915d2ecd08c1289da38d4fbeb646898f9c5dccf)

mgr/cephadm: allow for multiple vip configuration on ingress service

Signed-off-by: Luis Domingues <domingues.luis@protonmail.ch>
(cherry picked from commit 0193a6f73659f7aa4ac1d000cf11c6544ad6ab6d)

Merge pull request #47627 from guits/wip-57133-pacific

pacific: cephadm/ceph-volume: fix rm-cluster --zap

Reviewed-by: John Mulligan <jmulligan@redhat.com>

Merge pull request #47664 from adk3798/wip-57099-pacific

pacific: cephadm: support quotes around public/cluster network in config passed to bootstrap

Reviewed-by: Redouane Kachach <rkachach@redhat.com>

Merge pull request #47684 from batrick/i57183

pacific: crash: pthread_mutex_lock()

Reviewed-by: Laura Flores <lflores@redhat.com>

Merge pull request #47522 from kamoltat/wip-ksirivad-backport-pacific-46242

pacific: pybind/mgr/pg_autoscaler: change overlapping roots to warning
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #47380 from NitzanMordhai/wip-55156-pacific

pacific: mon/ConfigMonitor: fix config get key with whitespaces

Reviewed-by: Neha Ojha: <nojha@redhat.com>

Merge pull request #47692 from pdvian/wip-55308-pacific

pacific: mgr, mon: Keep upto date metadata with mgr for MONs

Reviewed-by: Laura Flores <lflores@redhat.com>

Merge pull request #47401 from tserong/wip-56977-pacific

pacific: cephfs-shell: move source to separate subdirectory

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
Reviewed-by: Nikhilkumar Shelke <nshelke@redhat.com>

Merge pull request #47282 from batrick/i56712

pacific: mds: standby-replay daemon always removed in MDSMonitor::prepare_beacon

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Kotresh HR <khiremat@redhat.com>

Merge pull request #47920 from idryomov/wip-57343-pacific

pacific: test/cli-integration/rbd: iSCSI REST API responses aren't pretty-printed anymore

Reviewed-by: Kefu Chai <tchaikov@gmail.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>

qa: filter internal directories in 'subvolumegroup ls' command

Internal directories: '_nogroup', '_index', '_legacy', '_deleting'
1. Internal directories should be filtered in 'subvolmegroup ls' command.
2. Internal directories should not be accepted as a group name.

Fixes: https://tracker.ceph.com/issues/55762
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
(cherry picked from commit 7b585d4db921112edeea3c879cb8bca0200c1b71)

mgr/volumes: filter internal directories in 'subvolumegroup ls' command

Internal directories: '_nogroup', '_index', '_legacy', '_deleting'
1. Internal directories should be filtered in 'subvolmegroup ls' command.
2. Internal directories should not be accepted as a group name.

Fixes: https://tracker.ceph.com/issues/55762
Signed-off-by: Nikhilkumar Shelke <nshelke@redhat.com>
(cherry picked from commit ce3fa7f1bcd9ca8a9e9e80ca33a15d0746ce7110)

Merge pull request #47911 from idryomov/wip-57317-pacific

pacific: librbd: use actual monitor addresses when creating a peer bootstrap token

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Christopher Hoffman <choffman@redhat.com>

Merge pull request #46949 from lxbsz/wip-56056

pacific: ceph-fuse: add dedicated snap stag map for each directory

Reviewed-by: Kotresh HR <khiremat@redhat.com>

Merge pull request #47583 from idryomov/wip-57107-pacific

pacific: rbd: find_action() should sort actions first

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

Merge pull request #47913 from idryomov/wip-56154-pacific

pacific: rbd-mirror: resume pending shutdown on error in snapshot replayer

Reviewed-by: Mykola Golub <mgolub@suse.com>
Reviewed-by: Christopher Hoffman <choffman@redhat.com>

Merge pull request #47460 from neesingh-rh/wip-57041-pacific

pacific: mgr/volumes: add interface to check the presence of subvolumegroups/subvolumes

Reviewed-by: Nikhilkumar Shelke <nshelke@redhat.com>
Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

Merge pull request #46077 from idryomov/wip-mrun-bashism-pacific

pacific: tooling: Change mrun to use bash

Reviewed-by: Adam C. Emerson <aemerson@redhat.com>

cephfs-shell: move source to separate subdirectory

This ensures the package discovery done by python setuptools >= 61
doesn't get confused when building cephfs-shell and cephfs-top.
This commit moves cephfs-shell to a separate "shell" subdirectory,
which is the same approach we've already got with the cephfs-top
tool being in a separate "top" subdirectory.

Fixes: https://tracker.ceph.com/issues/56658
Signed-off-by: Tim Serong <tserong@suse.com>
(cherry picked from commit dc69033763cc116c6ccdf1f97149a74248691042)

Merge pull request #47803 from tchaikov/wip-pacific-update-fio

pacific: Updates to fix `make check` failures

Reviewed-by: Laura Flores <lflores@redhat.com>

client: allow overwrites to files with size greater than the max_file_size cfg

Before this change, overwriting from file-offset >= max_file_size config
returns "File too large" (even though the data is being written)
This change allow overwrites as the file size is not further increasing.

Fixes: https://tracker.ceph.com/issues/24894
Signed-off-by: Tamar Shacked <tshacked@redhat.com>
(cherry picked from commit a451a3670b7bb783ca6dcb8b2a31a8e6ec396899)

Merge pull request #47956 from zdover23/wip-doc-2022-09-04-backport-47841-to-pacific

pacific: doc/start: update documenting-ceph branch names

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

doc/start: update documenting-ceph branch names

This PR updates the branch names in the
documenting-ceph.rst file. It gets rid of all references
to the "master" branch, and updates the language to
reflect the state of play in 2022.

inb4: This PR merely removes the most egregious inaccuracies,
the ones that were most readily evident on a cursory perusal.
The full text remains to be carefully read and fitted together
with care.

I had to start somewhere.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 7bc6262547c82dd6519e4099bfc4f082f14343ac)

Merge pull request #47948 from adk3798/wip-57412-pacific

pacific: doc/cephadm/services: fix example for specifying rgw placement

Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com>

doc/cephadm/services: fix example for specifying rgw placement
fixes: https://tracker.ceph.com/issues/56953

Signed-off-by: Redouane Kachach <rkachach@redhat.com>
(cherry picked from commit 1ed4c30876262c8890247325eb84ff46621d34fe)

doc/mgr: remove section absent from pacific

This removes a section that is not in Pacific.

This is an alteration to the simple backport from
the main branch.

Signed-off-by: Zac Dover <zac.dover@gmail.com>

include: fix IS_ERR on Windows

The "long" type uses 32b on x64 Windows platforms, which means
it's not large enough to store a pointer. intptr_t or uintptr_t
should be used instead.

This change fixes include/err.h, using the right types. There was
a previous patch on this topic but unfortunately it didn't address
all the type casts.

This issue was brought up by the unittest_crush test, which recently
started to fail as the CrushWrapper methods use IS_ERR.

Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
(cherry picked from commit c95b6b6c774da05e989cd09e23eee1eeaa9e6ec2)

cmake: link denc-mod-rgw against Boost::filesystem

to address the runtime link failure.

this change is not cherry-picked from main branch. as, in main branch,
the Boost::filesystem linkage is pulled in by rgw_common, which was
changed to a static library in 43d10b9e44ca50700e9076a47f2c38b360d1d632.
but this change is not included in pacific. so rgw added the linkage
via rgw_libs CMake variable. unfortunately, the lexical scope of this
variable does not not include tools/ceph-dencoder/CMakeLists.txt, so
we have to add this linkage manually here.

Signed-off-by: Tim Serong <tserong@suse.com>
Signed-off-by: Kefu Chai <tchaikov@gmail.com>

ceph-dencoder: Add erasure_code to denc-mod-osd's target_link_libraries

Fixes: https://tracker.ceph.com/issues/57390
Signed-off-by: Tim Serong <tserong@suse.com>
(cherry picked from commit 690b9c6e8097666b1cda8f5a4fdd3d1d6903373f)

test/cli-integration/rbd: iSCSI REST API responses aren't pretty-printed anymore

See https://github.com/ceph/ceph-iscsi/pull/263 and
https://github.com/pallets/flask/pull/2193.  Flask stopped
pretty-printing by default in 1.0:

  Change the default for JSONIFY_PRETTYPRINT_REGULAR to False.
  json.jsonify returns a compact format by default, and an indented
  format in debug mode.

Fixes: https://tracker.ceph.com/issues/57343
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 1cec9e83c02c366b5274739ae11297b6fca8584f)

tools/ceph-dencoder: include experimental/filesystem as an alternative

in case we use pre C++17 C++ compiler and standard library.

this change is not cherry-picked from main, as we are using new C++
stanrdard library which is compliant with C++20. so no need to worry
about this.

Signed-off-by: Tim Serong <tserong@suse.com>
Signed-off-by: Kefu Chai <tchaikov@gmail.com>

Merge pull request #47870 from zdover23/wip-doc-2022-08-30-backport-447843-to-pacific

pacific: doc/mgr: update prompts in dboard.rst includes

Reviewed-by: Anthony D'Atri <anthony.datri@gmail.com>

tooling: Change mrun to use bash

Since mrun contains some bashisms, have it use bash explicitly.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 1a049489cc3d2f63284365d6a0af6ee55c7efffd)

rbd-mirror: skip setting error code on snapshot replayer shutdown

This is regarding failures in unregister_remote_update_watcher() and
unregister_local_update_watcher().  handle_replay_complete() can't be
called in these cases anymore as it would blindly attempt to unregister
watchers from scratch again.  Dropping handle_replay_complete() calls
there means that these failures would only be logged and would not be
surfaced by snapshot replayer.  But the only caller ignores them
anyway:

  void ImageReplayer<I>::shut_down(int r) {
    ...
    // close the replayer
    if (m_replayer != nullptr) {
      ctx = new LambdaContext([this, ctx](int r) {
        m_replayer->destroy();
        m_replayer = nullptr;
        ctx->complete(0);             <------
      });
      ctx = new LambdaContext([this, ctx](int r) {
        m_replayer->shut_down(ctx);
      });
    }

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ced071f0de57af8cddffebca24baeb27f2a211d8)

rbd-mirror: resume pending shutdown on error in snapshot replayer

If a shutdown is requested, e.g. by update_pool_replayers() because
remote RADOS instance got blocklisted, and Replayer::shut_down() pends
it on completion of current snapshot sync, it gets stuck if replayer
encounters an error in the interim. This is particularly likely in the
blocklist case: a higher layer may detect that client got blocklisted
and request a shutdown first, and then when replayer sees EBLOCKLISTED
in turn, it calls handle_replay_complete() -- which does not resume
a pending shutdown. Because update_pool_replayers() blocks on shutdown
with Mirror::m_lock held, eventually the entire daemon hangs in
perpetuity.

Fixes: https://tracker.ceph.com/issues/56154
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit fc4cc575bc53f62f88ee3faf0daba8906bc1c6c1)