git.apps.os.sepia.ceph.com Git

cephadm: `ceph-volume` should raise fsid mismatch

raise an fsid mismatch error when passed differing fsids via `--fsid` and `--config`:

```
self = <tests.test_cephadm.TestCephVolume object at 0x7f1c711961f0>, cephadm_fs = <pyfakefs.fake_filesystem.FakeFilesystem object at 0x7f1c713addc0>

    def test_fsid(self, cephadm_fs):
        cv_cmd = ['--', 'inventory', '--format', 'json']
        fsid = '00000000-0000-0000-0000-0000deadbeef'

        cmd = ['ceph-volume', '--fsid', fsid] + cv_cmd
        with with_cephadm_ctx(cmd) as ctx:
            cd.command_ceph_volume(ctx)
            assert ctx.fsid == fsid

        s = get_ceph_conf(fsid=fsid)
        f = cephadm_fs.create_file('ceph.conf', contents=s)

        cmd = ['ceph-volume', '--fsid', fsid, '--config', f.path] + cv_cmd
        with with_cephadm_ctx(cmd) as ctx:
            cd.command_ceph_volume(ctx)
            assert ctx.fsid == fsid

        cmd = ['ceph-volume', '--fsid', '10000000-0000-0000-0000-0000deadbeef', '--config', f.path] + cv_cmd
        with with_cephadm_ctx(cmd) as ctx:
            err = 'fsid does not match ceph.conf'
            with pytest.raises(cd.Error, match=err):
                cd.command_ceph_volume(ctx)
>               assert ctx.fsid == None
E               AssertionError: assert '10000000-0000-0000-0000-0000deadbeef' == None
E                +  where '10000000-0000-0000-0000-0000deadbeef' = <cephadm.CephadmContext object at 0x7f1c7121c1c0>.fsid
```

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit d9198d8668055fdc9e5c3c58668a6aeeda61df4a)

cephadm: add `ceph-volume` tests

add basic ceph-volume tests for `--fsid`, `--config`, and `--keyring`

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 2d2bb9f96d7c34fab3e8d25c77986ab1a319fa1b)

cephadm: remove `get_parm` mock

fixture does not need to patch the `get_parm` func

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit b589ae68fa63ddb303a60232fe14200443bca0cc)

cephadm: don't use ctx.fsid for clean_cgroup

The clean_cgroup method assumes that the ctx.fsid is set while this is
true for the bootstrap command, it isn't set for adopt or deploy commands
(and maybe others).

This ends up to the adopt command to fails:

Traceback (most recent call last):
  File "/sbin/cephadm", line 8301, in <module>
    main()
  File "/sbin/cephadm", line 8289, in main
    r = ctx.func(ctx)
  File "/sbin/cephadm", line 1764, in _default_image
    return func(ctx)
  File "/sbin/cephadm", line 5091, in command_adopt
    command_adopt_ceph(ctx, daemon_type, daemon_id, fsid)
  File "/sbin/cephadm", line 5299, in command_adopt_ceph
    osd_fsid=osd_fsid)
  File "/sbin/cephadm", line 2884, in deploy_daemon_units
    clean_cgroup(ctx, unit_name)
  File "/sbin/cephadm", line 2724, in clean_cgroup
    if not ctx.fsid:
  File "/sbin/cephadm", line 155, in __getattr__
    return super().__getattribute__(name)
AttributeError: 'CephadmContext' object has no attribute 'fsid'

Since we already have the fsid value in deploy_daemon_units (which calls
clean_cgroup) then we can pass the fsid value directly.

This fixes a regression introduced by 1fee255

Fixes: https://tracker.ceph.com/issues/51902
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 3907ce7d6e091f87c3bd4437d13951ee838dc02b)

cephadm: don't fail hard on SameFileError during shutil.copy

Fixes: https://tracker.ceph.com/issues/51829
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 3909be178addb6f692d97651bfdaf9ae742497f1)

mgr/cephadm: ingress: fix typo in spec.virtual_interface_networks reference

When using virtual_inteface_networks to identify the interface to have the
virtual ip on, it referenced spec.networks instead of
spec.virtual_interface_networks.

Fixes: https://tracker.ceph.com/issues/51721
Signed-off-by: Asbjørn Sannes <asbjorn.sannes@interhost.no>
(cherry picked from commit 024c6aba01362e18ff02d88004b501663dbfdeed)

mgr/cephadm: Don't allow stopping full mgr, mon or osd services

I can't think of any case where we would want to allow this

Fixes: https://tracker.ceph.com/issues/51298
Signed-off-by: Adam King <adking@redhat.com>
(cherry picked from commit 6402c587d006f8e179f01ec2007828bb5fbf5489)

mgr/cephadm/grafana: check if dashboard is enabled

When deploying the grafana service but the mgr dashboard isn't enabled then
dashboard set-grafana-api-ssl-verify command fails.

Closes: https://tracker.ceph.com/issues/51796
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 16bb5b8076a6df30b1e6323b406fee0ba6cc2b39)

mgr/cephadm/iscsi: simplify the dashboard check

We don't need to run an extra command (mgr module ls) to obtain the mgr
modules list since we already have this information in the mgr_map.
This workflow is already done for the monitoring stack or for configuring
the iscsi integration within the dashboard (during creation) via the
config_dashboard method.

The mgr_map is mocked in the tests with the dashboard module enabled so we
don't need _mon_command_mock_mgr_module_ls anymore.

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit a6808efca4535f10c5639ee0a6a517c110da3f44)

mgr/cephadm: Fix haproxy not being recognized as a proper daemon

Turns out daemon types != service types:

cephadm [WRN] Found unknown service type haproxy on host smithi019
cephadm [WRN] Found unknown service type keepalived on host smithi019

leading to `self.mgr.cache.get_daemons_by_service(spec.service_name())`
not returning any daemons.

Fixes: https://tracker.ceph.com/issues/51311
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit a8f1cf2edb0ef5b48632af8da9577c8a42a6ff60)

mgr/cephadm/templates: add jinja2 lint

This adds a jinja2 lint environment in tox for testing the cephadm jinja2
templates.

This patch fixes some minor jinja2 syntax for ganesha and keepalived even if
the current templates work perfectly.

Tags should have one (and only one) space

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 9eac2fba90a3b179087455100e494d7da6b0910e)

cephadm: haproxy 2.4 defaults to a different container user.

Another alternative would be to investigage a different setup
leverageing `--sysctl net.ipv4.ip_unprivileged_port_start=0`,
but that would be a larger PR.

Fixes: https://tracker.ceph.com/issues/51355
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 250064bdcbe778b3cc245df843d14dd19cbb8772)

cephadm: use pyfakefs during `test_create_daemon_dirs_prometheus`

convert test to use the `cephadm_fs` fixture

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit f853ce7e9a52b4abd0f6626ca13886bd0e2e36a6)

cephadm: use CephadmContext rather than MagicMock

MagicMock hides attribute errors:

```
self = <cephadm.CephadmContext object at 0x7f1121e62370>, name = 'config_json'

    def __getattr__(self, name: str) -> Any:
        if '_conf' in self.__dict__ and hasattr(self._conf, name):
            return getattr(self._conf, name)
        elif '_args' in self.__dict__ and hasattr(self._args, name):
            return getattr(self._args, name)
        else:
>           return super().__getattribute__(name)
E           AttributeError: 'CephadmContext' object has no attribute 'config_json'
```

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 4a99b771a4a59671728e072bb27270bba8cb78c8)

cephadm: add `infer_config` unit test

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 53d07362ff8efa5846fd6067b469923457d5ac8d)

cephadm: add `shell` command tests

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit e6dca29ae7513c2555aacf953a5d4ec7fb7ea0bb)

cephadm: add `infer_fsid` unit test

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit c19fb2568ed06376c103bdea22816811bf30317e)

cephadm: infer fsid from ceph.conf

Fixes: https://tracker.ceph.com/issues/51328
Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit e35271adfcfd64ca19aee58b334eb1aaf3855ef4)

qa/workunits/test_cephadm: Also test stdin

Just to be sure

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 95c2d2c6fa9bb74efcfa168316bc91a38226275c)

doc/cephadm: add notes to `orch daemon add`

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit b53af54adcf833a79df8f191829f5868675f859e)

doc/cephadm: Add RGW ssl

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 0a77eee518147fb534fb8c27b42fb6069a226832)

doc/cephadm: add missing "ceph"

The word ceph is missing.

Signed-off-by: "Wang,Fei" <wf.ab@126.com>
(cherry picked from commit 7f476bea44a5b4b39e6c8fa85aa3747f93269cf9)

doc/cephadm: correct a transposed word error

The positions of two words are interchanged:
scans each cluster in the host ----> scans each host in the cluster

Signed-off-by: "Wang,Fei" <wf.ab@126.com>
(cherry picked from commit 0003abd49c239e7bab64860c403b1f6596e2ad7a)

mgr/cephadm: add help strings for 'orch client-keyring ...' commands

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 7cc8510266e0ccc015750e2572ff3f41215ab3a2)

doc/cephadm: operations.rst typo

s/any hosts that is/any host/

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 86b53cc1eb7fbc78d464b2e41e584af212dec5df)

Fetch the actually running selinux status.

The HostFacts should return the **actual** selinux mode in which the
kernel is running.

The actual mode can be different from the one in the configuration
if the server has not been rebooted or if the mode was changed
after boot using setenforce.

Instead of reading _selinux_path_list we should look at the output of
sestatus or getenforce.

The _selinux_path_list attribute is no longer needed.

Fixes: https://tracker.ceph.com/issues/51632
Signed-off-by: Javier Cacheiro <javier.cacheiro.lopez@cesga.es>
(cherry picked from commit c3c79fc44c34825384c59cbe962b9153e6b522b0)

mgr/cephadm/iscsi: check if dashboard is enabled

When the mgr dashboard module isn't enabled then the iSCSI service deletion
is stuck and the cluster state goes ERR.
The `ceph dashboard` commands aren't available when the mgr dashboard module
isnt' enabled.

Closes: https://tracker.ceph.com/issues/51546
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 1b83aea8b0f5156f3e8bf0c3e33853fb4557f888)

doc/cephadm: rewrite troubleshooting 1 of x

This PR improves the readability and format
of the troubleshooting.rst file. This also
makes a change to the markdown of one of the
sub-subsections so that it is made of tildes
(~) instead of carets (^), because that's
the RST standard.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit bd24b6d392f4c652d8721eccb14c12197d617ca6)

doc/cephadm: operations: Data location & ...

This (very long) PR does a few things:

- Rewrites the "Data Location" section of the Operations
  docs
- Rewrites the "Health Checks" section of the Operations
  docs
- Adds prompts to commands
- Adds console-output formatting to the places where it
  is appropriate
- Adds several section headers where appropriate, to
  signpost to the reader what is currently under discussion

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit c12af828caf3c5d529e85a3205bdc865d8266fcf)

cephadm: use CephadmContext rather than MagicMock

MagicMock hides attribute errors:

```
ctx = <cephadm.CephadmContext object at 0x7f0a12f58eb0>, container_id = 'container_id', daemon_type = 'node-exporter'

    @staticmethod
    def get_version(ctx, container_id, daemon_type):
        # type: (CephadmContext, str, str) -> str
        """
        :param: daemon_type Either "prometheus", "alertmanager" or "node-exporter"
        """
        assert daemon_type in ('prometheus', 'alertmanager', 'node-exporter')
        cmd = daemon_type.replace('-', '_')
        code = -1
        err = ''
        version = ''
        if daemon_type == 'alertmanager':
            for cmd in ['alertmanager', 'prometheus-alertmanager']:
                _, err, code = call(ctx, [
                    ctx.container_engine.path, 'exec', container_id, cmd,
                    '--version'
                ], verbosity=CallVerbosity.DEBUG)
                if code == 0:
                    break
            cmd = 'alertmanager'  # reset cmd for version extraction
        else:
            _, err, code = call(ctx, [
>               ctx.container_engine.path, 'exec', container_id, cmd, '--version'
            ], verbosity=CallVerbosity.DEBUG)
E           AttributeError: 'NoneType' object has no attribute 'path'
```

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 25d62794fc3cfb6496da12a265c97871a470fa4f)

Conflicts:
src/cephadm/tests/test_cephadm.py

cephadm: ensure sysctl_dir exist

For some reason, the sysctl directory could not exist if no packages dropping
a custom sysctl file is installed on the host.
Instead we create the directory if it doesn't exist.

Closes: https://tracker.ceph.com/issues/51620
Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
(cherry picked from commit 71ba01f0180fa1ee7ff37e09adc20a5a0f4e896e)

doc/man/8/cephadm: add --log-to-file (and --single-host-defaults)

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 9a49a5819e70eaa348b1d220a32c8f5b13b517b1)

cephadm: add bootstrap --log-to-file option

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 7500b821aec731ac9d6cef17ad9651d5c6143c68)

doc/dev/cephadm: Define variables

Fixes: https://tracker.ceph.com/issues/47142
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 9d30b96f04e85d16452931f315f900af73190968)

doc/cephadm: improve "Potential Problems"

This PR makes some improvements to the "Potential
Problems" section of the "Upgrading Ceph" chapter
of the cephadm documentation.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 0e91091d1cd5ed61c65d0779091ed06cd578925c)

doc/cephadm: improving "Starting the Upgrade"

This PR (slightly) improves the text in the section "Starting
the Upgrade" in the "Upgrading Ceph" chapter of the cephadm
documentation.

This is a very minor update, and does little but bring the sentences
into agreement with many other sentences that I've already written.
This is done to give the reader an almost tabular sense of what to
expect when looking at our docs.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit d66a164ce721aafad26eebbd92029f0f816bcc85)

mgr/cephadm: avoid saving daemons of unknown type

Fixes: https://tracker.ceph.com/issues/51176
Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 8e15ed7263f84ad955c0172d62420dba75e70d4e)

mgr/cephadm: include addr in HOST_CHECK_FAILED alert detail

Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit 929f2819735aa7799a11d2e60665d2b5173a4a52)

cephadm: workaround unit replace failure

This should be a bug in systemd. It failed to cleanup cgroups when stop the
unit. Then if we start a new unit with the same name, the 'ExecStartPre' command
will fail with status=219/CGROUP (Only when systemd unified cgroup hierarchy is
enabled), because cgroup v2 does not allow process in non-leaf group. This
should be fixed in systemd commit e08dabfec7304dfa0d59997dc4219ffaf22af717.

By now, we just remove these left over cgroups before start new unit.

Fixes: https://tracker.ceph.com/issues/50998
Signed-off-by: 胡玮文 <huww98@outlook.com>
(cherry picked from commit 1fee255ee4ceab99684c34e3e64532b2eb555a9e)

cephadm: shared folder: Mount the cephadm

When using shared_ceph_folder, also mount `cephadm`

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 3a46caf1214726e957319543a69c32bf055a5136)

cephadm: fix regexp to strip `v1:` or `v2:` prefix from IPv6 addr

regexp was striping the first hextet of the IPv6 address:

```
FAILED tests/test_cephadm.py::TestBootstrap::test_mon_addrv[[0000:0000:0000:0000:0000:FFFF:C0A8:0101:1234]-list_networks5-None] - cephadm.Error: Cannot infer CIDR network for mon IP `0000:0000:0000:0000:FFFF:C0A8:0101`: pass --skip-mon-network to configure it later
```

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 85dfc6f4a93dd238a0d56fa49f1b5bdd4417acb7)

cephadm: add `bootstrap --mon-addrv` test

Signed-off-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit e40b78d432edc2da84958d68cbba708e42c4e91d)

mgr/cephadm: add ceph orch host drain and limit host removal to empty hosts

ceph orch host drain removes all daemons from a host so it can be safely removed
ceph orch host rm will only remove host that a safe to remove

Signed-off-by: Daniel Pivonka <dpivonka@redhat.com>
(cherry picked from commit cb8a612d83a5a67f388fe40fa3d8dcefc5accc00)

doc/cephadm: improve "Ceph Daemon Logs" (1 of x)

This PR turned out to be a 3-in-1:

(1) improves syntax and formatting of "Logging to stdout"
(2) improves syntax and formatting of "Logging to files"
(3) replaces all carets with tildes in 3rd-level section
    headers in operations.rst (./build-doc was crying
    about inconsistency when I fed it tildes, but tildes
    and not carets are the RST standard according to
    https://docutils.sourceforge.io/ \
    docs/user/rst/quickstart.html#sections
    so the carets had to go.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit 09b1dd0bb57ea14a459bb0bb17ae2258419cf5e9)

doc/cephadm: improve "Canceling an Upgrade"

This PR improves the section "Canceling an Upgrade"
in the "Upgrading Ceph" chapter of the cephadm
documentation.

I removed an extraneous prompt and rewrote a sentence
so that it was congruent with other sentences in similar
places elsewhere in the documentation.

Signed-off-by: Zac Dover <zac.dover@gmail.com>
(cherry picked from commit ee455f00ab26240e47d7e8588615f386c05ed271)

pyhton-common: fix mypy errors

Fixes:

```
py3 run-test: commands[2] | mypy --config-file=../mypy.ini -p ceph
ceph/deployment/service_spec.py: note: In member "yaml_representer" of class "ServiceSpec":
ceph/deployment/service_spec.py:659: error: Argument 1 to "represent_dict" of "SafeRepresenter" has incompatible type "_OrderedDictItemsView[str, Any]"; expected "Mapping[Any, Any]"
```

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 73e7698958d1cfacf1abb35fcd36f5849b55fd15)

mgr/orch: fix mypy errors

Fixes:

```
orchestrator/__init__.py:6: note: In module imported here:
orchestrator/_interface.py: note: In member "yaml_representer" of class "DaemonDescription":
orchestrator/_interface.py:1039: error: Argument 1 to "represent_dict" of "SafeRepresenter" has incompatible type "ItemsView[Any, Any]"; expected "Mapping[Any, Any]"
orchestrator/_interface.py: note: In member "yaml_representer" of class "ServiceDescription":
orchestrator/_interface.py:1178: error: Argument 1 to "represent_dict" of "SafeRepresenter" has incompatible type "ItemsView[Any, Any]"; expected "Mapping[Any, Any]"
orchestrator/_interface.py: note: At top level:
orchestrator/_interface.py:1181: error: Argument 2 to "add_representer" has incompatible type "Callable[[SafeDumper, DaemonDescription], Any]"; expected "Callable[[SafeDumper, ServiceDescription], Node]"
Found 3 errors in 1 file (checked 29 source files)
```

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
(cherry picked from commit 90c9980e8ff2fb975e70c61d5eb7578385876065)

Merge pull request #42606 from s0nea/wip-dashboard-pacific-translations

mgr/dashboard: update translations for pacific

Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Tatjana Dehler <tdehler@suse.com>
Reviewed-by: Volker Theile <vtheile@suse.com>

Merge pull request #42298 from callithea/wip-51636-pacific

pacific: monitoring: fix Physical Device Latency unit

Reviewed-by: Waad Alkhoury <walkhour@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: clwluvw <NOT@FOUND>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>

Merge pull request #42642 from cfsnyder/wip-51696-pacific

pacific: rgw: fail as expected when set/delete-bucket-website attempted on a non-exis…

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42566 from cfsnyder/wip-51514-pacific

pacific: rgw/notifications: support metadata filter in CompleteMultipartUploa…

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #41367 from cbodley/wip-50845

pacific: rgw: deprecate the civetweb frontend

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

Merge pull request #42637 from cfsnyder/wip-51779-pacific

pacific: rgw : add check for tenant provided in RGWCreateRole

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42635 from cfsnyder/wip-50645-pacific

pacific: rgw: allow rgw-orphan-list to process multiple data pools

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42634 from cfsnyder/wip-51014-pacific

pacific: rgw: remove quota soft threshold

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42633 from cfsnyder/wip-50679-pacific

pacific: rgw: fix segfault related to explicit object manifest handling

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42631 from cfsnyder/wip-50465-pacific

pacific: rgw/notifications: delete bucket notification object when empty

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42626 from cfsnyder/wip-52007-pacific

pacific: rgw: avoid occuring radosgw daemon crash when access a conditionally …

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42625 from cfsnyder/wip-51751-pacific

pacific: RGW Zipper - Make sure bucket list progresses

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42638 from cfsnyder/wip-50708-pacific

pacific: rgw: fix bucket object listing when marker matches prefix

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42649 from cfsnyder/wip-50379-pacific

pacific: rgw/amqp/test: fix mock prototype for librabbitmq-0.11.0

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42647 from cfsnyder/wip-49555-pacific

pacific: rgw/notification: add exception handling for persistent notification thread

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42646 from cfsnyder/wip-50463-pacific

pacific: rgw/multisite: return correct error code when op fails

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42645 from cfsnyder/wip-52051-pacific

pacific: rgw: when deleted obj removed in versioned bucket, extra del-marker added

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42644 from cfsnyder/wip-51804-pacific

pacific: rgw/http/notifications: support content type in HTTP POST messages

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42641 from cfsnyder/wip-50367-pacific

pacific: rgw: during reshard lock contention, adjust logging

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42640 from cfsnyder/wip-51801-pacific

pacific: rgw: radosgw-admin errors if marker not specified on data/mdlog trim

Reviewed-by: Adam Emerson <aemerson@redhat.com>

Merge pull request #42639 from cfsnyder/wip-51781-pacific

pacific: rgw : modfiy error XML for deleterole

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42656 from cfsnyder/wip-51785-pacific

pacific: rgw multisite: metadata sync treats all errors as 'transient' for retry

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42655 from cfsnyder/wip-50803-pacific

pacific: radosgw-admin: skip GC init on read-only admin ops

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42654 from cfsnyder/wip-50731-pacific

pacific: rgw/rgw_file: Fix the return value of read() and readlink()

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42653 from cfsnyder/wip-50728-pacific

pacific: rgw : add check empty for sync url

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42652 from cfsnyder/wip-50711-pacific

pacific: rgw: fix for mfa resync crash when supplied with only one totp_pin.

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42651 from cfsnyder/wip-51771-pacific

pacific: test/rgw: fix use of poll() with timers in unittest_rgw_dmclock_scheduler

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42650 from cfsnyder/wip-50642-pacific

pacific: rgw/sts: read_obj_policy() consults iam_user_policies on ENOENT

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42648 from cfsnyder/wip-50094-pacific

pacific: librgw/notifications: initialize kafka and amqp

Reviewed-by: Casey Bodley <cbodley@redhat.com>

qa/rgw: run multisite tests with metadata sync error injection

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit cff6bf37e65964d2d987231598659f08a84b9410)

rgw: limit concurrency of metadata sync

limit the number of concurrent RGWMetaSyncSingleEntryCRs that each mdlog
shard is allowed to spawn. use META_SYNC_SPAWN_WINDOW=20 to match data-
and bucket sync

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit a07add0a5408f0480e93ea5920bc013119ba6401)

rgw: metadata sync treats all errors as 'transient'

collect_children() had a special case for EAGAIN that it treated as
a 'transient' error, which set the can_adjust_marker = false to bail out
of RGWMetaSyncShardCR and retry from the previous marker

but the http client doesn't return EAGAIN - rgw_http_error_to_errno()
defaults to EIO - so this retry logic based on can_adjust_marker never
runs. on any other error, RGWMetaSyncSingleEntryCR would not call
marker_tracker->finish() to advance the sync status marker, and
RGWMetaSyncShardCR would continue on with full- or incremental sync
without ever attempting to retry the failed entries

a detailed comment in collect_children() describes a different strategy
for handling 'permanent' errors, but that was never fully elaborated.
i also don't think there's a reasonable way to differentiate between
transient and permanent errors, so this treats all errors as transient
to be retried

if an error really is permanent for a given metadata key, metadata sync
will get stuck there and require manual intervention

Fixes: https://tracker.ceph.com/issues/39657
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 866d66b8749b28ec626a8d0adba3d14fdd8abead)

radosgw-admin: skip GC init on read-only admin ops

Fixes: https://tracker.ceph.com/issues/50520
Signed-off-by: Mark Kogan <mkogan@redhat.com>
(cherry picked from commit 9ac1991fc798af7e0ba0fac18209b71b5ae3f02b)

Conflicts:
src/rgw/rgw_admin.cc
src/rgw/rgw_rados.h
src/rgw/rgw_sal.cc
src/rgw/rgw_sal.h

Cherry-pick notes:
- Conflicts due to move from of RGWStoreManager in rgw_sal_rados.h to StoreManager in rgw_sal.h

rgw/rgw_file: Fix the return value of read() and readlink()

Fixes: https://tracker.ceph.com/issues/49189
Signed-off-by: Dai zhiwei <daizhiwei3@huawei.com>
Signed-off-by: luo rixin <luorixin@huawei.com>
(cherry picked from commit bfd83e8fa142873a0bdf09a4d1ad1b04127f5885)

rgw : add check empty for sync url

Fixes: https://tracker.ceph.com/issues/50103
Signed-off-by: caolei <halei15848934852@163.com>
(cherry picked from commit 3a4e0b79310b21eeee37043d5419887bb41c0cf6)

rgw: fix for mfa resync crash when supplied with only one totp_pin.

The fix returns an appropriate error message.

fixes: https://tracker.ceph.com/issues/50394

Signed-off-by: Pritha Srivastava <prsrivas@redhat.com>
(cherry picked from commit 1b943419c45a9b17be71668bd023aeb814171b2c)

test/rgw: use spawn library for test_rgw_dmclock_scheduler

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit a8e3589a2c875b6fadc853c75f20cb9256f294ca)

test/rgw: fix use of poll() with timers in unittest_rgw_dmclock_scheduler

the AsyncScheduler uses an asio timer to dispatch work to its executor
with an optional delay. when no delay is requested, it waits on the
timer with an expiration time in the past (crimson::dmclock::TimeZero)

tests are failing here because poll() is returning without executing the
handlers of those expired timers

asio implements these timers with timerfd and epoll. debugging with
strace, i see that these timers armed with timerfd_settime() are not
always immediately ready according to epoll_wait():

  eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK)   = 3
  epoll_create1(EPOLL_CLOEXEC)            = 4
  timerfd_create(CLOCK_MONOTONIC, TFD_CLOEXEC) = 5
  epoll_ctl(4, EPOLL_CTL_ADD, 3, {events=EPOLLIN|EPOLLERR|EPOLLET, data={u32=14164052, u64=14164052}}) = 0
  epoll_ctl(4, EPOLL_CTL_ADD, 5, {events=EPOLLIN|EPOLLERR, data={u32=14164064, u64=14164064}}) = 0
  timerfd_settime(5, TFD_TIMER_ABSTIME, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=0, tv_nsec=1}}, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=0, tv_nsec=0}}) = 0
  epoll_wait(4, [{events=EPOLLIN, data={u32=14164052, u64=14164052}}], 128, 0) = 1
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [], 128, 0)               = 0
  epoll_wait(4, [{events=EPOLLIN, data={u32=14164064, u64=14164064}}], 128, 0) = 1

in this example, it took 6 calls to context.poll() before it was ready
to execute the timer's handler

to work around this, replace calls to context.poll() with calls to
context.run_for() with a very short duration

Fixes: https://tracker.ceph.com/issues/42788
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 21baed999e31c5e69c75f0cbb8757ef91585d917)

rgw: read_obj_policy() consults iam_user_policies on ENOENT

when the head object doesn't exist, read_obj_policy() has to decide
whether to return ENOENT or EACCES

when there's a bucket policy, we check whether it has s3ListBucket
permissions. when there's an assumed role, we also need to check
against the role's policies in s->iam_user_policies

Fixes: https://tracker.ceph.com/issues/49780
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 5dc9375fa1888242f388f8b502f445f3ddc891f7)

rgw/amqp/test: fix mock prototype for librabbitmq-0.11.0

also use extern C for to get compilation errors when
function prototype change

Fixes: https://tracker.ceph.com/issues/50291
Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
(cherry picked from commit 2ba598ec4c294bd09d2df18ccd2096382e303d39)

librgw/notifications: initialize kafka and amqp

Fixes: https://tracker.ceph.com/issues/49738
Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
(cherry picked from commit 178f6bdac97b57300bbe0956633cf686a7e3ccee)

rgw/notification: add exception handling for persistent notification thread

Fixes: https://tracker.ceph.com/issues/49322
Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
(cherry picked from commit 915963ecb9effcb1f2d38f444c1bb9307f8ffbe1)

Conflicts:
src/rgw/rgw_notify.cc

rgw/multisite: return correct error code when op fails

when trying to disable/enbale sync on non-master zone

Fixes: https://tracker.ceph.com/issues/50201
Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
(cherry picked from commit 83e89dfa3358fe91597d6714483f96b21a234ae6)

rgw: when deleted obj removed in versioned bucket, extra del-marker added

After initial checks are complete, this will read the OLH earlier than
previously to check the delete-marker flag and under the bug's
conditions will return -ENOENT rather than create a spurious delete
marker.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 69d7589fb1305b7d202ffd126c3c835e7cd0dda3)

rgw/http/notifications: support content type in HTTP POST messages

Fixes: https://tracker.ceph.com/issues/51530
Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
(cherry picked from commit 6a1688b57d1e329cecd0c26c494cb08c9a4c3079)

Merge pull request #42629 from rhcs-dashboard/wip-52049-pacific

pacific: mgr/dashboard: show perf. counters for rgw svc. on Cluster > Hosts

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #42628 from rhcs-dashboard/wip-52048-pacific

pacific: mgr/dashboard: fix ssl cert validation for rgw service creation

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

rgw: fail as expected when set/delete-bucket-website attempted on a non-existent bucket, rgw should return HTTP 404 and NoSuchBucket.

Fixes: https://tracker.ceph.com/issues/51536
Signed-off-by: xiangrui meng <mengxr@chinatelecom.cn>
(cherry picked from commit c623aa45d35b269c6701a57e44ac05bb29a79dc8)

rgw: during reshard lock contention, adjust logging

When RGW fails to get a lock on a reshard log, we log it in such a way
that it looks like an error. Instead we'll make sure that the log
message is informational.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 6d3dee37791ad427a3435c493a1d7874ba075674)

rgw: remove quota soft threshold

Remove quota soft threshold, which causes expensive checks for sharded buckets

Fixes: 14eabd4aa7b8a2e2c0c43fe7f877ed2171277526
Signed-off-by: Zulai Wang <wangzl31@outlook.com>
(cherry picked from commit 32a39705765af0f87bec9101e5d337b797e05fea)

Conflicts:
src/common/options/rgw.yaml.in

Cherry-pick notes:
- Options defined in src/common/options.cc in Pacific vs src/common/options/rgw.yaml.in

rgw: radosgw-admin errors if marker not specified on data/mdlog trim

Check that a marker was specified and trim if we don't have one.

Also: In a world where we're parsing for generation, it doesn't really
make sense to have a 'no marker specified' as separate from a marker
that is just an empty string.

Also: Successful datalog trim returns zero, not
-ENODATA, and radosgw-admin should expect this.

Fixes: https://tracker.ceph.com/issues/51712
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 4cb6fcf7f4e9548a8a0dd017b49a6ea23bedffd6)

Conflicts:
src/rgw/rgw_admin.cc

Cherry-pick notes:
- static_cast for RadosStore not needed in Pacific

rgw : modfiy error XML for deleterole

Fixes: https://tracker.ceph.com/issues/51157
Signed-off-by: caolei <halei15848934852@163.com>
(cherry picked from commit c7ab6579c7655352d08a4c12fc3a6951217dbe6f)

rgw: fix bucket object listing when marker matches prefix

When an iniitial marker that ends with a delimiter is provided, it
prevents listing of that "subdirectory" due to new logic at the cls
level to make listing more efficient. The fix catches that situation.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
(cherry picked from commit 220ef4b22d1d1667eb4f2c300a0b788e87b9067d)