Kefu Chai [Sun, 8 Mar 2020 06:00:53 +0000 (14:00 +0800)]
qa/tasks/ceph_manager: use StringIO for capturing COT output
there are couple factors we should consider when choosing between
BytesIO and StringIO:
- if the producer is producing binary
- if we are expecting binary
- if the layers in between them are doing the decoding/encoding
automatically.
in our case, the producer is either the ChannelFile instances returned
by paramiko.SSHClient or subprocess.CompletedProcess insances returned
by subprocess.run(). the former are file-like objects opened in "r" mode,
but their contents are decoded with utf-8 when reading if
ChannelFile.FLAG_BINARY is not specified. that's why we always try to
add this flag in orchestra/run.py when collecting the stdout and stderr
from paramiko.SSHClient after executing a command.
back in python2, this works just fine. as we don't differentiate bytes
from str by then.
but in python3, we have to make a decision. in the case of
ceph-objectstore-tool (COT for short), it does not produce binary and
we don't check its output with binary, so, if neither Remote.run() nor
LocalRemote.run() decodes/encodes for us, it's fine.
so it boils down to `copy_to_log()`:
i think we we should respect the consumer's expectation, and only decode
the output if a StringIO is passed in as stdout or stderr.
as we always log the output with logging we could either set
`ChannelFile.FLAG_BINARY` depending on the type of `capture` or not.
if it's not set, paramiko will return str (bytes) on python2, and str on
python3. if it's not set paramiko will return str (bytes) on python2,
and bytes on python3.
if there is non-ASCII in the output, logging will bail fail with
`UnicodeDecodeError` exception. and paramiko throws the same exception
when trying to decode for us if `ChannelFile.FLAG_BINARY` is not
specified.
so to ensure that we always have logging messages no matter if the
producer follows the rule of "use StringIO if you only emit text" or
not, we have to use `ChannelFile.FLAG_BINARY`, and force paramiko
to send us the bytes. but we still have the luxury to use StringIO
and do the decode when the caller asks for str explicitly. that'd save
the pain of using `str.decode()` or `six.ensure_str()` everywhere
even if we can assure that the program does not write binary.
rbd: make common options override krbd-specific options
ceph-csi has added support for passing custom map and unmap options via
mapOptions and unmapOptions storage class parameters. However, it also
uses --read-only for implementing ROX (ReadOnlyMany) PVs. If the user
supplies "mapOptions: rw", they will get around the intended read-only
restriction (at least on the block device).
ceph-csi could be patched to use "-o ro", but it actually makes sense
for common options to win over device type-specific equivalents.
Conflicts:
src/tools/rbd/action/Kernel.cc [ snapshot quiesce support and
commit 34f539d8af33 ("rbd: delay parsing of default kernel map
options") not in octopus ]
Alfonso Martínez [Fri, 18 Sep 2020 15:16:34 +0000 (17:16 +0200)]
mgr/dashboard: fix performance issue when listing large amounts of buckets
Fixes: https://tracker.ceph.com/issues/47543 Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit 924368e1d0aebcb0d8f9747589d9048414d33080)
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-bucket-details/rgw-bucket-details.component.spec.ts
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-bucket-details/rgw-bucket-details.component.ts
src/pybind/mgr/dashboard/frontend/src/app/shared/api/rgw-bucket.service.ts
- Adapted changes in these files to octopus code.
Matthew Oliver [Thu, 9 Jul 2020 06:13:05 +0000 (06:13 +0000)]
rgw: Swift API anonymous access should 401
There was a previous patch to fix this but turns out that only fixed it
for the Swift V1 auth. And it actaully broke keystone because it didn't
take into account the idiosyncrasies of multi tenancy. Which resulted in
the incorect behaviour for keystone. Worse, because it didn't take
tenants properly into account keystone ACLs where broken.
This patch reworks, and simplifies the original patch to work for both
auths. It even extends the ThirdPartyAccountApplier to check for an ANON
user and properly scope it to a tenant.
Fixes: https://tracker.ceph.com/issues/46295 Signed-off-by: Matthew Oliver <moliver@suse.com>
(cherry picked from commit 67081098dc2dddd80d52d5acd166e68954cae618)
Casey Bodley [Mon, 31 Aug 2020 15:19:34 +0000 (11:19 -0400)]
radosgw-admin: period pull command is not always a raw_storage_op
if a --url is given, 'period pull' does not depend on any zone/period
configuration and can be a raw_storage_op. if we get a --remote instead,
we do need to initialize the zone/period configuration to find the
correct endpoint/access keys
mgr/dashboard: Fix many-to-many issue in host-details dashboard
The labels on one side do not match the labels of the other side, where
a label_replace is used. The fix uses the same label_replace on the
missing side.
Fixes: https://tracker.ceph.com/issues/47334 Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
(cherry picked from commit fe64b9d1763ec9dbe78fe73c403929524ab4e253)
`ceph-volume simple activate --all` relies on the presence of json files
in `/etc/ceph/osd` that was created with `ceph-volume simple scan`
command.
In a cluster lifecycle, it is very likely an OSD which was deployed with
ceph-disk at some point gets removed or replaced. It means the corresponding
json file in `/etc/ceph/osd` becomes unrelevant. It makes `ceph-volume
simple activate --all` fails because it tries to mount non existing
partitions.
The idea here is to simply warn the user that the osd described in the
json file doesn't exist anymore and exit properly instead of throwing an
error.
Patrick Donnelly [Wed, 16 Sep 2020 19:28:55 +0000 (12:28 -0700)]
mon: allow overriding the initial mon_host
This overrides what the CephContext believes to be the current quorum of
monitors (retrieved from other instances of the MonClient), introduced
by [1]. Tests need to be able to target a specific monitor for
exercising forwarding and other things.
mon: store mon updates in ceph context for future MonMap instantiation
MonMap builds initial mon list using provided sources, like
mon-host or monmap.
For future instantiations of MonClient, if mon addresses are
updated, stale information from the provided sources are used.
This commit retains mon updates that are processed by the
MonClient in CephContext, for use in MonMap instantiations
and hence uses updated information as required.
This is helpful in cases where librados or libcephfs
instantiate MonClient in the ceph-mgr deamon as required.
Jason Dillaman [Wed, 5 Aug 2020 16:36:26 +0000 (12:36 -0400)]
test/rbd-mirror: pool watcher registration error might result in race
The init finish context should be swapped out before it attempts to
re-register the watcher. This affects the test case which mocks the
timer to fire immediately instead of after 30 seconds.
Fixes: https://tracker.ceph.com/issues/46669 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit c89d31ebf6c412d609123979c63ebc600b70e179)
RTD does not support installing system packages, the only ways to install
dependencies are setuptools and pip. while ditaa is a tool written in
Java. so we need to find a native python tool allowing us to render ditaa
images. plantweb is able to the web service for rendering the ditaa
diagram. so let's use it as a fallback if "ditaa" is not around.
also start a new line after the directive, otherwise planweb server will
return 500 at seeing the diagram.
doc/conf.py: exclude pybindings docs from build for RTD
because it'd difficult to prepare (dummy) librados,libcephfs and librbd for
their python bindings in the building environment offered by Read the Docs.
mgr/dashboard: increase Grafana iframe height to avoid scroll bar
Add Grafana component style four with height 1160px
Fixes : https://tracker.ceph.com/issues/44966 Signed-off-by: Ngwa Sedrick Meh <nsedrick101@gmail.com>
(cherry picked from commit 550762ed872871ca63c44c68fca66ea74bea0cc4)
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/hosts/host-details/host-details.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/osd/osd-details/osd-details.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/osd/osd-list/osd-list.component.html
src/pybind/mgr/dashboard/frontend/src/app/ceph/pool/pool-details/pool-details.component.html
- We are still using the tabset style in octopus, while we use the ngbNav directive in master;
Also there are some more tabs in master vs. octopus which are not related to this PR and thus
are not included
Jason Dillaman [Wed, 5 Aug 2020 13:12:41 +0000 (09:12 -0400)]
librbd: migration abort should revert data back to the original image
If the migration destination image was modified and then the migration
was aborted, we need to copy the data back to the source image to avoid
losing data. For simplicity we will only revert the HEAD revision state
and will not attempt to copy new snapshots on the destination image
back to the source.
Fixes: https://tracker.ceph.com/issues/41394 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 5bd15da8be09a4e7644d411a0b0c132e5b795393)
We want to prevent the destination image from being used while an
abort is in-progress. Test that the image has no watchers prior to
permitting the abort, switch the migration state to ABORTING, and
treat the image as read-only if the migration state is ABORTING.
mgr/dashboard: Fix bugs in a unit test and i18n translation
* SKIP: Unknown argument type: 'BinaryExpression' NodeObject { ... text: 'You can activate the module on the Telemetry configuration '
* WARNING: /ceph/src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-bucket-list/rgw-bucket-list.component.spec.ts:39:32 - get is deprecated: from v9.0.0 use TestBed.inject
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/shared/components/telemetry-notification/telemetry-notification.component.ts
Replaced `TestBed.inject` by `TestBed.get` as Angular 9 has not been backported.
Ilya Dryomov [Sat, 29 Aug 2020 10:02:30 +0000 (12:02 +0200)]
msg/async/ProtocolV2: allow rxbuf/txbuf get bigger in testing
We have a kernel client test case that constructs huge auth tickets
to exercise the three related code paths in the kernel. One of the
tickets is bigger than 1000000 bytes, as required for triggering the
third code path.
We haven't bumped into this assert earlier because the kernel client
is still on msgr v1. However, "rbd map" and "rbd unmap" commands
started connecting to the cluster in commit 96f05a7956b3 ("rbd: delay
determination of default pool name") and that happens via msgr v2.
mgr/cephadm: Add comments to secondary contaieners
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com> Co-authored-by: Michael Fritch <mfritch@suse.com>
(cherry picked from commit 98c0119833b9fc9648d667f002d84b1ccb75f334)
Matthew Oliver [Mon, 17 Aug 2020 01:08:56 +0000 (11:08 +1000)]
cephadm: auto wrap and unwrap ipv6 addresses
This patch attempts to simplify IPv6 support in cephadm by automatically
wrapping and unwrapping IPv6 addresses when required.
There are some asumptions though, if you are supplyings an IPv6 addrv
then it needs to be wrapped. But because you are specifiying, you should
know what your doing.
But in general, it means in bootstrap you should be able to supply ipv6
addresses wrapped or not so long as there isn't a post appended.
Fixes: https://tracker.ceph.com/issues/46922 Signed-off-by: Matthew Oliver <moliver@suse.com>
(cherry picked from commit 09eac4bef0f04f5db7118f94dd9679f3295bddf8)
Adam King [Thu, 27 Aug 2020 16:22:49 +0000 (12:22 -0400)]
mgr/cephadm: Verify non-empty list in get_active_daemon functions
The get_active_daemon functions for monitoring stack daemons
were just returning the first or last daemon in the given list
without checking the list actually contained any daemons