doc/start: edit first 50 lines of documenting-ceph
Edit the first 150 lines of doc/start/documenting-ceph.rst. This is part
of an initiative to harvest the fruits of Cephalocon 2023, at which
documentation proved to be in demand to a surprising degree.
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit dd37f94aa4f1de947b1eaf5d82cc529925f5823e)
Conrad Hoffmann [Wed, 22 Mar 2023 22:03:57 +0000 (23:03 +0100)]
doc: account for PG autoscaling being the default
The current documentation tries really hard to convince people to set
both `osd_pool_default_pg_num` and `osd_pool_default_pgp_num` in their
configs, but at least the latter has undesirable side effects on any
Ceph version that has PG autoscaling enabled by default (at least quincy
and beyond).
Assume a cluster with defaults of `64` for `pg_num` and `pgp_num`.
Starting `radosgw` will fail as it tries to create various pools without
providing values for `pg_num` or `pgp_num`. This triggers the following
in `OSDMonitor::prepare_new_pool()`:
- `pg_num` is set to `1`, because autoscaling is enabled
- `pgp_num` is set to `osd pool default pgp_num`, which we set to `64`
- This is an invalid setup, so the pool creation fails
Likewise, `ceph osd pool create mypool` (without providing values for
`pg_num` or `pgp_num`) does not work.
Following this rationale:
- Not providing a default value for `pgp_num` will always do the right
thing, unless you use advanced features, in which case you can be
expected to set both values on pool creation
- Setting `osd_pool_default_pgp_num` in your config breaks pool creation
for various cases
This commit:
- Removes `osd_pool_default_pgp_num` from all example configs
- Adds mentions of the autoscaling and how it interacts with the default
values in various places
For each file that was touched, the following maintenance was also
performed:
- Change interternal spaces to underscores for config values
- Remove mentions of filestore or any of its settings
- Fix minor inconsistencies, like indentation etc.
There is also a ticket which I think is very relevant and fixed by this,
though it only captures part of the broader issue addressed here:
qa/suites/rbd: install qemu-utils in addition to qemu-block-extra on Ubuntu
qemu-utils is usually pre-installed but, due to what appears to be
a Ubuntu packaging bug, it's not upgraded when qemu-block-extra is
installed:
The following NEW packages will be installed:
qemu-block-extra
The following packages will be upgraded:
qemu-system-common qemu-system-data qemu-system-gui qemu-system-x86
However, the version of the block driver must match exactly the version
of the qemu-img tool, so the above leads to:
$ qemu-img convert -f qcow2 -O raw /home/ubuntu/cephtest/qemu/base.client.0.0.qcow2 rbd:rbd/client.0.0
Failed to initialize module: /usr/lib/x86_64-linux-gnu/qemu/block-rbd.so
Note: only modules from the same build can be loaded.
qemu: module block-block-rbd not found, do you want to install qemu-block-extra package?
qemu-img: Unknown protocol 'rbd'
Matt Benjamin [Thu, 15 Dec 2022 19:55:16 +0000 (14:55 -0500)]
rgw/notifications: fetch object state to get size, in rgw_lc.cc
Failure to call get_obj_state() leaves object size and other members
uninitialized, and appears to result in in lc delete notifications
with 0 for object size.
Fixes: https://tracker.ceph.com/issues/58287 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit b20a66767f782c06258fb0a5551ee45d6dccb91c)
Vedansh Bhartia [Thu, 2 Mar 2023 13:04:53 +0000 (18:34 +0530)]
rgw: use unique_ptr for flat_map emplace in BucketTrimWatcher
When emplacing objects into the trim notify handler of
BucketTrimWatcher, use a unique_ptr for the handler so that it is
destroyed if the emplace fails.
Though the destructor is already called, this behaviour cannot be relied
upon. std::map does not exhibit the same behaviour, and would have
leaked memory had it been used instead.
Matt Benjamin [Sat, 11 Mar 2023 19:58:54 +0000 (14:58 -0500)]
Do not duplicate query-string in ops-log
Fixes: https://tracker.ceph.com/issues/59059 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit 3f2313f0e67c444407139c80dff596c5d5b5903e)
Yuval Lifshitz [Sun, 26 Mar 2023 10:02:17 +0000 (10:02 +0000)]
rgw/notifications: support bucket notification with bucket policy
following policy should be used to allow any user to get, put and delete
bucket notification on a bucket called "my-bucket":
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement",
"Effect": "Allow",
"Principal": "*",
"Action": ["s3:GetBucketNotification", "s3:PutBucketNotification"],
"Resource": "arn:aws:s3:::my-bucket"
}
]
}
note that notification deletion uses the "PUT" permission.
Tongliang Deng [Tue, 13 Dec 2022 06:42:34 +0000 (06:42 +0000)]
rgw/sse-s3: fix bucket encryption of multipart upload
Multipart upload missing encryption when we have bucket encryption
policy. Fix it by fetching bucket encryption policy and resolving
defaults at multipart init op.
Fixes: https://tracker.ceph.com/issues/59218 Signed-off-by: Tongliang Deng <dengtongliang@gmail.com>
(cherry picked from commit 6d9e4f7924c6149d23919ef82bc09406e1290164)
Marcus Watts [Fri, 8 May 2020 05:41:35 +0000 (01:41 -0400)]
rgw/civetweb: handle old clients with transfer-encoding: chunked.
s3 clients *should* provide an x-amz-decoded-content-length field
when they use transport-encoding: chunked. Some clients do not.
With swift we already allow chunked uploads that do not specify the
content length in advance. This commit adds similar support
for s3. Known client affected by this: boto2.
lichaochao [Tue, 28 Mar 2023 03:17:26 +0000 (05:17 +0200)]
rgw: fix rgw cache invalidation after unregister_watch() error
When a metadata osd fails, an unregister_watch() error may occur,
resulting in an rgw cache invalidation.
By adding an unregister_done flag and when a register_watch() error ,
performing a reinit() operation again,
After the first reinit() failure, the register_watch() will be performed again
Fixes: https://tracker.ceph.com/issues/59217 Signed-off-by: lichaochao <lichaochao2_yewu@cmss.chinamobile.com>
(cherry picked from commit f9aae71af3ad8eee5996c31544d98041968dbbec)
Casey Bodley [Thu, 23 Mar 2023 19:02:51 +0000 (15:02 -0400)]
rgw: RGWCopyObj loads src_bucket in init_processing()
if `RGWCopyObj::verify_permissions()` returns an error, it may leave
some zipper objects uninitialized. when the user has admin or system
privileges, we'll ignore that error and call `execute()` anyway. this
moves the initialization into `RGWCopyObj::init_processing()` instead
Casey Bodley [Thu, 23 Mar 2023 18:49:09 +0000 (14:49 -0400)]
rgw: don't reuse s->bucket_instance_id for src_bucket
the bucket_instance_id gets parsed from the "rgwx-bucket-instance"
header, and corresponds to the destination bucket. don't reuse it
for the src_bucket, whose name either comes from the s3 header
"x-amz-copy-source" or the swift header "x-copy-from"
Casey Bodley [Thu, 23 Mar 2023 18:01:51 +0000 (14:01 -0400)]
rgw: dont duplicate sal handles in RGWCopyObj
the CopyObject request is issued against the destination bucket/object,
so refer directly to s->bucket and s->object instead of creating
separate dst_bucket and dst_object handles
```
error: /var/cache/dnf/baseos-00fe51d07def85f0/packages/kernel-core-4.18.0-483.el8.x86_64.rpm: signature hdr data: BAD, no. of bytes(459772) out of range
```
ceph-volume tests are failing, OSDs never get up and running.
For some reason, updating the OS early in the testing workflow
addresses that issue in the CI.
Remove confusing parentheses and add a clearer (as compared to the
parentheses) hyphen (actually an em-dash, or at least it is intended
to be an em-dash) to doc/rados/operations/monitoring-osd-pg.rst
Co-authored-by: Anthony D'Atri <anthony.datri@gmail.com> Signed-off-by: Zac Dover <zac.dover@proton.me>
(cherry picked from commit 0c965c18d0e6ab1461b5fad42d481f25e4207940)
Ilya Dryomov [Tue, 28 Mar 2023 18:03:05 +0000 (20:03 +0200)]
librbd: avoid generating ESHUTDOWN in ManagedLock
EBLOCKLISTED has a very special meaning but happens to be an alias for
ESHUTDOWN. If the client gets blocklisted, we always want to propagate
EBLOCKLISTED error code since it's generated by the OSD.
For ManagedLock use case of indicating that an operation on the lock
raced with lock shut down, meaning that a higher level request can just
be restarted, ERESTART should do.
Ilya Dryomov [Tue, 28 Mar 2023 17:52:42 +0000 (19:52 +0200)]
librbd: fix recursive locking on owner_lock in ImageDispatch
needs_exclusive_lock() calls acquire_lock() with owner_lock held.
If lock acquisiton races with lock shut down, ManagedLock completes
ImageDispatch context directly and dispatch is retried immediately on
the same thread (due to DISPATCH_RESULT_RESTART). This results in
recursion into needs_exclusive_lock() and, barring locking issues, can
lead to unbounded stack growth if lock shut down takes its time.
During send_acquire_lock, there's a case where
there's no watcher handle present and lock request is delayed.
If the client is blocklisted, the delayed request will not
continue and the call that requested lock will never complete.
The lock process will now propagate -EBLOCKLIST, to callback
instead of indefinitely delaying.
Fixes: https://tracker.ceph.com/issues/59115 Signed-off-by: Christopher Hoffman <choffman@redhat.com>
(cherry picked from commit 6a0aeadc31ab1942c42c6e466183148f1d3752be)
Ilya Dryomov [Thu, 30 Mar 2023 11:58:20 +0000 (13:58 +0200)]
librbd: clear Image::list_watchers() list before populating it
The "append to the passed list" behavior is confusing and not what the
corresponding C API (rbd_watchers_list) or other similar C++ APIs (e.g.
list_lockers) do.
Dongsheng Yang [Wed, 15 Mar 2023 06:54:39 +0000 (06:54 +0000)]
librbd: fix wrong attribute for rbd_quiesce_complete api
When we use rbd_quiesce_complete api, we got an error:
/usr/bin/ld: undefined reference to `rbd_quiesce_complete'
Then we found the problem is the symbol of rbd_quiesce_complete
in librbd.so is LOCAL. After some investigation, we found
the attribute of rbd_quiesce_complete api is CEPH_RADOS_API
rather than expected CEPH_RBD_API.
Fixes: https://tracker.ceph.com/issues/59208 Signed-off-by: Dongsheng Yang <dongsheng.yang.linux@gmail.com>
(cherry picked from commit 51a2b707a3074e000b310fc20901d5038b15ea0c)