Casey Bodley [Thu, 22 Feb 2024 21:54:54 +0000 (16:54 -0500)]
rgw/aio: avoid infinite recursion in aio_abstract()
a recent regression from 320a2179a3c6c1981a0fd2494938515997c1bfad causes
aio_abstract() to recurse when given an empty optional_yield. this is
exposed by the librgw_file tests
common/tracer: fix decoding when jaeger tracing is disabled
We aren't currently using jaeger tracing on Windows. The issue is
that Windows hosts (or any other host that doesn't use jaeger)
are experiencing message decoding failures after a recent change [1].
This change updates the tracer encoding so that messages from
non-jaeger hosts may be decoded by services that use jaeger.
[1] https://github.com/ceph/ceph/pull/47457
Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com>
This commit rebrings 3701ffa6733b001d4278a0b68395c5efe2382f25 which
got reverted due to an implicit dependency with other revert. Please
see https://github.com/ceph/ceph/pull/52114#issuecomment-1950288188.
Omri Zeneva [Wed, 24 Aug 2022 13:57:11 +0000 (09:57 -0400)]
tracer/osd/librados/build/rgw: rgw and osd end2end tracing using opentelemetry
* build: add opentelemetry to cmake system
crimson targets that uses Message.cc/h are built before opentelemetry (o-tel), so we need to build o-tel eralier so we also add the library to the include path earlier
this shoud work for WITH_JAEGER flag both the ON/OFF cases, and for librados where the compilation flag is ignored
* msg/tracer: add o-tel trace to Messages with decode/encode function in tracer.h
some files that uses Message.cc/h just need the encode/decode functions and not all others functions.
some crimson targets does not link with ceph_context (common) which is required for tracer.cc file. so we just need to include that functions
* librados: Add opentelemtry trace param for aio_operate and operate methods
in order to propagate the trace info I added the otel-trace as an extra param.
in some places, there already was a blkin trace info, and since it is not used in other places we can safely change it to o-tel trace info.
this will be done in another commit, so the cleanup of blkin trace will be in a dedicated commit
* osd: use the o-tel trace of the msg as a parent span of the osd trace
if there is a valid span in the msg, we will add this op to the request
trace, otherwise it will start a new trace for the OSD op
* rgw: pass put obj trace info to librados
in order to make it possible, I saved the trace info inside the sal::Object, so we can use it later when writing the object to rados
it could be used also later for read ops.
note the trace field of req_state is initalized only in rgw_process, so it's also required in librgw request flow
* prevent breaking channges to kSize. make sure that changes between components built with
different versions of OTEL do not break message compatibility
When doing PG dump using 'ceph pg dump --format json-pretty'
the output is extremely big that the command hangs and also
the ceph-mgr hangs and eventuall fails over.
The exact size depends on the number of OSDs in the cluster
and the number of peers for each OSD.
In tests, it's been identified that the network ping times
is the largest component in terms of size which is removed
from the output now so as to limit the overall size.
Ronen Friedman [Mon, 12 Feb 2024 14:50:22 +0000 (08:50 -0600)]
test/osd: fix test_scrub_sched following scrubber changes
Replacing PgScrubber::determine_scrub_time() with a local copy,
as a stop-gap measure to keep the test running.
The scrub scheduling refactoring will remove the need for
this function, and the test will be updated accordingly.
Adam King [Thu, 15 Feb 2024 14:42:50 +0000 (09:42 -0500)]
Merge pull request #55566 from zdover23/wip-doc-2024-02-14-cephadm-services-nfs
doc/cephadm: correct nfs config pool name
Reviewed-by: Adam King <adking@redhat.com> Reviewed-by: Anthony D'Atri <anthonyeleven@users.noreply.github.com> Reviewed-by: John Mulligan <jmulligan@redhat.com>
Casey Bodley [Wed, 14 Feb 2024 14:43:14 +0000 (09:43 -0500)]
rgw/putobj: RadosWriter uses part head object for multipart parts
the cleanup logic in the RadosWrite destructor was using the wrong
`head_obj` to avoid races between cleanup and part re-uploads. it
pointed at the final location of the multipart upload, rather than the
head object of the current part
Ilya Dryomov [Mon, 12 Feb 2024 12:07:22 +0000 (13:07 +0100)]
librbd: refactor merge() for SparseBufferlistExtent
- pass left.length + right.length instead of bl.length()
for consistency and to avoid circumventing the assert in
SparseBufferlistExtent constructor
- claim_append() takes an lvalue reference, no need to move
- follow the pattern used in split()
Ilya Dryomov [Mon, 12 Feb 2024 10:00:45 +0000 (11:00 +0100)]
librbd: fix split() for SparseExtent and SparseBufferlistExtent
SparseExtents and SparseBufferlist are typedefs for interval_map. In
both cases, split() handler is broken: for the former the extent isn't
actually split and for the latter incorrect bufferlist is attached to
the split extent.
Fortunately, both SnapshotDelta as produced by ObjectListSnapsRequest
and SparseBufferlist used in a couple of places seem to be collections
where only disjoint intervals are inserted and splitting doesn't occur
(at least in the common case). But still, this is a landmine waiting
for someone to step on it.
Venky Shankar [Wed, 14 Feb 2024 05:10:49 +0000 (10:40 +0530)]
Merge PR #54690 into main
* refs/pull/54690/head:
client: handle callback completion if the async I/O failed
client: make sure the callback is finished when returning ENOTCONN
client: do not accept zero byte write request
client: check for negative value of iovcnt
src/test: test zero bytes async i/o
src/test: test async I/O with negative iov structures count
src/test: test async I/O if the client is not mounted
src/test: test async I/O with read only file
src/test: test async I/O with a file created with O_PATH
Reviewed-by: Frank S. Filz <ffilzlnx@mindspring.com> Reviewed-by: Milind Changire <mchangir@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Rishabh Dave <ridave@redhat.com> Reviewed-by: Matt Benjamin <mbenjami@redhat.com>
osd: always send returnvec-on-errors for client's retry
Currently there is a discrepancy in terms of the returnvec's
presence between MOSDOpReplys sent for original requests and
those on dups. The former always contain the returnvec if
an error happened, even if `allows_returnvec()` is `false`.
This commit extends the behavior on dups.
For RCA please see: https://tracker.ceph.com/issues/64192#note-9
Matt Benjamin [Fri, 2 Feb 2024 19:59:20 +0000 (14:59 -0500)]
rgw_sigv4: fixes to bootstrap maven/junit5 suite
The junit5 suite in fact chooses selects transport security (SSL)
strictly from the endpoint URL. The test_awssdkv4_sig.sh (or its
caller?) only needs to export RGW_HTTP_ENDPOINT_URL appropriately
to get one or the other.
Fix several mistakes in refactoring caught by Ali Maredia.
Print AccessKey, SecretKey and EndpointURL on startup
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
These changes address checksum header identification and signing
algorithm selection, including checksum trailer verification
for signed- and unsigned-payload cases.
These changes address all the actual S3 request failures I have
so far been able to reproduce, with and without content checksums
and/or new trailing checksum headers, and with and without
SSL.
Fixes: https://tracker.ceph.com/issues/63153
Specifically, it fixes the request failures that motivated the
initial tracker filing. It extracts but does not validate new client
content checksums if present. Validation and management of new
S3 content-checksum headers will follow in a subsequent change.
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
squashed commits:
* wip chunk meta parsing--seem to have first AWSv4ComplMulti::ChunkMeta::create_next sort of parsing
* use constexpr sarlen(...) for static array lengths throughout rgw_auth_s3.cc
* link AWSv4CompleMulti::ChunkMeta to its enclosing completer
* capture original content-length header before AWSv4ComplMulti overwrites it
* mostly extract the trailer
* fix misordered content-length, experiment w/exbuf
* save leftover bytes between calls to AWSv4ComplMulti::recv_chunk()
* propagate data_offset_in_stream from AWSv4ComplMulti::recv_chunk()
* clean up trailer section extract
* trailer section cleanup and introduce extract_helper
* unrolled checksum extract--fixup
* fix sv_trailer end pos, and cleanup
* add proplist interface to rgw::auth::Completer and AWSv4ComplMulti
* spliterate trailers
* check completer props
* redefine prop_map to point into already-allocated trailer_vec
* hax: thread a counter onto AWSv4ComplMulti recv_body() and recv_chunk path
* fix apparent bug where due to reads less than chunk_size induce a final, zero-length read that was skipped before forcing recognition of the last chunk in the stream
* check only for a trailing checksum named in x-amz-trailer
* don't try to match signatures when no signature provided (because streaming unsigned)
* oops, fix content_length decl
* fix recognition of next chunk envelope in unsigned aws-chunk case
* clean up AWSv4CompMulti flags and correctly detect aws unsigned chunked
* rework checksum-trailer extraction and introduce AWSv4ComplMulti::calc_v4_trailing_signature
* thread const struct req_state* into AWSv4ComplMulti
* large cleanup of trailer parsing, no regression
* fix trailer signature calculation--checks
* correctly generate final chunk hmac
* typo in comment
* verify trailing signature when expected (using expected final chunk signature)
* move trailer_vec back onto recv_body()'s stack
* remove strange completer comment
* remove last_frag (now points into parsing_buf)
* remove implied dependency on content_length
* move trailer recognition to AWSv4ComplMulti::complete()
* remove now-unused is_last_chunk() predicate
* remove unused ChunkMeta::completer
* responses to review comments
* when trailer is sig expected, fail (only) if none present or if it does not match calculated
* remove stale parse_content_length(...) decl
* remove now-unused AWSv4ComplMulti::content_length
* fix extract_helper end search position as in mut_extract_helper
* change "\n" reserve term in get_canon_amz_hdrs() part of the sum (review)
and initialize length to 0
* remove debugging code
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Vallari Agrawal [Thu, 1 Feb 2024 13:07:27 +0000 (18:37 +0530)]
qa: add qa/tasks/nvmeof.py and rbd/nvmeof_basic_task and fio workunits
This is v2 of the rbd/nvmeof test: It deploys 1 gateway and 1 initiator.
Then does basic verification on nvme commands and runs fio.
This commit creates:
1. qa/tasks/nvmeof.py: adds a new 'Nvmeof' task which deploys
the gateway and shares config with the initiator hosts.
Sharing config was previously done by 'nvmeof_gateway_cfg' task
in qa/tasks/cephadm.py (that task is removed in this commit).
2. qa/workunits/rbd/nvmeof_basic_tests.sh:
Runs nvme commands (discovery, connect, connect-all, disconnect-all,
and list-subsys) and does basic verification of the output.
3. qa/workunits/rbd/nvmeof_fio_test.sh:
Runs fio command. Also runs iostat in parallel if IOSTAT_INTERVAL
variable is set. This variable configures the delay between each iostat
print.
nvmeof-cli upgrade from v0.0.6 to v0.0.7 introduced major changes
to all nvmeof commands. This commit changes v0.0.6 commands to
v0.0.7 in qa/workunits/rbd/nvmeof_initiator.sh
Kefu Chai [Sun, 11 Feb 2024 08:53:26 +0000 (16:53 +0800)]
cmake: find_package(cap) before linking against it
before this change, we link against libcap without finding it. this
works fine as long as libcap-devel or libcap-dev is installed in the
system. but if it is not, the source would fail to build due to missing
`sys/capability.h`. this is not a great developer experience.
in this change, a `Findcap.cmake` is added to find the capability
library. which would fail the build at the configure phase.
Kefu Chai [Sun, 11 Feb 2024 07:42:14 +0000 (15:42 +0800)]
cmake: build boost debug variant when CMAKE_BUILD_TYPE is Debug
boost has some different predefined build variants. they are quite
like CMake's CMAKE_BUILD_TYPE. in which, "debug" enables some
features related features. so it would be nice if we can have it
enabled for the Debug build, if the boost is built from source.
see also
https://www.boost.org/build/doc/html/bbv2/overview/builtins/features.html
before this change, we always build the "release" variant. in this
change, "debug" variant is built if Ceph's is built with
CMAKE_BUILD_TYPE=Debug. please note, this change does not change
the way how boost is built when packaging Ceph, as our debian/rpm
receipts do not define CMAKE_BUILD_TYPE and respect the distros'
settings, in that case, the "release" variant is still built.
Zac Dover [Sat, 10 Feb 2024 14:36:29 +0000 (00:36 +1000)]
doc/radosgw: undo 55524
Roll back the docs changes made in
https://github.com/ceph/ceph/pull/55524, in accordance with Casey
Bodley's instructions to me here:
https://github.com/ceph/ceph/pull/55524#issuecomment-1937020543.
Zac Dover [Sat, 10 Feb 2024 03:14:59 +0000 (13:14 +1000)]
doc/radosgw: remove invalid LUA context options
Remove "background", "getdata", and "putdata" from the list of LUA
context options. Passing these options throws the following error:
"ERROR: invalid script context: background. must be one of: preRequest,
postRequest".