The inter-connectedness of RadosStore and RGWRados resulted in a
segfault during RGWRados::init_complete due to the rados pointer not
being set in RadosStore yet.
Split the calls to RGWRados::initialize and RGWRados::init_complete, so
that we can set up RadosStore between them, allowing the services
created in RGWRados::init_complete to access the RadosStore.
Fixes: https://tracker.ceph.com/issues/55512 Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
Omri Zeneva [Sun, 27 Mar 2022 17:10:33 +0000 (20:10 +0300)]
rgw: support bucket name in pre request context
because bucket object is created only after authentication,
if bucket object is null upon accessing Request.Bucket.Name, we return req_state->init_state.url_bucket
Anthony D'Atri [Sat, 30 Apr 2022 07:56:21 +0000 (00:56 -0700)]
SubmittingPatches: Improve SubmittingPatches example
The example commit didn't show the convention of prefixing the message with
the relative directory path where the file lives, which has led new
contributors to innocently submit changes that aren't formatted ideally.
This adds a path to the example.
Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>
For debugging purposes, allow radosgw-admin to run with stores other
than RadosStore. Many operations will still fail (by crashing), so care
must be taken when running this way.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
RGW - Allow starting RGW/dbstore without connecting to Mons
DBStore, and some other Stores like Motr, don't need to connect to the
Mons to work. However, startup automatically connects to the mons.
There's provision to not connect, but the split isn't quite right. We
need to call global_pre_init() to get config from the file, to determine
which store to start, but we then need to decide before calling
global_init() whether the configured store needs to connect to the mons.
This requires a slight change to global_init() to set no_mon_config from
the new flags.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
J. Eric Ivancich [Tue, 26 Apr 2022 16:46:08 +0000 (12:46 -0400)]
rgw: "bucket check --fix" should delete damaged multipart uploads from bi
As one of the steps in `radosgw-admin bucket check --fix ...` it looks
for bucket index entries for incomplete multipart uploads that do not
have a corresponding ".meta" entry in the same bucket index. It then
intends to delete those entries, however the function that it calls
to perform the bucket index deletions was flawed and did not direct
the removals to the appropriate shard(s), but instead a non-existant
oid.
This commit determines the appropriate shard for each of the entries
to be removed and asynchronously issues a librados call to
omap_rm_keys.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Rishabh Dave [Mon, 24 Jan 2022 18:33:05 +0000 (00:03 +0530)]
qa/cephfs: don't remove sudo from the command arguments
run_shell() in qa.tasks.cephfs.mount.CephFSMount prepends "sudo" to its
command arguments but it doesn't specify to the underlying method that
"sudo" shouldn't be deleted from the command arguments.
Fixes: https://tracker.ceph.com/issues/53601 Signed-off-by: Rishabh Dave <ridave@redhat.com>
Kotresh HR [Tue, 14 Dec 2021 10:13:41 +0000 (15:43 +0530)]
qa: Fix a few tracebacks in vstart_runner
1. CommandFailedError: Command failed with status 127: \
['None/archive/coverage', 'rados' ...]
2. TypeError: a bytes-like object is required, not '_io.BytesIO'
Nizamudeen A [Tue, 26 Apr 2022 10:19:09 +0000 (15:49 +0530)]
mgr/dashboard: prometheus rules internal server error
After we increase/decrease the count of the node-exporter, we get a 500
- Internal server error from api/prometheus/rules endpoint. On further
debugging its caused by the jsonDecodder, because I guess the expected
input for the json.loads() is not a json formatted input. So to fix
that issue I can either do an error handling on the json.loads() or I
can move the json.loads() on the already existing try block. I went for
the second approach here.
Fixes: https://tracker.ceph.com/issues/54356 Signed-off-by: Nizamudeen A <nia@redhat.com>
J. Eric Ivancich [Tue, 12 Apr 2022 18:47:45 +0000 (14:47 -0400)]
rgw: address crash and race in RGWIndexCompletionManager
An atomic int was used in a modulo operator to distribute contention
among a set of locks and to track completions. Because it was an int,
enough increments would cause it to go negative (due to
twos-complement encoding and overflow) thereby causing a
crash. Additionally, even though it was atomic, the read and increment
were separate operations, leading to a race.
This commit addresses both of these issues.
Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
Used the https://www.npmjs.com/package/@grafana/e2e npm packages and
followed
https://github.com/grafana/grafana/blob/main/contribute/style-guides/e2e.md
to understand the style of the grafana e2e testing.
In this PR I introduces the tests for the Hosts Overall
Performance and also RGW per Daemon and Overall Performance
Fixes: https://tracker.ceph.com/issues/54356 Signed-off-by: Nizamudeen A <nia@redhat.com>
Rishabh Dave [Wed, 9 Feb 2022 18:16:27 +0000 (23:46 +0530)]
qa/cephfs: change default timeout from 900 secs to 300
15 minutes is unnecessarily large as a default value for timeout for a
command. Not having to wait unnecessarily on a crash of a command will
reduce teuthology's testing queue and will save individual developer's
time while running tests locally.
Whatever lines are modified for this purpose are also modified to follow
the stlye guideline, specfically wrapping at 80 characters.
Fixes: https://tracker.ceph.com/issues/54236 Signed-off-by: Rishabh Dave <ridave@redhat.com>
fix endianness issue with WriteLogCacheEntry encoding. abandon the
use of bits in the union. make '&' operation with the whole union
filed(flags) to get the bit information.
Merge pull request #45904 from cfsnyder/fix_rocksdb_iter_perf
os/bluestore: set upper and lower bounds on rocksdb omap iterators
Reviewed-by: Neha Ojha <nojha@redhat.com> Reviewed-by: Mark Nelson <mnelson@redhat.com> Reviewed-by: Casey Bodley <cbodley@redhat.com> Reviewed-by: Adam Kupczyk <akupczyk@redhat.com> Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Or Friedmann [Tue, 19 Apr 2022 12:00:28 +0000 (12:00 +0000)]
rgw: RGWCoroutine::set_sleeping() checks for null stack
users of the RGWOmapAppend coroutine don't manage the lifetime of its
underlying coroutine stack, so end up making calls on RGWOmapAppend
after its stack goes away. this null check is a band-aid, and there are
still several other calls in RGWCoroutine that don't check for null
stack
Fixes: https://tracker.ceph.com/issues/49302 Signed-off-by: Or Friedmann <ofriedma@redhat.com> Signed-off-by: Casey Bodley <cbodley@redhat.com>
this mutex was only held by one function, OpsLogFile::flush(). this
private member function is only ever called from the background thread,
so doesn't need to be protected by a mutex
as a further cleanup, i renamed 'cond' and 'mutex' now that we don't
need to differentiate between different locks
this shuts up ceph::debug_condition_variable's assertion that the
associated mutex is held during notify_one(). this is not strictly
required for correct use, but is a common source of bugs
Laura Flores [Fri, 22 Apr 2022 23:06:09 +0000 (23:06 +0000)]
.github: add a "Contribution Guidelines" to the pull request template
These guidelines refer contributors to the "Submitting Patches to Ceph" doc
and the "Submitting Patches to Ceph - Backports" doc. Even though there are
already tips for titling/signing commits in the PR template, these tips
are commented out and easy to gloss over once the contributor creates the
PR. These existing tips do not include any pointers about staging backports.
Fixes: https://tracker.ceph.com/issues/55418 Signed-off-by: Laura Flores <lflores@redhat.com>
mgr/cephadm: Adding support to store ceph conf per cluster fsid Fixes: https://tracker.ceph.com/issues/55185 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
Mark Nelson [Wed, 20 Apr 2022 19:45:45 +0000 (19:45 +0000)]
crimson/osd: fix argument parsing after seastar changes
Last fall seastar changed the way that app-template works, separating internal "seastar" options from "app" options. Part of that change was to only return app_opts when get_options_description() is called, which is what we use to filter arguments that should be passed to seastare instead of crimson. This has the unfortunate effect of breaking all "seastar" options we pass to seastar such as "--memory" or "--cpuset". There is no way currently to access the internal seastar options short of scraping and parsing stdout (private member without an accessor). The PR that made the change can be seen here:
Potentially we could use our existing code if we got the seastar devs to provide something like "get_all_options_descrption(), but I don't think we should rely on a function like this. They clearly aren't intending for projects to rely on this behavior for argument filtering. It's brittle and something we can't easily fix ourselves if there are future problems.
Instead, we should filter our own options from argv and then pass what remains to seastar. Previously we didn't do this because crimson::common:ConfigProxy isn't available until seastar starts up, so we can't use it to filter out which options to give seastar (chicken and egg problem). We don't actually need ConfigProxy to filter the arguments though. It's good enough to create a throw-away md_config_t instance, give it a dummy tracker, and then let it parse the arguments as it normally does. This let's us filter out the arguments to give seastar before seastar itself starts up, which then let's us filter which arguments we should eventually pass to crimson's ConfigProxy.
And also the '-' has precedence over the '<<', more detail please
see https://en.cppreference.com/w/c/language/operator_precedence.
Fixes: https://tracker.ceph.com/issues/55409 Reported-by: Jos Collin <jcollin@redhat.com> Reported-by: Rishabh Dave <ridave@redhat.com> Signed-off-by: Xiubo Li <xiubli@redhat.com>
Adds a precondition to RocksDBStore::get_cf_handle(string, IteratorBounds)
to avoid duplicating logic of the only caller (RocksDBStore::get_iterator).
Assertions will fail if preconditions are not met.
amazon docs for PutBucketLifecycleConfiguration do say that a
Content-MD5 header is required, but clients in FIPS mode may not
be able to generate this header.
MD5 should not be used as a security feature, so rgw shouldn't require
it here. if no Content-MD5 is given, just skip the checksum verification
instead of rejecting the request
bluestore: add config option to allow rocksdb iterator bounds to be disabled
Add osd_rocksdb_iterator_bounds_enabled config option to allow rocksdb iterator bounds to be disabled.
Also includes minor refactoring to shorten code associated with IteratorBounds initialization in bluestore.