Josh Salomon [Thu, 13 Jan 2022 02:23:07 +0000 (02:23 +0000)]
osd, tools: refactor OSDMap::calc_pg_upmaps (simplify the code)
This is the first commit in a series of commits that aims at adding a primary balancer to Ceph and improving the current upmap balancer functionality. This first commit focuses on simplifying (refactoring) the code of `calc_pg_upmaps` so it is easier to change in the future. This PR keeps the existing functionality as-is and does not change anything but the code structure.
As part of the work is major refactoring of OSDMap::calc_pg_upmaps, the first thing is adding an --upmap-seed param to osdmaptool so test results can be compared without the random factor.
Other changes made:
- Divided sections of `OSDMap::calc_pg_upmaps` into their own separate functions
- Renamed tmp to tmp_osd_map
- Changed all the occurances of 'first' and 'second' in the function to more meaningful names.
gal salomon [Mon, 12 Apr 2021 05:54:37 +0000 (08:54 +0300)]
parquet implementation:
(1) adding arrow/parquet to make(install is missing)
(2) s3select-operation contains 2 flows CSV and Parquet
(3) upon parquet-flow s3select processing engine is calling (via callback) to get-size and range-request, the range-requests are a-sync, thus the caller is waiting until notification.
(4) flow : execute --> s3select --(arrow layer)--> range-request --> GetObj::execute --> send_response_data --> notify-range-request --> (back-to) --> s3select
(5) on parquet flow the s3select is handling the response (using call-backs) because of aws-response-limitation (16mb)
add unique pointer (rgw_api); verify magic number for parquet objects; s3select module update
fix buffer-over-flow (copy range request)
change the range-request flow. now,it needs to use the callback parametrs (ofs & len) and not to use the element length
refactoring. seperate the CSV flow from the parquet flow, a phase before adding conditional build(depend on arrow package installation)
adding arrow/parquet installation to debian/control
align s3select repo with RGW (missing API"s, such as get_error_description)
undefined reference to arrow symbol
fix comment: using optional_yield by value
fix comments; remove future/promise
s3select: a leak fix
s3select: fixing result production
s3select,s3tests : parquet alignments
typo: git-remote --> git_remote
s3select: remove redundant comma(end of projections); bug fix in parquet flow upon aggregation queries
adding arrow/parquet
editorial. remove blank lines
s3select: merged with master(output serialization,presto alignments)
merging(not rebase) master functionlities into parquet branch
(*) a dedicated source-files for s3select operation.
(*) s3select-engine: fix leaks on parquet flows, enabling allocate csv_object and parquet_object on stack
(*) the csv_object and parquet object allocated on stack (no heap allocation)
move data-members from heap to stack allocation, refactoring, separate flows for CSV and parquet. s3select: bug fix
conditional build: upon arrow package is installed the parquet flow become visable, thus enables to process parquet object. in case the package is not installed only CSV is usable
RGW Zipper - don't load stats for every bucket load
This was a side-effect of consolidating the Zipper API, and resulted in
a large performance hit. Stats are only needed if they are requested,
so don't load them every time.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
Laura Flores [Tue, 4 Jan 2022 22:54:33 +0000 (22:54 +0000)]
mgr/telemetry: add the rocksdb version number to telemetry
Capturing the RocksDB version number in Telemetry would allow us to check that users are using the appropriate RocksDB version for their Ceph cluster. For instance, if a user is working in a Pacific cluster, but their RocksDB version is meant for Nautilus, that might be a problem.
It is strucured as "rocksdb_stats" --> "version" in anticipation of more stats that can will be added under "rocksdb_stats".
Kalpesh Pandya [Thu, 11 Nov 2021 06:46:16 +0000 (12:16 +0530)]
qa/tasks: Checking for kafka cleanup
Adding a sleep after running ./kafka-server-stop.sh and ./zookeeper-server-stop.sh
scripts so that nothing gets logged into the kafka logs after the sleep time.
And finally killing the process.
This resolves: https://tracker.ceph.com/issues/53220
osd: Display scheduler specific info when dumping an OpSchedulerItem
Implement logic to dump information relevant to the scheduler type being
employed when dumping details about an OpSchedulerItem. For e.g., the
'priority' field is relevant for the 'wpq' scheduler, but for the
'mclock_scheduler', the 'qos_cost' gives more information during debugging.
A couple of additional fields called 'qos_cost' and 'is_qos_request' are
introduced in OpSchedulerItem class. These are mainly used to facilitate
dumping of relevant information depending on the scheduler type. The
interesting points are when an item is enqueued and dequeued.
For the 'mclock_scheduler', the 'class_id' and the 'qos_cost' fields are
dumped during enqueue and dequeue op respectively. For the 'wpq' scheduler
things remain the same as before.
An additional benefit of this change is to help immediately identify the
type of scheduler being used for a given shard depending on what is dumped
in the debug messages while debugging.
Ilya Dryomov [Fri, 7 Jan 2022 12:31:08 +0000 (13:31 +0100)]
test/librbd: make diff-iterate clone tests exercise fast-diff mode
The fast-diff feature wasn't propagated to the clone so these tests
were exercising the slow list_snaps path no matter what RBD_FEATURES
value was supplied to ceph_test_librbd.
Ilya Dryomov [Wed, 5 Jan 2022 19:24:40 +0000 (20:24 +0100)]
librbd: restore diff-iterate include_parent functionality in fast-diff mode
Commit 4429ed4f3f4c ("librbd: switch diff iterate API to use new snaps
list dispatch methods") removed the recursive execute() call. The new
list_snaps method does indeed handle parent diffs internally but it is
not used in fast-diff mode. Nothing changed there -- we still need to
load the parent object map, calculate parent object_diff_state, etc.
John Mulligan [Thu, 6 Jan 2022 21:36:32 +0000 (16:36 -0500)]
cephadm: check if cephadm is root after cli is parsed
Fixes: https://tracker.ceph.com/issues/53572
Perform a check if cephadm is root after the CLI arguments are parsed
but before logging is configured. This allows a user to get help on
cephadm without requiring to be root or use sudo, etc.
The root check must be done before logging is configured because the
logging set up function creates dirs and files in system dirs.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
John Mulligan [Thu, 6 Jan 2022 21:26:37 +0000 (16:26 -0500)]
cephadm: split cli parsing & ctx setup from logging setup
Split the parsing and processing of the CLI from the logging
setup in the cephadm main func. Move logging setup to occur after
we've determined that there's a runnable command func.
This is preparatory work to allow running `cephadm --help` without being
root.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
yuval Lifshitz [Mon, 13 Dec 2021 18:56:20 +0000 (20:56 +0200)]
rgw/notifications: add cloudevents support to HTTP endpoint
following the cloudevents HTTP spec:
https://github.com/cloudevents/spec/blob/v1.0.1/http-protocol-binding.md
and more specifically this aws-s3 spec:
https://github.com/cloudevents/spec/blob/main/cloudevents/adapters/aws-s3.md