Ernesto Puerta [Fri, 14 Jan 2022 11:56:55 +0000 (12:56 +0100)]
Merge pull request #44507 from votdev/issue_53813_nfs_page_not_found
mgr/dashboard: NFS pages shows 'Page not found'
Reviewed-by: Alfonso Martínez <almartin@redhat.com> Reviewed-by: Laura Paduano <lpaduano@suse.com> Reviewed-by: Nizamudeen A <nia@redhat.com> Reviewed-by: Tatjana Dehler <tdehler@suse.com> Reviewed-by: Volker Theile <vtheile@suse.com>
Yaarit Hatuka [Wed, 12 Jan 2022 05:01:48 +0000 (05:01 +0000)]
mgr/telemetry: verify there are new collections when nagging due to a major
upgrade
When adding a new collection we define whether to nag the user about it.
We may add many collections and nag about none of them. However, in case
of a major upgrade, we wish to notify the user about these new
collections. This commit verifies there are indeed new collections when
nagging due to a major upgrade.
Yaarit Hatuka [Wed, 12 Jan 2022 04:36:27 +0000 (04:36 +0000)]
mgr/telemetry: improve output of `ceph telemetry collection ls`
STATUS column now indicates whether a collection is being reported, and
the reasons why it's not (either the user is not opted-in to this
collection, or its channel is off).
Also, removed the ENROLLED and DEFAULT columns due to potential
confusion they may cause.
In case a user is not opted-in to certain collections, a message will
appear above the table with the missing collections:
New collections are available:
['basic_base', 'basic_mds_metadata', 'crash_base', 'device_base',
'ident_base', 'perf_perf']
Run `ceph telemetry on` to opt-in to these collections.
Yaarit Hatuka [Tue, 7 Dec 2021 18:30:56 +0000 (18:30 +0000)]
mgr/telemetry: add command to list all collections
List all collections, their current enrollment state, status, default,
and description, with:
$ ceph telemetry collection ls
NAME ENROLLED STATUS DEFAULT DESC
basic_base TRUE ON ON Basic information about the cluster (capacity, number and type of daemons, version, etc.)
basic_mds_metadata TRUE ON ON MDS metadata
crash_base TRUE ON ON Information about daemon crashes (daemon type and version, backtrace, etc.)
device_base TRUE ON ON Information about device health metrics
ident_base TRUE OFF OFF User-provided identifying information about the cluster
perf_perf TRUE OFF OFF Information about performance counters of the cluster
Please note:
NAME:
=====
Collection name; prefix indicates the channel the collection belongs to.
ENROLLED:
=========
Signifies the collections that were available in the module when the
user last opted-in to telemetry. Please note: Even if a collection is
'enrolled', its metrics will be reported only if its channel is enabled.
STATUS:
=======
Indicates whether the collection metrics are reported; this is
determined by the status (enabled / disabled) of the channel the
collection belongs to, along with the enrollment status of the
collection.
DEFAULT:
========
The default status (enabled / disabled) of the channel the collection
belongs to.
Yaarit Hatuka [Tue, 23 Nov 2021 21:28:47 +0000 (21:28 +0000)]
mgr/telemetry: add preview-device and preview-all commands
`ceph telemetry show` will show a sample cluster report if the user is
opted-in to telemetry. The report will be compiled of the collections
the user is opted-in to. To preview a report compiled of the most recent
collection available, use `ceph telemetry preview`.
The device channel is not included in the cluster report, since it's
being sent to a different endpoint, thus we use
`ceph telemetry show-device` in case the user is opted-in to telemetry
and the device channel is enabled. If not, it can also be previewed with
`ceph telemetry preview-device`.
If telemetry is on, and device channel is enabled, both reports can be
reviewed with `ceph telemetry show-all`, otherwise use
`ceph telemetry preview-all`.
Yaarit Hatuka [Tue, 23 Nov 2021 17:11:38 +0000 (17:11 +0000)]
mgr/telemetry: add command to list all channels
List all channels, their current state, default, and description, with:
$ ceph telemetry channel ls
NAME ENABLED DEFAULT DESC
basic ON ON Share basic cluster information (size, version)
ident OFF OFF Share a user-provided description and/or contact email for the cluster
crash ON ON Share metadata about Ceph daemon crashes (version, stack straces, etc)
device ON ON Share device health metrics (e.g., SMART data, minus potentially identifying info like serial numbers)
perf ON OFF Share perf counter metrics summed across the whole cluster
Yaarit Hatuka [Tue, 23 Nov 2021 00:12:10 +0000 (00:12 +0000)]
mgr/telemetry: add commands to enable/disable channels
Currently we enable/disable a telemetry channel via CLI with:
`ceph config set mgr mgr/telemetry/channel_basic true`
`ceph config set mgr mgr/telemetry/channel_crash false`
We can now do this with:
`ceph telemetry enable channel basic`
`ceph telemetry disable channel crash`
Yaarit Hatuka [Mon, 15 Nov 2021 16:53:59 +0000 (16:53 +0000)]
mgr/telemetry: introduce new design for adding new data
The current design requires increasing the telemetry revision each time
we add new data to the report. As a result, users need to re-opt-in to
telemetry. This new design allows for adding new data to the report,
while allowing users to keep sending only what they already opted-in to,
hence no re-opt-in is required. In case users wish to report the new
data as well, they need to re-opt-in and enable any new channels.
Also, move formatting perf histograms to a function, so we can use it
both in `show` and `preview` commands.
Fix get_report call in dashboard to use get_report_locked.
mgr/prometheus: Fix regression with OSD/host details/overview dashboards
Fix issues with PromQL expressions and vector matching with the
`ceph_disk_occupation` metric.
As it turns out, `ceph_disk_occupation` cannot simply be used as
expected, as there seem to be some edge cases for users that have
several OSDs on a single disk. This leads to issues which cannot be
approached by PromQL alone (many-to-many PromQL erros). The data we
have expected is simply different in some rare cases.
I have not found a sole PromQL solution to this issue. What we basically
need is the following.
1. Match on labels `host` and `instance` to get one or more OSD names
from a metadata metric (`ceph_disk_occupation`) to let a user know
about which OSDs belong to which disk.
2. Match on labels `ceph_daemon` of the `ceph_disk_occupation` metric,
in which case the value of `ceph_daemon` must not refer to more than
a single OSD. The exact opposite to requirement 1.
As both operations are currently performed on a single metric, and there
is no way to satisfy both requirements on a single metric, the intention
of this commit is to extend the metric by providing a similar metric
that satisfies one of the requirements. This enables the queries to
differentiate between a vector matching operation to show a string to
the user (where `ceph_daemon` could possibly be `osd.1` or
`osd.1+osd.2`) and to match a vector by having a single `ceph_daemon` in
the condition for the matching.
Although the `ceph_daemon` label is used on a variety of daemons, only
OSDs seem to be affected by this issue (only if more than one OSD is run
on a single disk). This means that only the `ceph_disk_occupation`
metadata metric seems to need to be extended and provided as two
metrics.
`ceph_disk_occupation` is supposed to be used for matching the
`ceph_daemon` label value.
`ceph_disk_occupation_human` is supposed to be used for anything where
the resulting data is displayed to be consumed by humans (graphs, alert
messages, etc).
Josh Salomon [Thu, 13 Jan 2022 02:23:07 +0000 (02:23 +0000)]
osd, tools: refactor OSDMap::calc_pg_upmaps (simplify the code)
This is the first commit in a series of commits that aims at adding a primary balancer to Ceph and improving the current upmap balancer functionality. This first commit focuses on simplifying (refactoring) the code of `calc_pg_upmaps` so it is easier to change in the future. This PR keeps the existing functionality as-is and does not change anything but the code structure.
As part of the work is major refactoring of OSDMap::calc_pg_upmaps, the first thing is adding an --upmap-seed param to osdmaptool so test results can be compared without the random factor.
Other changes made:
- Divided sections of `OSDMap::calc_pg_upmaps` into their own separate functions
- Renamed tmp to tmp_osd_map
- Changed all the occurances of 'first' and 'second' in the function to more meaningful names.
gal salomon [Mon, 12 Apr 2021 05:54:37 +0000 (08:54 +0300)]
parquet implementation:
(1) adding arrow/parquet to make(install is missing)
(2) s3select-operation contains 2 flows CSV and Parquet
(3) upon parquet-flow s3select processing engine is calling (via callback) to get-size and range-request, the range-requests are a-sync, thus the caller is waiting until notification.
(4) flow : execute --> s3select --(arrow layer)--> range-request --> GetObj::execute --> send_response_data --> notify-range-request --> (back-to) --> s3select
(5) on parquet flow the s3select is handling the response (using call-backs) because of aws-response-limitation (16mb)
add unique pointer (rgw_api); verify magic number for parquet objects; s3select module update
fix buffer-over-flow (copy range request)
change the range-request flow. now,it needs to use the callback parametrs (ofs & len) and not to use the element length
refactoring. seperate the CSV flow from the parquet flow, a phase before adding conditional build(depend on arrow package installation)
adding arrow/parquet installation to debian/control
align s3select repo with RGW (missing API"s, such as get_error_description)
undefined reference to arrow symbol
fix comment: using optional_yield by value
fix comments; remove future/promise
s3select: a leak fix
s3select: fixing result production
s3select,s3tests : parquet alignments
typo: git-remote --> git_remote
s3select: remove redundant comma(end of projections); bug fix in parquet flow upon aggregation queries
adding arrow/parquet
editorial. remove blank lines
s3select: merged with master(output serialization,presto alignments)
merging(not rebase) master functionlities into parquet branch
(*) a dedicated source-files for s3select operation.
(*) s3select-engine: fix leaks on parquet flows, enabling allocate csv_object and parquet_object on stack
(*) the csv_object and parquet object allocated on stack (no heap allocation)
move data-members from heap to stack allocation, refactoring, separate flows for CSV and parquet. s3select: bug fix
conditional build: upon arrow package is installed the parquet flow become visable, thus enables to process parquet object. in case the package is not installed only CSV is usable
blk: introduce multi-size huge page pools to KernelDevice.
When testing remember about `bluestore_max_blob_size` as it's
only 64 KB by default while the entire huge page-based pools
machinery targets far bigger scenrios (initially 4 MB!).
blk: bring MAP_HUGETLB-based buffer pool to KernelDevice.
The idea here is to bring a pool of `mmap`-allocated,
constantly-sized buffers which would take precedence
over the 2 MB-aligned, THP-based mechanism. On first
attempt to acquire a 4 MB buffer, KernelDevice mmaps
`bdev_read_preallocated_huge_buffer_num` (default 128)
memory regions using the MAP_HUGETLB option. If this
fails, the entire process is aborted. Buffers, after
their life-times going over, are recycled with lock-
free queue shared across entire process.
Remember about allocating the appropriate number of
huge pages in the system! For instance:
```
echo 256 | sudo tee /proc/sys/vm/nr_hugepages
```
Ilya Dryomov [Tue, 11 Jan 2022 12:13:01 +0000 (13:13 +0100)]
qa/run_xfstests_qemu.sh: harden against wget failures
If wget fails (e.g. due to a certificate issue), it still creates
an empty file. Then this file is marked executable, ./"${SCRIPT}"
immediately returns 0 and run_xfstests_qemu.sh exits successfully
without running a single xfstest.
This started on Sep 30, 2021 with the expiration of Let's Encrypt
root certificate -- all qemu jobs with "test: qa/run_xfstests_qemu.sh"
just booted the VM for a couple of seconds and reported success.