Kotresh HR [Tue, 16 Aug 2022 11:41:33 +0000 (17:11 +0530)]
mgr/volumes: Remove stale snapshot user metadata
This patch adds the capability to remove the stale snapshot user
metadata while loading the subvolume if it is present. It can't
be done in 'SubvolumeBase.discover' since v1 and v2 snapshot paths
are different. This is done just after the discover before returning
the specific version object.
mgr/volumes: Allow forceful snapshot removal on osd full
When the osd is full, if the snapshot has metadata set, it
can't be removed as user metadata can't be removed when osd
is full. This patch provides a way to remove the snapshot
with 'force' option while keeping the corresponding metadata
which gets removed on subvolume discover when it finds space.
Kotresh HR [Tue, 16 Aug 2022 11:38:16 +0000 (17:08 +0530)]
mgr/volumes: Better handle config file on osd full scenario
The 'metadata_mgr.flush()' used to truncate the config file
before flushing the new config data. This could lead to an
empty config file when there is no space to write new config
data. This patch handles this scenario by writing it to
temporary file and rename it to config file. This would
retain the config file without truncating it.
Also, there are bunch of places which wasn't handling
'MetadataMgrException' because of this. Fixed those.
If any clone is in pending or in-progress state then
show these clones in 'fs subvolume snapshot info'
command output. This field only exists if clones are
in pending or in progress state.
Adam King [Thu, 25 Aug 2022 16:09:49 +0000 (12:09 -0400)]
mgr/orchestrator/tests: don't match exact whitespace in table output
It seems that the exact spacing may differ a bit between
python versions. Currently seeing py3 (which cooresponds to py 3.6
on my system) passing these tests and py37 (which is python 3.7
obviously) failing. I think verifying against the exact whitespace
is unnecessary anyhow. As long as it isn't egregious, we don't
really need to worry about exactly what the spacing is.
Adam King [Wed, 24 Aug 2022 14:36:53 +0000 (10:36 -0400)]
doc/cephadm: fix example for specifying networks for rgw
count_per_host must be used with underscores rather
than dashes to work, you need to pass service_id not
service_name and the option for the port is called
rgw_frontend_port not just "port"
Zac Dover [Tue, 23 Aug 2022 06:59:04 +0000 (16:59 +1000)]
doc/mgr: edit orchestrator.rst
This PR improves the English language in the "Orchestrator CLI"
section of the MGR documentation. It adds a couple of section
headers in order to signpost the information in the document
a bit more than had already been done, but it makes no major
structural changes to the presentation of the information here.
This PR was motivated by feedback from the 2022 Ceph User Survey
in which one of the respondents wrote "better ceph orch documen-
tation".
The final section on this page, "Current Implementation Status",
must be verified by someone who is familiar with the current state
of "ceph orch" and a date stamp should be applied to the top of
the section so that the word "current" has a meaningful referent.
Xiubo Li [Wed, 6 Apr 2022 00:12:26 +0000 (08:12 +0800)]
ceph-fuse: add dedicated snap stag map for each directory
This will fix the fino colliding bug, which is caused when the
snapid is later than 0xffff.
From mds 'mds_max_snaps_per_dir' option, we can see that the max
snapshots for each directory is 4_K, and in ceph-fuse we have
around 64_K, which is from 0xffff - 2, stags could be used to make
the fake fuse inode numbers for each directory.
Xiubo Li [Thu, 24 Mar 2022 02:01:57 +0000 (10:01 +0800)]
ceph-fuse: return EINVAL if get invalid fino instead of assert
All the snap ids of the finos returned to libfuse from libcephfs
will be recorded in the map of 'stag_snap_map', and will never be
erased before unmounting. So if libfuse passes invalid fino the
ceph-fuse should return EINVAL errno instead of crash itself.
Xiubo Li [Wed, 23 Mar 2022 02:05:32 +0000 (10:05 +0800)]
mds-client: make the fake inos option unchangeable in runtime
If the flags is empty then in option.h in can_update_at_runtime()
it will return true. That means this opetion could be changed in
runtime, which is buggy. Because if this is false, ceph-fuse will
use its own fake inos instead of libcephfs'. If this is changed
during runtime, we will hit inos dosn't exist assert bugs.
John Mulligan [Tue, 2 Aug 2022 13:45:59 +0000 (09:45 -0400)]
mgr/volumes: drop pre-python 3.2 version checks
Based on other conversations we believe that there is no need to support
python versions lower than Python 3.6 for pacific and later. This means
it is safe to drop the remaining version checks for python
3.2.
John Mulligan [Mon, 11 Jul 2022 20:44:00 +0000 (16:44 -0400)]
mgr/volumes: a lock to guard against races reading/writing config
Fixes: https://tracker.ceph.com/issues/55583
Use a python threading lock to avoid race conditions where the
config file is being both read and written to at the same time.
Before this change, the content of the config file being parsed could be
'corrupted' by the MetadataManager racing with itself. Along with the
previous two patches, additional logging was added to the mgr code to
produce the simplified version of the mgr log below:
```
[volumes INFO volumes.fs.operations.versions.metadata_manager] READ: b'[GLOBAL]\nversion = 2\ntype = clone\npath = /volumes/Park/babydino2/c9f773af-5221-49c6-846c-d65c0920ae3f\nstate = pending\n\n[source]\nvolume = cephfs\ngroup = Park\nsubvolume = Jurrasic\nsnapshot = dinodna0\n\n'
[volumes INFO volumes.fs.operations.versions.metadata_manager] READ: b''
[volumes INFO volumes.fs.operations.versions.metadata_manager] READ: b'[GLOBAL]\nversion = 2\ntype = clone\npath = /volumes/Park/babydino2/c9f773af-5221-49c6-846c-d65c0920ae3f\nstate = pending\n\n[source]\nvolume = cephfs\ngroup = Park\nsubvolume = Jurrasic\nsnapshot = dinodna0\n\n'
[volumes INFO volumes.fs.operations.versions.metadata_manager] wrote 203 bytes to config b'/volumes/Park/babydino2/.meta'
[volumes INFO volumes.fs.operations.versions.metadata_manager] READ: b'a0\n\n'
[volumes INFO volumes.fs.operations.versions.metadata_manager] READ: b''
[volumes ERROR volumes.module] Failed _cmd_fs_clone_cancel(clone_name:babydino2, format:json, group_name:Park, prefix:fs clone cancel, vol_name:cephfs) < "":
Traceback (most recent call last):
...
File "/usr/lib64/python3.6/configparser.py", line 1111, in _read
raise e
configparser.ParsingError: Source contains parsing errors: b'/volumes/Park/babydino2/.meta'
[line 13]: 'a0\n'
```
Looking at the above you can see that the log indicates a write to the
config file (of 203 bytes). This happens before the file has finished
reading and thus instead of getting an empty string indicating EOF, it
gets that last four bytes of the new content of the file. The lock
prevents the MetadataManager from both reading and writing the config
file at the same time.
John Mulligan [Tue, 12 Jul 2022 22:33:07 +0000 (18:33 -0400)]
mgr/volumes: write volume metadata with shim class
Add a class that works a bit like a python file object so that we
can simplify the flush function. Providing a file-like object to
the ConfigParser's write function avoids unnecessary copies to
a StringIO object and makes the code easier to read.
With no more uses of StringIO, the StringIO imports are removed.
John Mulligan [Tue, 12 Jul 2022 22:32:54 +0000 (18:32 -0400)]
mgr/volumes: read volume metadata file using read_string
The read_string method, available in Python 3.2 (we assume Python 3.6 as
our current minimum python versino), supports parsing a provided string
for ini-style configuration parameters. Refactoring the reading of the
config file from cephfs into a simple iterator function and then
providing it to the ConfigParser as a single string, allows us to avoid
using StringIO and simplifies the refresh function.
Nizamudeen A [Tue, 26 Apr 2022 10:19:09 +0000 (15:49 +0530)]
mgr/dashboard: prometheus rules internal server error
After we increase/decrease the count of the node-exporter, we get a 500
- Internal server error from api/prometheus/rules endpoint. On further
debugging its caused by the jsonDecodder, because I guess the expected
input for the json.loads() is not a json formatted input. So to fix
that issue I can either do an error handling on the json.loads() or I
can move the json.loads() on the already existing try block. I went for
the second approach here.
qa: filter internal directories in 'subvolumegroup ls' command
Internal directories: '_nogroup', '_index', '_legacy', '_deleting'
1. Internal directories should be filtered in 'subvolmegroup ls' command.
2. Internal directories should not be accepted as a group name.
mgr/volumes: filter internal directories in 'subvolumegroup ls' command
Internal directories: '_nogroup', '_index', '_legacy', '_deleting'
1. Internal directories should be filtered in 'subvolmegroup ls' command.
2. Internal directories should not be accepted as a group name.
Used the https://www.npmjs.com/package/@grafana/e2e npm packages and
followed
https://github.com/grafana/grafana/blob/main/contribute/style-guides/e2e.md
to understand the style of the grafana e2e testing.
In this PR I introduces the tests for the Hosts Overall
Performance and also RGW per Daemon and Overall Performance
To be able to recreate and test pg log duplicate entries, a new option
added to the COT: --op pg-log-inject-dups we will also need to provide
--file json_arry of dups, it can get as many dups that need to be inject
the json for dups is in the following format:
{"reqid": "client.n.n:n", "version": "n'n", "user_version": n, "return_code": n}
Xiubo Li [Mon, 7 Mar 2022 07:42:42 +0000 (15:42 +0800)]
mds: flush mdlog if locked and still has wanted caps not satisfied
In _do_cap_update() if one client is releasing the Fw caps the
relevant client range will be erased, and then new_max will be 0.
It will skip flushing the mdlog after it submitting a journal log,
which will keep holding the wrlock for the filelock.
So when a new client is trying to open the file for reading, since
the wrlock is locked for the filelock the file_eval() is possibly
couldn't changing the lock state and at the same time if the
filelock is in stable state, such as in EXECL, MIX. The mds may
skip flushing the mdlog in the open related code too.
We need to flush the mdlog if there still has any wanted caps
couldn't be satisfied and has any lock for the filelock after the
file_eval().
Kefu Chai [Mon, 8 Aug 2022 14:41:17 +0000 (22:41 +0800)]
pybind/mgr/dashboard: do not use distutils.version.StrictVersion
replace `distutils.version.StrictVersion` with
`pkg_resources.parse_version()`
as the former is deprecated, see https://peps.python.org/pep-0632/.
let's use `pkg_resources` instead. this change also addresses
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1010894.
we have this issue when testing with an ubuntu jammy test node.
see https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1967139
Kefu Chai [Mon, 8 Aug 2022 14:41:17 +0000 (22:41 +0800)]
pybind/mgr/dashboard: do not use distutils.version.StrictVersion
replace `distutils.version.StrictVersion` with
`pkg_resources.parse_version()`
as the former is deprecated, see https://peps.python.org/pep-0632/.
let's use `pkg_resources` instead. this change also addresses
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1010894.
we have this issue when testing with an ubuntu jammy test node.
see https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1967139