Since inject_facts_as_vars is set to false in the ansible.cfg file then we
have to update the references to use ansible_facts[<thing>] instead of
ansible_<thing>.
We already install the dependency from ceph-ansible requirements.txt and to
avoid false positive (like after rebooting a node) we can retry failing test.
Without loading the ansible.cfg file from ceph-ansible project, we don't
have the pipelining enabled which can result in significant performance
improvement.
This removes the ANSIBLE_ACTION_PLUGINS, ANSIBLE_RETRY_FILES_ENABLED and
ANSIBLE_SSH_RETRIES environment variables as it is already included in the
ansible.cfg file.
ceph-volume/tests: update ansible ssh_args env var
The ansible ssh_args parameter is usually defined in the ansible.cfg file.
Currently this variable is overrided in tox to manage the vagrant ssh file
but we lost all default values.
rpm: cleanup: drop %service_del_postun_without_restart
SUSE needs %service_del_postun (with or without restart) *only* if there
is a possibility that the RPM containing the unit file will be upgraded
from a version that packaged SysVinit scripts instead of systemd unit
files. (Which is not the case here.)
Dan van der Ster [Tue, 29 Jun 2021 20:36:00 +0000 (22:36 +0200)]
mgr/DaemonServer: skip redundant update of pgp_num_actual
During PG merge the MGR was observed repeatedly sending identical
set pgp_num_actual values, leading to osdmap churn at 2000/hr.
Skip the redundant osd set pgp_num_actual command if the
pgp_num is already our computed next.
Fixes: https://tracker.ceph.com/issues/51433 Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit 3f15749de0d550a124f8c6afbd457f17ef020963)
Ilya Dryomov [Wed, 26 Aug 2020 12:12:29 +0000 (14:12 +0200)]
rbd: fix default pool handling for krbd map/unmap
The default pool name does not get passed to the kernel since commit 96f05a7956b3 ("rbd: delay determination of default pool name"). The
kernel ends up interpreting the image name as the pool name (and the
snapshot name as the image name).
Alfonso Martínez [Mon, 19 Jul 2021 07:57:26 +0000 (09:57 +0200)]
mgr/dashboard: run cephadm-backend e2e tests with KCLI
Fixes: https://tracker.ceph.com/issues/51300 Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit 5c03b49c4da55cf8d0c679ecb2c58182e4d3361a)
Conflicts:
- Added content in HACKING.rst as dash-devel.rst does not exist in octopus:
doc/dev/developer_guide/dash-devel.rst
src/pybind/mgr/dashboard/HACKING.rst
- Adapted code to octopus branch in the following files due to branch divergence:
src/pybind/mgr/dashboard/frontend/cypress.json
src/pybind/mgr/dashboard/frontend/cypress/integration/cluster/configuration.e2e-spec.ts
src/pybind/mgr/dashboard/frontend/cypress/integration/cluster/hosts.po.ts
src/pybind/mgr/dashboard/frontend/cypress/integration/cluster/osds.e2e-spec.ts
src/pybind/mgr/dashboard/frontend/cypress/integration/orchestrator/workflow/01-hosts.e2e-spec.ts
src/pybind/mgr/dashboard/frontend/cypress/integration/page-helper.po.ts
src/pybind/mgr/dashboard/frontend/cypress/integration/ui/dashboard.e2e-spec.ts
Jason Dillaman [Thu, 29 Oct 2020 14:10:56 +0000 (10:10 -0400)]
librbd: refresh full global config when applying metadata
The ConfigProxy contains a point-in-time copy of the global config
that is dynamically updated in CephContext::_conf. Upon an image
refresh, pull the latest version of the global config from the
CephContext and apply it to the config stored within the ImageCtx.
Fixes: https://tracker.ceph.com/issues/48035 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 352dec753ead8b61e19b46d096255e06393b740f)
Adam C. Emerson [Wed, 14 Jul 2021 15:02:21 +0000 (11:02 -0400)]
rgw: Robust notify invalidates on cache timeout
This avoids a potential race condition in which updates are delayed.
Fixes: https://tracker.ceph.com/issues/51674 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 76247990ff38049ee32dd47d31482b9648353673)
Conflicts:
src/rgw/services/svc_notify.cc
- The original patchset was post-DPP, this branch is pre-DPP
- Skip the renaming, since this is a backport and that's mostly a
matter of futureproofing.
Backport: https://tracker.ceph.com/issues/51678 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Adam C. Emerson [Wed, 7 Jul 2021 22:47:00 +0000 (18:47 -0400)]
rgw: distribute() takes RGWCacheNotifyInfo
So we don't have to parse the bufferlist back out to find what object
to throw out of the cache.
Fixes: https://tracker.ceph.com/issues/51674 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 7f952ad80114096322f202ba58279aaa4a002313)
Conflicts:
src/rgw/services/svc_notify.cc
src/rgw/services/svc_notify.h
src/rgw/services/svc_sys_obj_cache.cc
- The original version is post-DPP, this is pre-DPP
Backport: https://tracker.ceph.com/issues/51678 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Kefu Chai [Tue, 25 May 2021 06:23:13 +0000 (14:23 +0800)]
osd/osd_type: use f->dump_unsigned() when appropriate
it is more explicit than `dump_stream()`, as we can not tell if dumped
variable is an integer or not by reading the code. it helps us to figure
out the scheme of the dumped object.
f7c5b01e18 tried to fix this, but adding peer_purged.erase() into
the peer_info loop made no effect because in purge_strays() when
inserting an osd to peer_purged we simultaneously remove it from
peer_info.
So it should be a separate loop through peer_purged list.
these parameters have proven to catch some of the uncaught bugs such as:
https://tracker.ceph.com/issues/48417, adopting them will help in
preventing more such hard to debug bugs.
This _mkdir_p should never have worked as the first directory it tries
to stat/mkdir is "", the empty string. This causes an assertion in the
client. I'm not sure how this code ever functioned without causing
faults. They look like:
2021-07-01 02:15:04.449 7f7612b5ab80 3 client.178735 statx enter (relpath want 2047)
胡玮文 [Mon, 21 Jun 2021 13:31:49 +0000 (21:31 +0800)]
mgr/dashboard: fix OSD out count
Think we have 3 OSDs out but up (prepare for re-formatting to change min_alloc_size), and another OSD down but in
(during reboot). The dashboard will display "1 down, 2 out", which is obviously incorrect. It should be "1 down, 3 out"
The rgw bucket creation form has the Name field which have an async
validator. The validator calls all the bucket name and check if the
entered name is unique or not. This happens on every keystroke. So if
100 or more buckets are there, then the async validation can be real
slow and causes misvalidations in different fields.
I changed the validation logic and did some cleanups to improve the
performance of the async validation.
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/ceph/rgw/rgw-bucket-form/rgw-bucket-form.component.ts
- Solved some import conflicts. Used the I18N import and removed the
forkJoin import
src/pybind/mgr/dashboard/frontend/src/app/shared/api/rgw-bucket.service.spec.ts
- Dont need ${RgwHelper.DAEMON_QUERY_PARAM}
src/pybind/mgr/dashboard/frontend/src/app/shared/api/rgw-bucket.service.ts
- Removed enumerate function
Paul Cuzner [Wed, 2 Jun 2021 23:34:19 +0000 (11:34 +1200)]
mgr/cephadm:fix alerts sent to wrong URL
The path_prefix in prometheus.yml was specifying an
endpoint prefix, which was invalid. This resulted in 404
errors when trying to send alerts to alertmanager and
blocked alerts being sent on to the ceph-dashboard API
receiver. This fix remves this prefix.
Fixes: https://tracker.ceph.com/issues/51073 Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
(cherry picked from commit 9d408a70c7d01fd7c94f9b814af916396d7cbf1f)
mgr/cephadm: fix issue with missing prometheus alerts
Files passed as configuration to the cephadm binary had not been created
and mapped to the container, if those files weren't included in the
required files section inside cephadm. This prevented optional file
includes in the configuration.
The configuration file for the Prometheus default alerts is not
mandatory and hence wasn't included in the required files section, still it
needs to be added to the container by cephadm.
This change enables optional files to be included in the configuration
for monitoring components, so that those files are created and mapped
within the container.
Note that a `required_files` variable has been removed at one position
in these changes, though it wasn't used to ensure that required files
were included in the configuration at that point anyway. The test which
ensures that all required files are passed is somewhere else.
Ilya Dryomov [Sun, 2 May 2021 21:13:29 +0000 (23:13 +0200)]
qa/workunits/rbd: disable qemu-iotest test 055 globally
It doesn't work on Focal and already disabled on CentOS 7 and 8. More
importantly, it doesn't actually test rbd -- it always tests "file", no
matter which protocol is specified in IMGPROTO.
Aaryan Porwal [Wed, 26 May 2021 08:58:15 +0000 (14:28 +0530)]
mgr/dashboard: fix for right sidebar nav icon not clickable
fixed the responsive sidebar not opening on click event, and close sidebar on clicking tasks and notification list item because it'll be over shadowed by the sidebar Signed-off-by: Aaryan Porwal <aaryanporwal2233@gmail.com>
(cherry picked from commit 4e53a139d96215477d00eb709c1662d8277cba1d)
Conflicts:
src/pybind/mgr/dashboard/frontend/src/app/core/navigation/navigation/navigation.component.html
- Adopt the master branch changes.
Patrick Donnelly [Fri, 18 Jun 2021 16:27:54 +0000 (09:27 -0700)]
mds: avoid journaling overhead for ceph.dir.subvolume for no-op case
In preparation for acquiring the xlock on the directory inode, the MDS
must journal a few events before continuing on with the setvxattr. This
can cause significant delays in the volumes ceph-mgr module which needs
to regularly enable this vxattr from multiple code paths. We could cache
in that module whether the vxattr is set but it's also pretty easy to
adjust the MDS to acquire a rdlock on the directory to check if the
subvolume flag is already set. That is much lighter weight and the lock
is generally readily available.
Fixes: https://tracker.ceph.com/issues/51276 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit b5f736eee408c220ffdfb67b10667a7b553dac25)