Paul Cuzner [Wed, 18 Aug 2021 05:02:32 +0000 (17:02 +1200)]
cephadm:Add listening ports to gather-facts output
This patch adds tcp and udp listening ports to the data
returned by gather-facts. This can be used to check port
availability prior to trying to deploying daemons, to
catch port conflicts earlier. IPv4 and IPv6 are supported
Oleander Reis [Wed, 18 Aug 2021 13:45:42 +0000 (15:45 +0200)]
cephadm: check for openntpd.service as time sync service
openntpd is an alternative implementation of time synchronization
by the openbsd project and is packaged for debian and ubuntu
since at least jessie / 18.04 with the service named openntpd.service
mgr/dashboard: rgw service creation form: add realm and zone to service spec.
Align rgw service id pattern with cephadm: https://github.com/ceph/ceph/pull/39877
- Update rgw pattern to allow service id for non-multisite config.
- Extract realm and zone from service id (when detected) and add them to the service spec.
Fixes: https://tracker.ceph.com/issues/44605 Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit 0575844192502ded32962b75a91cf51de22e97e6)
mgr/dashboard: connect-rgw: rename to set-rgw-credentials; refactoring
- Rename the dashboard command to better reflect its behavior.
- Rename '_radosgw_admin' method to 'send_rgwadmin_command' for consistency with
'send_mon_command' and move it to the mgr_module.py .
- Cleanup: remove unneeded rgw settings.
- Better error handling and test coverage.
Fixes: https://tracker.ceph.com/issues/44605 Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit 6e20ef1dd35f3681d14cd4e08ca63eb20edc2c88)
Alfonso Martínez [Wed, 28 Jul 2021 07:48:18 +0000 (09:48 +0200)]
mgr/dashboard: connect-rgw: adaptation and test coverage
- Align Dashboard with cephadm: configure credentials using the same logic.
- Fix: create a 'dashboard' user per realm (before: only on 1st realm).
- Lint fixes, test coverage, method renaming to better reflect behavior and method visibility.
Fixes: https://tracker.ceph.com/issues/44605 Signed-off-by: Alfonso Martínez <almartin@redhat.com>
(cherry picked from commit 0fcf0a7827cf4e8748a382613f9c8d1715c4a1e8)
Sage Weil [Thu, 5 Aug 2021 14:24:13 +0000 (10:24 -0400)]
mgr/cephadm: enable prometheus module before deploying prometheus
The mon will restart the mgr when the module is enabled, so we don't
really have to do anything here. The raise is there just in case the
mgr doesn't immediately get the new mgrmap and respawn, although there is
likely no harm done if we continue to deploy prometheus in the meantime,
even if we're interrupted partway through.
Igor Fedotov [Tue, 9 Feb 2021 15:29:01 +0000 (18:29 +0300)]
os/bluestore: cap omap naming scheme upgrade transactoin.
We shouldn't use single per-onode transaction for such an upgrade when onode's omap list is huge. This results in similarly sized WAL/SST files which are inefficient, might cause high memory usage and sometimes error-prone.
Fixes: https://tracker.ceph.com/issues/49170 Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit e897fa243c1dd38329733b452872616023f14ac8)
libudev uses fnmatch(3) for matching attributes, meaning that shell
glob pattern matching is employed instead of literal string matching.
Escape glob metacharacters to suppress pattern matching.
Kamoltat [Fri, 25 Jun 2021 22:40:43 +0000 (22:40 +0000)]
pybind/mgr/autoscaler: don't scale pools with overlapping roots
In the previous version of get_subtree_resource_status() in
src/pybind/mgr/pg_autoscaler/module.py we ignore overlapping
pools which in some cases if combined with the new `scale-down`
algorithm in https://github.com/ceph/ceph/pull/38805 can cause
some pools to scale up/down to inapproriate amount of pgs.
Therefore, the PR identifies the overlapping roots and prevent the pools
with such roots from scaling. This only happens with `scale-down` profile
as we see no problem with the default `scale-up` profile.
Removed the variable `pool_root` since it is not used anywhere in
the code, it only gets assigned and reassigned
Also included a unit test test_overlapping_roots.py that tests the function
identify_subtrees_and_overlaps() as well as edited test_cal_final_pg_target.py
to account for pools that contain overlapping roots, therefore, those pools
are expected not to scale.
Kefu Chai [Mon, 28 Jun 2021 04:28:17 +0000 (12:28 +0800)]
pybind/mgr/pg_autoscaler: extract CrushSubtreeResourceStatus out
as it also serves as part of interface of get_subtree_resource_status(),
not only its internals. to ease adding the type annotations, this class
is promoted out of the class.
The autoscaler by default will start out each pool with minimal
pgs and `scale-up` the pgs when there is more usage in each pool.
Users can now use the commands:
`osd pool set autoscale-profile scale-down` to make the pools
start out with a full complement of pgs and only `scale-down`
when usage ratio across the pools are not even.
`osd pool set autoscale-profile scale-up` (by default) to make the pools
start out with minimal pgs and `scale-up` the pgs when there
is more usage in each pool.
Edited KVMonitor.cc file to make the `autoscale_profile` variable
persistent.
Edited tests/test_cal_final_pg_target.py so that it takes into account
the new `profile` argument when calling cal_final_pg_target(). Also,
added some new test cases for when profile is `scale-up`
Renamed tests/test_autoscaler.py to a more appropriate name:
tests/test_cal_ratio.py
Kamoltat [Thu, 7 Jan 2021 15:39:19 +0000 (15:39 +0000)]
mgr/pg_autoscaler: avoid scale-down until there is pressure
The autoscaler will start out with scaling each
pools to have a full complements of pgs from the start
and will only decrease it when pools need more due to
increased usage.
Introduced a unit test that tests only the
function get_final_pg_target_and_ratio() which
deals with the distrubtion of pgs amongst the
pools
Edited workunit script to reflect the change
of how pgs are calculated and distrubted.
Greg Farnum [Thu, 17 Jun 2021 19:56:20 +0000 (19:56 +0000)]
mon: Sanely set the default CRUSH rule when creating pools in stretch mode
If we get a pool create request while in stretch mode that does not explicitly
specify a crush rule, look at the stretch-mode pools and their rules, and
select the most common one.
Also update set_up_stretch_mode.sh to add a few more rules that let me test
this locally.
Casey Bodley [Tue, 10 Aug 2021 19:40:25 +0000 (15:40 -0400)]
cls/cmpomap: empty values are 0 in U64 comparisons
previously, when trying to use cmpomap interfaces on an omap key with
an empty value, U64 comparisons would fail to decode with -EIO. so
cmp_set_vals() and cmp_rm_keys() are unable to update or remove such
keys
for backward-compatibility with rgw's data sync error repo, where the
keys used to have empty values, enable these comparisons by treating an
empty value as 0
Ramana Raja [Mon, 28 Jun 2021 23:39:10 +0000 (19:39 -0400)]
mds: create file system with specific ID
File system will need to be recreated when monitor databases are lost
and rebuilt. Some applications (e.g., CSI) expect that the recovered
file system have the same ID as before. Allow creating a file system
with a specific ID to help in such scenarios. This can now be done by
the `fs new` command using the argument 'fscid' and 'force' flag.
Newer file systems will no longer have increasing IDs as a corollary.
Kefu Chai [Fri, 20 Aug 2021 14:50:40 +0000 (22:50 +0800)]
cmake: exclude "grafonnet-lib" target from "all"
so we don't build this target when running "make", and hence avoid
accessing the internet in a building envronment where the internest
access is not allowed.
Conflicts:
monitoring/grafana/dashboards/CMakeLists.txt
- pacific does not have "LOG_DOWNLOAD ON", "LOG_MERGED_STDOUTERR ON", or
"LOG_OUTPUT_ON_FAILURE ON", but that fact is orthogonal to the substance of
this backport
Adam Kupczyk [Wed, 14 Jul 2021 21:35:12 +0000 (23:35 +0200)]
kv/RocksDBStore: Add handling of block_cache option for resharding
Synchronized all situations when we initialize DB to include handling of block_cache option.
Lack of it prevented ability to reshard into specification that we have as default.
Conflicts:
src/kv/RocksDBStore.cc
Trivial conflict, related to gist of the change. No logic involved in resolving.
Deepika Upadhyay [Wed, 23 Jun 2021 05:12:38 +0000 (10:42 +0530)]
mon/PGMap: DIRTY field as N/A in `df detail` when cache tier not in use
'ceph df detail' reports a column for DIRTY objects under POOLS even
though cache tiers not being used. In replicated or EC pool all objects
in the pool are reported as logically DIRTY as they have never been
flushed .
we display N/A for DIRTY objects if the pool is not a cache tier.
Kefu Chai [Tue, 17 Aug 2021 07:53:51 +0000 (15:53 +0800)]
mgr/dashboard/api: set a UTF-8 locale when running pip
ansible-core started to include files whose filenames are encoded in
non-ascii characters, so we have to use a more capable encoding for the
locale in order to install this package. otherwise we'd have following
error:
Collecting ansible-core<2.12,>=2.11.3
Using cached ansible-core-2.11.4.tar.gz (6.8 MB)
ERROR: Exception:
Traceback (most recent call last):
File "/tmp/tmp.fX76ASIrch/venv/lib/python3.8/site-packages/pip/_internal/cli/base_command.py", line 173, in _main
status = self.run(options, args)
...
File "/tmp/tmp.fX76ASIrch/venv/lib/python3.8/site-packages/pip/_internal/utils/unpacking.py", line 226, in untar_file
with open(path, "wb") as destfp:
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 137-140: ordinal not in range(256)
rgw/sts: correcting the evaluation of session policies
passed in with AssumeRoleWithWebIdentity.
Session Policies are used to restrict the permissions
granted by identity-based (Role's permission policy
and resource-policy (bucket policy) in some cases.