Jan Fajerski [Tue, 10 Dec 2019 13:56:37 +0000 (14:56 +0100)]
lvm: add sizing arguments to prepare and create.
This adds options to size to-be-created LVs in the prepare and create
subcommands. Sizing can be done explicitly by passing a sizes or
implicitly by specifying the number of slots per [data|journal|wal|db]
device. The former will try to create a LV of the specified size and use
that to create OSDs if it succeeds. The latter will carve up the device
size into $n slots and use one of those slots for the to-be-created OSD.
If partitions or LVs are passed these options are ignored.
This also creates the foundation to move to byte-based sizing, by moving
VolumeGroup lvm querying and size calculation to bytes as the base unit.
Fixes: https://tracker.ceph.com/issues/43299 Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 8b8913ad3c0b8ceae9f458fd8d3ec292c5ff5eb1)
Tiago Pasqualini [Fri, 31 Jan 2020 18:22:19 +0000 (15:22 -0300)]
rgw: make max_connections configurable in beast
Beast frontend currently accepts a hardcoded number of connections
that is defined by boost::asio::socket_base::max_connections. This
commit makes it configurable via a 'max_connections' config option
on rgw frontend.
Jan Fajerski [Tue, 22 Oct 2019 12:32:11 +0000 (14:32 +0200)]
ceph-volume: api/lvm create or reuse a vg
This changes create_lv so one can pass the desired device and either a
VG with a name starting with ceph is re-used or a new one is created.
This commit also adds two new lvm primitives, making use of lvm's select
feature. The goal is to eventually avoid keeping a full list of lv's (or
vg's) around and query the lvm system as needed.
Neha [Sat, 14 Dec 2019 00:49:27 +0000 (00:49 +0000)]
osd/PeeringState.cc: don't let num_objects become negative
If num_objects become negative, we may end up incorrectly setting
the PG_STATE_DEGRADED flag. Instead reset num_objects to zero and set
the PG_STATE_INCONSISTENT flag when this happens expecting scrub to fix
the inconsistency.
Fixes: https://tracker.ceph.com/issues/43308 Co-authored-by: David Zafman <dzafman@redhat.com> Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 54c5eca48ce94dc4a1416836a06203db880c3842)
Conflicts:
src/osd/PeeringState.cc
- file is missing in nautilus; backported the code change manually
to src/osd/PG.cc
Neha [Wed, 11 Dec 2019 23:47:19 +0000 (23:47 +0000)]
osd/PeeringState.cc: skip peer_purged when discovering all missing
We hit a couple of bugs because in discover_all_missing() we send
pg_query to an OSD that was marked stray and already got purged. This results
in a state machine crash on the purged OSD. Fix this by skipping any
purged peers.
xie xingguo [Wed, 20 Nov 2019 00:31:54 +0000 (08:31 +0800)]
osd/PeeringState: do not exclude up from acting_recovery_backfill
If we choose a primary that does not belong to the current up set,
and all up peers are still recoverable, then we might end up excluding
some up peer from the acting_recovery_backfill set too due to the
"want size <= pool size" constraint (since https://github.com/ceph/ceph/pull/24035),
as a result of which all up peers might not get recovered in one go.
Fix by falling through any oversized want set to async recovery, which
should be able to handle it nicely.
songshuangyang [Tue, 13 Nov 2018 09:32:41 +0000 (17:32 +0800)]
rgw: add list user admin OP API
The radosgw-admin tool support the `user list` subcommand to list radosgw users, but there is no user listing function for the admin OP API. It needs to support this API.
Shilpa Jagannath [Tue, 26 Nov 2019 08:03:52 +0000 (13:33 +0530)]
rgw: when a period lookup for oldest_realm_epoch returns an ENOENT,
find the oldest one and update RGWMetadataLogHistory. This is to avoid an
empty cursor being passed in to ceph_assert() in PurgePeriodLogsCR::operate()
in case of incomplete period history.
Conflicts:
src/rgw/services/svc_mdlog.cc
- file does not exist in mimic; made the changes manually to
src/rgw/rgw_metadata.cc
- in mimic, find_oldest_period takes an argument: store
- in mimic, write_history takes an additional argument: store
Ilya Dryomov [Wed, 4 Dec 2019 18:08:46 +0000 (19:08 +0100)]
qa: kernel.sh: unlock before rolling back
"rbd snap rollback" expects an unlocked image, but we may get there
locked if object map is enabled (or if lock_on_read is specified in
rbd_default_map_options).
Ilya Dryomov [Wed, 4 Dec 2019 14:26:54 +0000 (15:26 +0100)]
qa: krbd_exclusive_option.sh: update for recent kernel changes
Since 5.3:
- a plain "rbd map" acquires the lock, so it's not different from
"rbd map -o exclusive" in this regard
- if the lock is held by the exclusive peer, I/O is failed right away
instead of blocking
- lock_timeout option is respected only by "rbd map" and not by I/O
Since 5.5:
- if the mapping is read-only, the lock isn't acquired
Added blacklisting test case, dropped lock_timeout test case.
taodd [Mon, 13 Jan 2020 14:18:45 +0000 (22:18 +0800)]
rgw: update the hash source for multipart entries during resharding Fixes: https://tracker.ceph.com/issues/43583 Signed-off-by: dongdong tao <dongdong.tao@canonical.com>
(cherry picked from commit fb6f78a3a54a39fb2f43fa7846cb847e4917860d)
Conflicts:
src/rgw/rgw_reshard.cc
- in master, *store has type "rgw::sal::RGWRadosStore", while in nautilus
it has type "RGWRados", but this line appears to be merely incidental
to the patch
rgw: adding mfa code validation when bucket versioning status is changed.
When the user changes bucket versioning status from Enabled->Suspended
and vice versa, MFA code needs to be validated, if MFA has been enabled
for the bucket.
J. Eric Ivancich [Tue, 29 Oct 2019 23:25:51 +0000 (19:25 -0400)]
rgw: allow reshard log entries for non-existent buckets to be cancelled
The radosgw-admin tool allows admins to add buckets to the reshard log
and to cancel buckets from the reshard log. Both operations check for
the existence of the bucket before proceeding and fail for nonexistent
buckets.
It's possible, however, for an admin to add a bucket to the reshard
log and then, before the bucket is resharded, for a user to delete the
bucket. This leaves the entry in the reshard log.
Prior to this commit an attempt to use radosgw-admin to cancel the
reshard log entry would fail. With this commit it will still fail
*but* notify the user they can use the --yes-i-really-mean-it
command-line option to do it nonetheless. And if the user includes
that option, it will succeed.
Conflicts:
src/rgw/rgw_admin.cc
- mimic has "g_conf->rgw_reshard_bucket_lock_duration" where master has
store->ctx()->_conf.get_val<uint64_t>("rgw_reshard_bucket_lock_duration")
rgw: auto-clean reshard queue entries for non-existent buckets
It is possible for a bucket to be added to the reshard queue and then
to be removed before its entry in the reshard queue is processed. When
this is now encountered, processing of the reshard queue errors out.
This fix recognizes when the reshard queue entry refers to a
non-existent bucket and remove the entry from the reshard queue,
allowing processing of the queue to continue.
Jan Fajerski [Tue, 3 Dec 2019 12:44:00 +0000 (13:44 +0100)]
ceph-volume/batch: fail on filtered devices when non-interactive
When batch is called non-interactively and a user explicitly specifies,
say a db-device, this will be filtered when unavailable. This can cause
the resulting OSD to be very different from the users intention
(standalone vs external db when the db-device was filtered). If devices
get filtered in non-interactive mode, ceph-volume should fail.
Fixes: https://tracker.ceph.com/issues/43105 Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 2e985053deec6c4cf60c0b85aec3df16cd77ceeb)
Michal Skalski [Wed, 29 Jan 2020 00:29:58 +0000 (01:29 +0100)]
OSD: Allow 64-char hostname to be added as the "host" in CRUSH
On Linux system it is possible to set 64 character length hostname when
HOST_NAME_MAX is set to 64. It means that if we execute gethostname
function we should expect HOST_NAME_MAX characters + 1 for null
character ending hostname string as described here:
http://man7.org/linux/man-pages/man2/sethostname.2.html
With the current code on host with 64 long hostname osd during start
updates crush map with host=unknown_host.
We want this in mimic anyway because ceph_manager.py is now passing it
for all of the raw_cluster_cmds, and it would be even more awkward to
make that behavior conditional on version.
Fixes: https://tracker.ceph.com/issues/43946 Signed-off-by: Sage Weil <sage@redhat.com>