Casey Bodley [Sat, 11 Aug 2018 15:39:35 +0000 (11:39 -0400)]
rgw: data sync holds lease over transition from full to incremental
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked commit from 3e9ac0f) Signed-off-by: Jonathan Brielmaier <jbrielmaier@suse.de>
Conflicts:
src/rgw/rgw_data_sync.cc: use ldout instead of tn->log, reflect
state of luminous in multiple places
Erwan Velu [Wed, 10 Oct 2018 18:26:01 +0000 (20:26 +0200)]
ceph_volume: Checking device validity at init time
When initializing the Device structure, it have to run is_valid() to
ensure the data structures (_is_valid & rejected_reasons) to be
populated accordingly to the device state.
Erwan Velu [Tue, 9 Oct 2018 20:28:19 +0000 (22:28 +0200)]
ceph_volume: Reporting nr_requests
We are already reporting the rotational & scheduler of a disk device.
Reporting the nr_requests could be useful to get how many concurrent IOs
the device supports/reports.
That could help detecting badly detected/configured devices.
Erwan Velu [Tue, 9 Oct 2018 20:26:28 +0000 (22:26 +0200)]
ceph_volume: Reporting firmware revision
We are already reporting model & vendor of a given disk, let's also
report the revision of the firmware. That is useful to filter-out some
known broken revisions.
Yuri Weinstein [Wed, 17 Oct 2018 23:27:54 +0000 (16:27 -0700)]
Excluded 'python34-cephfs','python34-rados','python34-rbd','python34-rgw','python34-ceph-argparse','python3-cephfs','python3-rados' from the install tasks
in ceph/ceph-container we've realized that `e2fsprogs` isn't installed in
centos container image because ceph hasn't a dependency for it.
It has for consequence to fail when deploying a containerized cluster
with dmcrypt when using centos image.
Typical error encountered:
typical error:
```
......
get_dm_uuid: get_dm_uuid /dev/sda uuid path is /sys/dev/block/8:0/dm/uuid
get_dm_uuid: get_dm_uuid /dev/sda uuid path is /sys/dev/block/8:0/dm/uuid
get_dm_uuid: get_dm_uuid /dev/sda5 uuid path is /sys/dev/block/8:5/dm/uuid
populate: Creating lockbox fs on %s: mkfs -t ext4 /dev/sda5
command_check_call: Running command: /usr/sbin/mkfs -t ext4 /dev/sda5
mkfs.ext4: No such file or directory
Traceback (most recent call last):
File "/usr/sbin/ceph-disk", line 9, in <module>
load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
......
```
Kefu Chai [Thu, 18 Oct 2018 10:29:49 +0000 (18:29 +0800)]
osd: cast `whoami` to unsigned so it can be used as the seed for RNG
default_random_engine's result_type is `unsigned int`, so we need to
pass an `unsigned int` as its seed.
Fixes: http://tracker.ceph.com/issues/26890 Signed-off-by: Kefu Chai <kchai@redhat.com>
Conflicts:
src/osd/OSD.cc: this breaks the build with clang. and in master
we are not using std::default_random_engine for setting the scrub
interval. so this change is not cherry-picked from master.
qa/tasks/cram: tasks now must live in the repository
Commit 0d8887652d53 ("qa/tasks/cram: use suite_repo repository for all
cram jobs") removed hardcoded git.ceph.com links, but as it turned out
it is still used for nightlies. There is no good way to accommodate
the different URL schemes, so let's get rid of URLs altogether.
Conflicts:
qa/suites/krbd/basic/tasks/krbd_blkroset.yaml
qa/suites/krbd/basic/tasks/krbd_huge_image.yaml
qa/suites/krbd/basic/tasks/krbd_msgr_segments.yaml
qa/suites/krbd/basic/tasks/krbd_parent_overlap.yaml
qa/suites/krbd/basic/tasks/krbd_whole_object_discard.yaml
- in master, the cram task is referred to in these additional yaml
files, but in luminous it's only referred to in
qa/suites/krbd/unmap/tasks/unmap.yaml
qa/tasks/cram: use suite_repo repository for all cram jobs
Currently git.ceph.com is hardcoded for all cram jobs. Testing
modifications is a pain: one needs to push to either ceph/ceph.git or
ceph/ceph-ci.git (depending on where the ceph branch is at, triggering
unnecessary builds in the latter case) and wait for the mirror to sync.
Runs scheduled against branches in developer's forks fail.
Move away from git.ceph.com to allow mixing branches and repositories,
similar to workunits.
Conflicts:
qa/suites/krbd/basic/tasks/krbd_blkroset.yaml
qa/suites/krbd/basic/tasks/krbd_huge_image.yaml
qa/suites/krbd/basic/tasks/krbd_msgr_segments.yaml
qa/suites/krbd/basic/tasks/krbd_parent_overlap.yaml
qa/suites/krbd/basic/tasks/krbd_whole_object_discard.yaml
- in master, the cram task is referred to in these additional yaml
files, but in luminous it's only referred to in
qa/suites/krbd/unmap/tasks/unmap.yaml
Dan van der Ster [Tue, 25 Sep 2018 08:39:37 +0000 (10:39 +0200)]
osd: add creating to pg_string_state
Fixes: http://tracker.ceph.com/issues/36174 Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit d38f6a11701ec788e4d384aa5b0ae65b8e57da64)
Conflicts:
src/osd/osd_types.cc : Resolved in pg_string_state
Andrew Schoen [Mon, 8 Oct 2018 13:57:07 +0000 (09:57 -0400)]
ceph-volume: filter devices used by journals/block.db
If after filterering of data/block devices there are only
one device left it can not be used if it is an SSD and
has been used previously as a journal or block.db
Andrew Schoen [Thu, 27 Sep 2018 20:22:17 +0000 (15:22 -0500)]
ceph-volume: pick strategy for batch with only the unused devices
This will pick a strategy, filter out any devices already been used by
ceph and then pick a strategy again. If the strategy has changed the
call should error, if the strategy is the same proceed. If there are no
unused devices then the command is a noop.
Neha Ojha [Tue, 9 Oct 2018 22:57:15 +0000 (15:57 -0700)]
osd/PrimaryLogPG.cc: reassign size only when object size > truncate_size
Before setting size equal to op.extent.truncate_size, we need to check
if the size of the object is greater than the truncate_size. We do not
need to set size to op.extent.truncate_size, in the case where the size of
the object is less than op.extent.truncate_size.
Without this change, we were always setting size =
op.extent.truncate_size, when (seq < op.extent.truncate_seq) and
(op.extent.offset + op.extent.length > op.extent.truncate_size), were both
true. This ended up in:
1. overestimating the size of the object
2. not considering the correct size of the object, for
the later checks, which calculate op.extent.length for the read ops
3. causing crashes when trying to read more data than what was present
Jason Dillaman [Tue, 25 Sep 2018 18:18:00 +0000 (14:18 -0400)]
osdc/Objecter: possible race condition with connection reset
If the connection quickly fails before the private session reference
can be associated with the connection, the connection will remain
closed and any OSD ops against the session will remain stuck.
Fixes: http://tracker.ceph.com/issues/36183 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 74ca33cb49d2c258324447b1ca366ed4e604202a)
Conflicts:
src/osdc/Objecter.cc : Resolved in ms_handle_reset
rgw: copy actual stats from the source shards during reshard
Currently we don't copy the actual_stats field during reshard, which makes
resharded buckets show a size_utilized as 0, which further has the problem that
a subsequent object removal would subtract the object size from the 0 size
utilized showing up large uint64_t values. Copy the size_actual from the source
object in both cls and in reshard_process. This will fix the new buckets,
existing buckets will still have to go through a bucket check --fix for their
stats to be corrected.
Neha Ojha [Mon, 21 May 2018 19:34:31 +0000 (12:34 -0700)]
PG: add custom_reaction Backfilled and release reservations after backfill
After backfill completes, we directly go to the Recovered state without
releasing reservations. The outstanding reservations cause double reservation
issues.
Creating a custom_reaction Backfilled, allows us to release reservations,
before transiting to the Recovered state.
The output json string is invalid for 'osd crush tree --format=json'
command. It contains a array of 'nodes' and a array of 'stray', but
not in a json object, and the stray array was not implemented.
Applications which depends on the output of the above MonCommand will
occur json parse error.
* refs/pull/24403/head:
qa: add timeout to cleaning up workunit sandbox
qa: cleanup workunit dir for each unit
qa: add timeout to kclient umount
qa: do not cleanup sandbox on error
qa: use default timeout in fs workunits
qa: use sudo to cleanup workspace
qa: cleanup parallel execution of fsstress
qa/workunit: implement cleanup option
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Sage Weil [Thu, 9 Aug 2018 13:33:42 +0000 (08:33 -0500)]
osd: vary tick interval +/- 5% to avoid scrub livelocks
If you have two pgs that need to scrub on two OSDs, each the primary
for one pg and the replica for the other, you can end up in a livelock:
- both osds locally reserve a scrub slot
- both osds send a scrub schedule request
- both scrub requests are rejected
- both osds wait exactly 1 second
- repeat
Seems a bit unlikely, but I've seen test cases where it goes on more an
hour.
Conflicts:
src/osd/OSD.cc
- luminous does not have src/include/random.h; use #include <random>
instead, seeding with whoami so each OSD gets a different series
of pseudo-random numbers