git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log

]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log

projects / ceph.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Xiaoguang Wang [Thu, 30 Aug 2018 02:26:41 +0000 (10:26 +0800)]

os/bluestore: fix deep-scrub operation againest disk silent errors

Say a object who has data caches, but in a while later, caches' underlying
physical device has silent disk erros accidentally, then caches and physical
data are not same. In such case, deep-scrub operation still tries to read
caches firstly and won't do crc checksum, then deep-scrub won't find such
data corruptions timely.

Here introduce a new flag 'CEPH_OSD_OP_FLAG_BYPASS_CLEAN_CACHE' which tells
deep-scrub to bypass object caches. Note that we only bypass cache who is in
STATE_CLEAN state. For STATE_WRITING caches, currently they are not written
to physical device, so deep-scrub operation can not read physical device and
can read these dirty caches safely. Once they are in STATE_CLEAN state(or not
added to bluestore cache), next round deep-scurb can check them correctly.

As to above discussions, I refactor BlueStore::BufferSpace::read sightly,
adding a new 'flags' argument, whose value will be 0 or:
     enum {
       BYPASS_CLEAN_CACHE = 0x1,     // bypass clean cache
     };

flags 0: normal read, do not bypass clean or dirty cache
flags BYPASS_CLEAN_CACHE: bypass clean cache, currently only for deep-scrube
                        operation

Test:
   I deliberately corrupt a object with cache, with this patch, deep-scrub
   can find data error very timely.

Signed-off-by: Xiaoguang Wang <xiaoguang.wang@easystack.cn>
(cherry picked from commit a7f1af25dd2ba88a322ed21828f073a277b09d02)

Conflicts:
src/include/rados.h
src/os/bluestore/BlueStore.cc: trivial resolution

commit | commitdiff | tree

Alfredo Deza [Fri, 26 Oct 2018 13:46:27 +0000 (09:46 -0400)]

Merge pull request #24759 from ceph/backport-luminous-24587

luminous: ceph-volume: adds a --prepare flag to `lvm batch`

commit | commitdiff | tree

Yuri Weinstein [Thu, 25 Oct 2018 22:22:14 +0000 (18:22 -0400)]

Merge pull request #24482 from pdvian/wip-36149-luminous

luminous: crush/CrushWrapper: fix crush tree json dumper

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 25 Oct 2018 22:21:46 +0000 (18:21 -0400)]

Merge pull request #24532 from dzafman/wip-test-luminous

luminous: backport and other test fixes for osd-scrub-repair.sh

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 25 Oct 2018 22:21:09 +0000 (18:21 -0400)]

Merge pull request #24538 from pdvian/wip-36229-luminous

luminous: qa: add test that builds example librados programs

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 25 Oct 2018 22:20:47 +0000 (18:20 -0400)]

Merge pull request #24574 from pdvian/wip-36295-luminous

luminous: osdc/Objecter: possible race condition with connection reset

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 25 Oct 2018 22:20:10 +0000 (18:20 -0400)]

Merge pull request #24582 from smithfarm/wip-36437-luminous

luminous: ec: src/common/interval_map.h: 161: FAILED assert(len > 0)

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 25 Oct 2018 22:19:46 +0000 (18:19 -0400)]

Merge pull request #24593 from dzafman/wip-36419

luminous: osd: get loadavg per cpu for scrub load threshold check

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 25 Oct 2018 22:19:20 +0000 (18:19 -0400)]

Merge pull request #24602 from pdvian/wip-36297-luminous

luminous: osd: add creating to pg_string_state

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>
Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 25 Oct 2018 22:18:10 +0000 (18:18 -0400)]

Merge pull request #24659 from tchaikov/luminous-26890

luminous: osd: cast 'whoami' to unsigned so it can be used as the seed for RNG

Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
Reviewed-by: Nathan Cutler <ncutler@suse.com>

commit | commitdiff | tree

David Galloway [Thu, 25 Oct 2018 21:47:04 +0000 (17:47 -0400)]

Merge commit 'f0d49da1be98291f26f70b524f182eba4fe78cee' into luminous

commit | commitdiff | tree

Andrew Schoen [Thu, 25 Oct 2018 12:51:41 +0000 (07:51 -0500)]

ceph-volume: update man page for batch --prepare

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit faa3aed4ad4df2dbdd853235e364d6e3b9b176b6)

commit | commitdiff | tree

Andrew Schoen [Mon, 15 Oct 2018 15:38:43 +0000 (10:38 -0500)]

ceph-volume: docs for --prepare flag of lvm batch

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit b4825e1bd7b75a0a53ee5678230b262286293d75)

commit | commitdiff | tree

Andrew Schoen [Mon, 15 Oct 2018 15:35:12 +0000 (10:35 -0500)]

ceph-volume: when --prepare is added to batch the OSDs are only prepared

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit abd238dc1f33252a43adc8b86efb3d3b4a076964)

commit | commitdiff | tree

Andrew Schoen [Mon, 15 Oct 2018 14:33:30 +0000 (09:33 -0500)]

ceph-volume: adds a --prepare flag to ceph-volume lvm batch

This flag will only prepare the OSDs, not activate them. This is useful
in our containerized ceph solution.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 4368ca7d48f4ce35f67854e8b0ca2be9babc5371)

commit | commitdiff | tree

Andrew Schoen [Thu, 25 Oct 2018 16:27:42 +0000 (11:27 -0500)]

Merge pull request #24754 from alfredodeza/luminous-rm36386

luminous ceph-volume remove version reporting from help menu

Reviewed-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 25 Oct 2018 16:20:23 +0000 (12:20 -0400)]

Merge pull request #23493 from VictorDenisov/backport_24333

luminous: PG: add custom_reaction Backfilled and release reservations after bac…

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Alfredo Deza [Wed, 10 Oct 2018 19:35:31 +0000 (15:35 -0400)]

ceph-volume remove version reporting from help menu

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 20db7bf585f6433c6705396264005a4227cbf2e3)

commit | commitdiff | tree

Jenkins Build Slave User [Wed, 24 Oct 2018 21:04:19 +0000 (21:04 +0000)]

12.2.9

commit | commitdiff | tree

Andrew Schoen [Wed, 24 Oct 2018 20:06:35 +0000 (15:06 -0500)]

Merge pull request #24741 from alfredodeza/luminous-rm36492

luminous ceph-volume: do not send (lvm) stderr/stdout to the terminal, use the logfile

Reviewed-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Alfredo Deza [Wed, 24 Oct 2018 14:55:06 +0000 (10:55 -0400)]

ceph-volume tests.api update monkeypatching to use **kw

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 6d6bcda75f846bc7c44c7b16b3687a65fed7cafc)

commit | commitdiff | tree

Alfredo Deza [Wed, 24 Oct 2018 14:54:48 +0000 (10:54 -0400)]

ceph-volume tests allow **kw when monkeypatching

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 04e51c956eadcbd38f39ec3dd4b78009d2cd4a70)

commit | commitdiff | tree

Alfredo Deza [Wed, 24 Oct 2018 14:54:18 +0000 (10:54 -0400)]

ceph-volume api.lvm: do not spit out errors on the terminal from LVM

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 581eed36ea0e7b5c5e41f8b83350403bcf73a31a)

commit | commitdiff | tree

Yuri Weinstein [Sun, 21 Oct 2018 16:58:21 +0000 (12:58 -0400)]

Merge pull request #24620 from dillaman/wip-36431-luminous

luminous: qa/workunits: replace 'realpath' with 'readlink -f' in fsstress.sh

Reviewed-by: Jason Dillaman <dillaman@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 18 Oct 2018 19:35:16 +0000 (15:35 -0400)]

Merge pull request #24650 from yuriw/wip-yuriw-fix-jewel-x-luminous

Excluded 'python34-cephfs' from the install tasks

Reviewed-by: David Galloway <dgallowa@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 17 Oct 2018 23:27:54 +0000 (16:27 -0700)]

Excluded 'python34-cephfs','python34-rados','python34-rbd','python34-rgw','python34-ceph-argparse','python3-cephfs','python3-rados' from the install tasks

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Thu, 18 Oct 2018 15:21:10 +0000 (10:21 -0500)]

Merge pull request #24663 from alfredodeza/luminous-rm23650

luminous: specs: require of e2fsprogs

Reviewed-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Alfredo Deza [Thu, 18 Oct 2018 14:39:55 +0000 (11:39 -0300)]

rpm: require 2fsprogs needed for ceph-disk when using dmcrypt

Signed-off-by: Alfredo Deza <adeza@redhat.com>

commit | commitdiff | tree

Guillaume Abrioux [Wed, 11 Apr 2018 10:05:33 +0000 (12:05 +0200)]

specs: require of e2fsprogs

in ceph/ceph-container we've realized that `e2fsprogs` isn't installed in
centos container image because ceph hasn't a dependency for it.
It has for consequence to fail when deploying a containerized cluster
with dmcrypt when using centos image.

Typical error encountered:

typical error:
```
......
get_dm_uuid: get_dm_uuid /dev/sda uuid path is /sys/dev/block/8:0/dm/uuid
get_dm_uuid: get_dm_uuid /dev/sda uuid path is /sys/dev/block/8:0/dm/uuid
get_dm_uuid: get_dm_uuid /dev/sda5 uuid path is /sys/dev/block/8:5/dm/uuid
populate: Creating lockbox fs on %s: mkfs -t ext4 /dev/sda5
command_check_call: Running command: /usr/sbin/mkfs -t ext4 /dev/sda5
mkfs.ext4: No such file or directory
Traceback (most recent call last):
File "/usr/sbin/ceph-disk", line 9, in <module>
load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
......
```

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit a99177834120e7a2c4592054f6a8b8736e0ffb92)

commit | commitdiff | tree

Kefu Chai [Thu, 18 Oct 2018 10:29:49 +0000 (18:29 +0800)]

osd: cast `whoami` to unsigned so it can be used as the seed for RNG

default_random_engine's result_type is `unsigned int`, so we need to
pass an `unsigned int` as its seed.

Fixes: http://tracker.ceph.com/issues/26890
Signed-off-by: Kefu Chai <kchai@redhat.com>
Conflicts:
src/osd/OSD.cc: this breaks the build with clang. and in master
we are not using std::default_random_engine for setting the scrub
interval. so this change is not cherry-picked from master.

commit | commitdiff | tree

Andrew Schoen [Tue, 16 Oct 2018 19:40:41 +0000 (14:40 -0500)]

Merge pull request #24589 from ceph/backport-luminous-24404

luminous: ceph-volume: make `lvm batch` idempotent

Reviewed-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Tue, 16 Oct 2018 18:13:59 +0000 (13:13 -0500)]

Merge pull request #24451 from alfredodeza/luminous-wip-rm24795

luminous: ceph-volume lvm.prepare update help to indicate partitions are needed, not devices

Reviewed-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Jason Dillaman [Fri, 12 Oct 2018 14:02:35 +0000 (10:02 -0400)]

qa/tasks/workunit: use suite branch/SHA1 when cloning workunits

Right now it's using the Ceph branch/SHA1 but it's using the suite
Git URL.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 1e3dc02604cf7c0d3991dfd9fe2596ea34e80bad)

commit | commitdiff | tree

Jason Dillaman [Thu, 11 Oct 2018 20:21:35 +0000 (16:21 -0400)]

qa/tasks: qemu task now uses a relative path in suite repo for test

This makes it easier to re-run tests against a suite branch without
requiring a full ceph-ci build and repo.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit c1f950236b2f1ba3b588fc638ebf8cb785c36e08)

commit | commitdiff | tree

Jason Dillaman [Thu, 11 Oct 2018 19:17:25 +0000 (15:17 -0400)]

qa/workunits: replace 'realpath' with 'readlink -f' in fsstress.sh

Fixes: http://tracker.ceph.com/issues/36409
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit ddb7e5eb80ebb78beff6dfb25042f0016a474491)

commit | commitdiff | tree

Ilya Dryomov [Thu, 6 Sep 2018 14:07:08 +0000 (16:07 +0200)]

qa/tasks/cram: tasks now must live in the repository

Commit 0d8887652d53 ("qa/tasks/cram: use suite_repo repository for all
cram jobs") removed hardcoded git.ceph.com links, but as it turned out
it is still used for nightlies. There is no good way to accommodate
the different URL schemes, so let's get rid of URLs altogether.

Fixes: https://tracker.ceph.com/issues/27211
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 592f566b4e270d3833ba896dabb193d5241dd5de)

Conflicts:
qa/suites/krbd/basic/tasks/krbd_blkroset.yaml
qa/suites/krbd/basic/tasks/krbd_huge_image.yaml
qa/suites/krbd/basic/tasks/krbd_msgr_segments.yaml
qa/suites/krbd/basic/tasks/krbd_parent_overlap.yaml
qa/suites/krbd/basic/tasks/krbd_whole_object_discard.yaml
- in master, the cram task is referred to in these additional yaml
files, but in luminous it's only referred to in
qa/suites/krbd/unmap/tasks/unmap.yaml

commit | commitdiff | tree

Ilya Dryomov [Thu, 6 Sep 2018 14:53:25 +0000 (16:53 +0200)]

qa/tasks/workunit: factor out overrides and refspec logic

Allow for reuse in the cram task.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit e1c89b51c80407fadbfee82d5d396cfcbd791aae)

commit | commitdiff | tree

Ilya Dryomov [Mon, 3 Sep 2018 15:40:08 +0000 (17:40 +0200)]

qa/tasks/cram: use suite_repo repository for all cram jobs

Currently git.ceph.com is hardcoded for all cram jobs. Testing
modifications is a pain: one needs to push to either ceph/ceph.git or
ceph/ceph-ci.git (depending on where the ceph branch is at, triggering
unnecessary builds in the latter case) and wait for the mirror to sync.
Runs scheduled against branches in developer's forks fail.

Move away from git.ceph.com to allow mixing branches and repositories,
similar to workunits.

Fixes: https://tracker.ceph.com/issues/27211
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 0d8887652d5312f7059ab2bdb52c948aa01680b0)

Conflicts:
qa/suites/krbd/basic/tasks/krbd_blkroset.yaml
qa/suites/krbd/basic/tasks/krbd_huge_image.yaml
qa/suites/krbd/basic/tasks/krbd_msgr_segments.yaml
qa/suites/krbd/basic/tasks/krbd_parent_overlap.yaml
qa/suites/krbd/basic/tasks/krbd_whole_object_discard.yaml
- in master, the cram task is referred to in these additional yaml
files, but in luminous it's only referred to in
qa/suites/krbd/unmap/tasks/unmap.yaml

commit | commitdiff | tree

Dan van der Ster [Tue, 25 Sep 2018 08:39:37 +0000 (10:39 +0200)]

osd: add creating to pg_string_state

Fixes: http://tracker.ceph.com/issues/36174
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit d38f6a11701ec788e4d384aa5b0ae65b8e57da64)

Conflicts:
src/osd/osd_types.cc : Resolved in pg_string_state

commit | commitdiff | tree

Andrew Schoen [Wed, 10 Oct 2018 19:28:29 +0000 (15:28 -0400)]

ceph-volume: extracts batch.filter_devices from Batch._get_strategy

This allows us to easily provide tests for that method.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit df7ef5383bb476ee020b898cd2e8fbce044fc07e)

commit | commitdiff | tree

Andrew Schoen [Wed, 10 Oct 2018 18:05:25 +0000 (14:05 -0400)]

ceph-volume: failing to get block db size from conf logs an exception

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 172d4af9b97d8d64ce35dc95efb72d9f190bc170)

commit | commitdiff | tree

Andrew Schoen [Tue, 9 Oct 2018 18:05:54 +0000 (14:05 -0400)]

ceph-volume: when all devices are filtered exit gracefully

Even if all devices are filtered we want to return a 0 exit code and
make sure the json reporting still works.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit aa4fcd602f8b1a1b17bc12e59334b19508f97d6d)

commit | commitdiff | tree

Andrew Schoen [Mon, 8 Oct 2018 13:57:07 +0000 (09:57 -0400)]

ceph-volume: filter devices used by journals/block.db

If after filterering of data/block devices there are only
one device left it can not be used if it is an SSD and
has been used previously as a journal or block.db

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit fc9a10e54813a402ecd1d823ea5a33f85e8eb963)

commit | commitdiff | tree

Andrew Schoen [Mon, 8 Oct 2018 13:39:05 +0000 (09:39 -0400)]

ceph-volume: add rotational property to Device class

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 7f6bfaaf0cf30a56481d6d67f2fa2a7e785cc344)

commit | commitdiff | tree

Andrew Schoen [Fri, 5 Oct 2018 21:18:48 +0000 (16:18 -0500)]

ceph-volume: add info about filtered devices to batch pretty reports

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 62426db799c54436c50b5751c8e72b3dbd710a2a)

commit | commitdiff | tree

Andrew Schoen [Fri, 5 Oct 2018 15:45:35 +0000 (10:45 -0500)]

ceph-volume: remove the used_by_ceph key in the json output

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 3b6d82afe7ee83283b25cefda00c7a59cf5144af)

commit | commitdiff | tree

Andrew Schoen [Fri, 5 Oct 2018 15:39:54 +0000 (10:39 -0500)]

ceph-volume: fix idempotency checks for lvm batch tests

The mixed type tests will change strategy after the idempotency test so
we need to handle that in test playbook.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 1143482061da50910f4f883aafdbe0a0e4269f39)

commit | commitdiff | tree

Andrew Schoen [Fri, 5 Oct 2018 15:38:11 +0000 (10:38 -0500)]

ceph-volume: fix bluestore strategy json reporting and type

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit b3cf90604d31709996538e612b1767d6fd4da8b5)

commit | commitdiff | tree

Andrew Schoen [Thu, 4 Oct 2018 17:47:48 +0000 (12:47 -0500)]

ceph-volume: ignore failure to load ceph configuration for block.db size

If we fail to load a ceph configureation file when trying to get the
block.db size then just use defaults instead of throwing an error.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit a7ee36ca92b6592b6b3e218252c6a4c30416591b)

commit | commitdiff | tree

Andrew Schoen [Thu, 4 Oct 2018 16:54:06 +0000 (11:54 -0500)]

ceph-volume: fix strategy comparison in 'lvm batch'

This also fixes some small json reporting issues with the
filestore MixedType strategy

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 1dd15025bfd52af9c31cd281f92d743b9ca0eeb8)

commit | commitdiff | tree

Andrew Schoen [Wed, 3 Oct 2018 20:01:08 +0000 (15:01 -0500)]

ceph-volume: raise a non zero exit code if strategy changes with batch

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 9dfc00f8e10e10fbdc52c44259596c6a96a90edd)

commit | commitdiff | tree

Andrew Schoen [Wed, 3 Oct 2018 17:13:27 +0000 (12:13 -0500)]

ceph-volume: add functional tests to ensure lvm batch is idempotent

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 9752d03cc16664cddd9ac7741284a2fb5b31f0e7)

commit | commitdiff | tree

Andrew Schoen [Wed, 3 Oct 2018 15:19:45 +0000 (10:19 -0500)]

ceph-volume: add tests for util.device.Device.used_by_ceph

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit a28e6531e47d60d9eb7f62b67f122578daa2a683)

commit | commitdiff | tree

Andrew Schoen [Tue, 2 Oct 2018 20:23:39 +0000 (15:23 -0500)]

ceph-volume: update tests to account for filtered_devices in batch

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 36396229d3b8d20862565b35d431ff22fa92cd1c)

commit | commitdiff | tree

Andrew Schoen [Tue, 2 Oct 2018 20:08:10 +0000 (15:08 -0500)]

ceph-volume: consider block and data devices used_by_ceph

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 67512530116c16d072c91867a259aeb429d32ff6)

commit | commitdiff | tree

Andrew Schoen [Tue, 2 Oct 2018 14:48:27 +0000 (09:48 -0500)]

ceph-volume: add filtered_devices and used_by_ceph to all batch reports

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 0718d2e2663c8e807dbb5171a58759c76ed92c08)

commit | commitdiff | tree

Andrew Schoen [Thu, 27 Sep 2018 20:22:17 +0000 (15:22 -0500)]

ceph-volume: pick strategy for batch with only the unused devices

This will pick a strategy, filter out any devices already been used by
ceph and then pick a strategy again. If the strategy has changed the
call should error, if the strategy is the same proceed. If there are no
unused devices then the command is a noop.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 4529f2d6053b0b07583a3d7501f4e05e08cad385)

commit | commitdiff | tree

Andrew Schoen [Thu, 27 Sep 2018 13:55:20 +0000 (08:55 -0500)]

ceph-volume: adds a 'changed' key to lvm batch --report

This will indicate if the command would result in any OSDs being created
or not. Other tooling can use that key for idempotency checks.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 10f1d577d4c4e66c77046fc3b274d6653af99586)

commit | commitdiff | tree

Andrew Schoen [Wed, 26 Sep 2018 21:07:30 +0000 (16:07 -0500)]

ceph-volume: adds used_by_ceph to filestore singletype batch report

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 48d10c9ccf13ba7f42750b37d26aeb28f6b1c606)

commit | commitdiff | tree

Andrew Schoen [Wed, 26 Sep 2018 21:01:30 +0000 (16:01 -0500)]

ceph-volume: adds a used_by_ceph property to the Device class

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 9d49a3708e34bccd183d5c32d022c36d8b118b42)

commit | commitdiff | tree

Andrew Schoen [Wed, 26 Sep 2018 20:53:26 +0000 (15:53 -0500)]

ceph-volume: adds a lvs property to the Device class

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 3dae3247adc1b96ca688ec81cec03180d5943823)

commit | commitdiff | tree

Andrew Schoen [Wed, 26 Sep 2018 19:29:41 +0000 (14:29 -0500)]

ceph-volume: add vg_name to the Device class

Signed-off-by: Andrew Schoen <aschoen@redhat.com>
(cherry picked from commit 619810c0ef48c1f19db8821a33f9c48cc11a0161)

commit | commitdiff | tree

Abhishek L [Mon, 15 Oct 2018 15:13:52 +0000 (17:13 +0200)]

Merge pull request #24527 from theanalyst/wip-luminous-36382

luminous: rgw: resharding produces invalid values of bucket stats

Reviewed-By: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Neha Ojha [Tue, 9 Oct 2018 22:57:15 +0000 (15:57 -0700)]

osd/PrimaryLogPG.cc: reassign size only when object size > truncate_size

Before setting size equal to op.extent.truncate_size, we need to check
if the size of the object is greater than the truncate_size. We do not
need to set size to op.extent.truncate_size, in the case where the size of
the object is less than op.extent.truncate_size.

Without this change, we were always setting size =
op.extent.truncate_size, when (seq < op.extent.truncate_seq) and
(op.extent.offset + op.extent.length > op.extent.truncate_size), were both
true. This ended up in:

1. overestimating the size of the object
2. not considering the correct size of the object, for
the later checks, which calculate op.extent.length for the read ops
3. causing crashes when trying to read more data than what was present

Fixes: http://tracker.ceph.com/issues/21931
Fixes: http://tracker.ceph.com/issues/22330
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 76c57810ee2346c392834206331aacb0faaa5b54)

commit | commitdiff | tree

Neha Ojha [Wed, 26 Sep 2018 23:31:44 +0000 (16:31 -0700)]

osd: print offset and length to track trimtrunc

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 07f7bd69425ed2c6b2a97bbf0034232a56fed046)

Conflicts:
src/osd/ECBackend.cc
- trivial: luminous uses assert() instead of ceph_assert()

commit | commitdiff | tree

Jason Dillaman [Tue, 25 Sep 2018 18:18:00 +0000 (14:18 -0400)]

osdc/Objecter: possible race condition with connection reset

If the connection quickly fails before the private session reference
can be associated with the connection, the connection will remain
closed and any OSD ops against the session will remain stuck.

Fixes: http://tracker.ceph.com/issues/36183
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 74ca33cb49d2c258324447b1ca366ed4e604202a)

Conflicts:
src/osdc/Objecter.cc : Resolved in ms_handle_reset

commit | commitdiff | tree

kungf [Thu, 14 Sep 2017 06:09:30 +0000 (14:09 +0800)]

osd: get loadavg per cpu for scrub load threshold check

Signed-off-by: kungf <yang.wang@easystack.cn>
(cherry picked from commit c1dba46084a7800d4cd34eda4966295faf0a6366)

commit | commitdiff | tree

Nathan Cutler [Thu, 19 Jul 2018 15:59:04 +0000 (17:59 +0200)]

qa: add test that builds example librados programs

Fixes: http://tracker.ceph.com/issues/15100
Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit c46c890d0241972cee10260f071f65b4beedf92c)

commit | commitdiff | tree

Yuri Weinstein [Wed, 10 Oct 2018 18:53:24 +0000 (14:53 -0400)]

Merge pull request #24396 from smithfarm/wip-26932-luminous

luminous: osd: scrub livelock

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Abhishek Lekshmanan [Fri, 5 Oct 2018 09:19:18 +0000 (11:19 +0200)]

rgw: copy actual stats from the source shards during reshard

Currently we don't copy the actual_stats field during reshard, which makes
resharded buckets show a size_utilized as 0, which further has the problem that
a subsequent object removal would subtract the object size from the 0 size
utilized showing up large uint64_t values. Copy the size_actual from the source
object in both cls and in reshard_process. This will fix the new buckets,
existing buckets will still have to go through a bucket check --fix for their
stats to be corrected.

Fixes: http://tracker.ceph.com/issues/36290
Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
(cherry picked from commit beb90638ae3d5329653b61bae0d6714796c41d04)

commit | commitdiff | tree

David Zafman [Wed, 10 Oct 2018 02:57:50 +0000 (19:57 -0700)]

test: Luminous specific fixes for osd-scrub-repair.sh

Signed-off-by: David Zafman <dzafman@redhat.com>

commit | commitdiff | tree

David Zafman [Tue, 9 Oct 2018 23:12:36 +0000 (16:12 -0700)]

osd: Adjust Luminous only messages for consistency

Signed-off-by: David Zafman <dzafman@redhat.com>

commit | commitdiff | tree

Neha Ojha [Mon, 21 May 2018 19:34:31 +0000 (12:34 -0700)]

PG: add custom_reaction Backfilled and release reservations after backfill

After backfill completes, we directly go to the Recovered state without
releasing reservations. The outstanding reservations cause double reservation
issues.

Creating a custom_reaction Backfilled, allows us to release reservations,
before transiting to the Recovered state.

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 1abc2320283e9247bec7b0821a6134f31b9b5e29)

Conflicts:
src/osd/PG.cc
src/osd/PG.h

commit | commitdiff | tree

Neha Ojha [Tue, 9 Oct 2018 01:01:58 +0000 (18:01 -0700)]

Merge pull request #24479 from neha-ojha/wip-36347-luminous

qa/suites/rados/upgrade/jewel-x-singleton: exclude python3-rados, python3-cephfs

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>

commit | commitdiff | tree

songshuangyang [Fri, 14 Sep 2018 06:53:17 +0000 (14:53 +0800)]

crush/CrushWrapper: fix crush tree json dumper

    The output json string is invalid for 'osd crush tree --format=json'
    command. It contains a array of 'nodes' and a array of 'stray', but
    not in a json object, and the stray array was not implemented.
    Applications which depends on the output of the above MonCommand will
    occur json parse error.

Signed-off-by: Oshyn Song <dualyangsong@gmail.com>
(cherry picked from commit 35c0d1f45cd676f201d4031cb8f447f7ea6aee0e)

commit | commitdiff | tree

Patrick Donnelly [Mon, 8 Oct 2018 20:26:07 +0000 (13:26 -0700)]

Merge PR #24403 into luminous

* refs/pull/24403/head:
qa: add timeout to cleaning up workunit sandbox
qa: cleanup workunit dir for each unit
qa: add timeout to kclient umount
qa: do not cleanup sandbox on error
qa: use default timeout in fs workunits
qa: use sudo to cleanup workspace
qa: cleanup parallel execution of fsstress
qa/workunit: implement cleanup option

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Neha Ojha [Mon, 8 Oct 2018 19:12:39 +0000 (15:12 -0400)]

qa/suites/rados/upgrade/jewel-x-singleton: exclude python3-rados, python3-cephfs

This fix goes directly into the luminous branch since these packages do not need
to be installed on jewel, when upgrading to luminous.

Fixes: https://tracker.ceph.com/issues/36347
Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 5 Oct 2018 20:57:44 +0000 (13:57 -0700)]

Merge pull request #24410 from smithfarm/wip-36196-luminous

luminous: mds: internal op missing events time 'throttled', 'all_read', 'dispatched'

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 5 Oct 2018 20:57:18 +0000 (13:57 -0700)]

Merge pull request #24421 from vshankar/wip-35937

luminous: mds: track average session uptime

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Sage Weil [Thu, 9 Aug 2018 13:33:42 +0000 (08:33 -0500)]

osd: vary tick interval +/- 5% to avoid scrub livelocks

If you have two pgs that need to scrub on two OSDs, each the primary
for one pg and the replica for the other, you can end up in a livelock:

- both osds locally reserve a scrub slot
- both osds send a scrub schedule request
- both scrub requests are rejected
- both osds wait exactly 1 second
- repeat

Seems a bit unlikely, but I've seen test cases where it goes on more an
hour.

Fixes: http://tracker.ceph.com/issues/26890
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 2011377c379c9d53a3a0a693a7874fc330278898)

Conflicts:
src/osd/OSD.cc
- luminous does not have src/include/random.h; use #include <random>
instead, seeding with whoami so each OSD gets a different series
of pseudo-random numbers

commit | commitdiff | tree

Yuri Weinstein [Fri, 5 Oct 2018 20:12:15 +0000 (13:12 -0700)]

Merge pull request #23483 from pdvian/wip-26840-luminous

luminous: librados application's symbol could conflict with the libceph-common

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 5 Oct 2018 20:11:45 +0000 (13:11 -0700)]

Merge pull request #24405 from dillaman/wip-36143-luminous

luminous: librbd: blacklisted client might not notice it lost the lock

Reviewed-by: Mykola Golub <mgolub@mirantis.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 5 Oct 2018 20:11:15 +0000 (13:11 -0700)]

Merge pull request #24415 from dillaman/wip-36224-luminous

luminous: librbd: object map improperly flagged as invalidated

Reviewed-by: Mykola Golub <mgolub@mirantis.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 5 Oct 2018 20:10:38 +0000 (13:10 -0700)]

Merge pull request #24419 from pdvian/wip-36157-luminous

luminous: msg: ceph_abort() when there are enough accepter errors in msg server

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 5 Oct 2018 20:09:56 +0000 (13:09 -0700)]

Merge pull request #24424 from theanalyst/wip-luminous-36311

luminous: multi-site: object name should be urlencoded when we put it into ES

Reviewed-by: Casey Bodley <cbodley@redhat.com>

commit | commitdiff | tree

Alfredo Deza [Fri, 5 Oct 2018 14:29:58 +0000 (10:29 -0400)]

ceph-volume lvm.prepare update help to remove old basic usage example

Signed-off-by: Alfredo Deza <adeza@redhat.com>

commit | commitdiff | tree

Alfredo Deza [Wed, 3 Oct 2018 12:11:58 +0000 (08:11 -0400)]

ceph-volume lvm.prepare update help to indicate partitions are needed, not devices

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit d31dd95b95445825f5e6669dd4ecb3118b09fdcf)

commit | commitdiff | tree

Jeffrey Zhang [Tue, 3 Apr 2018 07:03:22 +0000 (15:03 +0800)]

fix typo in ceph-volume lvm prepare help

Signed-off-by: Jeffrey Zhang <zhang.lei.fly@gmail.com>
(cherry picked from commit d65b8844d16d71df01b57f368badc100db505506)

commit | commitdiff | tree

Alfredo Deza [Tue, 13 Mar 2018 19:30:22 +0000 (15:30 -0400)]

ceph-volume lvm.prepare simplify help menu with bluestore default flags

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit b7140d23741db33b68444f29a744f552f355a6f5)

commit | commitdiff | tree

Alfredo Deza [Tue, 13 Mar 2018 19:28:28 +0000 (15:28 -0400)]

ceph-volume lvm.create simplify help menu with bluestore default flags

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 2d1b1918285c941ca7b1953d77d0fc465f60a0c5)

commit | commitdiff | tree

Alfredo Deza [Tue, 13 Mar 2018 19:26:46 +0000 (15:26 -0400)]

doc/ceph-volume document multipath support

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit 1dca5eac386ff2f67f2449700cd4bf27b81dafda)

commit | commitdiff | tree

Sage Weil [Thu, 9 Aug 2018 13:22:05 +0000 (08:22 -0500)]

osd: tick at OSD_TICK_INTERVAL, not heartbeat interval

Heartbeat inveral is not related!

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 7d76458354661f7575c4a2cae251a9b828513580)

commit | commitdiff | tree

Patrick Donnelly [Sun, 30 Sep 2018 00:37:12 +0000 (17:37 -0700)]

qa: add timeout to cleaning up workunit sandbox

If there is a bug preventing rm from completing, the workunit will get stuck.

Fixes: http://tracker.ceph.com/issues/36184
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 3a10d74f3aa4901dd9edffc0061992073ae67085)

commit | commitdiff | tree

Patrick Donnelly [Mon, 24 Sep 2018 18:29:10 +0000 (11:29 -0700)]

qa: cleanup workunit dir for each unit

This was wrongly dropped and moved to the finalizer.

Introduced-by: de824f74dd8ac909e47335ccd53d7a085e388e41
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 70844f3f55004024a747854013a1efb409705d81)

commit | commitdiff | tree

Patrick Donnelly [Sun, 30 Sep 2018 00:34:37 +0000 (17:34 -0700)]

qa: add timeout to kclient umount

Otherwise QA sits forever waiting for the kclient to umount when there is a
problem.

Fixes: http://tracker.ceph.com/issues/36184
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 7a64eb9dfb908a1a8e5d2b0dcaa7ca9df52a9ab1)

commit | commitdiff | tree

Patrick Donnelly [Wed, 26 Sep 2018 14:38:58 +0000 (07:38 -0700)]

qa: do not cleanup sandbox on error

Otherwise the command will hang if the mount is broken.

Fixes: http://tracker.ceph.com/issues/36184
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit d4b8f94cf8d95ebb277b550fc6ebc3468052a39c)

commit | commitdiff | tree

Patrick Donnelly [Mon, 1 Oct 2018 01:10:05 +0000 (18:10 -0700)]

qa: use default timeout in fs workunits

Six hours is unnecessarily long.

Fixes: http://tracker.ceph.com/issues/36184
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit bdd2ddcfd862b65dfd73bc1ea09b0ad07040d445)

commit | commitdiff | tree

Patrick Donnelly [Mon, 24 Sep 2018 18:02:49 +0000 (11:02 -0700)]

qa: use sudo to cleanup workspace

Files in scratch_tmp may not be owned by ubuntu.

Fixes: http://tracker.ceph.com/issues/36165
Introduced-by: de824f74dd8ac909e47335ccd53d7a085e388e41
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 1eaf78a75498d0f739b40bf310d036c851465fad)

commit | commitdiff | tree

Patrick Donnelly [Tue, 18 Sep 2018 21:57:05 +0000 (14:57 -0700)]

qa: cleanup parallel execution of fsstress

Two instances of fsstress clobber each other. Just build it in the local sandbox.

Fixes: http://tracker.ceph.com/issues/24177
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit de824f74dd8ac909e47335ccd53d7a085e388e41)

commit | commitdiff | tree

Nathan Cutler [Wed, 3 Oct 2018 19:13:11 +0000 (21:13 +0200)]

qa/workunit: implement cleanup option

This is a partial backport of 91942df5a690809ed872f5aa8c35b56e8048e485
just to get the workunit.py changes.

Signed-off-by: Nathan Cutler <ncutler@suse.com>

Unnamed repository; edit this file 'description' to name the repository.