]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
9 years agoMerge pull request #6397 from SUSE/wip-13615-infernalis
Sage Weil [Mon, 8 Feb 2016 13:49:41 +0000 (08:49 -0500)]
Merge pull request #6397 from SUSE/wip-13615-infernalis

OSD::build_past_intervals_parallel() shall reset primary and up_primary when begin a new past_interval.

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #6840 from SUSE/wip-13791-infernalis
Sage Weil [Mon, 8 Feb 2016 13:49:17 +0000 (08:49 -0500)]
Merge pull request #6840 from SUSE/wip-13791-infernalis

Objecter: potential null pointer access when do pool_snap_list.

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #6851 from Abhishekvrshny/wip-14018-infernalis
Sage Weil [Mon, 8 Feb 2016 13:48:49 +0000 (08:48 -0500)]
Merge pull request #6851 from Abhishekvrshny/wip-14018-infernalis

infernalis: osd/PG.cc: 288: FAILED assert(info.last_epoch_started >= info.history.last_epoch_started)

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #6849 from Abhishekvrshny/wip-13979-infernalis
Sage Weil [Mon, 8 Feb 2016 13:48:25 +0000 (08:48 -0500)]
Merge pull request #6849 from Abhishekvrshny/wip-13979-infernalis

osd: call on_new_interval on newly split child PG

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #6907 from Abhishekvrshny/wip-13929-infernalis
Sage Weil [Mon, 8 Feb 2016 13:48:03 +0000 (08:48 -0500)]
Merge pull request #6907 from Abhishekvrshny/wip-13929-infernalis

infernalis: Ceph Pools' MAX AVAIL is 0 if some OSDs' weight is 0

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #7421 from Abhishekvrshny/wip-14494-infernalis
Sage Weil [Mon, 8 Feb 2016 13:47:36 +0000 (08:47 -0500)]
Merge pull request #7421 from Abhishekvrshny/wip-14494-infernalis

infernalis: pgs stuck inconsistent after infernalis upgrade

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #6627 from Abhishekvrshny/wip-13771-infernalis
Sage Weil [Mon, 8 Feb 2016 13:46:25 +0000 (08:46 -0500)]
Merge pull request #6627 from Abhishekvrshny/wip-13771-infernalis

Objecter: pool op callback may hang forever.

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #7543 from SUSE/wip-14676-infernalis
Loic Dachary [Mon, 8 Feb 2016 11:18:07 +0000 (18:18 +0700)]
Merge pull request #7543 from SUSE/wip-14676-infernalis

infernalis: rgw: radosgw-admin --help doesn't show the orphans find command

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #6993 from badone/wip-13993-infernalis
Loic Dachary [Mon, 8 Feb 2016 04:22:28 +0000 (11:22 +0700)]
Merge pull request #6993 from badone/wip-13993-infernalis

log: Log.cc: Assign LOG_DEBUG priority to syslog calls

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #6882 from dachary/wip-13988-reuse-osd-id-infernalis
Loic Dachary [Mon, 8 Feb 2016 04:21:11 +0000 (11:21 +0700)]
Merge pull request #6882 from dachary/wip-13988-reuse-osd-id-infernalis

tests: verify it is possible to reuse an OSD id

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #6852 from Abhishekvrshny/wip-14013-infernalis
Loic Dachary [Mon, 8 Feb 2016 04:18:35 +0000 (11:18 +0700)]
Merge pull request #6852 from Abhishekvrshny/wip-14013-infernalis

infernalis: systemd/ceph-disk@.service assumes /bin/flock

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #6846 from Abhishekvrshny/wip-13638-infernalis
Loic Dachary [Mon, 8 Feb 2016 04:16:23 +0000 (11:16 +0700)]
Merge pull request #6846 from Abhishekvrshny/wip-13638-infernalis

FileStore: potential memory leak if getattrs fails.

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #6836 from SUSE/wip-13891-infernalis
Loic Dachary [Mon, 8 Feb 2016 04:14:18 +0000 (11:14 +0700)]
Merge pull request #6836 from SUSE/wip-13891-infernalis

infernalis: auth/cephx: large amounts of log are produced by osd

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #6833 from SUSE/wip-13935-infernalis
Loic Dachary [Mon, 8 Feb 2016 04:12:47 +0000 (11:12 +0700)]
Merge pull request #6833 from SUSE/wip-13935-infernalis

infernalis: Ceph daemon failed to start, because the service name was already used.

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #6694 from xiexingguo/xxg-wip-13869
Loic Dachary [Mon, 8 Feb 2016 04:12:00 +0000 (11:12 +0700)]
Merge pull request #6694 from xiexingguo/xxg-wip-13869

osd: fix race condition during send_failures

Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Loic Dachary <ldachary@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
9 years agoMerge pull request #6626 from Abhishekvrshny/wip-13655-infernalis
Loic Dachary [Mon, 8 Feb 2016 04:09:30 +0000 (11:09 +0700)]
Merge pull request #6626 from Abhishekvrshny/wip-13655-infernalis

crush: crash if we see CRUSH_ITEM_NONE in early rule step

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agoMerge pull request #6449 from dachary/wip-13671-infernalis
Loic Dachary [Mon, 8 Feb 2016 04:06:41 +0000 (11:06 +0700)]
Merge pull request #6449 from dachary/wip-13671-infernalis

tests: testprofile must be removed before it is re-created

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agorgw-admin: document orphans commands in usage 7543/head
Yehuda Sadeh [Tue, 2 Feb 2016 00:33:55 +0000 (16:33 -0800)]
rgw-admin: document orphans commands in usage

Fixes: #14516
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit 105a76bf542e05b739d5a03ca8ae55432350f107)

9 years agoMerge pull request #6880 from dachary/wip-14044-infernalis
Sage Weil [Thu, 4 Feb 2016 21:23:51 +0000 (16:23 -0500)]
Merge pull request #6880 from dachary/wip-14044-infernalis

infernalis: ceph-disk list fails on /dev/cciss!c0d0

9 years agoMerge pull request #6392 from SUSE/wip-13589-infernalis
Sage Weil [Fri, 29 Jan 2016 14:05:14 +0000 (09:05 -0500)]
Merge pull request #6392 from SUSE/wip-13589-infernalis

mon: should not set isvalid = true when cephx_verify_authorizer retur…

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #6500 from SUSE/wip-13678-infernalis
Sage Weil [Fri, 29 Jan 2016 13:55:45 +0000 (08:55 -0500)]
Merge pull request #6500 from SUSE/wip-13678-infernalis

systemd: no rbdmap systemd unit file

9 years agoosd/PG: For performance start scrub scan at pool to skip temp objects 7421/head
David Zafman [Thu, 24 Sep 2015 15:38:41 +0000 (11:38 -0400)]
osd/PG: For performance start scrub scan at pool to skip temp objects

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 05d79faa512210b0f0a91640d18db33b887a6e73)

9 years agoosd/OSD: clear_temp_objects() include removal of Hammer temp objects
David Zafman [Fri, 18 Dec 2015 17:08:19 +0000 (09:08 -0800)]
osd/OSD: clear_temp_objects() include removal of Hammer temp objects

Fixes: #13862
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 10b4a0825d9917b6fdd0d6450640238b78ba05d4)

9 years agoosd: Improve log message which isn't about a particular shard
David Zafman [Fri, 18 Dec 2015 02:04:08 +0000 (18:04 -0800)]
osd: Improve log message which isn't about a particular shard

Remove redundant dout()

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit e85907fcc582922925609f595f68c597a88c39dc)

9 years agoMerge pull request #7225 from dillaman/wip-13810-infernalis
Josh Durgin [Thu, 14 Jan 2016 01:15:41 +0000 (17:15 -0800)]
Merge pull request #7225 from dillaman/wip-13810-infernalis

tests: notification slave needs to wait for master

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
9 years agotests: notification slave needs to wait for master 7225/head
Jason Dillaman [Wed, 13 Jan 2016 17:44:01 +0000 (12:44 -0500)]
tests: notification slave needs to wait for master

If the slave instance starts before the master, race
conditions are possible.

Fixes: #13810
Backport: infernalis, hammer
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 3992d6fe67bbf82322cedc1582406caaf6d4de60)

9 years agotests: verify it is possible to reuse an OSD id 6882/head
Loic Dachary [Thu, 10 Dec 2015 14:20:32 +0000 (15:20 +0100)]
tests: verify it is possible to reuse an OSD id

When an OSD id is removed via ceph osd rm, it will be reused by the next
ceph osd create command. Verify that and OSD reusing such an id
successfully comes up.

http://tracker.ceph.com/issues/13988 Refs: #13988

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 7324615bdb829f77928fa10d4e988c6422945937)

9 years agoceph-disk: list accepts absolute dev names 6880/head
Loic Dachary [Tue, 5 Jan 2016 16:33:45 +0000 (17:33 +0100)]
ceph-disk: list accepts absolute dev names

The ceph-disk list subcommand now accepts /dev/sda as well as sda.
The filtering is done on the full list of devices instead of restricting
the number of devices explored. Always obtaining the full list of
devices makes things simpler when trying to match a dmcrypted device to
the corresponding raw device.

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 591d581c84cfd72d7c655ac88b0911a318b96e95)

Conflicts:
src/ceph-disk: as part of the implementation of deactivate /
destroy in master, the prototype of list_device was changed
        to take a list of paths instead of the all arguments (args).

9 years agoceph-disk: display OSD details when listing dmcrypt devices
Loic Dachary [Tue, 5 Jan 2016 13:25:51 +0000 (14:25 +0100)]
ceph-disk: display OSD details when listing dmcrypt devices

The details about a device that mapped via dmcrypt are directly
available. Do not try to fetch them from the device entry describing the
devicemapper entry.

http://tracker.ceph.com/issues/14230 Fixes: #14230

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 7aab4ed6f108ddc7bc90300f1999a38f30da3a57)

Conflicts:
src/ceph-disk: an incorrect attempt was made to fix the same
                       problem. It was not backported and does not
                       need to be. It is entirely contained in the
                       code block removed and is the reason for the
                       conflict.

9 years agotests: limit ceph-disk unit tests to test dir
Loic Dachary [Wed, 9 Dec 2015 15:52:10 +0000 (16:52 +0100)]
tests: limit ceph-disk unit tests to test dir

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 499c80db606fe3926a8a603e03fdba6967d66003)

9 years agoceph-disk: factorize duplicated dmcrypt mapping
Loic Dachary [Tue, 5 Jan 2016 16:38:59 +0000 (17:38 +0100)]
ceph-disk: factorize duplicated dmcrypt mapping

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 35a0c94c4cd3a57cfc382c64eaa9cfb9306dd2e6)

9 years agoceph-disk: fix regression in cciss devices names
Loic Dachary [Tue, 5 Jan 2016 16:42:11 +0000 (17:42 +0100)]
ceph-disk: fix regression in cciss devices names

The cciss driver has device paths such as /dev/cciss/c0d1 with a
matching /sys/block/cciss!c0d1. The general case is that whenever a
device name is found in /sys/block, the / is replaced by the !.

When refactoring the ceph-disk list subcommand, this conversion was
overlooked in a few places. All explicit concatenation of /dev with a
device name are replaced with a call to get_dev_name which does the same
but also converts all ! in /.

http://tracker.ceph.com/issues/13970 Fixes: #13970

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit a2fd3a535e66b3a2b694cda9c6add33383ccfa4a)

Conflicts:
src/ceph-disk : trivial resolution

9 years agoMerge pull request #7001 from dachary/wip-14145-infernalis
Loic Dachary [Thu, 7 Jan 2016 14:06:32 +0000 (15:06 +0100)]
Merge pull request #7001 from dachary/wip-14145-infernalis

infernalis: ceph-disk: use blkid instead of sgdisk -i

On CentOS 7.1 and other operating systems with a version of udev greater or equal to 214,
running ceph-disk prepare triggered unexpected removal and addition of partitions on
the disk being prepared. That created problems ranging from the OSD not being activated
to failures because /dev/sdb1 does not exist although it should.

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agotests: ceph-disk cryptsetup close must try harder 7001/head
Loic Dachary [Wed, 6 Jan 2016 22:36:57 +0000 (23:36 +0100)]
tests: ceph-disk cryptsetup close must try harder

Similar to how it's done in dmcrpyt_unmap in master (
132e56615805cba0395898cf165b32b88600d633 ), the infernalis tests helper
that were deprecated by the addition of the deactivate / destroy
ceph-disk subcommand must try cryptsetup close a few times in some
contexts.

Signed-off-by: Loic Dachary <loic@dachary.org>
9 years agoceph-disk: protect deactivate with activate lock
Loic Dachary [Fri, 18 Dec 2015 23:53:03 +0000 (00:53 +0100)]
ceph-disk: protect deactivate with activate lock

When ceph-disk prepares the disk, it triggers udev events and each of
them ceph-disk activate. If systemctl stop ceph-osd@2 happens while
there still are ceph-disk activate in flight, the systemctl stop may be
cancelled by the systemctl enable issued by one of the pending ceph-disk
activate.

This only matters in a test environment where disks are destroyed
shortly after they are activated.

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 6395bf856b4d4511f0758174ef915ebcafbe3777)

Conflicts:

        src/ceph-disk: ceph-disk deactivate does not exist in ceph-disk
            on infernalis. But the same feature is implemented in
            ceph-test-disk.py for test purposes and has the same
            problem. The patch is adapted to ceph-test-disk.py.

9 years agoceph-disk: retry cryptsetup remove
Loic Dachary [Wed, 6 Jan 2016 10:15:19 +0000 (11:15 +0100)]
ceph-disk: retry cryptsetup remove

Retry a cryptsetup remove ten times. After the ceph-osd terminates, the
device is released asyncrhonously and an attempt to cryptsetup remove
will may fail because it is considered busy. Although a few attempts are
made before giving up, the number of attempts / the duration of the
attempts cannot be controlled with a cryptsetup option. The workaround
is to increase this by trying a few times.

If cryptsetup remove fails for a reason that is unrelated to timeout,
the error will be repeated a few times. There is no undesirable side
effect. It will not hide a problem.

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 132e56615805cba0395898cf165b32b88600d633)

9 years agoceph-disk: use blkid instead of sgdisk -i
Loic Dachary [Fri, 18 Dec 2015 16:03:21 +0000 (17:03 +0100)]
ceph-disk: use blkid instead of sgdisk -i

sgdisk -i 1 /dev/vdb opens /dev/vdb in write mode which indirectly
triggers a BLKRRPART ioctl from udev (starting version 214 and up) when
the device is closed (see below for the udev release note). The
implementation of this ioctl by the kernel (even old kernels) removes
all partitions and adds them again (similar to what partprobe does
explicitly).

The side effects of partitions disappearing while ceph-disk is running
are devastating.

sgdisk is replaced by blkid which only opens the device in read mode and
will not trigger this unexpected behavior.

The problem does not show on Ubuntu 14.04 because it is running udev <
214 but shows on CentOS 7 which is running udev > 214.

git clone git://anonscm.debian.org/pkg-systemd/systemd.git
systemd/NEWS:
CHANGES WITH 214:

        * As an experimental feature, udev now tries to lock the
          disk device node (flock(LOCK_SH|LOCK_NB)) while it
          executes events for the disk or any of its partitions.
          Applications like partitioning programs can lock the
          disk device node (flock(LOCK_EX)) and claim temporary
          device ownership that way; udev will entirely skip all event
          handling for this disk and its partitions. If the disk
          was opened for writing, the close will trigger a partition
          table rescan in udev's "watch" facility, and if needed
          synthesize "change" events for the disk and all its partitions.
          This is now unconditionally enabled, and if it turns out to
          cause major problems, we might turn it on only for specific
          devices, or might need to disable it entirely. Device Mapper
          devices are excluded from this logic.

http://tracker.ceph.com/issues/14080 Fixes: #14080

Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 9dce05a8cdfc564c5162885bbb67a04ad7b95c5a)

9 years agoceph-disk: dereference symlinks in destroy and zap
Loic Dachary [Wed, 16 Dec 2015 14:57:03 +0000 (15:57 +0100)]
ceph-disk: dereference symlinks in destroy and zap

The behavior of partprobe or sgdisk may be subtly different if given a
symbolic link to a device instead of an actual device. The debug output
is also more confusing when the symlink shows instead of the device it
points to.

Always dereference the symlink before running destroy and zap.

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit fe71647bc9bd0f9ddc6d470ee7bee1e6b0983e2b)

Conflicts:
        src/ceph-disk
          trivial, because destroy is not implemented
          in infernalis

9 years agoceph-disk: increase partprobe / udevadm settle timeouts
Loic Dachary [Wed, 16 Dec 2015 11:33:25 +0000 (12:33 +0100)]
ceph-disk: increase partprobe / udevadm settle timeouts

The default of 120 seconds may be exceeded when the disk is very slow
which can happen in cloud environments. Increase it to 600 seconds
instead.

The partprobe command may fail for the same reason but it does not have
a timeout parameter. Instead, try a few times before failing.

The udevadm settle guarding partprobe are not necessary because
partprobe already does the same. However, partprobe does not provide a
way to control the timeout. Having a udevadm settle after another is
going to be a noop most of the time and not add any delay. It matters
when the udevadm settle run by partprobe fails with a timeout because
partprobe will silentely ignores the failure.

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 730b5d62d3cda7de4076bafa6e9e35f1eb8e2190)

9 years agotests: ceph-disk workunit increase verbosity
Loic Dachary [Wed, 16 Dec 2015 11:36:47 +0000 (12:36 +0100)]
tests: ceph-disk workunit increase verbosity

So that reading the teuthology log is enough in most cases to figure out
the cause of the error.

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit fd7fe8c4977658f66651dad5efb0d816ae71b38b)

Conflicts:
qa/workunits/ceph-disk/ceph-disk-test.py:
          trivial, because destroy/deactivate are not implemented
          in infernalis. The existing destroy_osd function
          has to be modified so the id returned by sh() does
          not have a trailing newline.

9 years agoceph-disk: log parted output
Loic Dachary [Wed, 16 Dec 2015 11:30:20 +0000 (12:30 +0100)]
ceph-disk: log parted output

Should parted output fail to parse, it is useful to get the full output
when running in verbose mode.

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit f5d36b9ac299e9f6d52cc32d540cc1c3342de6e7)

9 years agoceph-disk: do not discard stderr
Loic Dachary [Wed, 16 Dec 2015 11:29:17 +0000 (12:29 +0100)]
ceph-disk: do not discard stderr

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 5fa35ba10e10b56262757afc43929ab8ee4164f2)

Conflicts:
src/ceph-disk : trivial, because destroy/deactivate
        are not implemented in infernalis

9 years agoMerge pull request #7038 from dillaman/wip-14121-infernalis
Josh Durgin [Wed, 23 Dec 2015 18:47:30 +0000 (10:47 -0800)]
Merge pull request #7038 from dillaman/wip-14121-infernalis

tests: rebuild exclusive lock test should acquire exclusive lock

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
9 years agotests: rebuild exclusive lock test should acquire exclusive lock 7038/head
Jason Dillaman [Wed, 23 Dec 2015 15:31:07 +0000 (10:31 -0500)]
tests: rebuild exclusive lock test should acquire exclusive lock

Starting with Jewel, the object map will not be loaded until the
exclusive lock is acquired since it might be updated by the
lock owner.

Fixes: #14121
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
9 years agolog: Log.cc: Assign LOG_DEBUG priority to syslog calls 6993/head
Brad Hubbard [Mon, 7 Dec 2015 01:31:28 +0000 (11:31 +1000)]
log: Log.cc: Assign LOG_DEBUG priority to syslog calls

Fixes: #13993
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
(cherry picked from commit 8e93f3f45db681f82633ca695a7dc4e7bd030584)

9 years agomon/PGMonitor: MAX AVAIL is 0 if some OSDs' weight is 0 6907/head
Chengyuan Li [Fri, 20 Nov 2015 05:29:39 +0000 (22:29 -0700)]
mon/PGMonitor: MAX AVAIL is 0 if some OSDs' weight is 0

In get_rule_avail(), even p->second is 0, it's possible to be used
as divisor and quotient is infinity, then is converted to an integer
which is negative value.
So we should check p->second value before calculation.

It fixes BUG #13840.

Signed-off-by: Chengyuan Li <chengyli@ebay.com>
(cherry picked from commit 18713e60edd1fe16ab571f7c83e6de026db483ca)

9 years agoMerge pull request #6395 from SUSE/wip-13593-infernalis
Abhishek Varshney [Wed, 9 Dec 2015 05:52:26 +0000 (11:22 +0530)]
Merge pull request #6395 from SUSE/wip-13593-infernalis

Ceph-fuse won't start correctly when the option log_max_new in ceph.conf set to zero

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
9 years agoMerge pull request #6828 from dachary/wip-ceph-disk-augeas
Loic Dachary [Tue, 8 Dec 2015 23:06:33 +0000 (00:06 +0100)]
Merge pull request #6828 from dachary/wip-ceph-disk-augeas

tests: ceph-disk workunit uses configobj

Reviewed-by: Loic Dachary <ldachary@redhat.com>
9 years agotests: ceph-disk workunit uses the ceph task 6828/head
Loic Dachary [Wed, 21 Oct 2015 23:48:31 +0000 (01:48 +0200)]
tests: ceph-disk workunit uses the ceph task

The ceph-disk workunit deploy keys that are not deployed by default by
the ceph teuthology task.

The OSD created by the ceph task are removed from the default
bucket (via osd rm) so they do not interfere with the tests.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
(cherry picked from commit 163de5b0f8f46695ab41b3f2288e9b5c1feaedab)

9 years agotests: ceph-disk workunit uses configobj
Loic Dachary [Wed, 21 Oct 2015 22:21:49 +0000 (00:21 +0200)]
tests: ceph-disk workunit uses configobj

Instead of using augtool to modify the configuration file, use
configobj. It is also used by the install teuthology task. The .ini
lens (puppet lens really) is unable to read ini files created by
configobj.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
(cherry picked from commit f4906a124cc194dccd855679a04a5c7ffc125a44)

9 years agoMerge pull request #6845 from dachary/wip-14019-infernalis
Loic Dachary [Tue, 8 Dec 2015 08:34:39 +0000 (09:34 +0100)]
Merge pull request #6845 from dachary/wip-14019-infernalis

infernalis: libunwind package missing on CentOS 7

Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
9 years agobuild/ops: systemd ceph-disk unit must not assume /bin/flock 6852/head
Loic Dachary [Fri, 4 Dec 2015 20:11:09 +0000 (21:11 +0100)]
build/ops: systemd ceph-disk unit must not assume /bin/flock

The flock command may be installed elsewhere, depending on the
system. Let the PATH search figure that out.

http://tracker.ceph.com/issues/13975 Fixes: #13975

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit c8f7d44c935bd097db7d131b785bdab78a7a650c)

9 years agoosd: Test osd_find_best_info_ignore_history_les config in another assert 6851/head
David Zafman [Thu, 3 Dec 2015 22:52:24 +0000 (14:52 -0800)]
osd: Test osd_find_best_info_ignore_history_les config in another assert

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 02a9a41f151a3d968bf8066749658659dc6e3ac4)

9 years agoosd: call on_new_interval on newly split child PG 6849/head
Sage Weil [Wed, 2 Dec 2015 19:50:28 +0000 (14:50 -0500)]
osd: call on_new_interval on newly split child PG

We must call on_new_interval() on any interval change *and* on the
creation of the PG.  Currently we call it from PG::init() and
PG::start_peering_interval().  However, PG::split_into() did not
do so for the child PG, which meant that the new child feature
bits were not properly initialized and the bitwise/nibblewise
debug bit was not correctly set.  That, in turn, could lead to
various misbehaviors, the most obvious of which is scrub errors
due to the sort order mismatch.

Fixes: #13962
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit fb120d7b2da5715e7f7d1baa65bfa70d2e5d807a)

9 years agoFileStore: potential memory leak if _fgetattrs fails 6846/head
xiexingguo [Mon, 26 Oct 2015 10:38:01 +0000 (18:38 +0800)]
FileStore: potential memory leak if _fgetattrs fails

Memory leak happens if _fgetattrs encounters some error and simply returns.
Fixes: #13597
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit ace7dd096b58a88e25ce16f011aed09269f2a2b4)

9 years agobuild/ops: enable CR in CentOS 7 6845/head
Loic Dachary [Tue, 8 Dec 2015 07:02:56 +0000 (08:02 +0100)]
build/ops: enable CR in CentOS 7

To get libunwind from the CR repositories until CentOS 7.2.1511 is released.

http://tracker.ceph.com/issues/13997 Fixes: #13997

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 247ee6084b58861da601d349bdba739b252d96de)

9 years agoObjecter: remove redundant result-check of _calc_target in _map_session. 6840/head
xiexingguo [Mon, 2 Nov 2015 13:46:11 +0000 (21:46 +0800)]
Objecter: remove redundant result-check of _calc_target in _map_session.

Result-code check is currently redundant since _calc_target never returns a negative value.
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 5a6117e667024f51e65847f73f7589467b6cb762)

9 years agoObjecter: potential null pointer access when do pool_snap_list.
xiexingguo [Thu, 29 Oct 2015 09:32:50 +0000 (17:32 +0800)]
Objecter: potential null pointer access when do pool_snap_list.

Objecter: potential null pointer access when do pool_snap_list. Shall check pool existence first.
Fixes: #13639
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 865541605b6c32f03e188ec33d079b44be42fa4a)

9 years agoauth/cephx: large amounts of log are produced by osd 6836/head
qiankunzheng [Thu, 5 Nov 2015 12:29:49 +0000 (07:29 -0500)]
auth/cephx: large amounts of log are produced by osd
if the auth of osd is deleted when the osd is running, the osd will produce large amounts of log.

Fixes:#13610
Signed-off-by: Qiankun Zheng <zheng.qiankun@h3c.com>
(cherry picked from commit 102f0b19326836e3b0754b4d32da89eb2bc0b03c)

9 years agoinit-ceph: fix systemd-run cant't start ceph daemon sometimes 6833/head
wangchaunhong [Tue, 20 Oct 2015 10:40:23 +0000 (18:40 +0800)]
init-ceph: fix systemd-run cant't start ceph daemon sometimes

Fixes: #13474
Signed-off-by: Chuanhong Wang <wang.chuanhong@zte.com.cn>
(cherry picked from commit 2f36909e1e08bac993e77d1781a777b386335669)

9 years agotests: test/librados/test.cc must create profile 6449/head
Loic Dachary [Mon, 2 Nov 2015 23:21:51 +0000 (00:21 +0100)]
tests: test/librados/test.cc must create profile

Now that the create_one_ec_pool function removes the testprofile each
time it is called, it must create the testprofile erasure code profile
again for the test to use.

http://tracker.ceph.com/issues/13664 Refs: #13664

Signed-off-by: Loic Dachary <ldachary@redhat.com>
(cherry picked from commit a60342942b5a42ee04d59af77a6b904ce62eefc4)

9 years agotests: destroy testprofile before creating one
Loic Dachary [Mon, 2 Nov 2015 19:24:51 +0000 (20:24 +0100)]
tests: destroy testprofile before creating one

The testprofile erasure code profile is destroyed before creating a new
one so that it does not fail when another testprofile erasure code
profile already exists with different parameters.

This must be done when creating erasure coded pools with the C++
interface, in the same way it's done with the C interface.

http://tracker.ceph.com/issues/13664 Fixes: #13664

Signed-off-by: Loic Dachary <ldachary@redhat.com>
(cherry picked from commit 47abab9a6f182aa0abe5047c04402850379bcd6d)

9 years agotests: add destroy_ec_profile{,_pp} helpers
Loic Dachary [Mon, 2 Nov 2015 19:23:52 +0000 (20:23 +0100)]
tests: add destroy_ec_profile{,_pp} helpers

Signed-off-by: Loic Dachary <ldachary@redhat.com>
(cherry picked from commit ab46d79bc09fc711fa35302f49eecac81a98519b)

9 years agorbdmap: systemd support 6500/head
Boris Ranto [Mon, 2 Nov 2015 13:07:47 +0000 (14:07 +0100)]
rbdmap: systemd support

Fixes: #13374
Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit 9224ac2ad25f7d017916f58b642c0ea25305c3e5)

9 years agorbdmap: Move do_map and do_unmap shell functions to rbdmap script
Boris Ranto [Fri, 30 Oct 2015 17:33:36 +0000 (18:33 +0100)]
rbdmap: Move do_map and do_unmap shell functions to rbdmap script

This patch creates rbdmap shell script that is called from init-rbdmap
init script. The patch also renames src/rbdmap configuration file to
src/etc-rbdmap so that rbdmap shell script can be installed via build
system directly. Finally, the patch accomodates these changes in spec
file and build system.

Fixes: #13374
Signed-off-by: Boris Ranto <branto@redhat.com>
(cherry picked from commit c0980af3c72f01e6f99fd1e7e91c446934d6d856)

Conflicts:
src/init-rbdmap
            Trivial resolution.

9 years agoMerge pull request #6634 from Abhishekvrshny/wip-13761-infernalis
Abhishek Varshney [Tue, 1 Dec 2015 12:14:24 +0000 (17:44 +0530)]
Merge pull request #6634 from Abhishekvrshny/wip-13761-infernalis

unknown argument --quiet in udevadm settle

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
9 years agoMerge pull request #6650 from Abhishekvrshny/wip-13830-infernalis
Abhishek Varshney [Mon, 30 Nov 2015 16:26:40 +0000 (21:56 +0530)]
Merge pull request #6650 from Abhishekvrshny/wip-13830-infernalis

init script reload doesn't work on EL7

Reviewed-by: Boris Ranto <branto@redhat.com>
9 years agoMerge pull request #6477 from SUSE/wip-13705-infernalis
Abhishek Varshney [Mon, 30 Nov 2015 16:25:55 +0000 (21:55 +0530)]
Merge pull request #6477 from SUSE/wip-13705-infernalis

rbd : enable feature objectmap

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
9 years agoMerge pull request #6474 from SUSE/wip-13619-infernalis
Abhishek Varshney [Mon, 30 Nov 2015 16:25:22 +0000 (21:55 +0530)]
Merge pull request #6474 from SUSE/wip-13619-infernalis

rbd clone issue

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
9 years agoMerge pull request #6633 from Abhishekvrshny/wip-13759-infernalis
Abhishek Varshney [Mon, 30 Nov 2015 16:24:44 +0000 (21:54 +0530)]
Merge pull request #6633 from Abhishekvrshny/wip-13759-infernalis

rbd: pure virtual method called

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
9 years agoMerge pull request #6632 from Abhishekvrshny/wip-13756-infernalis
Abhishek Varshney [Mon, 30 Nov 2015 16:24:18 +0000 (21:54 +0530)]
Merge pull request #6632 from Abhishekvrshny/wip-13756-infernalis

QEMU hangs after creating snapshot and stopping VM

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
9 years agoMerge pull request #6630 from Abhishekvrshny/wip-13754-infernalis
Abhishek Varshney [Mon, 30 Nov 2015 16:20:40 +0000 (21:50 +0530)]
Merge pull request #6630 from Abhishekvrshny/wip-13754-infernalis

Avoid re-writing old-format image header on resize
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
9 years agoMerge pull request #6396 from SUSE/wip-13342-infernalis
Loic Dachary [Mon, 30 Nov 2015 14:01:23 +0000 (15:01 +0100)]
Merge pull request #6396 from SUSE/wip-13342-infernalis

ceph upstart script rbdmap.conf incorrectly processes parameters

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agoosd: fix send_failures() locking 6694/head
Sage Weil [Fri, 18 Sep 2015 01:42:53 +0000 (21:42 -0400)]
osd: fix send_failures() locking

It is unsafe to check failure_queue.empty() without the lock.
Fixes: #13869
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit b3ca828ae8ebc9068073494c46faf3e8e1443ada)

9 years agoinit-rbdmap: fix CMDPARAMS 6396/head
Sage Weil [Wed, 30 Sep 2015 12:29:05 +0000 (08:29 -0400)]
init-rbdmap: fix CMDPARAMS

Fixes: #13214
Reported-by: Wyllys Ingersoll <wyllys.ingersoll@keepertech.com>
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 02113ac449cd7631f1c9a3840c94bbf253c052bd)

9 years agorgw: fix reload on non Debian systems. 6650/head
Herve Rousseau [Fri, 6 Nov 2015 08:52:28 +0000 (09:52 +0100)]
rgw: fix reload on non Debian systems.

When using reload in non-debian systems, /bin/sh's kill is used to send the HUP signal to the radosgw process.
This kill version doesn't understand -SIGHUP as a valid signal, using -HUP does work.

Fix: #13709
Backport: hammer
Signed-off-by: Hervé Rousseau <hroussea@cern.ch>
(cherry picked from commit 1b000abac3a02d1e788bf25eead4b6873133f5d2)

9 years agokrbd: remove deprecated --quiet param from udevadm 6634/head
Jason Dillaman [Tue, 27 Oct 2015 14:13:27 +0000 (10:13 -0400)]
krbd: remove deprecated --quiet param from udevadm

This parameter has been removed since systemd 213, so this
effects Fedora 21+, Debian Jessie, and potentially future
releases of RHEL 7.

Fixes: #13560
Backport: hammer, infernalis
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 4300f2a9fe29627eea580564ff2d576de3647467)

9 years agorun_cmd: close parent process console file descriptors
Jason Dillaman [Tue, 27 Oct 2015 14:12:34 +0000 (10:12 -0400)]
run_cmd: close parent process console file descriptors

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit f46f7dc94139c0bafe10361622416d7dc343d31f)

9 years agoWorkQueue: new PointerWQ base class for ContextWQ 6633/head
Jason Dillaman [Tue, 7 Jul 2015 16:11:13 +0000 (12:11 -0400)]
WorkQueue: new PointerWQ base class for ContextWQ

The existing work queues do not properly function if added to a running
thread pool.  librbd uses a singleton thread pool which requires
dynamically adding/removing work queues as images are opened and closed.

Fixes: #13636
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 3e78b18b09d75626ca2599bac3b9f9c9889507a5)

9 years agolibrbd: fixed deadlock while attempting to flush AIO requests 6632/head
Jason Dillaman [Mon, 9 Nov 2015 16:22:24 +0000 (11:22 -0500)]
librbd: fixed deadlock while attempting to flush AIO requests

In-flight AIO requests might force a flush if a snapshot was created
out-of-band.  The flush completion was previously invoked asynchronously,
potentially via the same thread worker handling the AIO request. This
resulted in the flush operation deadlocking since it can't complete.

Fixes: #13726
Backport: infernalis, hammer
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit bfeb90e5fe24347648c72345881fd3d932243c98)

9 years agotests: new test case to catch deadlock on RBD image refresh
Jason Dillaman [Mon, 9 Nov 2015 15:48:10 +0000 (10:48 -0500)]
tests: new test case to catch deadlock on RBD image refresh

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit a9729d9553e7fb925509cad8d388cf52a9fede9c)

9 years agolibrbd: resize should only update image size within header 6630/head
Jason Dillaman [Mon, 2 Nov 2015 21:50:19 +0000 (16:50 -0500)]
librbd: resize should only update image size within header

Previously, the whole RBD image format 1 header struct was
re-written to disk on a resize operation.

Fixes: #13674
Backport: infernalis, hammer, firefly
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit d5be20b6d4646284571568ab28cbf45b0729390b)

9 years agoObjecter: pool_op callback may hang forever. 6627/head
xiexingguo [Thu, 29 Oct 2015 12:04:11 +0000 (20:04 +0800)]
Objecter: pool_op callback may hang forever.

pool_op callback may hang forever due to osdmap update during reply handling.
Fixes: #13642
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 00c6fa9e31975a935ed2bb33a099e2b4f02ad7f2)

9 years agocrush/mapper: ensure take bucket value is valid 6626/head
Sage Weil [Tue, 13 Oct 2015 13:55:01 +0000 (09:55 -0400)]
crush/mapper: ensure take bucket value is valid

Ensure that the take argument is a valid bucket ID before indexing the
buckets array.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 93ec538e8a667699876b72459b8ad78966d89c61)

9 years agocrush/mapper: ensure bucket id is valid before indexing buckets array
Sage Weil [Wed, 28 Oct 2015 00:55:26 +0000 (20:55 -0400)]
crush/mapper: ensure bucket id is valid before indexing buckets array

We were indexing the buckets array without verifying the index was within
the [0,max_buckets) range.  This could happen because a multistep rule
does not have enough buckets and has CRUSH_ITEM_NONE
for an intermediate result, which would feed in CRUSH_ITEM_NONE and
make us crash.

Fixes: #13477
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 976a24a326da8931e689ee22fce35feab5b67b76)

9 years agolibrbd : fix enable objectmap feature issue 6477/head
xinxin shu [Wed, 21 Oct 2015 11:01:21 +0000 (19:01 +0800)]
librbd : fix enable objectmap feature issue

Fixes: #13558
Signed-off-by: xinxin shu <xinxin.shu@intel.com>
(cherry picked from commit b0536ebab4e1f34e1ed87fe5efbb00d0f7b48abb)

9 years agorbd: fix clone issue when we specify image feature 6474/head
xinxin shu [Wed, 21 Oct 2015 06:56:17 +0000 (14:56 +0800)]
rbd: fix clone issue when we specify image feature

Fixes: #13553
Signed-off-by: xinxin shu <xinxin.shu@intel.com>
(cherry picked from commit da48dbb8f8c9417343d2ca7819c58b7c46ef7ad0)

9 years ago9.2.0 v9.2.0
Jenkins Build Slave User [Tue, 3 Nov 2015 16:58:32 +0000 (16:58 +0000)]
9.2.0

9 years agoMerge pull request #6444 from liewegas/wip-pg-key
Samuel Just [Mon, 2 Nov 2015 16:17:19 +0000 (08:17 -0800)]
Merge pull request #6444 from liewegas/wip-pg-key

osd/PG: tolerate missing epoch key

Reviewed-by: Samuel Just <sjust@redhat.com>
9 years agoosd/PG: tolerate missing epoch key 6444/head
Sage Weil [Sat, 24 Oct 2015 23:51:15 +0000 (19:51 -0400)]
osd/PG: tolerate missing epoch key

An orphan PG may have an info attr but no epoch key.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #6344 from dillaman/wip-13559-infernalis
Jason Dillaman [Sat, 31 Oct 2015 03:41:34 +0000 (23:41 -0400)]
Merge pull request #6344 from dillaman/wip-13559-infernalis

librbd: potential assertion failure during cache read

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
9 years agoMerge pull request #6363 from ceph/wip-init-rbdmap
Kefu Chai [Thu, 29 Oct 2015 15:01:17 +0000 (23:01 +0800)]
Merge pull request #6363 from ceph/wip-init-rbdmap

Drop redhat-lsb-core dependency

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
9 years agoOSD:shall reset primary and up_primary fields when beginning a new past_interval. 6397/head
xiexingguo [Tue, 13 Oct 2015 06:04:20 +0000 (14:04 +0800)]
OSD:shall reset primary and up_primary fields when beginning a new past_interval.

Shall reset primary and up_primary fields when we start over a new past_interval in OSD::build_past_intervals_parallel().
Fixes: #13471
Signed-off-by: xie.xingguo@zte.com.cn
(cherry picked from commit 65064ca05bc7f8b6ef424806d1fd14b87add62a4)

9 years agoceph-fuse.cc: While starting ceph-fuse, start the log thread first 6395/head
wenjunhuang [Sat, 10 Oct 2015 06:30:56 +0000 (14:30 +0800)]
ceph-fuse.cc: While starting ceph-fuse, start the log thread first

http://tracker.ceph.com/issues/13443 Fixes: #13443

Signed-off-by: Wenjun Huang <wenjunhuang@tencent.com>
(cherry picked from commit f2763085754462610730a23bb5652237714abc2a)

9 years agomon: should not set isvalid = true when cephx_verify_authorizer return false 6392/head
yangruifeng [Mon, 19 Oct 2015 12:08:12 +0000 (08:08 -0400)]
mon: should not set isvalid = true when cephx_verify_authorizer return false

Fixes: #13525
Signed-off-by: Ruifeng Yang <yangruifeng.09209@h3c.com>
(cherry picked from commit c7f75b8f7c0a773148ec16141941efd00ee76626)

9 years agoMerge pull request #6366 from liewegas/wip-osd-fixboot-infernalis
Samuel Just [Mon, 26 Oct 2015 19:23:40 +0000 (12:23 -0700)]
Merge pull request #6366 from liewegas/wip-osd-fixboot-infernalis

osd: fix OSDService vs Objecter init order

Reviewed-by: Samuel Just <sjust@redhat.com>
9 years agoosd: fix OSDService vs Objecter init order 6366/head
Sage Weil [Fri, 23 Oct 2015 17:27:39 +0000 (13:27 -0400)]
osd: fix OSDService vs Objecter init order

This reverts c7d96a5ed1d2cb844622af29b13705b8f7be6be7, but still keeps
the Objecter init *after* we have authenticated.  This way we don't
crash when we get mon messages like MOSDPGCreate, and we also don't
request maps we aren't prepared to handle.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoceph.spec.in: We no longer need redhat-lsb-core 6363/head
Boris Ranto [Fri, 23 Oct 2015 14:39:16 +0000 (16:39 +0200)]
ceph.spec.in: We no longer need redhat-lsb-core

Drop the redhat-lsb-core dependency as it is no longer necessary on
fedora/rhel.

The other two init scripts do not use redhat-lsb-core either. The
init-ceph.in conditionally requires /lib/lsb/init-functions and does not
use any of the functions defined in that file (at least not directly).
The init-radosgw file includes /etc/rc.d/init.d/functions on non-debian
platforms instead of /lib/lsb/init-functions file so it does not require
redhat-lsb-core either.

Signed-off-by: Boris Ranto <branto@redhat.com>
9 years agoinit-rbdmap: Rewrite to use logger + clean-up
Boris Ranto [Fri, 23 Oct 2015 13:31:27 +0000 (15:31 +0200)]
init-rbdmap: Rewrite to use logger + clean-up

This patch rewrites the init-rbdmap init script so that it uses logger
instead of the log_* functions. The patch also fixes various smaller
bugs like:
* MAP_RV was undefined if mapping already existed
* UMNT_RV and UMAP_RV were almost always empty (if they succeeded) ->
  removed them
* use of continue instead RET_OP in various places (RET_OP was not being
  checked after the switch to logger messages)
* removed use of DESC (used only twice and only one occurrence actually
  made sense)

Signed-off-by: Boris Ranto <branto@redhat.com>
9 years agolibrbd: potential assertion failure during cache read 6344/head
Jason Dillaman [Wed, 21 Oct 2015 17:12:48 +0000 (13:12 -0400)]
librbd: potential assertion failure during cache read

It's possible for a cache read from a clone to trigger a writeback if a
previous read op determined the object doesn't exist in the clone,
followed by a cached write to the non-existent clone object, followed
by another read request to the same object.  This causes the cache to
flush the pending writeback ops while not holding the owner lock.

Fixes: #13559
Backport: hammer
Signed-off-by: Jason Dillaman <dillaman@redhat.com>