]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
8 years agomerge ceph-qa-suite 12454/head
Sage Weil [Wed, 14 Dec 2016 17:29:59 +0000 (11:29 -0600)]
merge ceph-qa-suite

8 years agomove ceph-qa-suite dirs into qa/
Sage Weil [Wed, 14 Dec 2016 17:29:55 +0000 (11:29 -0600)]
move ceph-qa-suite dirs into qa/

8 years agoRevert "tasks/workunit.py: depth 1 clone"
Sage Weil [Wed, 14 Dec 2016 17:27:58 +0000 (12:27 -0500)]
Revert "tasks/workunit.py: depth 1 clone"

This reverts commit e6f61ea9f19d0f1fad4a6547775fa80616eeeb89.

8 years agotasks/workunit.py: depth 1 clone
Sage Weil [Wed, 14 Dec 2016 17:19:44 +0000 (12:19 -0500)]
tasks/workunit.py: depth 1 clone

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 4faf77a649cb3f8ddf497ca81937b3dbf63a18dc)

8 years agotasks/workunit: remove kludge to use git.ceph.com
Sage Weil [Wed, 14 Dec 2016 17:18:29 +0000 (12:18 -0500)]
tasks/workunit: remove kludge to use git.ceph.com

This was hard-coded to ceph.git (almost) and breaks when
you specify --ceph-repo.  Remove it entirely.  We'll see if
github.com is better at handling our load than it used to
be!

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 159c455a0326eef2c017b3e3cf510f918b5ec76c)

8 years agotasks/ceph: restore context of osd mount path before mkfs
Kefu Chai [Fri, 9 Dec 2016 18:36:52 +0000 (02:36 +0800)]
tasks/ceph: restore context of osd mount path before mkfs

all newly created files and directories under the mount dir inherit the
SELinux type of their parent directory. so we need to set it before
mkfs.

Fixes: http://tracker.ceph.com/issues/16800
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 53225d5272a1d35d4183fcfa55a139f55f77e122)

8 years ago10.2.5 v10.2.5
Jenkins Build Slave User [Fri, 9 Dec 2016 20:08:24 +0000 (20:08 +0000)]
10.2.5

8 years agoMerge pull request #12376 from liewegas/wip-msgr-eagain-loop-jewel
Samuel Just [Thu, 8 Dec 2016 15:55:27 +0000 (07:55 -0800)]
Merge pull request #12376 from liewegas/wip-msgr-eagain-loop-jewel

msg/simple/Pipe: avoid returning 0 on poll timeout

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
8 years agomsg/simple/Pipe: avoid returning 0 on poll timeout 12376/head
Sage Weil [Thu, 8 Dec 2016 00:25:55 +0000 (18:25 -0600)]
msg/simple/Pipe: avoid returning 0 on poll timeout

If poll times out it will return 0 (no data to read on socket).  In
165e5abdbf6311974d4001e43982b83d06f9e0cc we changed tcp_read_wait from
returning -1 to returning -errno, which means we return 0 instead of -1
in this case.

This makes tcp_read() get into an infinite loop by repeatedly trying to
read from the socket and getting EAGAIN.

Fix by explicitly checking for a 0 return from poll(2) and returning
EAGAIN in that case.

Fixes: http://tracker.ceph.com/issues/18184
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 6c3d015c6854a12cda40673848813d968ff6afae)

8 years ago10.2.4 v10.2.4
Jenkins Build Slave User [Mon, 5 Dec 2016 22:15:20 +0000 (22:15 +0000)]
10.2.4

8 years agoMerge pull request #12167 from liewegas/wip-osdmap-encoding-jewel
Loic Dachary [Mon, 5 Dec 2016 13:50:23 +0000 (14:50 +0100)]
Merge pull request #12167 from liewegas/wip-osdmap-encoding-jewel

jewel: osd: condition OSDMap encoding on features

Reviewed-by: Loic Dachary <ldachary@redhat.com>
8 years agoMerge pull request #1290 from SUSE/wip-18014-jewel
Nathan Cutler [Sun, 4 Dec 2016 10:46:45 +0000 (11:46 +0100)]
Merge pull request #1290 from SUSE/wip-18014-jewel

thrashosds: try ceph-objectstore-tool for 10 minutes

Reviewed-by: Loic Dachary <ldachary@redhat.com>
8 years agothrashosds: try ceph-objectstore-tool for 10 minutes
Nathan Cutler [Thu, 24 Nov 2016 10:25:35 +0000 (11:25 +0100)]
thrashosds: try ceph-objectstore-tool for 10 minutes

If ceph-objectstore-tool binary is not present, it's likely because we're in
the middle of an upgrade. Do not try to run the binary until we verify that
it's really present. If it is absent, spend up to 10 minutes waiting for it to
appear.

Before this patch there was quite a large window for a race to occur. This
patch doesn't entirely eliminate it, but drastically reduces it.

Fixes: http://tracker.ceph.com/issues/18014
Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit 862b47faac1fc9f05ee3322ee4b65cf3d3d666c5)

8 years agoMerge pull request #12067 from SUSE/wip-17953-jewel
Loic Dachary [Sat, 3 Dec 2016 09:57:18 +0000 (10:57 +0100)]
Merge pull request #12067 from SUSE/wip-17953-jewel

jewel: mon: OSDMonitor: only reject MOSDBoot based on up_from if inst matches

Reviewed-by: Samuel Just <sjust@redhat.com>
8 years agoMerge pull request #1297 from ceph/wip-14.04
Zack Cerza [Fri, 2 Dec 2016 20:25:32 +0000 (13:25 -0700)]
Merge pull request #1297 from ceph/wip-14.04

suites/rados: s/trusty/"14.04"/

8 years agoOSDMonitor: only reject MOSDBoot based on up_from if inst matches 12067/head
Samuel Just [Mon, 14 Nov 2016 19:50:23 +0000 (11:50 -0800)]
OSDMonitor: only reject MOSDBoot based on up_from if inst matches

If the osd actually restarts, there is no guarrantee that the epoch will
advance past up_from.  If the inst is different, it can't really be a
dup.  At worst, it might be a queued MOSDBoot from a previous inst, but
in that case, the real inst would see itself marked up, and then back
down causing it to try booting again.

Fixes: http://tracker.ceph.com/issues/17899
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 033ad5b46c0492134e72a8372e44e3ef1358d2df)

8 years agoMerge pull request #12207 from jdurgin/wip-librados-setxattr-overload-jewel
Josh Durgin [Fri, 2 Dec 2016 16:16:27 +0000 (08:16 -0800)]
Merge pull request #12207 from jdurgin/wip-librados-setxattr-overload-jewel

librados: remove new setxattr overload to avoid breaking the C++ ABI

Reviewed-by: Sage Weil <sage@redhat.com>
8 years agosuites/rados: s/trusty/"14.04"/
Sage Weil [Fri, 2 Dec 2016 14:37:09 +0000 (09:37 -0500)]
suites/rados: s/trusty/"14.04"/

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoMerge pull request #1296 from dachary/wip-ceph-coverage-jewel
Loic Dachary [Fri, 2 Dec 2016 10:42:34 +0000 (11:42 +0100)]
Merge pull request #1296 from dachary/wip-ceph-coverage-jewel

upgrade: ceph-test is needed for ceph-coverage

Reviewed-by: Abhishek Lekshmanan <abhishek@suse.com>
8 years agoupgrade/infernalis-client-x: ceph-test is needed for ceph-coverage
Loic Dachary [Fri, 2 Dec 2016 09:32:42 +0000 (10:32 +0100)]
upgrade/infernalis-client-x: ceph-test is needed for ceph-coverage

Do not exclude the ceph-test package otherwise the ceph-coverage
executable is not installed.

Fixes: http://tracker.ceph.com/issues/16506
Signed-off-by: Loic Dachary <loic@dachary.org>
8 years agoupgrade: ceph-test is needed for ceph-coverage
Loic Dachary [Fri, 2 Dec 2016 09:27:25 +0000 (10:27 +0100)]
upgrade: ceph-test is needed for ceph-coverage

Do not exclude the ceph-test package otherwise the ceph-coverage
executable is not installed.

Fixes: http://tracker.ceph.com/issues/16506
Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 8122494530589e27df929652e38c74137d6d823a)

8 years agoMerge pull request #12267 from dachary/wip-17904-jewel
Loic Dachary [Fri, 2 Dec 2016 09:01:39 +0000 (10:01 +0100)]
Merge pull request #12267 from dachary/wip-17904-jewel

 jewel: Error EINVAL: removing mon.a at 172.21.15.16:6789/0, there will be 1 monitors

Reviewed-by: Samuel Just <sjust@redhat.com>
8 years agoMerge pull request #1284 from ceph/jewel-name-limits
Sage Weil [Thu, 1 Dec 2016 21:57:20 +0000 (16:57 -0500)]
Merge pull request #1284 from ceph/jewel-name-limits

drop broken name length config args

8 years agoMerge pull request #1292 from ceph/jewel-avoid-xenial
Sage Weil [Thu, 1 Dec 2016 21:56:55 +0000 (16:56 -0500)]
Merge pull request #1292 from ceph/jewel-avoid-xenial

rados: avoid xenial for upgrade tests

8 years agomon: MonmapMonitor: drop unnecessary 'goto' statements 12267/head
Joao Eduardo Luis [Wed, 2 Nov 2016 15:38:36 +0000 (15:38 +0000)]
mon: MonmapMonitor: drop unnecessary 'goto' statements

Signed-off-by: Joao Eduardo Luis <joao@suse.de>
(cherry picked from commit 20dcb597e35e6961db81831facefbe22cecddec3)

8 years agomon: MonmapMonitor: return success when monitor will be removed
Joao Eduardo Luis [Wed, 2 Nov 2016 15:33:52 +0000 (15:33 +0000)]
mon: MonmapMonitor: return success when monitor will be removed

Fixes: http://tracker.ceph.com/issues/17725
Signed-off-by: Joao Eduardo Luis <joao@suse.de>
(cherry picked from commit c9d46cfbf2512bc3495c6901de2b8f711bef9bae)

8 years agoMerge pull request #12001 from dachary/wip-17915-jewel
Samuel Just [Thu, 1 Dec 2016 19:08:04 +0000 (11:08 -0800)]
Merge pull request #12001 from dachary/wip-17915-jewel

jewel: filestore: can get stuck in an unbounded loop during scrub

Reviewed-by: Samuel Just <sjust@redhat.com>
8 years agoupgrade/client-upgrade: correct distros/ location
Sage Weil [Thu, 1 Dec 2016 15:03:49 +0000 (10:03 -0500)]
upgrade/client-upgrade: correct distros/ location

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoupgrade/client-upgrade: fix distro symlinks
Sage Weil [Thu, 1 Dec 2016 14:53:53 +0000 (09:53 -0500)]
upgrade/client-upgrade: fix distro symlinks

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoupgrade/client-upgrade: specify centos or trusty (not xenial)
Sage Weil [Wed, 30 Nov 2016 16:55:45 +0000 (11:55 -0500)]
upgrade/client-upgrade: specify centos or trusty (not xenial)

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agorados: avoid ubuntu xenial on upgrade tests
Sage Weil [Wed, 30 Nov 2016 16:00:39 +0000 (11:00 -0500)]
rados: avoid ubuntu xenial on upgrade tests

Not all of the older package builds are present for xenial.

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoMerge pull request #1152 from ceph/wip-objecstore
Sage Weil [Wed, 30 Nov 2016 02:54:15 +0000 (21:54 -0500)]
Merge pull request #1152 from ceph/wip-objecstore

rados/objectstore/objectstore.yaml: skip bluestore tests

8 years agoMerge pull request #1288 from dachary/wip-shec-upgrade-jewel
Sage Weil [Tue, 29 Nov 2016 15:36:22 +0000 (10:36 -0500)]
Merge pull request #1288 from dachary/wip-shec-upgrade-jewel

upgrade/hammer-x: verify shec before the full upgrade

Reviewed-by: Sage Weil <sage@redhat.com>
8 years agoupgrade/hammer-x: verify shec before the full upgrade
Loic Dachary [Tue, 29 Nov 2016 08:49:15 +0000 (09:49 +0100)]
upgrade/hammer-x: verify shec before the full upgrade

The hammer-x/stress-split-erasure-code upgrade sequence comes from
hammer-x/stress-split and was modified to fully upgrade the cluster. It
previously upgraded only half of it. Verifying that the shec plugin is
not available and that trying to set it does not crash the OSD or the
MON must be tried before the upgrade is complete.

Signed-off-by: Loic Dachary <loic@dachary.org>
8 years agolibrados: remove new setxattr overload to avoid breaking the C++ ABI 12207/head
Josh Durgin [Tue, 29 Nov 2016 06:06:56 +0000 (22:06 -0800)]
librados: remove new setxattr overload to avoid breaking the C++ ABI

Fixes: http://tracker.ceph.com/issues/18058
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
(cherry picked from commit b8ff781ddcf737882163cf56d7c9b11e815fb699)

Conflicts:
src/include/rados/librados.hpp (trivial namespace change in removed line)

8 years agoMerge pull request #1287 from ceph/jewel-failed-to-encode
Samuel Just [Mon, 28 Nov 2016 21:59:47 +0000 (13:59 -0800)]
Merge pull request #1287 from ceph/jewel-failed-to-encode

upgrade/hammer-x: encoding fixes (jewel)

Reviewed-by: Samuel Just <sjust@redhat.com>
8 years agocrush: condition latest tunable encoding on features 12167/head
Sage Weil [Wed, 23 Nov 2016 19:15:50 +0000 (14:15 -0500)]
crush: condition latest tunable encoding on features

This avoids throwing hammer OSDMap encodings off.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 9e5ff86487bd1f5979866b5e16300dd4a3979f97)

8 years agocrush/CrushWrapper: encode with features
Sage Weil [Mon, 28 Nov 2016 19:35:53 +0000 (14:35 -0500)]
crush/CrushWrapper: encode with features

No behavior change yet; just fixing callers.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit b7c9e055848c8aa951bc48c957cff3ef323ea441)

[Updated write_file to use all feaetures]
[Updated OSDMonitor.cc to use mon->quorum_features instead of the
 mon->get_quorum_con_featuers() helper]
[trivial conflict from removed write_file and read_file]

Conflicts:
src/crush/CrushWrapper.h
src/mgr/PyModules.cc
src/mon/OSDMonitor.cc
src/tools/ceph_monstore_tool.cc

8 years agocrush/CrushWrapper: drop unused 'lean' encode() argument
Sage Weil [Mon, 28 Nov 2016 19:35:24 +0000 (14:35 -0500)]
crush/CrushWrapper: drop unused 'lean' encode() argument

No callers, no users.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 638a38bbb55c07ad0358a35a56418e66874d1c26)

Conflicts:
src/crush/CrushWrapper.h

[trivial conflict due to removal of write_file and read_file]

8 years agoupgrade/hammer-x/stress-split-*: disable sighup injection
Sage Weil [Mon, 28 Nov 2016 16:56:02 +0000 (11:56 -0500)]
upgrade/hammer-x/stress-split-*: disable sighup injection

We already did this for stress-split; do the same
here.  It triggers a File closed exception when the greenlet
is joined.

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoupgrade/hammer-x/parallel: white 'failed to encode'
Sage Weil [Mon, 28 Nov 2016 15:29:40 +0000 (10:29 -0500)]
upgrade/hammer-x/parallel: white 'failed to encode'

The problem here has nothing to do with osdmap
encoding, but that hammer -> jewel makes the systemd
transition and installing the package starts
the mons.. before the osds.  I'm not sure what
the workaround for that is but the osdmap issue
appears okay, so ignore this for now.

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoMerge pull request #1285 from ceph/jewel-failed-to-encode
Yuri Weinstein [Sun, 27 Nov 2016 19:06:14 +0000 (11:06 -0800)]
Merge pull request #1285 from ceph/jewel-failed-to-encode

upgrade/hammer-x: do not whitelist 'failed to encode map'

8 years agoupgrade/hammer-x/parallel: upgrade osds first
Sage Weil [Sat, 26 Nov 2016 23:31:23 +0000 (18:31 -0500)]
upgrade/hammer-x/parallel: upgrade osds first

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoupgrade/hammer-x: do not whitelist 'failed to encode map'
Sage Weil [Sat, 26 Nov 2016 23:37:37 +0000 (18:37 -0500)]
upgrade/hammer-x: do not whitelist 'failed to encode map'

Well, on parallel.  For the others, keep it in
place because we don't upgrade osds first (we are testing
other things).

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agodrop broken name length config args
Sage Weil [Thu, 5 May 2016 13:07:36 +0000 (09:07 -0400)]
drop broken name length config args

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 2a44c3d20de9a75065c271e9ad8dfceeed1186d9)

8 years agoupgrade/hammer-x: fix symlinks
Sage Weil [Wed, 23 Nov 2016 21:40:38 +0000 (16:40 -0500)]
upgrade/hammer-x: fix symlinks

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoosd/osd_types: encode pg_pool_t like hammer if features indicate hammer
Sage Weil [Wed, 23 Nov 2016 18:51:59 +0000 (13:51 -0500)]
osd/osd_types: encode pg_pool_t like hammer if features indicate hammer

If the target features are missing the new OSDOp encoding, the
first feature we added post-hammer, encode like hammer.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 2f8cfb632823ba4e63eaff394392d6af7979d7c8)

8 years agoosd/osd_types: conditional pg_pool_t encoding
Sage Weil [Wed, 23 Nov 2016 18:48:35 +0000 (13:48 -0500)]
osd/osd_types: conditional pg_pool_t encoding

Align this with decode.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 01d9e8a20bbc3c039f67b040da95018e2c7b00b6)

8 years agoMerge pull request #1273 from ceph/wip-whitelist-crc
Tamilarasi Muthamizhan [Wed, 23 Nov 2016 19:21:53 +0000 (11:21 -0800)]
Merge pull request #1273 from ceph/wip-whitelist-crc

whitelist CRC mismatch

8 years agoMerge pull request #1281 from ceph/wip-jewel-debug-fuse
Sage Weil [Wed, 23 Nov 2016 14:38:08 +0000 (09:38 -0500)]
Merge pull request #1281 from ceph/wip-jewel-debug-fuse

upgrade/hammer-x: debug mds

8 years agoupgrade/hammer-x: debug mds
Sage Weil [Wed, 23 Nov 2016 14:37:49 +0000 (09:37 -0500)]
upgrade/hammer-x: debug mds

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoMerge pull request #1280 from ceph/wip-jewel-debug-fuse
Sage Weil [Wed, 23 Nov 2016 14:37:00 +0000 (09:37 -0500)]
Merge pull request #1280 from ceph/wip-jewel-debug-fuse

upgrade/hammer-x: debug client

8 years agoupgrade/hammer-x: debug client
Sage Weil [Wed, 23 Nov 2016 14:36:19 +0000 (09:36 -0500)]
upgrade/hammer-x: debug client

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoMerge pull request #1272 from ceph/wip-rgw-sync-agent-retry-jewel
Loic Dachary [Tue, 22 Nov 2016 17:42:30 +0000 (18:42 +0100)]
Merge pull request #1272 from ceph/wip-rgw-sync-agent-retry-jewel

jewel: rgw: fix some races with radosgw and radosgw-agent startup

Reviewed-by: Loic Dachary <ldachary@redhat.com>
8 years agotasks.rgw: 'time' imported but unused
Owen Synge [Thu, 17 Nov 2016 10:37:59 +0000 (11:37 +0100)]
tasks.rgw: 'time' imported but unused

flake8 was failing.

Signed-off-by: Owen Synge <osynge@suse.com>
(cherry picked from commit ef1d2a6aabe91282e28dbb6200fb7c2fab816720)

8 years agoRestart OSDs that belong to first node only
Tamil Muthamizhan [Fri, 18 Nov 2016 21:43:30 +0000 (13:43 -0800)]
Restart OSDs that belong to first node only

Restart only first half osds as only the first node
is upgraded

Signed-off-by: Tamil Muthamizhan <tmuthami@redhat.com>
8 years agoThis is triggering failures like
Tamil Muthamizhan [Fri, 18 Nov 2016 21:35:19 +0000 (13:35 -0800)]
This is triggering failures like

2016-11-18T01:17:08.865 INFO:tasks.ceph.osd.3:Stopping old one...
2016-11-18T01:17:08.865 DEBUG:tasks.ceph.osd.3:waiting for process to exit
2016-11-18T01:17:08.865 INFO:teuthology.orchestra.run:waiting for 300
2016-11-18T01:17:09.199 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/home/teuthworker/src/ceph-qa-suite_wip-whitelist-crc/tasks/ceph_manager.py", line 660, in wrapper
    return func(self)
  File "/home/teuthworker/src/ceph-qa-suite_wip-whitelist-crc/tasks/ceph_manager.py", line 677, in do_sighup
    self.ceph_manager.signal_osd(osd, signal.SIGHUP, silent=True)
  File "/home/teuthworker/src/ceph-qa-suite_wip-whitelist-crc/tasks/ceph_manager.py", line 1865, in signal_osd
    self.cluster).signal(sig, silent=silent)
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/daemon.py", line 111, in signal
    self.proc.stdin.write(struct.pack('!b', sig))
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/file.py", line 377, in write
    raise IOError('File is closed')
IOError: File is closed

so trying to avoid this error.

Signed-off-by: Tamil Muthamizhan <tmuthami@redhat.com>
8 years agowhitelist CRC mismatch
Tamil Muthamizhan [Thu, 17 Nov 2016 21:41:32 +0000 (13:41 -0800)]
whitelist CRC mismatch

whitelisted CRC mismatch and added upgrade for second
half of the cluster

Signed-off-by: Tamil Muthamizhan <tmuthami@redhat.com>
8 years agorgw: remove unnecessary sleeps
Casey Bodley [Tue, 15 Nov 2016 19:42:23 +0000 (14:42 -0500)]
rgw: remove unnecessary sleeps

remove the sleeps that were added to address radosgw startup races

Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 3e68bfdbb29e933edb06f73c88eed176ffacc2e3)

8 years agorgw: start_rgw() polls gateway until it accepts connections
Casey Bodley [Tue, 15 Nov 2016 18:44:27 +0000 (13:44 -0500)]
rgw: start_rgw() polls gateway until it accepts connections

resolves various races between radosgw startup and further operations -
both within the rgw task itself (such as the 'radosgw-admin realm pull'),
and in later tasks

Fixes: http://tracker.ceph.com/issues/17794
Fixes: http://tracker.ceph.com/issues/17872
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 5e6538e623c0dea46203cd0c74201b7700f4767c)

8 years agorgw: add retry/backoff to sync agent requests
Casey Bodley [Tue, 15 Nov 2016 16:24:25 +0000 (11:24 -0500)]
rgw: add retry/backoff to sync agent requests

resolves an issue where startup of the radosgw-agent races with the
requests we send to it to run sync. uses the requests package with
urllib3 to add retry with backoff to these requests

Fixes: http://tracker.ceph.com/issues/16129
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 69bbafa804993d2a6cec608ac4f2eb4bfbb01753)

8 years agoMerge pull request #1266 from ceph/wip-add-point-jewel
Yuri Weinstein [Wed, 16 Nov 2016 18:06:11 +0000 (10:06 -0800)]
Merge pull request #1266 from ceph/wip-add-point-jewel

Added /upgrade/jewel-x/point-to-point-x

8 years agoAdded /upgrade/jewel-x/point-to-point-x
Yuri Weinstein [Tue, 15 Nov 2016 21:39:25 +0000 (21:39 +0000)]
Added /upgrade/jewel-x/point-to-point-x

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
8 years agoMerge pull request #1265 from ceph/wip-dont-whitelist
Yuri Weinstein [Tue, 15 Nov 2016 22:34:40 +0000 (14:34 -0800)]
Merge pull request #1265 from ceph/wip-dont-whitelist

DO NOT whitelist CRC mismatch

8 years agoDO NOT whitelist CRC mismatch
Tamil Muthamizhan [Tue, 15 Nov 2016 22:40:29 +0000 (14:40 -0800)]
DO NOT whitelist CRC mismatch

Signed-off-by: Tamil Muthamizhan <tmuthami@redhat.com>
8 years agoMerge pull request #1264 from ceph/jewel-osds-before-mon
Tamilarasi Muthamizhan [Tue, 15 Nov 2016 22:13:57 +0000 (14:13 -0800)]
Merge pull request #1264 from ceph/jewel-osds-before-mon

Jewel osds before mon

8 years agoos/filestore/HashIndex: fix list_by_hash_* termination on reaching end 12001/head
Sage Weil [Thu, 10 Nov 2016 18:56:24 +0000 (13:56 -0500)]
os/filestore/HashIndex: fix list_by_hash_* termination on reaching end

If we set *next to max, then the caller (a few lines up) doesn't terminate
the loop and will keep trying to list objects in every following hash
dir until it reaches the end of the collection.  In fact, if we have an
end bound we will never to an efficient listing unless we hit the max
first.

For one user, this was causing OSD suicides when scrub ran because it
wasn't able to list all objects before the timeout.  In general, this would
cause scrub to stall a PG for a long time and slow down requests.

Broken by refactor in 921c4586f165ce39c17ef8b579c548dc8f6f4500.

Fixes: http://tracker.ceph.com/issues/17859
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit c5180262a086c2d3895aff4bf0fb0ff9a6666149)

8 years agoupgrade/hammer-x: osds first
Sage Weil [Mon, 14 Nov 2016 23:13:04 +0000 (18:13 -0500)]
upgrade/hammer-x: osds first

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoupgrade/hammer-x/f-h-x-offline: osds first
Sage Weil [Mon, 14 Nov 2016 23:10:10 +0000 (18:10 -0500)]
upgrade/hammer-x/f-h-x-offline: osds first

Signed-off-by: Sage Weil <sage@redhat.com>
8 years agoMerge pull request #1252 from SUSE/wip-17683
Yuri Weinstein [Mon, 14 Nov 2016 16:10:00 +0000 (08:10 -0800)]
Merge pull request #1252 from SUSE/wip-17683

upgrade: disable ceph-objectstore-tool test in infernalis-x

8 years agoMerge pull request #1256 from ceph/wip-17734-jewel
Yuri Weinstein [Mon, 14 Nov 2016 16:08:34 +0000 (08:08 -0800)]
Merge pull request #1256 from ceph/wip-17734-jewel

upgrade/hammer-x/stress-split: set require_jewel_osds

8 years agoupgrade/hammer-x/stress-split: set require_jewel_osds
Loic Dachary [Mon, 14 Nov 2016 12:29:11 +0000 (13:29 +0100)]
upgrade/hammer-x/stress-split: set require_jewel_osds

It was missing and the cluster permanently stays on WARNING state after
the upgrade of the OSDs.

Signed-off-by: Loic Dachary <loic@dachary.org>
8 years agoupgrade: disable ceph-objectstore-tool test in infernalis-x
Nathan Cutler [Fri, 11 Nov 2016 23:51:30 +0000 (00:51 +0100)]
upgrade: disable ceph-objectstore-tool test in infernalis-x

Fixes: http://tracker.ceph.com/issues/17683
Signed-off-by: Nathan Cutler <ncutler@suse.com>
8 years agoMerge pull request #1245 from ceph/wip-fix-infernalis-s
Nathan Cutler [Sat, 12 Nov 2016 20:04:23 +0000 (21:04 +0100)]
Merge pull request #1245 from ceph/wip-fix-infernalis-s

Added require_jewel_osds flag

Reviewed-by: Nathan Cutler <ncutler@suse.com>
8 years agoAdded require_jewel_osds flag
Yuri Weinstein [Thu, 10 Nov 2016 17:28:18 +0000 (17:28 +0000)]
Added require_jewel_osds flag
Added to point-to-point as well

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>
8 years agosuites/powercycle: no ext4
Sage Weil [Thu, 10 Nov 2016 22:34:19 +0000 (17:34 -0500)]
suites/powercycle: no ext4

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 2a71071bc132c8d14306f8ef4ba6ab043426c85a)

8 years agorgw: add sleep to let the sync agent to init
Orit Wasserman [Tue, 7 Jun 2016 10:13:01 +0000 (12:13 +0200)]
rgw: add sleep to let the sync agent to init

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
(cherry picked from commit 54d311a991cfe658687d0f5a69f40718b7bea707)

8 years agorgw: add debug info when comparing bucket metadata
Orit Wasserman [Tue, 23 Aug 2016 14:27:50 +0000 (16:27 +0200)]
rgw: add debug info when comparing bucket metadata

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
(cherry picked from commit ad5a2a2f199f8fcecadf2c91b33c0530a26d8c3d)

8 years agoMerge pull request #1240 from ceph/wip-17734-jewel
Loic Dachary [Thu, 10 Nov 2016 10:25:35 +0000 (11:25 +0100)]
Merge pull request #1240 from ceph/wip-17734-jewel

upgrade/hammer-x: wait for osdmaps to propagate

8 years agoMerge pull request #11822 from SUSE/wip-17816-jewel
Loic Dachary [Wed, 9 Nov 2016 19:53:18 +0000 (20:53 +0100)]
Merge pull request #11822 from SUSE/wip-17816-jewel

jewel: Missing comma in ceph-create-keys causes concatenation of arguments

Reviewed-by: Loic Dachary <ldachary@redhat.com>
8 years agoupgrade/hammer-x: wait for osdmaps to propagate
Loic Dachary [Wed, 9 Nov 2016 09:36:20 +0000 (10:36 +0100)]
upgrade/hammer-x: wait for osdmaps to propagate

Fixes: http://tracker.ceph.com/issues/17808
Signed-off-by: Loic Dachary <loic@dachary.org>
8 years agoceph-create-keys: add missing argument comma 11822/head
Patrick Donnelly [Sun, 18 Sep 2016 20:26:29 +0000 (16:26 -0400)]
ceph-create-keys: add missing argument comma

The arguments "get" and "client.admin" were being concatenated into
"getclient.admin".

Found using ceph-ansible + strace:

    13031 execve("/usr/bin/ceph", ["ceph", "--cluster=ceph", "--name=mon.", "--keyring=/var/lib/ceph/mon/ceph-ceph-mon0/keyring", "auth", "getclient.admin"], ["PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin", "LANG=en_US.UTF-8", "CLUSTER=ceph", "TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728", "CEPH_AUTO_RESTART_ON_UPGRADE=no"] <unfinished ...>

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 482022233d845b75876b04ca23fb137281a9f6ab)

8 years agoMerge pull request #1226 from dachary/wip-17734-jewel
Loic Dachary [Mon, 7 Nov 2016 13:41:16 +0000 (14:41 +0100)]
Merge pull request #1226 from dachary/wip-17734-jewel

releases/jewel: set require_jewel_osds

8 years agoMerge pull request #11679 from dachary/wip-17734-jewel
Loic Dachary [Mon, 7 Nov 2016 13:39:48 +0000 (14:39 +0100)]
Merge pull request #11679 from dachary/wip-17734-jewel

jewel: Upgrading 0.94.6 -> 0.94.9 saturating mon node networking

Reviewed-by: Kefu Chai <kchai@redhat.com>
8 years agoupgrade/hammer-x: set require_jewel_osds
Loic Dachary [Mon, 31 Oct 2016 17:44:19 +0000 (18:44 +0100)]
upgrade/hammer-x: set require_jewel_osds

And replace infernalis with jewel where relevant.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
8 years agorados/singleton-nomsgr/all: set require_jewel_osds
Loic Dachary [Mon, 31 Oct 2016 17:25:17 +0000 (18:25 +0100)]
rados/singleton-nomsgr/all: set require_jewel_osds

those tests do not exist anymore in master, no backport possible.

Signed-off-by: Loic Dachary <ldachary@redhat.com>
8 years agomon: expose require_jewel_osds flag to user 11679/head
xie xingguo [Sat, 21 May 2016 06:11:55 +0000 (14:11 +0800)]
mon: expose require_jewel_osds flag to user

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 83ffc2b761742d563777e50959faa6a6010edae0)

8 years agomon/OSDMonitor: encode OSDMap::Incremental with same features as OSDMap
Sage Weil [Fri, 21 Oct 2016 16:25:08 +0000 (12:25 -0400)]
mon/OSDMonitor: encode OSDMap::Incremental with same features as OSDMap

The Incremental encode stashes encode_features, which is
what we use later to reencode the updated OSDMap.  Use
the same features so that the encoding will match!

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 916ca6a0aaa32bd9c2b449e0d7fbd312c29f06e5)

8 years agomon/OSDMonitor: health warn if require_{jewel,kraken} flags aren't set
Sage Weil [Thu, 13 Oct 2016 16:16:40 +0000 (12:16 -0400)]
mon/OSDMonitor: health warn if require_{jewel,kraken} flags aren't set

We want to prompt users to set these flags as soon as their
upgrades complete.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 12e508313dbd5d1d38c76859cb7de2ce22404e12)

Conflicts:
   src/mon/OSDMonitor.cc: remove references to kraken

    if ((osdmap.get_up_osd_features() & CEPH_FEATURE_SERVER_KRAKEN) &&
!osdmap.test_flag(CEPH_OSDMAP_REQUIRE_KRAKEN)) {
      string msg = "all OSDs are running kraken or later but the"
" 'require_kraken_osds' osdmap flag is not set";
      summary.push_back(make_pair(HEALTH_WARN, msg));
      if (detail) {
detail->push_back(make_pair(HEALTH_WARN, msg));
      }
    } else

8 years agomon/OSDMonitor: encode canonical full osdmap based on osdmap flags
Sage Weil [Fri, 30 Sep 2016 22:02:39 +0000 (18:02 -0400)]
mon/OSDMonitor: encode canonical full osdmap based on osdmap flags

If the JEWEL or KRAKEN flags aren't set, encode the full map without
those features.  This ensure that older OSDs in the cluster will be able
to correctly encode the full map with a matching CRC.  At least, that is
true as long as the encoding changes are guarded by those feature bits.
That appears to be true currently, and we plan to ensure that it is true
in the future as well.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 5e0daf6642011bf1222c4dc20aa284966fa5df9f)

Conflicts:
   src/mon/OSDMonitor.cc: removed reference to kraken

    if (!tmp.test_flag(CEPH_OSDMAP_REQUIRE_KRAKEN)) {
      dout(10) << __func__ << " encoding without feature SERVER_KRAKEN" << dendl;
      features &= ~CEPH_FEATURE_SERVER_KRAKEN;
    }

8 years agoMerge pull request #11742 from tchaikov/wip-17728-jewel
Loic Dachary [Fri, 4 Nov 2016 14:31:05 +0000 (15:31 +0100)]
Merge pull request #11742 from tchaikov/wip-17728-jewel

jewel: test/ceph_test_msgr: do not use Message::middle for holding transient…

Reviewed-by: Loic Dachary <ldachary@redhat.com>
8 years agoMerge pull request #11746 from liewegas/wip-post-file-key-jewel
Loic Dachary [Thu, 3 Nov 2016 14:54:28 +0000 (15:54 +0100)]
Merge pull request #11746 from liewegas/wip-post-file-key-jewel

jewel: ceph-post-file: use new ssh key

Reviewed-by: Loic Dachary <ldachary@redhat.com>
Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
8 years agoceph-post-file: Ignore keys offered by ssh-agent 11746/head
David Galloway [Fri, 19 Aug 2016 20:11:32 +0000 (16:11 -0400)]
ceph-post-file: Ignore keys offered by ssh-agent

In my case, I had multiple private keys in ssh-agent which resulted in
the sftp connection failing despite explicitly specifying the private
key to use

Signed-off-by: David Galloway <dgallowa@redhat.com>
(cherry picked from commit a61fcb2eac35a149b49efdc9b2ffa675afb968e8)

8 years agoceph-post-file: migrate to RSA SSH keys
Sage Weil [Wed, 2 Nov 2016 13:37:41 +0000 (09:37 -0400)]
ceph-post-file: migrate to RSA SSH keys

DSA keys are being deprecated: http://www.openssh.com/legacy.html

drop.ceph.com will continue to allow the old DSA key but eventually,
users submitting logs using ceph-post-file will run into issues when
OpenSSH completely drops support for the algorithm.

Fixes: http://tracker.ceph.com/issues/14267
Signed-off-by: David Galloway <dgallowa@redhat.com>
(cherry picked from commit ecd02bf3f1c7a07a3271b2736a9e12dd6e897821)

# Conflicts:
# src/CMakeLists.txt

8 years agomsg: adjust byte_throttler from Message::encode 11742/head
Sage Weil [Sun, 23 Oct 2016 23:40:57 +0000 (18:40 -0500)]
msg: adjust byte_throttler from Message::encode

Normally we never call encode on a message that has a byte_throttler set
because we only use it for messages we received.  However, for forwarded
messages that we clear_payload() before resending, we *do* reencode, and in
that case we need to retake the appropriate number of bytes from the
throttler--just like we release them in clear_payload().

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit a9651282f7c16df872757b82d3d2995d92458d5c)

8 years agomsg/Message: fix set_middle vs throttler
Sage Weil [Sun, 23 Oct 2016 23:10:00 +0000 (18:10 -0500)]
msg/Message: fix set_middle vs throttler

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit e7bf50b27a495ed75def67025d1ceca83861ba35)

8 years agomessages/MForward: reencode forwarded message if target has differing features
Sage Weil [Sat, 22 Oct 2016 18:01:34 +0000 (14:01 -0400)]
messages/MForward: reencode forwarded message if target has differing features

This ensures we reencode the payload with the
appropriate set of features if the client, us, or the
target do not have identical features.  Otherwise we
may forward an encoding with more features than the
target can handle.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit a433455e59067a844c3df4a0d6080db2ceb4ec59)

8 years agomessages/MForward: fix encoding features
Sage Weil [Wed, 28 Sep 2016 15:44:28 +0000 (11:44 -0400)]
messages/MForward: fix encoding features

We were encoding the message with the sending client's
features, which makes no sense: we need to encode with
the recipient's features so that it can decode the
message.

The simplest way to fix this is to rip out the bizarre
msg_bl handling code and simply keep a decoded Message
reference, and encode it when we send.

We encode the encapsulated message with the intersection
of the target mon's features and the sending client's
features.  This probably doesn't matter, but it's
conceivable that there is some feature-dependent
behavior in the message encode/decode that is important.

Fixes: http://tracker.ceph.com/issues/17365
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit d4f5e88f36e5388ae9e062c4bc49ac1c684a3f3c)

8 years agoall: add const to operator<< param
Michal Jarzabek [Sat, 4 Jun 2016 22:24:06 +0000 (23:24 +0100)]
all: add const to operator<< param

Signed-off-by: Michal Jarzabek <stiopa@gmail.com>
(cherry picked from commit 0a157e088b2e5eb66177421f19f559ca427240eb)

8 years agotest/ceph_test_msgr: do not use Message::middle for holding transient data
Kefu Chai [Fri, 28 Oct 2016 17:54:58 +0000 (01:54 +0800)]
test/ceph_test_msgr: do not use Message::middle for holding transient data

Message::middle is used for holding encoded data, so we we can not stuff
it with payload and leave the "payload" field empty. this change
refactors the ceph_test_msgr by introducing a Payload class which
encodes all test data in it.

Fixes: http://tracker.ceph.com/issues/17728
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 56896a7ed20869ce91ade4c77c1d6cbab8d50de1)
Conflicts:
src/test/msgr/test_msgr.cc: do not use the new-style DENC()
framework for implementing the encoder of Payload class. DENC() was
introduced after jewel was released.