Adam C. Emerson [Fri, 17 Nov 2017 20:51:42 +0000 (15:51 -0500)]
rgw: Add retry_raced_bucket_write
If the OSD informs us that our bucket info is out of date when we need
to write, we should have a way to update it.
This template function allows us to wrap relevant sections of code so
they'll be retried against new bucket info on -ECANCELED.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 1a3fcc70c0747791aa423cd0aa7d2596eaf3d73c) Fixes: http://tracker.ceph.com/issues/22517
Adam C. Emerson [Thu, 16 Nov 2017 19:42:58 +0000 (14:42 -0500)]
rgw: Add try_refresh_bucket_info function
Sometimes operations fail with -ECANCELED. This means we got raced. If
this happens we should update our bucket info from cache and try again.
Some user reports suggest that our cache may be getting and staying
out of sync. This is a bug and should be fixed, but it would also be
nice if we were robust enough to notice the problem and refresh.
So in that case, we invalidate the cache and fetch direct from the
OSD, putting a warning in the log.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 9114e5e50995f0c7d2be5c24aa4712d89cd89f48) Fixes: http://tracker.ceph.com/issues/22517
Kefu Chai [Fri, 24 Nov 2017 05:56:02 +0000 (13:56 +0800)]
make-dist: exclude unused bits in boost
the docs, examples and tests are not used. so drop them. we could go
further by removing unused components in boost. but that'd be an issue
if somebody added a component in CMakeLists but forgets to update this
script. also, we need to remove boost/$component and lib/$component to
achieve this goal. this also introduces extra complicity. so leave it
for another change.
Kefu Chai [Fri, 22 Dec 2017 14:42:16 +0000 (22:42 +0800)]
install-deps.sh: update g++ symlink also
we need to update g++ symlink also, if it points to the wrong version
http://tracker.ceph.com/issues/22220 Signed-off-by: Kefu Chai <kchai@redhat.com>
Conflicts: the libboost issue does not affect master. as master builds
boost from source. so, it's not cherry-picked from master.
(cherry picked from commit 248a157635b46d3cf23e37ae263c62b0dc4e0e59)
Kefu Chai [Wed, 13 Dec 2017 05:36:54 +0000 (13:36 +0800)]
install-deps.sh: point gcc to the one shipped by distro
to define a struct in a method is legal in C++11, but it causes internal
compiler error due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82155
if we are using GCC-7. so we need to either workaround in our source
code by moving the struct definition out of the member method or revert
to a GCC without this bug. but if we go with the first route, the jewel
build still fails, because GCC-7 starts to use the new CXX11 ABI, which
is not compatible with the libboost we use in jewel. the libboost was
still built with the old ABI for backward compatibility. so let's just
fix the install-deps.sh to point gcc to the origin one.
See: http://tracker.ceph.com/issues/22220 Signed-off-by: Kefu Chai <kchai@redhat.com>
Conflicts: the libboost issue does not affect master. as master builds
boost from source. so, it's not cherry-picked from master.
(cherry picked from commit ccc4dea90e483ea8bf6bee0721ef929e7f48ff5a)
Deprecation warnings for ceph-disk will no longer be present in any
Luminous release beyond 12.2.2 - but are still present in master and any
newer release.
this is a follow-up of #19328. we need to get this change into 12.2.3.
so better off do the switch somewhere after 12.2.2 which has been
tagged, and before 12.2.3, which is not tagged yet.
please note, this is not targetting master, because i want to make
sure the change number (the <num> in << 12.2.2-<num>) is correct. it
does not hurt if it's not, as long as it is ">> 12.2.2", so the replace
machinery in 12.2.3 works, and it covers the releases where the
ceph-{osdomap,kvstore,monstore}-tool are not move yet. but why don't
make it more right?
d3ac8d18 moves ceph-client-debug from ceph-test to ceph-base without
updating the package relationships between the two involved packages.
which results in:
dpkg: error processing archive /var/cache/apt/archives/ceph-test_12.2.1-241-g43e027b-1trusty_amd64.deb (--unpack):
trying to overwrite '/usr/bin/ceph-client-debug', which is also in package ceph-base 10.2.10-14-gcbaddae-1trusty
dpkg-deb: error: subprocess paste was killed by signal (Broken pipe)
dpkg: error processing archive /var/cache/apt/archives/ceph-osd_13.0.0-2201-g6cc0b41-1trusty_amd64.deb (--unpack):
trying to overwrite '/usr/bin/ceph-osdomap-tool', which is also in package ceph-test 10.2.10-14-gcbaddae-1trusty
in 40caf6a6, we moves some tools from ceph-test out into ceph-osd,
ceph-mon and ceph-base respectively. but didn't update the relationships
between these packages accordingly. this causes the upgrade failure.
see https://www.debian.org/doc/debian-policy/#document-ch-relationships
for more details on "Breaks" and "Conflicts".
the reason why the package version to be replaced/conflicted is 12.2.2
is that: i assume that this change will be backported to luminous, and
the next release of it will be 12.2.2 .
Song Shun [Tue, 28 Nov 2017 03:28:43 +0000 (11:28 +0800)]
ceph-disk: fix signed integer is greater than maximum when call major
fix signed integer is greater than maximum when call os.major
using python 2.7.5 in Centos 7
Sage Weil [Wed, 29 Nov 2017 21:20:59 +0000 (15:20 -0600)]
mon/Monitor: fix statfs handling before luminous switchover happens
After the mons are luminous but before we switch over to using the
MgrStatMonitor's new info, the version on mgrstat will generally be <<
than that of pgmon, and the client will send that version with the
request. This means that the statfs message will perpetually appear to be
in the future and fail the is_readable() check.
Fix this with any ugly hack that resets the version to 1 if we haven't
completed the luminous upgrade yet.
Kefu Chai [Tue, 28 Nov 2017 06:42:31 +0000 (14:42 +0800)]
qa/ceph-disk: enlarge the simulated SCSI disk
100MB will be allocated for journal, and the remaining 100MB is for data
device. taking the inode into consideration, there will be approximately
87988 kB available for the activated OSD. and it will complain with a
"nearfull" state.
Sage Weil [Mon, 27 Nov 2017 16:28:16 +0000 (10:28 -0600)]
qa/suites/upgrade/jewel-x/point-to-point: skip ec tests when mons may be old
Early point release mons don't handle legacy ruleset-* ec profiles, new
ones do. Skip the ec tests that may trigger this when we are doing a
workload that races with mon upgrades.
Nathan Cutler [Tue, 21 Nov 2017 10:36:02 +0000 (11:36 +0100)]
tests: ceph-disk: ignore E722 in flake8 test
Very old, and very new, versions of flake8 treat E722 as an error:
flake8 runtests: commands[0] | flake8 --ignore=H105,H405,E127 ceph_disk tests
ceph_disk/main.py:1575:9: E722 do not use bare except'
ceph_disk/main.py:1582:9: E722 do not use bare except'
ceph_disk/main.py:3252:5: E722 do not use bare except'
ceph_disk/main.py:3288:21: E722 do not use bare except'
ceph_disk/main.py:3296:17: E722 do not use bare except'
ceph_disk/main.py:4358:5: E722 do not use bare except'
tests/test_main.py:26:1: E722 do not use bare except'
ERROR: InvocationError: '/opt/j/ws/mkck/src/ceph-disk/.tox/flake8/bin/flake8 --ignore=H105,H405,E127 ceph_disk tests'
Kefu Chai [Tue, 21 Nov 2017 13:47:30 +0000 (21:47 +0800)]
qa/workunits: silence py warnings for ceph-disk tests
ceph-disk now prints "depreacted" warning message when it starts. but
the tests parses its stdout and stderr for a json string. so we need to
silence the warnings for the tests.
Sage Weil [Wed, 15 Nov 2017 14:55:33 +0000 (08:55 -0600)]
mon/OSDMonitor: add option to fix up ruleset-* to crush-* for ec profiles
The jewel->luminous upgrade test will fail if we finish the upgrade while
a workload setting old-style ec profiles is running. Add option to
automatically fix them up. Warn to the cluster log when this happens.
For now, enable this option to ease upgrades and whitelist the warning.
Only include this option in luminous so that we implicitly sunset this
compatibility kludge immediately.
Fixes: http://tracker.ceph.com/issues/22128 Signed-off-by: Sage Weil <sage@redhat.com>