Huan Zhang [Fri, 24 Jun 2016 03:27:53 +0000 (11:27 +0800)]
rbd discard return -EINVAL if len > MAX_INT32
rbd discard use 'int' to return discarded length, but the 'len' user
passed is 'uint64', in some case, the ret value will be truncated
and return a negative value which means discard failed. ret -EINVAL
if len > MAX_INT32 to indicate support len <= MAX_INT32 only.
Jason Dillaman [Wed, 15 Nov 2017 14:09:15 +0000 (09:09 -0500)]
librbd: prevent overflow of discard API result code
Prevent discard/writesame lengths larger than 2GB.
Fixes: http://tracker.ceph.com/issues/21966 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 3effd324db181e625665be33b5c6529dca723cc5) Signed-off-by: Nathan Cutler <ncutler@suse.com>
Conflicts:
PendingReleaseNotes (adapted for jewel)
src/librbd/librbd.cc (no writesame in jewel)
Li Wang [Wed, 1 Nov 2017 09:21:29 +0000 (09:21 +0000)]
rbd-nbd: fix unused nbd device search bug in container
In some container scenarios, the host may choose to
map a specific nbd device, for example, /dev/nbd6 into the
container, in that case, the nbd device available in the
container is not numbered from 0. The current unused
nbd device search function will return no result.
This patch fixes it.
Fixes: http://tracker.ceph.com/issues/22012 Signed-off-by: Li Wang <laurence.liwang@gmail.com> Reviewed-by: Yunchuan Wen <yunchuan.wen@kylin-cloud.com>
(cherry picked from commit be0f9581f9727187ca03232e0b368e7da7a60609)
Jason Dillaman [Fri, 27 Oct 2017 20:45:54 +0000 (16:45 -0400)]
cls/journal: ensure tags are properly expired
Previously, if only the local image was using the journal or if
a disconnected peer was attached, the tag entries could not be
expired even if unreferenced.
Fixes: http://tracker.ceph.com/issues/21960 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 19fa1c7f5b2809e9a223b7b196dfc031e97a5dcd)
Conflicts:
src/tools/ceph_objectstore_tool.cc (in jewel, ::encode() takes only two
arguments, while in luminous/master it takes a third which we omit
here)
Yan, Zheng [Thu, 11 Jan 2018 09:50:22 +0000 (17:50 +0800)]
client: fix cap revoke race
If caps are been revoking by the auth MDS, don't consider them as
issued even they are still issued by non-auth MDS. The non-auth
MDS should also be revoking/exporting these caps, the client just
hasn't received the cap revoke/export message.
The race I encountered is: When caps are exporting to new MDS, the
client receives cap import message and cap revoke message from the
new MDS, then receives cap export message from the old MDS. When
the client receives cap revoke message from the new MDS, the revoking
caps are still issued by the old MDS, so the client does nothing.
Later when the cap export message is received, the client removes
the caps issued by the old MDS. (Another way to fix the race is
calling ceph_check_caps() in handle_cap_export())
In cls_timeindex_list() though `to_index` has expired for a timespan, the marker is set for a subsequent index during the time boundary check.
This marker is further returned to RGWObjectExpirer::process_single_shard(), where this out_marker is trimmed from the respective shard,
resulting in a lost removal hint and a leaked object.
Jason Dillaman [Wed, 27 Sep 2017 13:40:08 +0000 (09:40 -0400)]
librbd: hold cache_lock while clearing cache nonexistence flags
When transitioning from a snapshot that had an associated parent
to a snapshot where the parent was flattened and removed, the cache
was being referenced without holding the required lock.
Fixes: http://tracker.ceph.com/issues/21558 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 16ef97830cde30efb96f7aee69834b3a5c2d5248)
Adam C. Emerson [Wed, 20 Dec 2017 22:06:32 +0000 (17:06 -0500)]
rgw: Plumb refresh logic into object cache
Now when we force a refetch of bucket info it will actually go to the
OSD rather than simply using the objects in the object cache.
Fixes: http://tracker.ceph.com/issues/22517 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit d997f657750faf920170843e62deacab70008d8b) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Adam C. Emerson [Tue, 19 Dec 2017 21:47:09 +0000 (16:47 -0500)]
rgw: Add expiration in the object cache
We had it in the chained caches, but it doesn't do much good if
they just fetch objects out of the object cache.
Fixes: http://tracker.ceph.com/issues/22517 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 82a7e6ca31b416a7f0e41b5fda4c403d1d6be947) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Adam C. Emerson [Tue, 19 Dec 2017 17:53:05 +0000 (12:53 -0500)]
rgw: retry CORS put/delete operations on ECANCELLED
Fixes: http://tracker.ceph.com/issues/22517 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit bff7e61ca5a66b301ec49c1cf9054d1b74535832) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Adam C. Emerson [Fri, 17 Nov 2017 22:15:26 +0000 (17:15 -0500)]
rgw: Expire entries in bucket info cache
To bound the degree to which an RGW instance can go out to lunch if
the watch/notify breaks down, force refresh of any cache entry over a
certain age.
Fifteen minutes by default, and expiration can be turned off entirely.
This is separate from the LRU. The LRU removes entries based on the
last time of access. This expiration patch forces refresh based on the
last time they were updated.
Fixes: http://tracker.ceph.com/issues/22517 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 4489cb58a15647a31ac0546d70400af5668404cb) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Adam C. Emerson [Fri, 17 Nov 2017 21:05:06 +0000 (16:05 -0500)]
rgw: Handle stale bucket info in RGWDeleteBucketWebsite
Fixes: http://tracker.ceph.com/issues/22517 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit f4d274248e43cb38ff2b27782c010b2c35b12b2b)
Adam C. Emerson [Fri, 17 Nov 2017 21:03:13 +0000 (16:03 -0500)]
rgw: Handle stale bucket info in RGWSetBucketWebsite
Fixes: http://tracker.ceph.com/issues/22517 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit b2b7385f194def1025a8947bab876c9856b06400) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Adam C. Emerson [Fri, 17 Nov 2017 20:59:44 +0000 (15:59 -0500)]
rgw: Handle stale bucket info in RGWSetBucketVersioning
Fixes: http://tracker.ceph.com/issues/22517 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit a0a1e7c2ef992b8758bcfb20d893730c1b202475) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Adam C. Emerson [Fri, 17 Nov 2017 20:53:05 +0000 (15:53 -0500)]
rgw: Handle stale bucket info in RGWPutMetadataBucket
Fixes: http://tracker.ceph.com/issues/22517 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit ebb86301b20098e15824f469001f6153b27965f5) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Adam C. Emerson [Fri, 17 Nov 2017 20:51:42 +0000 (15:51 -0500)]
rgw: Add retry_raced_bucket_write
If the OSD informs us that our bucket info is out of date when we need
to write, we should have a way to update it.
This template function allows us to wrap relevant sections of code so
they'll be retried against new bucket info on -ECANCELED.
Fixes: http://tracker.ceph.com/issues/22517 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 1a3fcc70c0747791aa423cd0aa7d2596eaf3d73c)
Adam C. Emerson [Thu, 16 Nov 2017 19:42:58 +0000 (14:42 -0500)]
rgw: Add try_refresh_bucket_info function
Sometimes operations fail with -ECANCELED. This means we got raced. If
this happens we should update our bucket info from cache and try again.
Some user reports suggest that our cache may be getting and staying
out of sync. This is a bug and should be fixed, but it would also be
nice if we were robust enough to notice the problem and refresh.
So in that case, we invalidate the cache and fetch direct from the
OSD, putting a warning in the log.
Fixes: http://tracker.ceph.com/issues/22517 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 9114e5e50995f0c7d2be5c24aa4712d89cd89f48) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Marcus Watts [Tue, 2 Jan 2018 22:24:38 +0000 (17:24 -0500)]
radosgw: fix doubled underscore with s3/swift server-side copy
A name is almost an oid. Except that if a name starts
with a _, the corresponding oid starts with a double _.
The copy logic passes the oid straight in as a name,
which results in a triple _ on the oid, resulting in
an object name with a double __.
Fix: remove one _ when converting an oid back into
a name, so that the final oid only has a double _.
Fixes: http://tracker.ceph.com/issues/22529
This change is not needed on master or Luminous, due to
already-merged #18662.
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Kefu Chai [Fri, 22 Dec 2017 14:42:16 +0000 (22:42 +0800)]
install-deps.sh: update g++ symlink also
we need to update g++ symlink also, if it points to the wrong version
http://tracker.ceph.com/issues/22220 Signed-off-by: Kefu Chai <kchai@redhat.com>
Conflicts: the libboost issue does not affect master. as master builds
boost from source. so, it's not cherry-picked from master.
Kefu Chai [Wed, 13 Dec 2017 05:36:54 +0000 (13:36 +0800)]
install-deps.sh: point gcc to the one shipped by distro
to define a struct in a method is legal in C++11, but it causes internal
compiler error due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82155
if we are using GCC-7. so we need to either workaround in our source
code by moving the struct definition out of the member method or revert
to a GCC without this bug. but if we go with the first route, the jewel
build still fails, because GCC-7 starts to use the new CXX11 ABI, which
is not compatible with the libboost we use in jewel. the libboost was
still built with the old ABI for backward compatibility. so let's just
fix the install-deps.sh to point gcc to the origin one.
See: http://tracker.ceph.com/issues/22220 Signed-off-by: Kefu Chai <kchai@redhat.com>
Conflicts: the libboost issue does not affect master. as master builds
boost from source. so, it's not cherry-picked from master.
This is needed for jewel-x point to point upgrade because earlier point
releases can't handle our ec profiles with ruleset-* (later ones can) and
the test races with the mon upgrades.
Nathan Cutler [Tue, 21 Nov 2017 10:36:02 +0000 (11:36 +0100)]
tests: ceph-disk: ignore E722 in flake8 test
Very old, and very new, versions of flake8 treat E722 as an error:
flake8 runtests: commands[0] | flake8 --ignore=H105,H405,E127 ceph_disk tests
ceph_disk/main.py:1575:9: E722 do not use bare except'
ceph_disk/main.py:1582:9: E722 do not use bare except'
ceph_disk/main.py:3252:5: E722 do not use bare except'
ceph_disk/main.py:3288:21: E722 do not use bare except'
ceph_disk/main.py:3296:17: E722 do not use bare except'
ceph_disk/main.py:4358:5: E722 do not use bare except'
tests/test_main.py:26:1: E722 do not use bare except'
ERROR: InvocationError: '/opt/j/ws/mkck/src/ceph-disk/.tox/flake8/bin/flake8 --ignore=H105,H405,E127 ceph_disk tests'
David Zafman [Wed, 13 Sep 2017 00:17:13 +0000 (17:17 -0700)]
osd: Only scan for omap corruption once
Before
state 2: Can have complete tables (some may be bad)
state 3: Never had complete tables
After
state 2: Can have complete tables (some may be bad)
state 3 with legacy: Can have complete tables (bad ones are cleared)
state 3: Never had complete tables
Once OSDs boot with this change you can't downgrade to a previous release.
If someone does downgrade they could have unstable OSDs that hit assert(state.v < 3).
The following command run after shutting down the cluster but before downgrading
ceph packages would be a way to fix this.
Currently the iterator isn't advanced after the erase call leading to a
second call on the iterator, which crashes due to a double free.
Since C++11 the map::erase function returns an iterator pointing to the
next element. Use the return value to set the iterator after erasing.
Fixes: http://tracker.ceph.com/issues/21808 Signed-off-by: Peter Keresztes Schmidt <carbenium@outlook.com>
(cherry picked from commit 9e49c4124422e58dd40dfb6038425430d3845412)
Conflicts: remove bluestore.yaml as jewel does not support it. and
remove links to objectstore from where the tests do not exist in jewel
yet, for instance, qa/suites/mgr/basic.