J. Eric Ivancich [Fri, 12 Oct 2018 22:07:24 +0000 (18:07 -0400)]
rgw: failed resharding clears resharding status from shard heads
Previously, when resharding failed, we restored the shard status on
the bucket info object. However the status on each of the shards was
left indicating a reshard was underway. This prevented some write
operations from taking place, as they would wait for resharding to
complete. This adds the missing functionality. It also makes the
functionality available to other classes via static functions in
RGWBucketReshard.
J. Eric Ivancich [Fri, 12 Oct 2018 14:24:32 +0000 (10:24 -0400)]
rgw: change the bucket reshard lock to exclusive-ephemeral
The bucket reshard lock was simply an exclusive lock that existed on
an object solely for the purpose of representing the lock. This is now
changed to exclusvie-ephemeral lock, so as not to leave these objects
behind.
J. Eric Ivancich [Fri, 12 Oct 2018 14:23:57 +0000 (10:23 -0400)]
cls: add exclusive ephemeral locks that auto-clean
Add a new type of cls lock -- exclusive ephemeral for which the
object only exists to represent the lock and for which the object
should be deleted at unlock. This is to prevent the accumulation of
unneeded objects in the cluster by automatically cleaning them up.
J. Eric Ivancich [Thu, 27 Sep 2018 17:31:57 +0000 (13:31 -0400)]
rgw: renew resharding locks to prevent expiration
Fix lock expiration problem with resharding. The resharding process
will renew its bucket lock (and logshard lock if necessary) when half
the remaining time is left on the lock. If the lock is expired and
cannot renew the process fails and errors out appropriately.
cls: add semantics for cls locks to require renewal without expiring
Add ability to *require* renewal of an existing lock in addition
toexisting ability to *allow* renewal of an existing lock. The key
difference is that a MUST_RENEW will fail if the lock has expired
(where a MAY_RENEW) will succeed. This provides calling code with the
ability to verify that a lock is held continually and that it was
never lost/expired.
Erwan Velu [Wed, 10 Oct 2018 18:26:01 +0000 (20:26 +0200)]
ceph_volume: Checking device validity at init time
When initializing the Device structure, it have to run is_valid() to
ensure the data structures (_is_valid & rejected_reasons) to be
populated accordingly to the device state.
Erwan Velu [Tue, 9 Oct 2018 20:28:19 +0000 (22:28 +0200)]
ceph_volume: Reporting nr_requests
We are already reporting the rotational & scheduler of a disk device.
Reporting the nr_requests could be useful to get how many concurrent IOs
the device supports/reports.
That could help detecting badly detected/configured devices.
Erwan Velu [Tue, 9 Oct 2018 20:26:28 +0000 (22:26 +0200)]
ceph_volume: Reporting firmware revision
We are already reporting model & vendor of a given disk, let's also
report the revision of the firmware. That is useful to filter-out some
known broken revisions.
Sage Weil [Mon, 22 Oct 2018 19:38:48 +0000 (14:38 -0500)]
os/bluestore: fix race between SharedBlobSet::lookup and SharedBlob::put
A B
SharedBlobSet::lookup()
takes lock
nref is not 0
SharedBlob::put()
--nref
returns SharedBlobRef,
++nref
takes cache lock
SharedBlobSet::remove
takes lock
removes
deletes SharedBlob
-> A ends up with a ref to deleted SharedBlob
Fix by verifying that nref is still zero in SharedBlobSet::remove(),
while we are holding the SharedBlobSet::lock. The lock ensures that we
have increased the ref for the lookup before entering remove, so we can
verify that nref is still zero before removing it. If not, we have
raced, and put() bails out and does nothing.
Brad Hubbard [Tue, 16 Oct 2018 01:57:05 +0000 (11:57 +1000)]
rpm: Use updated gperftools-libs at runtime
Due to ABI breakage in libtcmalloc.so.4 we need to specify the minimum
version to be used at runtime to be greater than or equal to the version
used at build time.
make sure we only build with the higher version of gperftools on
distros where both 2.4 and 2.6.1 are packaged. see
https://git.centos.org/summary/rpms!gperftools.git . at the time of
writing, gperftools 2.6.1 is packaged for CentOS/RHEL 7, if gperftools
(>= 2.4) is required by Ceph, and user already has this version
installed, when new Ceph packages are installed, the updated gperftools
2.6.1 version won't be installed as a dependency. when launching
Ceph compiled with tcmalloc enabled, we will have
symbol lookup error: ceph-osd: undefined symbol: _ZdaPvm
so, by bumping up the required version of gperftools, the updated
gperftools will be installed.
see https://software.opensuse.org/package/gperftools, openSUSE/SLE offer
2.5. so they are safe at this moment.
Casey Bodley [Wed, 15 Aug 2018 20:04:37 +0000 (16:04 -0400)]
rgw: incremental data sync uses truncated flag to detect end of listing
we call wait() after incremental sync if we've reached the end of the
datalog listing. the existing logic compares our local marker with the
remote's high marker, with some extra code to handle the case where the
remote log was trimmed
all of this can be simplified by using the 'truncated' flag returned
with the RGWReadRemoteDataLogShardCR used to list the remote datalog
test: use death_test_style="threadsafe" for Mutex.NormalAsserts
use threadsafe for running the Mutex assert test to run the test from
beginning to avoid the hang. this test overrides the symbol of
ceph::__ceph_assert_fail() with a local one offered by the application
links against libceph-common. but we intentionally forbid this behavior:
we do not allow libceph-common to reference the symbols exposed from
application with the same name. see http://tracker.ceph.com/issues/25154
qa/tasks/cram: tasks now must live in the repository
Commit 0d8887652d53 ("qa/tasks/cram: use suite_repo repository for all
cram jobs") removed hardcoded git.ceph.com links, but as it turned out
it is still used for nightlies. There is no good way to accommodate
the different URL schemes, so let's get rid of URLs altogether.
qa/tasks/cram: use suite_repo repository for all cram jobs
Currently git.ceph.com is hardcoded for all cram jobs. Testing
modifications is a pain: one needs to push to either ceph/ceph.git or
ceph/ceph-ci.git (depending on where the ceph branch is at, triggering
unnecessary builds in the latter case) and wait for the mirror to sync.
Runs scheduled against branches in developer's forks fail.
Move away from git.ceph.com to allow mixing branches and repositories,
similar to workunits.
Boris Ranto [Fri, 14 Sep 2018 10:03:23 +0000 (12:03 +0200)]
mgr/dashboard: Do not require cert for http
The ceph dashboard currently requires a SSL certificate even if it is
not running in the SSL mode since it is always querying for the
certificate file/key pair.
This patch fixes the behaviour by querying for the certificate file/key
only if it is running in the SSL mode.
Although is preferred and should be enabled by default users might
want to disable SSL as the dashboard might be running behind a proxy
which terminates the SSL.
Fixes: https://tracker.ceph.com/issues/24674 Signed-off-by: Wido den Hollander <wido@42on.com>
(cherry picked from commit 21fbfc9c3a00edfe6063c33c738d49fdba21ea73)
Conflicts:
src/pybind/mgr/dashboard/controllers/docs.py: did not exist in
mimic
Venky Shankar [Mon, 6 Aug 2018 03:37:18 +0000 (23:37 -0400)]
mds: evict clients that do not respond to cap revoke by MDS
By default, preserve old behaviour. When configured with a non
default value, evict clients that have not responded to cap
revoke by MDS for the configured amount of seconds.
Paul Emmerich [Thu, 26 Jul 2018 19:24:38 +0000 (21:24 +0200)]
os/bluestore: handle spurious read errors
Some kernels (4.9+) sometime fail to return data when reading
from a block device under memory pressure. This patch retries
the read if the checksum verification fails, tests show that
the first retried read succeeds in ~99.5% of the cases, so
3 attempts are made by default before giving up on the data.
Works-around: http://tracker.ceph.com/issues/22464 Signed-off-by: Paul Emmerich <paul.emmerich@croit.io>
(cherry picked from commit cffcbc73aaaa874829d5fc9091af3042b887f9a7)
- conflict due to adjacent tests in store_test
- g_conf, not g_conf()