Greg Farnum [Tue, 1 Oct 2013 23:41:22 +0000 (16:41 -0700)]
ReplicatedPG: copy: use CopyCallback instead of CopyOp in OpContext
In order to make this happen, we make the switch to generate the complete
transaction in the generic copy code and save it into the Callback. Then
in finish_copy() we just take that transaction and prepend it to the existing
transaction.
With that change, and by making use of the existing CopyCallback data,
we no longer need to access the CopyOp from the OpContext, so we can remove it.
Hurray, the pipelines are now independent!
We implement enough of the CopyFromCallback that CopyOp no longer needs
a direct reference to the OpContext, so we remove it and replace all
references with calls to cop->cb->complete().
ReplicatedPG: follow the same finish path for failed copy ops
We don't necessarily want to respond to clients with a failure if
a copy got an error code. Instead, conditionally execute the success
path and always launch back into execute_ctx() when the copy has
stopped (either due to completion or failure).
Update the COPY_FROM section so it returns the CopyOp::rval (instead
of always zero) and only launches finish_copy() on success.
Greg Farnum [Tue, 1 Oct 2013 20:28:03 +0000 (13:28 -0700)]
ReplicatedPG: update pg stats correctly when doing a copy
The obs.oi.size needs to updated in the middle so that we actually
change the stats -- this got set backwards by mistake during one
of the refactors to support large objects!
(See 4e29e362e7981634d751ee982144fbf602782a9a)
Sage Weil [Tue, 1 Oct 2013 16:28:29 +0000 (09:28 -0700)]
osdc/ObjectCacher: limit writeback IOs generated while holding lock
While analyzing a log from Mike Dawson I saw a long stall while librbd's
objectcacher was starting lots (many hundreds) of IOs. Limit the amount of
time we spend doing this at a time to allow IO replies to be processed so
that the cache remains responsive.
I'm not sure this warrants a tunable (which we would need to add for both
libcephfs and librbd).
Yehuda Sadeh [Mon, 26 Aug 2013 18:16:08 +0000 (11:16 -0700)]
rgw: quiet down warning message
Fixes: #6123
We don't want to know about failing to read region map info
if it's not found, only if failed on some other error. In
any case it's just a warning.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Objecter: add "honor_cache_redirects" flag covering cache settings
When set to false, we do not redirect based on the cache_pool data
in the OSDMap. We'll use this so the OSDs can actually fetch data
into the cache pools on promotion! Signed-off-by: Greg Farnum <greg@inktank.com>
Sage Weil [Thu, 5 Sep 2013 04:29:11 +0000 (21:29 -0700)]
common/crc32c_intel_fast: avoid reading partial trailing word
The optimized intel code reads in word-sized chunks, knowing that the
allocator will only hand out memory in word-sized increments. This makes
valgrind unhappy. Whitelisting doesn't work because for some reason there
is no caller context (probably because of some interaction with yasm?).
Instead, just use the baseline code for the last few bytes. This should
not be significant.
Dan Mick [Fri, 27 Sep 2013 01:00:31 +0000 (18:00 -0700)]
ceph_argparse.py, cephtool/test.sh: fix blacklist with no nonce
It's legal to give a CephEntityAddr to osd blacklist with no nonce,
so allow it in the valid() method; also add validation of any nonce
given that it's a long >= 0.
Also fix comment on CephEntityAddr type description in MonCommands.h,
and add tests for invalid nonces (while fixing the existing tests to remove
the () around expect_false args).
Fixes: #6425 Signed-off-by: Dan Mick <dan.mick@inktank.com>
We now put CEPH_ARGS in the actual args we parse in python, which are passed
to rados piecemeal later. This lets you put things like --id ... in there
that need to be parsed before librados is initialized.
David Zafman [Wed, 18 Sep 2013 01:14:16 +0000 (18:14 -0700)]
osd: Cleanup init()/read_superblock()
Fix error handling in init()
Cleanup read_superblock() by moving unrelated code into init()
Move init() feature upgrade right after compatibility checking
Remove redundant whoami check
Signed-off-by: David Zafman <david.zafman@inktank.com>
David Zafman [Fri, 20 Sep 2013 01:54:36 +0000 (18:54 -0700)]
common, os, osd, test, tools: FileStore must work with ghobjects rather than hobjects
Add ghobject_t to hboject.h header
Add constants NO_SHARD/NO_GEN and change gen_t/shard_t
Convert other headers from hobject_t to ghobject_t
Mostly straight hobject_t to ghobject_t for src/os cc files
Fix tools and tests and enable ceph-dencoder
Add filename generation and parsing including unittest addition
Get ceph-filestore-dump to build
Add gen/shard to DBObjectMap::ghobject_key() and update test case
Add CEPH_FS_FEATURE_INCOMPAT_SHARDS new FileStore feature
Add CEPH_OSD_FEATURE_INCOMPAT_SHARDS new osd feature
Fixes: #5862 Signed-off-by: David Zafman <david.zafman@inktank.com>
David Zafman [Wed, 25 Sep 2013 16:19:16 +0000 (09:19 -0700)]
os, osd, tools: Add backportable compatibility checking for sharded objects
OSD
New CEPH_OSD_FEATURE_INCOMPAT_SHARDS
FileStore
NEW CEPH_FS_FEATURE_INCOMPAT_SHARDS
Add FSSuperblock with feature CompatSet in it
Store sharded_objects state using CompatSet
Add set_allow_sharded_objects() and get_allow_sharded_objects() to FileStore/ObjectStore
Add read_superblock()/write_superblock() internal filestore functions
ceph_filestore_dump
Add OSDsuperblock to export format
Use CompatSet from OSD code itself in filestore-dump tool
Always check compatibility of OSD features with on-disk features
On import verify compatibility of on-disk features with export data
Bump super_ver due to export format change
Backport: dumpling, cuttlefish
Signed-off-by: David Zafman <david.zafman@inktank.com>
David Zafman [Tue, 17 Sep 2013 20:39:57 +0000 (13:39 -0700)]
include: Bug fixes for CompatSet
FeatureSet insert/remove
Use 64-bit arithmetic to allow features past 31
Allow feature 63 by fixing assert in insert
CompatSet::unsupported() bugs
Ignore feature 0 which became illegal
Use 64-bit arithmetic when computing mask
Use id in insert() and to get correct feature name
Use the right map to get name for diff.ro_compat