John Spray [Fri, 29 Aug 2014 17:34:39 +0000 (18:34 +0100)]
tools: use cout instead of cerr in journal tool
Aside from being a bit odd to begin with, using stderr
was causing tests to fail because the output was polluted
by log output which is also on stderr.
Fixes: 9281 Signed-off-by: John Spray <john.spray@redhat.com>
Sage Weil [Fri, 29 Aug 2014 15:29:35 +0000 (08:29 -0700)]
mds/RecoveryQueue: do not start prioritized items synchronously
When we prioritize an item move it into a second priority list/set, but
do not start immediately, so that we still obey the max. When we go to
start an item, pull items first off the priority list, then off the regular
list.
John Spray [Thu, 28 Aug 2014 23:53:44 +0000 (00:53 +0100)]
client: fix dispatcher ordering (broken fuse)
Objecter never saw any OSD maps because of 1e1ee480 and
the dispatchers being in the wrong order -- ignoring map
in Client was hiding it from Objecter.
Fixes: #9266 Signed-off-by: John Spray <john.spray@redhat.com>
David Zafman [Wed, 20 Aug 2014 08:33:45 +0000 (01:33 -0700)]
ceph_objectstore_tool: Bug fixes and test improvements
ceph_objectgstore_tool:
Fix bugs in the way collection_list_partial() was being called
which caused objects to be seen over and over again.
Unit test:
Fix get_objs() to walk pg tree for pg with sub-directories
Create more objects to test object listing code
Limit number of larger objects
Limit number of objects which get attributes and omaps
David Zafman [Wed, 30 Jul 2014 19:39:49 +0000 (12:39 -0700)]
Complete replacement of ceph_filestore_tool and ceph_filestore_dump
with unified ceph_objectstore_tool
Move list-lost-objects and fix-lost-objects features from
ceph_filestore_tool to ceph_objectstore_tool as list-lost, fix-lost
Change --type to --op for info, log, export...operations
Add --type for the ObjectStore type (defaults to filestore)
Change --filestore-path to --data-path
Update installation, Makefile.am, and .gitignore
Fix and rename test case to match
Add some additional invalid option checks
Signed-off-by: David Zafman <david.zafman@inktank.com>
David Zafman [Wed, 21 May 2014 19:45:33 +0000 (12:45 -0700)]
test: ceph_filestore_dump.sh test improvements
Add some usage error tests
Don't use the same var in second for loop
Add xattr/omap to rep pool and xattr to ec pool
Add list, get-bytes and set-bytes testing
Add list-attrs and get-attr
Signed-off-by: David Zafman <david.zafman@inktank.com>
David Zafman [Wed, 14 May 2014 19:42:21 +0000 (12:42 -0700)]
common,ceph_filestore_dump: Add ability for utilities to suppress library dout output
Suppress dout output with CODE_ENVIRONMENT_UTILITY_NODOUT
ceph_filestore_dump turns on dout output if --debug specified
When used it can still be enable with --log-to-stderr --err-to-stderr
Signed-off-by: David Zafman <david.zafman@inktank.com>
Sage Weil [Thu, 14 Aug 2014 21:52:40 +0000 (14:52 -0700)]
mds/RecoveryQueue: add method to prioritize a file recovery; fix logging
Add a prioritize() method to make file recovery start immediately for the
given inode. Note that this doesn't respect the max recovery limit: if
someone stats it, they are blocking, and we start the recovery immediately.
Also fix up the dout logging a bit so that everything is prefixed
consistently.
Sage Weil [Thu, 14 Aug 2014 21:39:29 +0000 (14:39 -0700)]
mds: change mds_max_file_recover from 5 -> 32
These are reasonably cheap operations (stat) and we should be too worried
about queueing up a bunch of them.
Ideally this sort of thing would magically tune to the throughput we can
get from the cluster, but until then, let's choose a default that works for
more users.
Sage Weil [Thu, 28 Aug 2014 17:59:18 +0000 (10:59 -0700)]
test/mon/*: prime mon with initial command before injection
The osdmonitor_prepare_command is very fragile. Send an initial command
to the mon beforehand. This seems to prevent the initial command from
getting combined into an early mon proposal with some other stuff.
Alternatively, we could remove these tests and this mechanism entirely as
it is likely to great in the future when the next set of mon changes are
made, but they have shown themselves to be useful it catching other
regressions, so we'll patch them up for a bit longer.
Loic Dachary [Sat, 23 Aug 2014 09:07:29 +0000 (11:07 +0200)]
erasure-code: assert the PluginRegistry lock is held when it must
Add lock to the preload method and assert that it is held by methods
requiring it. Although preload is called at bootstrap and does not
require the lock, adding it does not hurt and makes the lock policy
clearer to understand.
Loic Dachary [Thu, 21 Aug 2014 16:38:52 +0000 (18:38 +0200)]
erasure-code: add Ceph version check to plugins
Add the __erasure_code_version function to all plugins, to return the
Ceph version against which they have been compiled. When a plugin is
loaded, an error is thrown if the version of the plugin does not match
the version of the daemon loading it.
If the symbol does not exist, which will be true of older plugins, set
the version to "an older version" so it never matches.
Loic Dachary [Thu, 21 Aug 2014 16:31:02 +0000 (18:31 +0200)]
erasure-code: jerasure preloads the plugin variant
The variant selection depending on the available CPU features is
encapsulated in a helper. The helper is used in the factory() method and
in the load() method.
The factory() method may load a variant that is not the default, for
benchmark purposes. Such a variant is not preloaded by the load() method
and upgrading while running may be problematic. However, running with a
non standard variant is used for benchmarking and upgrades in this
context are not a concern.
Loic Dachary [Thu, 21 Aug 2014 16:22:18 +0000 (18:22 +0200)]
erasure-code: add directory to plugin init functions
The prototype of the init functions of erasure coded plugins is changed
from
int __erasure_code_init(char *plugin_name)
to
int __erasure_code_init(char *plugin_name, char *directory)
The jerasure plugin will find optimized variants in this directory and
load them. The load() and preload() functions of
ErasureCodePluginRegistry only use a directory instead of a more generic
parameters map. The parameters map was only used for the directory entry
anyway.
Samuel Just [Wed, 27 Aug 2014 23:21:41 +0000 (16:21 -0700)]
PG::can_discard_op: do discard old subopreplies
Otherwise, a sub_op_reply from a previous interval can stick around
until we either one day go active again and get rid of it or delete the
pg which is holding it on its waiting_for_active list. While it sticks
around futily waiting for the pg to once more go active, it will cause
harmless slow request warnings.
Fixes: #9259
Backport: firefly Signed-off-by: Samuel Just <sam.just@inktank.com>
John Spray [Wed, 27 Aug 2014 21:32:12 +0000 (22:32 +0100)]
mds: restart on -EBLACKLISTED
Previously these cases would (hopefully) hit an
assert(r==0) in the various completion contexts,
and the MDS would "crash" from the user's point of view.
With the introduction of MDSIOContext, we have a single
place to filter all RADOS op responses, which also has
a handle to the global MDS instance. Check result values
for -EBLACKLISTED and call MDS::respawn() in response.