John Spray [Thu, 28 Aug 2014 23:53:44 +0000 (00:53 +0100)]
client: fix dispatcher ordering (broken fuse)
Objecter never saw any OSD maps because of 1e1ee480 and
the dispatchers being in the wrong order -- ignoring map
in Client was hiding it from Objecter.
Fixes: #9266 Signed-off-by: John Spray <john.spray@redhat.com>
David Zafman [Wed, 20 Aug 2014 08:33:45 +0000 (01:33 -0700)]
ceph_objectstore_tool: Bug fixes and test improvements
ceph_objectgstore_tool:
Fix bugs in the way collection_list_partial() was being called
which caused objects to be seen over and over again.
Unit test:
Fix get_objs() to walk pg tree for pg with sub-directories
Create more objects to test object listing code
Limit number of larger objects
Limit number of objects which get attributes and omaps
David Zafman [Wed, 30 Jul 2014 19:39:49 +0000 (12:39 -0700)]
Complete replacement of ceph_filestore_tool and ceph_filestore_dump
with unified ceph_objectstore_tool
Move list-lost-objects and fix-lost-objects features from
ceph_filestore_tool to ceph_objectstore_tool as list-lost, fix-lost
Change --type to --op for info, log, export...operations
Add --type for the ObjectStore type (defaults to filestore)
Change --filestore-path to --data-path
Update installation, Makefile.am, and .gitignore
Fix and rename test case to match
Add some additional invalid option checks
Signed-off-by: David Zafman <david.zafman@inktank.com>
David Zafman [Wed, 21 May 2014 19:45:33 +0000 (12:45 -0700)]
test: ceph_filestore_dump.sh test improvements
Add some usage error tests
Don't use the same var in second for loop
Add xattr/omap to rep pool and xattr to ec pool
Add list, get-bytes and set-bytes testing
Add list-attrs and get-attr
Signed-off-by: David Zafman <david.zafman@inktank.com>
David Zafman [Wed, 14 May 2014 19:42:21 +0000 (12:42 -0700)]
common,ceph_filestore_dump: Add ability for utilities to suppress library dout output
Suppress dout output with CODE_ENVIRONMENT_UTILITY_NODOUT
ceph_filestore_dump turns on dout output if --debug specified
When used it can still be enable with --log-to-stderr --err-to-stderr
Signed-off-by: David Zafman <david.zafman@inktank.com>
Sage Weil [Thu, 28 Aug 2014 17:59:18 +0000 (10:59 -0700)]
test/mon/*: prime mon with initial command before injection
The osdmonitor_prepare_command is very fragile. Send an initial command
to the mon beforehand. This seems to prevent the initial command from
getting combined into an early mon proposal with some other stuff.
Alternatively, we could remove these tests and this mechanism entirely as
it is likely to great in the future when the next set of mon changes are
made, but they have shown themselves to be useful it catching other
regressions, so we'll patch them up for a bit longer.
Loic Dachary [Sat, 23 Aug 2014 09:07:29 +0000 (11:07 +0200)]
erasure-code: assert the PluginRegistry lock is held when it must
Add lock to the preload method and assert that it is held by methods
requiring it. Although preload is called at bootstrap and does not
require the lock, adding it does not hurt and makes the lock policy
clearer to understand.
Loic Dachary [Thu, 21 Aug 2014 16:38:52 +0000 (18:38 +0200)]
erasure-code: add Ceph version check to plugins
Add the __erasure_code_version function to all plugins, to return the
Ceph version against which they have been compiled. When a plugin is
loaded, an error is thrown if the version of the plugin does not match
the version of the daemon loading it.
If the symbol does not exist, which will be true of older plugins, set
the version to "an older version" so it never matches.
Loic Dachary [Thu, 21 Aug 2014 16:31:02 +0000 (18:31 +0200)]
erasure-code: jerasure preloads the plugin variant
The variant selection depending on the available CPU features is
encapsulated in a helper. The helper is used in the factory() method and
in the load() method.
The factory() method may load a variant that is not the default, for
benchmark purposes. Such a variant is not preloaded by the load() method
and upgrading while running may be problematic. However, running with a
non standard variant is used for benchmarking and upgrades in this
context are not a concern.
Loic Dachary [Thu, 21 Aug 2014 16:22:18 +0000 (18:22 +0200)]
erasure-code: add directory to plugin init functions
The prototype of the init functions of erasure coded plugins is changed
from
int __erasure_code_init(char *plugin_name)
to
int __erasure_code_init(char *plugin_name, char *directory)
The jerasure plugin will find optimized variants in this directory and
load them. The load() and preload() functions of
ErasureCodePluginRegistry only use a directory instead of a more generic
parameters map. The parameters map was only used for the directory entry
anyway.
Samuel Just [Wed, 27 Aug 2014 23:21:41 +0000 (16:21 -0700)]
PG::can_discard_op: do discard old subopreplies
Otherwise, a sub_op_reply from a previous interval can stick around
until we either one day go active again and get rid of it or delete the
pg which is holding it on its waiting_for_active list. While it sticks
around futily waiting for the pg to once more go active, it will cause
harmless slow request warnings.
Fixes: #9259
Backport: firefly Signed-off-by: Samuel Just <sam.just@inktank.com>
Sage Weil [Tue, 19 Aug 2014 23:48:34 +0000 (16:48 -0700)]
mon/Paxos: make backend write async
Move into the WRITING state and do the write to leveldb (or whatever the
backend is) asynchronously.
A few tricks here:
- we can't do the is_updating() state check because we will always be in
REFRESH. Instead, make commit_proposal() tolerate the case where it is
called but the top proposal isn't the one we just did (or the list is
empty). This makes the callers simpler.
- do_refresh() may call bootstrap. If we do bootstrap while in REFRESH,
don't do a sync/flush on the backend store because *we* are async
completion thread and we'll deadlock. All other callers need to wait
for this, though!
Sage Weil [Tue, 19 Aug 2014 23:45:46 +0000 (16:45 -0700)]
mon/Paxos[Service]: allow reads during WRITING state
The REFRESH state is not readable; that's when we are re-reading our state
out of leveldb, and we hold the mon_lock during the period. So, strictly
speaking, it doesn't matter whether we include it here since none of these
call sites would be visited while in that state.
Sage Weil [Sun, 17 Aug 2014 05:29:04 +0000 (22:29 -0700)]
mon/Paxos: move post-commit finish work into commit_finish()
The main change here is that we are merging the singleton and clustered
finish code together. This is mostly a code shuffle, except for one
semantic change: we now trigger the commit waiters before finish_round()
in the singleton case, whereas before we did not. I don't think there
was a specific reason why it differed from the clustered case.