When opening a regular file, fuse assigns a 'struct Fh' pointer to
fuse_file_info::fh. but when openning a directory, fuse assigns a
'struct dir_result_t' to fuse_file_info::fh. So we need a seperate
function for fsyncdir (cast fuse_file_info::fh to a struct
dir_result_t pointer)
qa/workunits: cephtool: take EOPNOTSUPP as an alias of ENOTSUP
the proble breaks `test_mon_deprecated_commands` on ubuntu precise,
on the python shipped with ubuntu precise, errno.errorcode[95]
evalutes to `EOPNOTSUPP` but not `ENOTSUP`. but these two errnos
are equal in glibc.
David Zafman [Tue, 14 Jul 2015 02:07:07 +0000 (19:07 -0700)]
common, tools, test: Add "rados purge" feature to remove all objects from a pool
This required creating an Object type which is a pair of strings an
object id and object namespace. Functionally, nothing has changed
with regards to the bench and cleanup command semantics. Those
commands still allow operation in the default or a specified namespace.
Fixes: #12262 Signed-off-by: David Zafman <dzafman@redhat.com>
Kefu Chai [Fri, 19 Jun 2015 14:57:57 +0000 (22:57 +0800)]
tools/ceph-monstore-tools: add rewrite command
"rewrite" command will
- add a new osdmap version to update current osdmap held by OSDMonitor
- add a new paxos version, as a proposal it will
* rewrite all osdmap epochs from specified epoch to the last_committed
one with the specified crush map.
* add the new osdmap which is added just now
so the leader monitor can trigger a recovery process to apply the transaction
to all monitors in quorum, and hence bring them back to normal after being
injected with a faulty crushmap.
Zhiqiang Wang [Fri, 17 Jul 2015 03:28:19 +0000 (11:28 +0800)]
rados.cc: fix an issue in the output of the 'rados df' command
The output doesn't indent correctly without this fix. Right align the df
stats with their headers. Before this change some of them are 1
character off, and with a strange 'category' column.
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
tests: test/cephtool-test-mon.sh uses 7202 7203 and 7204
When running 3 mons, vstart uses ports starting from CEPH_PORT=7202 and
increments to 7203 for the second mon and 7204 for the third. Add a
comment showing the port number. The method for a test to figure out
which port is free is to grep for the port and if nothing matches, use
it. The grep will match 7204 and 7203 and 7202 from the comment and
reduce the chances of someone using 7203 because it was nowhere to be
found in the sources.
Don't simply put() a reference if it has gone unclaimed without
get()'ing it first. This can cause nefarious consequences for those
users of MForward that do not expect this to happen.
When we introduced the MonOpRequest in the monitor and moved pretty much
every single function receiving messages in their arguments to take op
requests, we basically lost the type safety that was guaranteed from
Monitor::dispatch().
This patch adds an op_type field to the op request, as an easy fix for
this now lacking safety.
mon: Monitor: drop PaxoServiceMessage reply functions
The services are now fully using MonOpRequest and should stay that way.
Drop PaxosService-specific reply functions as we want nothing to do with
them :)
mon: MonOpRequest: send_reply() belongs in the Monitor class
Op Requests should have no business replying to messages. Besides,
given the Monitor is currently the place to do this, because it is the
one with access to all things that may be required to validate state
(e.g., quorum features), permanently moving this code to the Monitor
class also avoid having duplicate/very similar code in two distinct
places.
mon: PaxosService: use wait_for_.*_ctx() in absence of an op
The vast majority of cases use PaxosService's wait_for_{state}()
functions to wait on given {state} before waking up a given op-related
callback. E.g., to reply to a command once a proposal finishes.
However, there are a few cases[1] in which the callback waiting for the
state change does not map to an op.
To maintain compatibility, we were keeping the functions just taking a
callback and no op with the same name as those taking ops (because c++
is amazing that way), but we realized that developers could keep on
using these functions just as before, disregarding the fact that they
likely want to use the version taking the op. As such, this patch
changes the name of the function taking only the callback, such that it
is used solely when the developer really wants to take just the
callback.
[1] at time of this patch, only three calls were being made that would
use only a callback. Out of over one hundred calls using ops.
mon: PaxosService: have wait_for_* functions requiring an op
Basically, so we can mark the op accordinly; we'll leave context-only
functions to maintain compatibility with other users of these functions
that do not use them for op-related callbacks.
mon: PGMonitor: implement C_MonOp on op-related callback contexts
These contexts deal with MonOpRequests, and we need to track their life
cycle; use C_MonOp to mark events when the callbacks are woken up for
some reason.
mon: OSDMonitor: implement C_MonOp on op-related callback contexts
These contexts deal with MonOpRequests, and we need to track their life
cycle; use C_MonOp to mark events when the callbacks are woken up for
some reason.
mon: LogMonitor: implements C_MonOp on op-related callback contexts
These contexts deal with MonOpRequests, and we need to track their life
cycle; use C_MonOp to mark events when the callbacks are woken up for
some reason.
mon: PaxosService: implement C_MonOp on op-related callback contexts
These contexts deal with MonOpRequests, and we need to track their life
cycle; use C_MonOp to mark events when the callbacks are woken up for
some reason.
mon: Monitor: implement C_MonOp on op-related callback contexts
These contexts deal with MonOpRequests, and we need to track their life
cycle; use C_MonOp to mark events when the callbacks are woken up for some
reason.
Client: check dir is still complete after dropping locks in _readdir_cache_cb
We drop the lock when invoking the callback, which means the directory
we're looking at might get dentries trimmed out of memory. Make sure that
hasn't happened after we get the lock back. If it *has* happened, fall back
to requesting the directory contents from the MDS. Update the dirp location
pointers after each entry to facilitate this.
Because this requires we update the dirp->at_cache_name value on every loop,
we rework the updating scheme a bit: to dereference the dn->name before
unlocking, so we know it's filled in; and since we update it on every loop
we don't need to refer to the previous dentry explicitly like we did before.
This should also handle racing file deletes: we get back a trace on
the removed dentry and that will clear the COMPLETE|ORDERED flags.
* an erasure code plugin (or another part of the code) creates a
ruleset
* the ruleset crashes during mapping (for whatever reason)
* ceph osd pool create uses the bugous ruleset
* the monitors try to do mapping a crash
Having a bugous ruleset in the crush map is very difficult prevent. The
catastrophic event of using it with a newly created pool can however be
prevented by calling the CrushTester just before creating the pool and
after all implicit or explicit crush ruleset creation happened.
mon: MonOpRequest: have the monitor dealing with operations
Deal with op requests throughout the monitor state machine, instead of
Messages. These op requests implement TrackedOp, which will be
trackable by the monitor via a OpTracker. This will allow us to follow
the operation's life cycle, for the duration of any given operation.
Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
mon: PaxosService: call post_refresh() instead of post_paxos_update()
Whenever the monitor finishes committing a proposal, we call
Monitor::refresh_from_paxos() to nudge the services to refresh. Once
all services have refreshed, we would then call each services
post_paxos_update().
However, due to an unfortunate, non-critical bug, some services (mainly
the LogMonitor) could have messages pending in their
'waiting_for_finished_proposal' callback queue [1], and we need to nudge
those callbacks.
This patch adds a new step during the refresh phase: instead of calling
directly the service's post_paxos_update(), we introduce a
PaxosService::post_refresh() which will call the services
post_paxos_update() function first and then nudge those callbacks when
appropriate.
[1] - Given the monitor will send MLog messages to itself, and given the
service is not readable before its initial state is proposed and
committed, some of the initial MLog's would be stuck waiting for the
proposal to finish. However, by design, we only nudge those message's
callbacks when an election finishes or, if the leader, when the proposal
finishes. On peons, however, we would only nudge those callbacks if an
election happened to be triggered, hence the need for an alternate path
to retry any message waiting for the initial proposal to finish.
Fixes: #11470 Signed-off-by: Joao Eduardo Luis <joao@suse.de>