common: LogClient: allow specifying facility for LogClient
Instead of allowing only one LogClient, we will now allow any daemon to
have any number of LogClients. They may either all log to the default
facility and level, or they may see their facility and level specified
upon creation (via a new constructor).
This patch also changes 'handle_log_ack' in such a way that the LogClient
will handle all acks with bearing the LogClient's facility, or will
otherwise simply ignore them. The function will return true whenever the
message has been handled or false if that was not the case. It will fall
on the caller the responsibility of deciding whether the message will be
passed to other LogClients or not, and when it is to be release.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
common: str_map: have 'get_str_map' only handling plain-text
'get_str_map()' used to handle both JSON and plain-text. In fact it
would try parsing the map as JSON on a first try and then fallback to
plain-text if it failed. Altough useful this would pose a big issue
when we attempted to parse some values, tha we knew to be plain-text,
that had some meaning in JSON -- e.g., 'false' or 'true'. In such case
the JSON parser would spit out an error, stating it had been able to
parse the JSON but didn't expected the type, which in fairness is
acceptable.
In its stead we now have two functions: 'get_str_map()' will only handle
plain-text, whereas 'get_json_str_map()' will keep the previous
behavior, attempting to parse a JSON string and falling back to
'get_str_map()' should it fail.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
common: str_map: add helper methods to get values from maps
Both methods obtain values for keys from a given map. Main distinction
is that one method will return a default value if key is not present and
if the default value is specified (i.e., not NULL), while the other
method will return the value of a fallback key if the key is not
present.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
We now introduce the concept of 'channel', analogous to syslog
facilities, for log entries. This will, shortly, allow a LogClient
to send messages to more than just the default syslog facility and log
file, also allowing multiple LogClients and having a way to associate
a LogEntry with its rightful owner.
We also add the same field to the MLogAck message so that a LogClient
waiting on a given ack is able to recognize the MLogAck as its own.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
common: config: let us obtain a diff between current and default config
It's mildly annoying when trying to figure out what has been changed on
a running system's config options and having to rely on whatever is set
on ceph.conf and the admin's memory of what has been injected.
With this we can simply ask the daemon for the diff between what would be
its default and what is its current config.
Current form will output extraneous information that was not directly
supplied by the user though, such as 'host' 'fsid' and 'daemonize', as
well as defaults we may rewrite ourselves (leveldb tunables on the monitor
for instance). Nonetheless, it's way better than the alternative and
considering it should be used solely for debug purposes I think we can
get away with it.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Sage Weil [Tue, 26 Aug 2014 15:16:29 +0000 (08:16 -0700)]
osd/OSDMap: encode blacklist in deterministic order
When we use an unordered_map the encoding order is non-deterministic,
which is problematic for OSDMap. Construct an ordered map<> on encode
and use that. This lets us keep the hash table for lookups in the general
case.
Fixes: #9211
Backport: firefly Signed-off-by: Sage Weil <sage@redhat.com>
Loic Dachary [Mon, 25 Aug 2014 15:05:04 +0000 (17:05 +0200)]
common: ROUND_UP_TO accepts any rounding factor
The ROUND_UP_TO function was limited to rounding factors that are powers
of two. This saves a modulo but it is not used where it would make a
difference. The implementation is changed so it is generic.
We need to identify whether an object is just composed of a head, or
also has a tail. Test for pre-firefly objects ("explicit objs") was
broken as it was just looking at the number of explicit objs in the
manifest. However, this is insufficient, as we might have empty head,
and in this case it wouldn't appear, so we need to check whether the
sole object is actually pointing at the head.
Somnath Roy [Mon, 18 Aug 2014 23:59:36 +0000 (16:59 -0700)]
CollectionIndex: Collection name is added to the access_lock name
The CollectionIndex constructor is changed to accept the coll_t
so that the collection name can be used to form access_lock(RWLock)
name.This is needed otherwise lockdep will report a recursive lock error
and assert. lockdep needs unique lock names for each Index object.
Sage Weil [Mon, 25 Aug 2014 04:18:00 +0000 (21:18 -0700)]
msg/Accepter: do not unlearn_addr on bind()
It is dangerous to set need_addr = true as it means someone may set the
addr to something else (specifically the port) in a racing thread.
However, it is not necessary: the only reason we added it way back in 5d5045d31a9e10d21b44eb1bd137db9ae53128ff was so that
local_connection->peer_addr would get updated, and bind() now calls that
unconditionally.
Fixes: #9079
Backport: firefly Signed-off-by: Sage Weil <sage@redhat.com>
John Spray [Mon, 25 Aug 2014 00:45:22 +0000 (01:45 +0100)]
osd: update handle_osd_map call
I had changed the implementation in Objecter
to avoid a spurious get/put cycle in "osdc/Objecter: fix resource
management", but this guy was still going a get() before
calling handle_osd_map.
John Spray [Mon, 25 Aug 2014 00:16:39 +0000 (01:16 +0100)]
osdc/Objecter: fix op_cancel on homeless session
Wrote this block without realizing that op_cancel
takes write lock on session lock, and that operation
is undefined when you already hold the read lock.
Fixes: #9214 Signed-off-by: John Spray <john.spray@redhat.com>
John Spray [Sun, 24 Aug 2014 22:48:57 +0000 (23:48 +0100)]
osdc/Objecter: hold session ref longer in resend
This is mostly cosmetic: in fact we are getting an extra
ref in _map_session and holding the session lock, so
it's safe, but it's awkward to be giving up the ref on
a session and then continuing to refer to it.
John Spray [Fri, 22 Aug 2014 12:37:46 +0000 (13:37 +0100)]
osdc/Objecter: disable lockdep for double lock
There is a special case in _recalc_linger_op_target
where we lock two sessions at once to transfer an op
between them. It is deadlock safe because it's the only
place we lock two at once, and we hold rwlock for write
while we do it.
John Spray [Fri, 15 Aug 2014 00:26:20 +0000 (01:26 +0100)]
osdc/Objecter: fix resource management
The refactor introduced various reference leaks, and
lacked cleanup in shutdown.
Things done here:
* Reinstate _recalc_linger_op_target, which was accidentally
disabled and let to freezes in notify() (#9112)
* Make reference counting on OSDSessions much more explicit, using
put_session and get_session everywhere
* Add assertions in ~OSDSession and ~Objecter that the various
maps of operations have been emptied.
* Reassign ops away from closing session to homeless session in
close_session()
* Delete/deref all the ops from the objecter-wide maps of operations
in shutdown()
John Spray [Fri, 15 Aug 2014 00:28:28 +0000 (01:28 +0100)]
librados: separate ::notify return values
There is a return code from objecter for committing
the notify linger op, and then later a code in the
CEPH_MSG_WATCH_NOTIFY handled by RadosClient directly.
Afaict there isn't any nice ordering guarantee here,
so they could stamp on each other. Use a SaferCond
for the submit one.
I don't think this was related to #9112 but while
I'm here...
Get rid of a level of intermediate classes with confusing names and put
the notify and notify finish logic in a single place so that it is easier
to follow and understand.
Pass the return value from the notify completion message to the caller.
Sage Weil [Mon, 11 Aug 2014 00:52:18 +0000 (17:52 -0700)]
osd: include ETIMEDOUT in notify reply on timeout
If a notify operation times out (all watchers to not ACK in time), include
an ETIMEDOUT in the final error message back to the client, so that they
know about it.
John Spray [Thu, 14 Aug 2014 13:39:10 +0000 (14:39 +0100)]
librados: avoid unnecessary locks
Revise wait_for_osdmap to be called outside of RadosClient::lock
and only take the lock if it has to wait for a map.
Also, now that objecter handles its own locking nicely,
there are various places where it is no longer necessary
for RadosClient to take its own lock -- all the calls that
go directly into objecter (RadosClient::pool_*) don't need
to hold RadosClient::lock.
John Spray [Thu, 14 Aug 2014 10:56:07 +0000 (11:56 +0100)]
librados: fix race on osdmap initialization
This would cause occasional failures where calls
to lookup_pool immediately after connect() would
fail to find any pool because the OSD map had not
yet been loaded. The wait for the map was lost when
the pool name cache was lost in ce176b827.
To avoid similar issues, the pool_requires_alignment
and pool_required_alignment helpers need the same
wait_for_osdmap before proceeding. Usually callers
would call lookup_pool before these guys but it's
not guaranteed.
John Spray [Wed, 13 Aug 2014 01:19:22 +0000 (02:19 +0100)]
librados: update Objecter shutdown
Previously checking for CONNECTED was equivalent to
checking the objecter had been initialized, but since
the separation between init() and start() that is
no longer the case. Avoid the need to be smart by
just readint Objecter::initialized to learn whether
to call Objecter::shutdown
Fixes: #9067 Signed-off-by: John Spray <john.spray@redhat.com>
John Spray [Tue, 12 Aug 2014 16:47:01 +0000 (17:47 +0100)]
tools: update for Journaler/Objecter interfaces
Journaler now requires a Finisher: construct one in
MDSUtility.
Objecter now requires separate calls to init() and start(),
do that in MDSUtility and also take advantage of Objecter's
new ability to act as its own dispatcher.
John Spray [Fri, 8 Aug 2014 00:49:26 +0000 (01:49 +0100)]
osdc: Add lock to Filer::Probe
This is necessary now that Objecter can call back
from multiple OSD op completions in parallel: otherwise
we get multiple threads trying to update
the same Probe object.
John Spray [Thu, 7 Aug 2014 14:56:40 +0000 (15:56 +0100)]
mds: convert IO contexts
As of this change, the only thing in the MDS inheriting
directly from Context is MDSContext.
The only files touching mds_lock explicitly are MDS, MDLog and
MDSContext -- everyone else should be getting their locking behaviour
via the contexts. (one minor exception made for an assertion in
Locker).
John Spray [Thu, 7 Aug 2014 14:52:58 +0000 (15:52 +0100)]
osdc/Journaler: use finisher for public callbacks
This is needed because of occasional lock cycles with
external callers doing e.g. write_head.
We do get some weird-looking multiply-nested
C_OnFinisher(C_OnFinisher(...)) from this approach,
where one finisher exists to protect journaler from
lock cycles wrt objecter, and the other exists
to protect the MDS from lock cycles wrt journaler.