Sage Weil [Fri, 4 Mar 2011 21:59:24 +0000 (13:59 -0800)]
osd: include all up peers in might_have_unfound when desperate
If our might_have_unfound calculation was off (it currently can be, see
#865) we could prematurely give up. Try any up OSD at this stage just to
be sure.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 4 Mar 2011 17:39:59 +0000 (09:39 -0800)]
osd: recover_primary if recover_replicas starts no ops
recover_replicas may fail to start anything if we see an unexpected error.
In that case, try recover_primary immediately instead of waiting for the
PG to (hopefully) get requeued for recovery later.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 4 Mar 2011 17:38:47 +0000 (09:38 -0800)]
osd: discover more missing if unfound and do_recovery can't start anything
If we couldn't start any recovery ops and things are still
unfound, see if we can discover more missing object locations.
It may be that our initial locations were bad and we errored
out while trying to pull.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
IoCtx::from_rados_ioctx_t creates an IoCtx out of a rados_ioctx_t.
However, this IoCtx must share ownership of the IoCtxImpl pointer with
the C API user who first called rados_ioctx_create. This must be done
via a reference count inside the IoCtxImpl.
Also add a copy constructor and assignment operator to class IoCtx,
since it's now cheap to have them.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Log a version message whenever we open the dout log, not just the first
time. However, only output it to log files and syslog. Spewing versions
to stderr and stdout was determined to be annoying.
Rename dout_emergency_impl to dout_emergency_to_file_and_syslog to
better reflect its function.
Rename ceph_version_to_string to pretty_version_to_string.
Add get_process_name to do just that. Re-arrange some version.h methods.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Conflicts:
Log a version message whenever we open the dout log, not just the first
time. However, only output it to log files and syslog. Spewing versions
to stderr and stdout was determined to be annoying.
Rename dout_emergency_impl to dout_emergency_to_file_and_syslog to
better reflect its function.
Rename ceph_version_to_string to pretty_version_to_string.
Add get_process_name to do just that. Re-arrange some version.h methods.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Greg Farnum [Thu, 3 Mar 2011 02:52:51 +0000 (18:52 -0800)]
CDir: Don't write out the header on a partial commit.
If we write out the header as part of a partial commit, and then
fail to complete a subsequent commit (network error, we crash, etc)
then the on-disk version of the directory is not correctly versioned.
The fact that some dentries are of a newer version than others
is okay because we will fix it up during journal replay, but if
the header says the directory is fully committed to the end of
the journal that won't happen!
So, take advantage of how messages between two daemons are strictly
ordered, and how messages for a given PG are strictly ordered, and
simply include the partial commit that contains the new header last.
Greg Farnum [Thu, 3 Mar 2011 02:49:39 +0000 (18:49 -0800)]
CDir: pay attention to the max_dir_commit_size!
Somehow it seems to have been ignoring this previously, which
doesn't make any sense at all since otherwise our tests on it
wouldn't have worked. Perhaps there was a merging error somewhere?
Sage Weil [Thu, 3 Mar 2011 00:13:54 +0000 (16:13 -0800)]
mds: rip out rename linkmerge support
It turns out POSIX says rename(a,b) is a no-op when a and b link to the
same inode. This is super weird but good news because it means we can
rip out a bunch of poorly tested code.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Greg Farnum [Wed, 2 Mar 2011 22:13:38 +0000 (14:13 -0800)]
tcmalloc: switch the interface.
Previously, we used function pointers. Fun for me to learn about, icky
to actually have!
Now we use our own wrapper functions with two implementations -- one
for with tcmalloc and one without. Make those programs which
are tcmalloc-aware build with the appropriate implementation source
at compile-time, but leave the wrapper function stubs in
no matter what.
While we're at it, implement two of the "MallocExtension" calls in
the OSD.
Sage Weil [Wed, 2 Mar 2011 21:10:37 +0000 (13:10 -0800)]
msgr: fix chdir after daemonize
We don't care of the mkdir succeeds. It has dubious value anyway, though;
if you specify a unique directory for the daemon the caller may as well
create it.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Alexandre Oliva [Wed, 2 Mar 2011 21:39:09 +0000 (13:39 -0800)]
cmds/cosd: Fix IsHeapProfilerRunning implicit return type cast.
G++ complains about the difference between the return type of tcmalloc's
IsHeapProfilerRunning (int) and the return type of the function that
g_conf.profiler_running is supposed to point to (bool). We could
probably get away with a type-cast, but as a compiler developer and
former C++ language lawyer, I'd rather not take the risk of destroying
the universe by invoking undefined behavior ;-)
Sage Weil [Wed, 2 Mar 2011 13:51:11 +0000 (05:51 -0800)]
osd: cache map bufferlists until they are flushed to disk
Another thread may share maps with a peer. Make sure they pull bufferlists
out of our cache if this happens prior to the encoded versions being
written to disk.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 1 Mar 2011 00:05:08 +0000 (16:05 -0800)]
osd: trigger discover_all_missing after replay delay
We were calling discover_all_missing only when we went immediately active,
not after we were in the replay state (which triggers from a timer event
that calls OSD::activate_pg(). Move the call into PG::activate() so that
we catch both callers.
This requires passing in a query_map from the caller. While we're at it,
clean up some other instances where we are defining a new query_map
deep within the call tree.
Fixes: #847 (I hope) Signed-off-by: Sage Weil <sage.weil@dreamhost.com>