Adam Crume [Wed, 13 Aug 2014 18:42:00 +0000 (11:42 -0700)]
lttng: Remove tracing from libcommon
This is a short-term fix for issues caused by tracepoints in libcommon.
Code crashes at runtime if the same tracepoints are linked into the
program multiple times. This happens with libcommon because it is
statically linked into dynamic libraries such as librados, then
statically linked into executables because symbols from libcommon are
not exposed in librados. Therefore, any programs that use librados and
libcommon would crash because of duplicate tracepoints.
Adam Crume [Thu, 7 Aug 2014 16:05:00 +0000 (09:05 -0700)]
rbd-replay: Fix compiler warning in unit tests
Was getting:
test/test_rbd_replay.cc:44:3: warning: converting ‘false’ to pointer type for argument 1 of ‘char testing::internal::IsNullLiteralHelper(testing::internal::Secret*)’ [-Wconversion-null]
Fixed by changing EXPECT_EQ(false, xxx) to EXPECT_FALSE(xxx).
For completeness, also changed EXPECT_EQ(true, xxx) to EXPECT_TRUE(xxx).
Adam Crume [Thu, 31 Jul 2014 23:22:44 +0000 (16:22 -0700)]
rbd-replay: Support replaying partial traces
Tracing may start after the application is started, and image open calls
may missed. To support replaying these traces, additional information is
traced, allowing missing open calls to be generated.
Adam Crume [Mon, 28 Jul 2014 23:32:15 +0000 (16:32 -0700)]
lttng: Preload liblttng-ust-fork.so in TESTS_ENVIRONMENT
This adds LD_PRELOAD=liblttng-ust-fork.so to TESTS_ENVIRONMENT.
This prevents lttng from complaining when processes are forked.
The complaints otherwise taint the output and cause tests to fail.
Adam Crume [Thu, 17 Jul 2014 22:01:42 +0000 (15:01 -0700)]
rbd-replay: Switch logging from cout to dout
To enable logs, we also have to use global_init to parse our
command-line args, so we now have other standard Ceph goodies
such as picking up config options from the environment.
This adds objectstore tracepoints for the filestore. It'd be nice to add
these to the objectstore interface some how so we can get all
implementations for free, but that might just be a bit difficult
especially since each impl will apply transactions in a differnet way.
Sage Weil [Wed, 13 Aug 2014 17:34:53 +0000 (10:34 -0700)]
osd/ReplicatedPG: only do agent mode calculations for positive values
After a split we can get negative values here. Only do the arithmetic if
we have a valid (positive) value that won't through the floating point
unit for a loop.
Fixes: #9082 Tested-by: Karan Singh <karan.singh@csc.fi> Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Wed, 13 Aug 2014 15:30:25 +0000 (08:30 -0700)]
osd: fix require_same_peer_instance from fast_dispatch
The mark-down of old peers needs to take the session_dispatch_lock in order
to safely clear the Session ref cycle. However, for fast dispatch callers,
that lock is already held. Pass a flag down from the callers indicating
whether we need to take the additional lock.
Fixes: #9096 Signed-off-by: Sage Weil <sage@redhat.com>
Samuel Just [Tue, 12 Aug 2014 19:20:28 +0000 (12:20 -0700)]
ReplicatedPG: do not pass cop into C_Copyfrom
We do not know when the objecter will finally let go of this Context. Thus, we
cannot know whether it will happen before the flush, at which point the
object_context held by the cop must have been released.
Also, we simply don't need it, process_copy_chunk alrady works in terms of the
tid!
Fixes: #8894 Reviewed-by: Sage Weil <sage@redhat.com> Signed-off-by: Samuel Just <sam.just@inktank.com>
Josh Durgin [Mon, 11 Aug 2014 23:41:26 +0000 (16:41 -0700)]
librbd: fix error path cleanup for opening an image
If the image doesn't exist and caching is enabled, the ObjectCacher
was not being shutdown, and the ImageCtx was leaked. The IoCtx could
later be closed while the ObjectCacher was still running, resulting in
a segfault. Simply use the usual cleanup path in open_image(), which
works fine here.
Sage Weil [Mon, 11 Aug 2014 03:22:23 +0000 (20:22 -0700)]
msg/Pipe: do not wait for self in Pipe::stop_and_wait()
The fast dispatch code necessitated adding a wait for the fast dispatch
to complete when taking over sockets back in commit 2d5d3097c3998add1061ce253104154d72879237. This included mark_down()
(although I am not certain mark_down was required to fix the previous set
of races).
In any case, if the fast dispatch thread itself tries to mark down its
own connection, it will deadlock in this method waiting for itself to
return and clear reader_dispatching. Skip this wait if we are in fact
the reader thread. This avoids the deadlock.
Alternatively, we could change mark_down() to not use stop_and_wait(), but
I am less clear about the potential races there, so I'm opting for the
minimal (though ugly) fix.
Fixes: #9057 Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Sun, 10 Aug 2014 19:15:38 +0000 (12:15 -0700)]
ceph_test_rados_api: fix cleanup of cache pool
We can't simply try to delete everything in there because some items may
be whiteouts. Instead, flush+evict everything, then remove overlay, and
*then* delete what remains.
Fixes: #9055 Signed-off-by: Sage Weil <sage@redhat.com>
OSD: introduce require_up_osd_peer() function for gating replica ops
This checks both that a Message originates from an OSD, and that the OSD
is up in the given map epoch.
We use it in handle_replica_op so that we don't inadvertently add operations
from down peers, who might or might not know it.
Sage Weil [Mon, 4 Aug 2014 21:57:28 +0000 (14:57 -0700)]
osd: reorder OSDService methods under proper dout_prefix macro
The dout_prefix for OSDService uses get_osdmap() to grab a shared_ptr for
the epoch printout. The OSD one does not, and is not safe to run in all
thread contexts.
In particular, update_osd_stat() is run by the heartbeat thread and can
race with the shared_ptr itself being updated with a new map.
Ironically, if this were simply an OSDMap*, there would be no race since
the pointer is a single word and updates atomically.
Fix this, and any similar issues, by moving the OSDService methods up in
OSD.cc so that they use the safe dout macro.
Fixes: #8998
Backport: firefly (in a minimal form, I think!) Signed-off-by: Sage Weil <sage@redhat.com>