A single counter ( waiting ) accurately reflects the number of
waiters, regardless of the method waiting. It is enough to allow
unit tests to synthetise all situations, including:
T1: x = lookup_or_create(0)
T1: release x part 1 (weak_ptrs now fail to lock)
T2: y = lookup_or_create(0)
T2: block in lookup_or_create (waiting == 1)
T1: z = lookup_or_create(1) (does not block because the key is different)
while holding the lock it waiting++ and waiting == 2
and before returning it waiting-- and waiting is back to == 1
T1: complete release x
T2: complete lookup_or_create(0) (waiting == 0)
The unit tests are modified to add a lookup on an unrelated key to
demonstrate that it does not reset waiting counter.
Covers 100% of the LOC and all the expected behavior, including thread
safety.
The sharedptr_registry is made friend of the test class so that it can
synthetize race conditions. The lookup and lookup_or_create methods
set the new in_method data member before calling cond.Wait() so that
the caller knows it is waiting.
Danny Al-Gaaf [Fri, 26 Jul 2013 21:28:44 +0000 (23:28 +0200)]
rgw/rgw_metadata.cc: delete md_log (RGWMetadataLog) in destructor
Call delete on md_log in the destructor.
CID 1054826 (#1 of 1): Resource leak in object (CTOR_DTOR_LEAK)
1. alloc_new: Allocating memory by calling "new RGWMetadataLog(_cct, _store)".
2. var_assign: Assigning: "this->md_log" = "new RGWMetadataLog(_cct, _store)".
3. ctor_dtor_leak: The constructor allocates field "md_log" of
"RGWMetadataManager" but the destructor and whatever functions it calls
do not free it.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Sage Weil [Fri, 26 Jul 2013 20:58:46 +0000 (13:58 -0700)]
osd: load all classes on startup
This avoid creating a wide window between when ceph-osd is started and
when a request arrives needing a class and it is loaded. In particular,
upgrading the packages in that window may cause linkage errors (if the
class API has changed, for example).
Fixes: #5752 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Dan van der Ster [Fri, 26 Jul 2013 07:36:10 +0000 (09:36 +0200)]
add condrestart to the sysvinit script
We need to be able to condrestart all the ceph services on a
machine, so that we don't restart daemons that are supposed to be
stopped (e.g. broken disks).
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Dan Mick [Fri, 26 Jul 2013 02:40:26 +0000 (19:40 -0700)]
ceph_rest_api.py: return error in nonformatted mode
When a nonformatted request is made, currently the only text in the
response is the (probably empty) response buffer. Add the statusmsg
as well, where the error is likely to be explained. This lets
the http client get a clue what happened.
Sage Weil [Thu, 25 Jul 2013 18:10:53 +0000 (11:10 -0700)]
mon/Paxos: share uncommitted value when leader is/was behind
If the leader has and older lc than we do, and we are sharing states to
bring them up to date, we still want to also share our uncommitted value.
This particular case was broken by b26b7f6e, which was only contemplating
the case where the leader was ahead of us or at the same point as us, but
not the case where the leader was behind. Note that the call to
share_state() a few lines up will bring them fully up to date, so
after they receive and store_state() for this message they will be at the
same lc as we are.
Fixes: #5750
Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
rgw: expose the version of synced items to the poster
To support this, we add an optional out argument to
RGWMetadatManager::put() and fill in the read_version. When the
function returns, that contains whatever the current on-disk version
of the object is (either what already existed or what we just wrote).
Add new STATUS_APPLIED, then specify the RGWX_UPDATE_STATUS header
based on that return code when doing metadata puts.
Add a send_response() function to RGWOp_Metadata_Put in order to
support sending back our new headers. Move the translation from
STATUS_NO_APPLY from set_req_state_err() to this function, so we
can turn different sync results into failures if necessary elsewhere.
rgw: add preliminary support for sync update policies on metadata sync
We want to be able to conditionally apply new updates:
1) if we already have a newer version than the sync is applying for some
reason (replay of logs?), we don't want to go back in time.
2) If both zones were active at the same time, then we'd like to be
able to do a merge based on timestamps.
In order to support this, we add a sync_type flag to the implementations of
RGWMetadataHandler::put, and then check the version or the mtime of the
incoming put to what we have on disk, and refuse the update if needed.
We return the 204 NoContent success code when refusing sync; for the
moment the conversion is automatic but we're going to pull it out in
the next couple commits.
This commit does not complete the feature: we don't provide an interface
for specifying a different sync protocol.
Also increase fd limit defaults to accomodate the larger number
of fds.
Fixes: #5692 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Reviewed-by: Mark Nelson <mark.nelson@inktank.com>
Samuel Just [Tue, 23 Jul 2013 20:51:26 +0000 (13:51 -0700)]
FileStore::_collection_rename: fix global replay guard
If the replay is being replayed, we might have already
performed the rename, skip it. Also, we must set the
collection replay guard only after we have done the
rename.
Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Danny Al-Gaaf [Thu, 25 Jul 2013 17:12:44 +0000 (19:12 +0200)]
rgw/rgw_metadata.h: init prefix in initialization list
For performance reasons: init 'prefix' with META_LOG_OBJ_PREFIX
in the initialization list of RGWMetadataLog instead of assigning
the value in the constructor body.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Sage Weil [Wed, 24 Jul 2013 18:55:42 +0000 (11:55 -0700)]
mon/OSDMonitor: search for latest full osdmap if record version is missing
In 97462a3213e5e15812c79afc0f54d697b6c498b1 we tried to search for a
recent full osdmap but were looking at the wrong key. If full_0 was
present we could record that the latest full map was last_committed even
though it wasn't present. This is fixed in 76cd7ac1c, but we need to
compensate for when get_version_latest_full() gives us a back version
number by repeating the search.
Fixes: #5737 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Danny Al-Gaaf [Wed, 24 Jul 2013 16:48:14 +0000 (18:48 +0200)]
rgw/rgw_metadata.h: init cur_shard in LogListCtx with 0
CID 1054868 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
uninit_member: Non-static class member "cur_shard" is not
initialized in this constructor nor in any functions that it calls.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf [Wed, 24 Jul 2013 16:43:00 +0000 (18:43 +0200)]
rgw/rgw_metadata.cc: fix possible null dereferencing
CID 1054827 (#1 of 1): Dereference after null check (FORWARD_NULL)
var_deref_model: Passing null pointer "objv_tracker->read_version"
to function "obj_version::operator =(obj_version const &)", which
dereferences it.
Moved affected 2 cases into the block checking for objv_tracker
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf [Wed, 24 Jul 2013 16:18:59 +0000 (18:18 +0200)]
mon/Monitor.cc: init scrub_version with 0 in constructor
CID 1019623 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
uninit_member: Non-static class member "scrub_version" is not
initialized in this constructor nor in any functions that it calls.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Gary Lowell [Wed, 24 Jul 2013 05:43:59 +0000 (22:43 -0700)]
configure.ac: Remove -rc suffix from the configure version number.
Remove the rc suffix since RPM complains about. For rc release
builds the "rc" in the git describe string is suffcient for
everyhting but RPM. For rc release builds (i.e. not gitbuilder)
add a flag to the spec file.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
Sage Weil [Wed, 24 Jul 2013 04:27:50 +0000 (21:27 -0700)]
global/signal_handler: use poll(2) instead of select(2)
Starting with commit 61a298c39c1a6684682e2b749e45a66d073182c8 we delay the
signal handler setup until after lots of other initialization has happened,
which can result in us having very large (>1024) open fds, which will
break the FD_SET macros for select(2). Use poll(2) instead.
Fixes: #5722 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Sage Weil [Mon, 22 Jul 2013 22:21:11 +0000 (15:21 -0700)]
client: signal mds sessions with Contexts instead of Conds
If we try to open an mds session and the MDS responds with close (aka,
"no"), we call _closed_mds_session() which signals the Cond*'s but then
deallocates the list. wait_on_list() then does a use-after-free trying
to remove itself.
Instead, use Context*'s, so that the waiter does not reference the list.
Fixes: #5689 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>
Dan Mick [Tue, 23 Jul 2013 04:54:16 +0000 (21:54 -0700)]
mon: "mds stat" must open/close section around dump_info
dump_info() got a new field outside the mdsmap section; it's ok for
the overall "report", but not for "mds stat". Add an enclosing section
in "mds stat". Fix test to expect new level.
Fixes: #5718 Signed-off-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Danny Al-Gaaf [Tue, 23 Jul 2013 19:56:09 +0000 (21:56 +0200)]
ceph.spec.in: obsolete ceph-libs only on the affected distro
The ceph-libs package existed only on Redhat based distro,
there was e.g. never such a package on SUSE. Therefore: make
sure the 'Obsoletes' is only set on these affected distros.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Sage Weil [Tue, 23 Jul 2013 20:32:12 +0000 (13:32 -0700)]
mon/OSDMonitor: fix base case for 7fb3804fb workaround
After cluster creation, we have no full map stored and first_committed ==
1. In that case, there is no need for a full map, since we can get there
from OSDMap() and the incrementals.
Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Eduardo Luis <joao@inktank.com>
Danny Al-Gaaf [Tue, 23 Jul 2013 19:56:09 +0000 (21:56 +0200)]
ceph.spec.in: obsolete ceph-libs only on the affected distro
The ceph-libs package existed only on Redhat based distro,
there was e.g. never such a package on SUSE. Therefore: make
sure the 'Obsoletes' is only set on these affected distros.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
mon: OSDMonitor: work around a full version bug introduced in 7fb3804fb
In 7fb3804fb860dcd0340dd3f7c39eec4315f8e4b6 we moved the full version
stashing logic to the encode_trim_extra() function. However, we forgot
to update the osdmap's 'latest_full' key that should always point to
the latest osdmap full version. This eventually degenerated in a missing
full version after a trim. This patch works around this bug by looking
for the latest available full osdmap version in the store and updating
'latest_full' to its proper value.
Related-to: #5704
Backport: cuttlefish
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
mon: OSDMonitor: update the osdmap's latest_full with the new full version
We used to do this on encode_full(), but since [1] we no longer rely on
PaxosService to manage the full maps for us. And we forgot to write down
the latest_full version to the store, leaving it in a truly outdated state.