Otherwise the FDCache will keep a file descriptor to a file that was
removed from the file system. This may create various type of errors
because the OSD checking the FDCache will assume the file that contains
information for an object exists although it does not. For instance in
the following:
* rados put object file
* rm file from the primary
* repair the pg to which the object is mapped
if the FDCache is not cleared, repair will incorrectly pull a copy from
a replica and write it to the now unlinked file. Later on, it will
assume the file exists on the primary and only be partially correct :
the data can still be accessed via the file descriptor but any operation
using the path name will fail.
osd: subscribe to the newest osdmap when reconnecting to a monitor
This is mostly relevant in testing clusters, but it ensures that an OSD
disconnecting from the monitor at the wrong time will still see any recent
map updates and prevent accidental loss of map injection into the OSD cluster. Fixes: #9219 Signed-off-by: Greg Farnum <greg@inktank.com>
John Spray [Wed, 10 Sep 2014 13:01:54 +0000 (14:01 +0100)]
mds: limit number of caps inspected in caps_tick
This is to avoid hitting an O(caps) loop in the worst
cast scenario. This mechanism is a little crude but
should be superceded at some point by admin socket
functionality to inspect session caps so that we
don't need to spit out this level of detail in logs.
John Spray [Wed, 3 Sep 2014 17:30:00 +0000 (18:30 +0100)]
client: more precise cap trimming
Two fixes:
* Client would unlink everything it could, instead of just
meeting its goal, because caps.size() doesn't change until
dentries are cleaned up later. Take account of the trimmed
count in the while() condition to fix that.
* Don't count the root ino as trimmed, as although it has no
dentries (of course), we will never give up the cap.
With this change, the client will now precisely achieve the number
of caps requested in CEPH_SESSION_RECALL_STATE messages.
John Spray [Wed, 3 Sep 2014 01:00:33 +0000 (02:00 +0100)]
client: fix crash in trim_caps
In a75af4c2, procedure was added to invalidate root's dentries
if the trimming failed to free enough caps. This would sometimes
crash because root->dir wasn't necessarily open.
Fix by only doing it if root dir is open, though I suspect this
may not be the end of it...
Dan van der Ster [Mon, 15 Sep 2014 09:23:11 +0000 (11:23 +0200)]
doc: osd_backfill_scan_(min|max) are object counts
osd_backfill_scan_min and osd_backfill_scan_max set the number of
items grabbed during a single backfill scan, not an interval in
seconds. Correct the doc.
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
init-radosgw.sysv: Support systemd for starting the gateway
When using RHEL7 the radosgw daemon needs to start under systemd.
Check for systemd running on PID 1. If it is then start
the daemon using: systemd-run -r <cmd>. pidof returns null
as it is executed too quickly, adding one second of sleep and
script reports startup correctly.
This might have been the culprit for #9307. Before we were calculating
the hash after the call to processor->handle_data(), however, that
method might have spliced the bufferlist, so we can't be sure that the
pointer that we were holding originally is still invalid. Instead, push
the hash calculation down. Added a new explicit complete_hash() call to
the processor, since when we're at complete() it's too late (we need to
have the hash at that point already).
Using a stringstream that is only displayed on error when calling the
erasure code factory, instead of cerr. The user expects the output to be
clean when there is no error. That was done for the encode function but
not the decode function.
Ma Jianpeng [Fri, 12 Sep 2014 03:21:58 +0000 (11:21 +0800)]
buffer: In rebuild_page_aligned for the last ptr is page aligned, no need call rebuild().
This only happen for the last ptr. Because rebuild() don't change the len
of ptr, so if last ptr isn't page-size aligned but is page aligned, the
rebuild() don't change anything.
Signed-off-by: Ma Jianpeng <jianpeng.ma@intel.com>
Using a stringstream that is only displayed on error when calling the
erasure code factory, instead of cerr. The user expects the output to be
clean when there is no error.
Since the erasure code plugin version check has been introduced,
whenever a library/binary that can load plugin needs to be recompiled,
the erasure code plugins must also be considered. If the reason for
recompiling the library/binary is a new commit, the plugins will fail to
load.
The dependency is not based on source compilation and a shared library
dependency on liberasure-code.la is added instead. This library is
uniformly used whenever a plugin is to be loaded and therefore covers
all library/binaries that need it.
When replaying EImportFinish/EFragment event, the replay thread may call
MDS::queue_waiters. MDS::queue_waiters() requires its caller to hold the
mds_lock. Otherwise assert(waiter_mutex == __null || waiter_mutex->is_locked())
in Cond::Signal() will be tiggered.
Currently in CrushWrapper, the member "struct crush_map *crush" is a public member,
so people can break the encapsulation and manipulate directly to the crush structure.
This is not a good practice for encapsulation and will lead to inconsistent if code
mix use the CrushWrapper API and crush C API.A simple example could be:
1.some code use crush_add_rule(C-API) to add a rule, which will not set the have_rmap flag to false in CrushWrapper
2.another code using CrushWrapper trying to look up the newly added rule by name will get a -ENOENT.
This patch move CrushWrapper::crush to private, together with three reverse map(type_rmap, name_rmap, rule_name_rmap)
and also change codes accessing the CrushWrapper::crush to make it compile.
Sage Weil [Wed, 10 Sep 2014 00:28:54 +0000 (17:28 -0700)]
osdc/Objecter: drop bad session nref assert
This is a bad assert. Specifically, handle_osd_op_reply may still be
holding the session ref while it is calling the completion for a previous
request. This is safe: it is only holding the session ref after it dropped
the global map rwlock because of the per-session completion locks. The
request in question was already marked completed by the time our thread
took the session lock.
Fixes: #9241 Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Mon, 8 Sep 2014 20:44:57 +0000 (13:44 -0700)]
osdc/Objecter: revoke rx_buffer on op_cancel
If we cancel a read, revoke the rx buffers to avoid a use-after-free and/or
other undefined badness by using user buffers that may no longer be
present.