Danny Al-Gaaf [Fri, 19 Sep 2014 10:25:07 +0000 (12:25 +0200)]
rgw_main.cc: add missing virtual destructor for RGWRequest
CID 1160858 (#1 of 1): Non-virtual destructor (VIRTUAL_DTOR)
nonvirtual_dtor: Class RGWLoadGenRequest has a destructor
and a pointer to it is upcast to class RGWRequest which doesn't
have a virtual destructor.
Danny Al-Gaaf [Fri, 19 Sep 2014 10:06:49 +0000 (12:06 +0200)]
os/GenericObjectMap.cc: pass big parameter by reference
CID 1188142 (#1 of 1): Big parameter passed by value (PASS_BY_VALUE)
pass_by_value: Passing parameter header of type
GenericObjectMap::_Header (size 176 bytes) by value.
Danny Al-Gaaf [Wed, 17 Sep 2014 17:31:13 +0000 (19:31 +0200)]
mds/Beacon.*: fix UNINIT_CTOR cases
CID 1238905 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
uninit_member: Non-static class member want_state is not initialized
in this constructor nor in any functions that it calls.
uninit_member: Non-static class member last_send is not initialized
in this constructor nor in any functions that it calls.
CID 1238903 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
uninit_member: Non-static class member data_chunk_count is not
initialized in this constructor nor in any functions that it calls.
If the file size does not fit in 32 bits the (unsigned) cast will
overflow. Cast to uint64_t which is the type of the value returned by
get_total_chunk_size.
We have been setting it to the old head value. This is usually
harmless since the new head will virtually always be ahead of the
old head for claim_log_and_clear_rollback_info, but can cause trouble
in some edge cases.
Fixes: #9481
Backport: firefly Signed-off-by: Samuel Just <sam.just@inktank.com>
Samuel Just [Mon, 15 Sep 2014 23:53:21 +0000 (16:53 -0700)]
PG::find_best_info: let history.last_epoch_started provide a lower bound
If we find a info.history.last_epoch_started above any
info.last_epoch_started, we must be missing updates and
min_last_update_acceptable should provisionally be max().
Fixes: #9482
Backport: firefly Signed-off-by: Samuel Just <sam.just@inktank.com>
Otherwise the FDCache will keep a file descriptor to a file that was
removed from the file system. This may create various type of errors
because the OSD checking the FDCache will assume the file that contains
information for an object exists although it does not. For instance in
the following:
* rados put object file
* rm file from the primary
* repair the pg to which the object is mapped
if the FDCache is not cleared, repair will incorrectly pull a copy from
a replica and write it to the now unlinked file. Later on, it will
assume the file exists on the primary and only be partially correct :
the data can still be accessed via the file descriptor but any operation
using the path name will fail.
osd: subscribe to the newest osdmap when reconnecting to a monitor
This is mostly relevant in testing clusters, but it ensures that an OSD
disconnecting from the monitor at the wrong time will still see any recent
map updates and prevent accidental loss of map injection into the OSD cluster. Fixes: #9219 Signed-off-by: Greg Farnum <greg@inktank.com>
Sage Weil [Mon, 15 Sep 2014 23:45:19 +0000 (16:45 -0700)]
osdc/Objecter: fix command op cancellation race
Cancel the command op timeout event before we clear out the op from the
session struct. This isn't strictly necessary because command_op_cancel
will "gracefully" handle the case where the tid is no longer present, but
this avoids that noise and is cleaner.
Sage Weil [Mon, 15 Sep 2014 23:40:39 +0000 (16:40 -0700)]
osdc/Objecter: cancel timeout before clearing op->session
The C_CancelOp path assumes op->session != NULL. Cancel that op before
we clear it. This fixes a crash like
#0 pthread_rwlock_wrlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S:39
#1 0x00007fc82690a4b1 in RWLock::get_write (this=0x18, lockdep=<optimized out>) at ./common/RWLock.h:88
#2 0x00007fc8268f4d79 in Objecter::op_cancel (this=0x1f61830, s=0x0, tid=0, r=-110) at osdc/Objecter.cc:1850
#3 0x00007fc8268ba449 in Context::complete (this=0x1f68c20, r=<optimized out>) at ./include/Context.h:64
#4 0x00007fc8269769aa in RWTimer::timer_thread (this=0x1f61950) at common/Timer.cc:268
#5 0x00007fc82697a85d in RWTimerThread::entry (this=<optimized out>) at common/Timer.cc:200
#6 0x00007fc82651ce9a in start_thread (arg=0x7fc7e3fff700) at pthread_create.c:308
Sage Weil [Mon, 15 Sep 2014 22:29:08 +0000 (15:29 -0700)]
ceph-disk: mount xfs with inode64 by default
We did this forever ago with mkcephfs, but ceph-disk didn't. Note that for
modern XFS this option is obsolete, but for older kernels it was not the
default.
Backport: firefly Signed-off-by: Sage Weil <sage@redhat.com>
John Spray [Wed, 10 Sep 2014 13:01:54 +0000 (14:01 +0100)]
mds: limit number of caps inspected in caps_tick
This is to avoid hitting an O(caps) loop in the worst
cast scenario. This mechanism is a little crude but
should be superceded at some point by admin socket
functionality to inspect session caps so that we
don't need to spit out this level of detail in logs.
John Spray [Wed, 3 Sep 2014 17:30:00 +0000 (18:30 +0100)]
client: more precise cap trimming
Two fixes:
* Client would unlink everything it could, instead of just
meeting its goal, because caps.size() doesn't change until
dentries are cleaned up later. Take account of the trimmed
count in the while() condition to fix that.
* Don't count the root ino as trimmed, as although it has no
dentries (of course), we will never give up the cap.
With this change, the client will now precisely achieve the number
of caps requested in CEPH_SESSION_RECALL_STATE messages.
John Spray [Wed, 3 Sep 2014 01:00:33 +0000 (02:00 +0100)]
client: fix crash in trim_caps
In a75af4c2, procedure was added to invalidate root's dentries
if the trimming failed to free enough caps. This would sometimes
crash because root->dir wasn't necessarily open.
Fix by only doing it if root dir is open, though I suspect this
may not be the end of it...
Dan van der Ster [Mon, 15 Sep 2014 09:23:11 +0000 (11:23 +0200)]
doc: osd_backfill_scan_(min|max) are object counts
osd_backfill_scan_min and osd_backfill_scan_max set the number of
items grabbed during a single backfill scan, not an interval in
seconds. Correct the doc.
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
init-radosgw.sysv: Support systemd for starting the gateway
When using RHEL7 the radosgw daemon needs to start under systemd.
Check for systemd running on PID 1. If it is then start
the daemon using: systemd-run -r <cmd>. pidof returns null
as it is executed too quickly, adding one second of sleep and
script reports startup correctly.