]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
10 years agoos: FileStore::lfn_unlink always clears FDCache 2505/head
Loic Dachary [Wed, 3 Sep 2014 12:19:40 +0000 (14:19 +0200)]
os: FileStore::lfn_unlink always clears FDCache

Otherwise the FDCache will keep a file descriptor to a file that was
removed from the file system. This may create various type of errors
because the OSD checking the FDCache will assume the file that contains
information for an object exists although it does not. For instance in
the following:

      * rados put object file
      * rm file from the primary
      * repair the pg to which the object is mapped

if the FDCache is not cleared, repair will incorrectly pull a copy from
a replica and write it to the now unlinked file. Later on, it will
assume the file exists on the primary and only be partially correct :
the data can still be accessed via the file descriptor but any operation
using the path name will fail.

http://tracker.ceph.com/issues/8914 Fixes: #8914

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
10 years agotests: set the failure domain to OSD by default
Loic Dachary [Wed, 3 Sep 2014 12:18:48 +0000 (14:18 +0200)]
tests: set the failure domain to OSD by default

So that tests do not need to do it to be able to use the default rbd
pool to store objects.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
10 years agotests: add get_osds() and get_pg() helpers
Loic Dachary [Wed, 3 Sep 2014 12:17:17 +0000 (14:17 +0200)]
tests: add get_osds() and get_pg() helpers

To get the ordered list of OSD to which an object is mapped and the name
of the corresponding PG.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
10 years agoMerge pull request #2499 from ceph/wip-9219-giant
Sage Weil [Tue, 16 Sep 2014 00:40:28 +0000 (17:40 -0700)]
Merge pull request #2499 from ceph/wip-9219-giant

wip-9219: subscribe to the newest osdmap when reconnecting to a monitor

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoosd: subscribe to the newest osdmap when reconnecting to a monitor 2499/head
Greg Farnum [Tue, 16 Sep 2014 00:07:41 +0000 (17:07 -0700)]
osd: subscribe to the newest osdmap when reconnecting to a monitor

This is mostly relevant in testing clusters, but it ensures that an OSD
disconnecting from the monitor at the wrong time will still see any recent
map updates and prevent accidental loss of map injection into the OSD cluster.
Fixes: #9219
Signed-off-by: Greg Farnum <greg@inktank.com>
10 years agoMerge pull request #2492 from ceph/wip-9284
John Spray [Mon, 15 Sep 2014 22:23:46 +0000 (23:23 +0100)]
Merge pull request #2492 from ceph/wip-9284

#9284 - fix client RECALL handling and add health metrics

Reviewed-by: Greg Farnum <greg@inktank.com>
10 years agoMerge pull request #2476 from ceph/wip-9307
Sage Weil [Mon, 15 Sep 2014 22:19:07 +0000 (15:19 -0700)]
Merge pull request #2476 from ceph/wip-9307

rgw: push hash calculater deeper

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2493 from ceph/wip-rbd-objectcacher-hang
Josh Durgin [Mon, 15 Sep 2014 20:25:33 +0000 (13:25 -0700)]
Merge pull request #2493 from ceph/wip-rbd-objectcacher-hang

rbd: ObjectCacher reads can hang when reading sparse files

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
10 years agoMerge pull request #2495 from dachary/wip-erasure-code-preload
Sage Weil [Mon, 15 Sep 2014 18:26:51 +0000 (11:26 -0700)]
Merge pull request #2495 from dachary/wip-erasure-code-preload

erasure-code: preload fails if < 0

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoerasure-code: preload fails if < 0 2495/head
Loic Dachary [Mon, 15 Sep 2014 18:21:14 +0000 (20:21 +0200)]
erasure-code: preload fails if < 0

And not if < -1.

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
10 years agoMerge pull request #2486 from jgalvez/master
Sage Weil [Mon, 15 Sep 2014 16:41:45 +0000 (09:41 -0700)]
Merge pull request #2486 from jgalvez/master

init-radosgw.sysv: Support systemd for starting the gateway

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2472 from dachary/wip-9429-bench
Loic Dachary [Mon, 15 Sep 2014 16:23:08 +0000 (18:23 +0200)]
Merge pull request #2472 from dachary/wip-9429-bench

erasure-code: fix erasure_code_benchmark goop (decode)

Reviewed-by: Janne Grunau <j@jannau.net>
10 years agomds: limit number of caps inspected in caps_tick 2420/head 2492/head
John Spray [Wed, 10 Sep 2014 13:01:54 +0000 (14:01 +0100)]
mds: limit number of caps inspected in caps_tick

This is to avoid hitting an O(caps) loop in the worst
cast scenario.  This mechanism is a little crude but
should be superceded at some point by admin socket
functionality to inspect session caps so that we
don't need to spit out this level of detail in logs.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomds: keep per-client revoking caps list
John Spray [Wed, 10 Sep 2014 12:37:37 +0000 (13:37 +0100)]
mds: keep per-client revoking caps list

...to avoid doing an O(caps) scan to find out
which clients are responsible for any late-revoking
caps during health checks.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoxlist: implement copy constructor
John Spray [Wed, 10 Sep 2014 12:21:42 +0000 (13:21 +0100)]
xlist: implement copy constructor

...so that I can have a std::map of them.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomds: health metric for late releasing caps
John Spray [Sun, 7 Sep 2014 15:22:37 +0000 (16:22 +0100)]
mds: health metric for late releasing caps

Follow up on Yan Zheng's "mds: warn clients which
aren't revoking cap" to include a health metric
for this condition as well as the clog messages.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomon: trigger transaction on MDS health changes
John Spray [Fri, 5 Sep 2014 11:49:53 +0000 (12:49 +0100)]
mon: trigger transaction on MDS health changes

I think this was previously only working as a side effect
of other MDS map changes.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomds: add a health metric for failure to recall caps
John Spray [Thu, 4 Sep 2014 15:47:38 +0000 (16:47 +0100)]
mds: add a health metric for failure to recall caps

Fixes: #9284
Signed-off-by: John Spray <john.spray@redhat.com>
10 years agomds: add state for tracking RECALL progress
John Spray [Thu, 4 Sep 2014 12:04:18 +0000 (13:04 +0100)]
mds: add state for tracking RECALL progress

To be used later for generating health metrics
for clients which are failing to promptly service
CEPH_SESSION_RECALL_STATE messages.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoxlist: implement const_iterator
John Spray [Fri, 5 Sep 2014 13:10:40 +0000 (14:10 +0100)]
xlist: implement const_iterator

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoclient: fix trim_caps for inodes in root
John Spray [Mon, 8 Sep 2014 00:14:27 +0000 (01:14 +0100)]
client: fix trim_caps for inodes in root

Previously client would fail to release caps for files
in the root directory in response to CEPH_SESSION_RECALL_STATE
messages.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoclient: failure injection for cap release
John Spray [Fri, 5 Sep 2014 12:16:16 +0000 (13:16 +0100)]
client: failure injection for cap release

Used for simulating a buggy client that trips
the error detection in #9282 (warn clients
which aren't revoking caps)

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoclient: fix potentially invalid read in trim_caps
John Spray [Wed, 3 Sep 2014 18:31:38 +0000 (19:31 +0100)]
client: fix potentially invalid read in trim_caps

trim_dentry can potentially free an inode, so get/put
it around the block where we use the inode's dn_set.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoclient: more precise cap trimming
John Spray [Wed, 3 Sep 2014 17:30:00 +0000 (18:30 +0100)]
client: more precise cap trimming

Two fixes:
 * Client would unlink everything it could, instead of just
   meeting its goal, because caps.size() doesn't change until
   dentries are cleaned up later.  Take account of the trimmed
   count in the while() condition to fix that.
 * Don't count the root ino as trimmed, as although it has no
   dentries (of course), we will never give up the cap.

With this change, the client will now precisely achieve the number
of caps requested in CEPH_SESSION_RECALL_STATE messages.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoclient: fix crash in trim_caps
John Spray [Wed, 3 Sep 2014 01:00:33 +0000 (02:00 +0100)]
client: fix crash in trim_caps

In a75af4c2, procedure was added to invalidate root's dentries
if the trimming failed to free enough caps.  This would sometimes
crash because root->dir wasn't necessarily open.

Fix by only doing it if root dir is open, though I suspect this
may not be the end of it...

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoMerge pull request #2485 from Abioy/master
Loic Dachary [Mon, 15 Sep 2014 13:40:44 +0000 (15:40 +0200)]
Merge pull request #2485 from Abioy/master

bugfix: wrong socket address in log msg of Pipe.cc

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
10 years agobugfix: wrong socket address in log msg of Pipe.cc 2485/head
Abioy [Mon, 15 Sep 2014 02:52:47 +0000 (10:52 +0800)]
bugfix: wrong socket address in log msg of Pipe.cc

paddr was not yet set up for the socket address

Signed-off-by: Yongyue Sun abioy.sun@gmail.com
10 years agoMerge pull request #2442 from dachary/wip-6754-jerasure-parameters
Loic Dachary [Mon, 15 Sep 2014 10:24:19 +0000 (12:24 +0200)]
Merge pull request #2442 from dachary/wip-6754-jerasure-parameters

erasure-code: fix BlaumRoth sanity check on w

Reviewed-by: Andreas Peters <andreas.joachim.peters@cern.ch>
10 years agoMerge pull request #2488 from cernceph/docfix
Loic Dachary [Mon, 15 Sep 2014 09:39:46 +0000 (11:39 +0200)]
Merge pull request #2488 from cernceph/docfix

doc: osd_backfill_scan_(min|max) are object counts

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
10 years agodoc: osd_backfill_scan_(min|max) are object counts 2488/head
Dan van der Ster [Mon, 15 Sep 2014 09:23:11 +0000 (11:23 +0200)]
doc: osd_backfill_scan_(min|max) are object counts

osd_backfill_scan_min and osd_backfill_scan_max set the number of
items grabbed during a single backfill scan, not an interval in
seconds. Correct the doc.

Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
10 years agorbd: ObjectCacher reads can hang when reading sparse files 2493/head
Jason Dillaman [Mon, 15 Sep 2014 04:53:50 +0000 (00:53 -0400)]
rbd: ObjectCacher reads can hang when reading sparse files

The pending read list was not properly flushed when empty objects
were read from a space file.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
10 years agoinit-radosgw.sysv: Support systemd for starting the gateway 2486/head
JuanJose 'JJ' Galvez [Mon, 15 Sep 2014 03:38:20 +0000 (20:38 -0700)]
init-radosgw.sysv: Support systemd for starting the gateway

When using RHEL7 the radosgw daemon needs to start under systemd.

Check for systemd running on PID 1. If it is then start
the daemon using: systemd-run -r <cmd>. pidof returns null
as it is executed too quickly, adding one second of sleep and
script reports startup correctly.

Signed-off-by: JuanJose 'JJ' Galvez <jgalvez@redhat.com>
10 years agoMerge pull request #2484 from sjahl/master
Loic Dachary [Sun, 14 Sep 2014 15:46:04 +0000 (17:46 +0200)]
Merge pull request #2484 from sjahl/master

doc: Added bucket management commands to ops/crush-map

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
10 years agodoc: Added bucket management commands to ops/crush-map 2484/head
Stephen Jahl [Sun, 14 Sep 2014 14:41:16 +0000 (10:41 -0400)]
doc: Added bucket management commands to ops/crush-map

Describes the CLI for adding and removing buckets, in addition to the
'moving' instructions which were already present.

Signed-off-by: Stephen Jahl <stephenjahl@gmail.com>
10 years agoMerge remote-tracking branch 'gh/giant'
Sage Weil [Sun, 14 Sep 2014 04:20:33 +0000 (21:20 -0700)]
Merge remote-tracking branch 'gh/giant'

10 years agoMerge pull request #2481 from sjahl/master
Sage Weil [Sat, 13 Sep 2014 19:46:24 +0000 (12:46 -0700)]
Merge pull request #2481 from sjahl/master

doc: fixes a formatting error on ops/crush-map

10 years agodoc: fixes a formatting error on ops/crush-map 2481/head
Stephen Jahl [Sat, 13 Sep 2014 19:31:53 +0000 (15:31 -0400)]
doc: fixes a formatting error on ops/crush-map

Signed-off-by: Stephen Jahl <stephenjahl@gmail.com>
10 years agoMerge pull request #2467 from majianpeng/fix3
Loic Dachary [Sat, 13 Sep 2014 15:56:16 +0000 (17:56 +0200)]
Merge pull request #2467 from majianpeng/fix3

buffer: In rebuild_page_aligned for the last ptr is page aligned, no need call rebuild().

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
10 years agoMerge pull request #2478 from ceph/wip-9445
Loic Dachary [Sat, 13 Sep 2014 15:32:57 +0000 (17:32 +0200)]
Merge pull request #2478 from ceph/wip-9445

global: fix hang when segv happens inside logging code

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
10 years agoMerge pull request #2477 from ceph/wip-client-msg-leak
Yan, Zheng [Sat, 13 Sep 2014 00:42:15 +0000 (08:42 +0800)]
Merge pull request #2477 from ceph/wip-client-msg-leak

client: fix a message leak

10 years agomds: update segment references during journal rewrite
John Spray [Thu, 11 Sep 2014 13:07:59 +0000 (14:07 +0100)]
mds: update segment references during journal rewrite

... to avoid leaving log events that reference log
segments by offsets which no longer exist.

Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit 386f2d7c829422695a1b1f41bd3f17ca3eef1f61)
Reviewed-by: Greg Farnum <greg@inktank.com>
10 years agoMerge pull request #2469 from ceph/wip-9427-rewrite
Gregory Farnum [Sat, 13 Sep 2014 00:35:54 +0000 (17:35 -0700)]
Merge pull request #2469 from ceph/wip-9427-rewrite

mds: update segment references during journal rewrite

Reviewed-by: Greg Farnum <greg@inktank.com>
10 years agolog: add simple test to verify an internal SEGV doesn't hang 2478/head
Sage Weil [Sat, 13 Sep 2014 00:18:01 +0000 (17:18 -0700)]
log: add simple test to verify an internal SEGV doesn't hang

Test that the segv injection works.

Test that a segv while logging something doesn't hang when the signal
handlers are installed.  Note that this fails/hangs without the previous
fix.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoclient: fix a message leak 2477/head
John Spray [Fri, 12 Sep 2014 17:42:02 +0000 (18:42 +0100)]
client: fix a message leak

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoglobal/signal_handler: do not log if SEGV originated inside log code
Sage Weil [Fri, 12 Sep 2014 22:25:03 +0000 (15:25 -0700)]
global/signal_handler: do not log if SEGV originated inside log code

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agolog: add Log::is_inside_log_lock()
Sage Weil [Fri, 12 Sep 2014 22:24:50 +0000 (15:24 -0700)]
log: add Log::is_inside_log_lock()

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agomds: update segment references during journal rewrite 2469/head
John Spray [Thu, 11 Sep 2014 13:07:59 +0000 (14:07 +0100)]
mds: update segment references during journal rewrite

... to avoid leaving log events that reference log
segments by offsets which no longer exist.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agorgw: push hash calculater deeper 2476/head
Yehuda Sadeh [Fri, 12 Sep 2014 21:07:44 +0000 (14:07 -0700)]
rgw: push hash calculater deeper

This might have been the culprit for #9307. Before we were calculating
the hash after the call to processor->handle_data(), however, that
method might have spliced the bufferlist, so we can't be sure that the
pointer that we were holding originally is still invalid. Instead, push
the hash calculation down. Added a new explicit complete_hash() call to
the processor, since when we're at complete() it's too late (we need to
have the hash at that point already).

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agoMerge pull request #2471 from ceph/wip-9446
John Spray [Fri, 12 Sep 2014 15:47:52 +0000 (16:47 +0100)]
Merge pull request #2471 from ceph/wip-9446

mon: fix MDS health detail output

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoerasure-code: fix erasure_code_benchmark goop (decode) 2472/head
Loic Dachary [Fri, 12 Sep 2014 15:36:35 +0000 (17:36 +0200)]
erasure-code: fix erasure_code_benchmark goop (decode)

Using a stringstream that is only displayed on error when calling the
erasure code factory, instead of cerr. The user expects the output to be
clean when there is no error. That was done for the encode function but
not the decode function.

http://tracker.ceph.com/issues/9429 Fixes: #9429

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
10 years agomon: fix MDS health detail output 2471/head
John Spray [Fri, 5 Sep 2014 11:49:09 +0000 (12:49 +0100)]
mon: fix MDS health detail output

I fat fingered a couple of things here.

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agobuffer: Add a test for bufferlist::rebuild_page_aligned 2467/head
Ma Jianpeng [Fri, 12 Sep 2014 13:52:56 +0000 (21:52 +0800)]
buffer: Add a test for bufferlist::rebuild_page_aligned

Make the last prt of bufferlist which is page-aligned don't change anything
after rebuild_page_aligned.

Signed-off-by: Ma Jianpeng <jianpeng.ma@intel.com>
10 years agobuffer: In rebuild_page_aligned for the last ptr is page aligned, no need call rebuild().
Ma Jianpeng [Fri, 12 Sep 2014 03:21:58 +0000 (11:21 +0800)]
buffer: In rebuild_page_aligned for the last ptr is page aligned, no need call rebuild().

This only happen for the last ptr. Because rebuild() don't change the len
of ptr, so if last ptr isn't page-size aligned but is page aligned, the
rebuild() don't change anything.

Signed-off-by: Ma Jianpeng <jianpeng.ma@intel.com>
10 years agoMerge pull request #2468 from dachary/wip-always-create-pidfile
Loic Dachary [Fri, 12 Sep 2014 14:00:27 +0000 (16:00 +0200)]
Merge pull request #2468 from dachary/wip-always-create-pidfile

daemons: write pid file even when told not to daemonize

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agodaemons: write pid file even when told not to daemonize 2468/head
Alexandre Oliva [Thu, 31 Jul 2014 02:08:43 +0000 (23:08 -0300)]
daemons: write pid file even when told not to daemonize

systemd wants to run daemons in foreground, but daemons wouldn't write
out the pid file with -f.  Fixed.

Signed-off-by: Alexandre Oliva <oliva@gnu.org>
10 years agoMerge pull request #2464 from dachary/wip-9429-bench
Loic Dachary [Fri, 12 Sep 2014 09:30:31 +0000 (11:30 +0200)]
Merge pull request #2464 from dachary/wip-9429-bench

erasure-code: fix erasure_code_benchmark goop

Reviewed-by: Janne Grunau <j@jannau.net>
10 years agoMerge pull request #2416 from xiaoxichen/make_crush_private
Sage Weil [Fri, 12 Sep 2014 03:41:07 +0000 (20:41 -0700)]
Merge pull request #2416 from xiaoxichen/make_crush_private

Change CrushWrapper::crush to private

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2450 from dachary/wip-9413-erasure-code-version-check
Sage Weil [Fri, 12 Sep 2014 02:56:04 +0000 (19:56 -0700)]
Merge pull request #2450 from dachary/wip-9413-erasure-code-version-check

erasure-code: mon, osd etc. depend on the plugins

10 years agoerasure-code: fix erasure_code_benchmark goop 2464/head
Loic Dachary [Thu, 11 Sep 2014 20:07:33 +0000 (22:07 +0200)]
erasure-code: fix erasure_code_benchmark goop

Using a stringstream that is only displayed on error when calling the
erasure code factory, instead of cerr. The user expects the output to be
clean when there is no error.

http://tracker.ceph.com/issues/9429 Fixes: #9429

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
10 years agoMerge pull request #2409 from apeters1971/wip-ec-isa-table-cache-refac-master
Loic Dachary [Thu, 11 Sep 2014 19:56:19 +0000 (21:56 +0200)]
Merge pull request #2409 from apeters1971/wip-ec-isa-table-cache-refac-master

EC-ISA: add intelligent table cache

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
10 years agodoc: Fixed syntax error.
John Wilkins [Thu, 11 Sep 2014 17:50:42 +0000 (10:50 -0700)]
doc: Fixed syntax error.

Signed-off-by: John Wilkins <jowilki@redhat.com>
10 years agodoc: Updated authentication notes. Fixed syntax error.
John Wilkins [Thu, 11 Sep 2014 17:50:22 +0000 (10:50 -0700)]
doc: Updated authentication notes. Fixed syntax error.

Signed-off-by: John Wilkins <jowilki@redhat.com>
10 years agoMerge pull request #2459 from ceph/wip-7934
Loic Dachary [Thu, 11 Sep 2014 17:33:46 +0000 (19:33 +0200)]
Merge pull request #2459 from ceph/wip-7934

test: Fix ceph_test_rados_watch_notify to delete the pools it creates

Reviewed-by: Loic Dachary <loic-201408@dachary.org>
10 years agoMerge pull request #2463 from ceph/wip-mds-beacon
John Spray [Thu, 11 Sep 2014 15:45:21 +0000 (16:45 +0100)]
Merge pull request #2463 from ceph/wip-mds-beacon

mds: a couple fixes for the beacons

Reviewed-by: John Spray <john.spray@redhat.com>
10 years agomds: sleep in progress thread if laggy and waiting_for_nolaggy waiters 2463/head
Sage Weil [Thu, 11 Sep 2014 05:51:20 +0000 (22:51 -0700)]
mds: sleep in progress thread if laggy and waiting_for_nolaggy waiters

If we have nolaggy waiters but are laggy we want to sleep.  Otherwise,
we will just spin and spam the log ...

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agomds/Beacon: do not reconnect to mon in quick succession
Sage Weil [Thu, 11 Sep 2014 05:13:42 +0000 (22:13 -0700)]
mds/Beacon: do not reconnect to mon in quick succession

Wait at least one beacon interval between mon session resets.

Fixes: #9428
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2460 from ceph/wip-client-ll-ref
Yan, Zheng [Thu, 11 Sep 2014 09:06:39 +0000 (17:06 +0800)]
Merge pull request #2460 from ceph/wip-client-ll-ref

client: include ll_ref when printing inode

10 years agoclient: include ll_ref when printing inode 2460/head
Yan, Zheng [Thu, 11 Sep 2014 09:03:55 +0000 (17:03 +0800)]
client: include ll_ref when printing inode

Signed-off-by: Yan, Zheng <zyan@redhat.com>
10 years agoMerge pull request #2444 from wonzhq/read-recency
Sage Weil [Thu, 11 Sep 2014 03:44:13 +0000 (20:44 -0700)]
Merge pull request #2444 from wonzhq/read-recency

osd: set min_read_recency_for_promote to default 1 when doing upgrade

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2449 from majianpeng/fix3
Sage Weil [Thu, 11 Sep 2014 03:37:03 +0000 (20:37 -0700)]
Merge pull request #2449 from majianpeng/fix3

fix two bugs about perfcounter

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agotest: Fix ceph_test_rados_watch_notify to delete the pools it creates 2459/head
David Zafman [Thu, 11 Sep 2014 02:19:08 +0000 (19:19 -0700)]
test: Fix ceph_test_rados_watch_notify to delete the pools it creates

Fixes: #7934
Signed-off-by: David Zafman <dzafman@redhat.com>
10 years agoReplicatedPG: Make perfcounter record the read-size for 2449/head
Ma Jianpeng [Thu, 11 Sep 2014 00:32:06 +0000 (08:32 +0800)]
ReplicatedPG: Make perfcounter record the read-size for
 async-read.

Signed-off-by: Ma Jianpeng <jianpeng.ma@intel.com>
10 years agoReplicatedPG: record correctly subop for perfcounter.
Ma Jianpeng [Thu, 11 Sep 2014 00:09:47 +0000 (08:09 +0800)]
ReplicatedPG: record correctly subop for perfcounter.

In log_subop_stats, it omit to record the counter of subop.

Signed-off-by: Ma Jianpeng <jianpeng.ma@intel.com>
10 years agoMerge pull request #2454 from athanatos/wip-9269
Sage Weil [Wed, 10 Sep 2014 19:15:19 +0000 (12:15 -0700)]
Merge pull request #2454 from athanatos/wip-9269

FileStore: report l_os_j_lat as commit latency

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2453 from athanatos/wip-9220
Sage Weil [Wed, 10 Sep 2014 19:09:34 +0000 (12:09 -0700)]
Merge pull request #2453 from athanatos/wip-9220

Objecter::_recalc_linger_op: resend for any acting set change

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2443 from ceph/wip-9241
Samuel Just [Wed, 10 Sep 2014 19:09:04 +0000 (12:09 -0700)]
Merge pull request #2443 from ceph/wip-9241

osdc/Objecter: drop bad session nref assert

Reviewed-by: Samuel Just <sam.just@inktank.com>
10 years agoMerge pull request #2293 from ceph/wip-hitset-bytes
Samuel Just [Wed, 10 Sep 2014 19:02:56 +0000 (12:02 -0700)]
Merge pull request #2293 from ceph/wip-hitset-bytes

osd: improve agent calculation by factoring out hit_set bytes used properly

Reviewed-by: Samuel Just <sam.just@inktank.com>
10 years agoerasure-code: mon, osd etc. depend on the plugins 2450/head
Loic Dachary [Wed, 10 Sep 2014 15:58:45 +0000 (17:58 +0200)]
erasure-code: mon, osd etc. depend on the plugins

Since the erasure code plugin version check has been introduced,
whenever a library/binary that can load plugin needs to be recompiled,
the erasure code plugins must also be considered. If the reason for
recompiling the library/binary is a new commit, the plugins will fail to
load.

The dependency is not based on source compilation and a shared library
dependency on liberasure-code.la is added instead. This library is
uniformly used whenever a plugin is to be loaded and therefore covers
all library/binaries that need it.

http://tracker.ceph.com/issues/9413 Fixes: #9413

Signed-off-by: Loic Dachary <loic-201408@dachary.org>
10 years agoMerge pull request #2451 from ceph/wip-osdc-leak
Sage Weil [Wed, 10 Sep 2014 18:31:27 +0000 (11:31 -0700)]
Merge pull request #2451 from ceph/wip-osdc-leak

osdc/Objecter: fix leak of MStatfsReply

Reviewed-by: John Spray <john.spray@redhat.com>
10 years agoMerge pull request #2447 from reclosedev/s3_colon_in_access_key
Yehuda Sadeh [Wed, 10 Sep 2014 16:59:53 +0000 (09:59 -0700)]
Merge pull request #2447 from reclosedev/s3_colon_in_access_key

[rgw][s3] Allow colon ':' in access key

Reviewed-by: Yehuda Sadeh <yehuda@redhat.com>
10 years ago[rgw][s3] Allow colon ':' in access key 2447/head
Roman Haritonov [Wed, 10 Sep 2014 08:31:56 +0000 (12:31 +0400)]
[rgw][s3] Allow colon ':' in access key

When access key contains ':', e.g. `some_info:for_user',
authorization header looks like:

"AWS some_info:for_user:request_signature"

so `auth_str.find(':')` result in auth_id = "some_info",
auth_sign = "for_user:request_signature".

auth_str.rfind(':') solves this issue.

Signed-off-by: Roman Haritonov <reclosedev@gmail.com>
10 years agoosdc/Objecter: fix leak of MStatfsReply 2451/head
Sage Weil [Wed, 10 Sep 2014 13:57:12 +0000 (06:57 -0700)]
osdc/Objecter: fix leak of MStatfsReply

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2448 from ceph/wip-replay-locking
John Spray [Wed, 10 Sep 2014 13:56:31 +0000 (14:56 +0100)]
Merge pull request #2448 from ceph/wip-replay-locking

mds: fix replay locking

Reviewed-by: John Spray <john.spray@redhat.com>
10 years agoEC-ISA: avoid usage of table cache lock outside the class implementation by introduci... 2409/head
Andreas-Joachim Peters [Tue, 9 Sep 2014 07:37:41 +0000 (09:37 +0200)]
EC-ISA: avoid usage of table cache lock outside the class implementation by introducing the setEncodingTable/setEncodingCoefficient methods

10 years agoEC-ISA: add intelligent table cache
Andreas-Joachim Peters [Wed, 3 Sep 2014 14:19:49 +0000 (16:19 +0200)]
EC-ISA: add intelligent table cache

10 years agomds: fix replay locking 2448/head
Yan, Zheng [Wed, 10 Sep 2014 05:44:58 +0000 (13:44 +0800)]
mds: fix replay locking

When replaying EImportFinish/EFragment event, the replay thread may call
MDS::queue_waiters. MDS::queue_waiters() requires its caller to hold the
mds_lock. Otherwise assert(waiter_mutex == __null || waiter_mutex->is_locked())
in Cond::Signal() will be tiggered.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
10 years agoosd: set min_read_recency_for_promote to default 1 when doing upgrade 2444/head
Zhiqiang Wang [Wed, 10 Sep 2014 03:58:32 +0000 (11:58 +0800)]
osd: set min_read_recency_for_promote to default 1 when doing upgrade

When upgrading from a build without the promotion on 2nd read feature,
should set min_read_recency_for_promote to the default value 1, instead
of 0.

Signed-off-by: Zhiqiang Wang <wonzhq@hotmail.com>
10 years agoChange CrushWrapper::crush to private 2416/head
Xiaoxi Chen [Fri, 5 Sep 2014 02:56:36 +0000 (10:56 +0800)]
Change CrushWrapper::crush to private

Currently in CrushWrapper, the member "struct crush_map *crush"  is a public member,
so people can break the encapsulation and manipulate directly to the crush structure.

This is not a good practice for encapsulation and will lead to inconsistent if code
mix use the CrushWrapper API and crush C API.A simple example could be:
1.some code use crush_add_rule(C-API) to add a rule, which will not set the have_rmap flag to false in CrushWrapper
2.another code using CrushWrapper trying to look up the newly added rule by name will get a -ENOENT.

This patch move CrushWrapper::crush to private, together with three reverse map(type_rmap, name_rmap, rule_name_rmap)
and also change codes accessing the CrushWrapper::crush to make it compile.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
10 years agoosdc/Objecter: drop bad session nref assert 2443/head
Sage Weil [Wed, 10 Sep 2014 00:28:54 +0000 (17:28 -0700)]
osdc/Objecter: drop bad session nref assert

This is a bad assert.  Specifically, handle_osd_op_reply may still be
holding the session ref while it is calling the completion for a previous
request.  This is safe: it is only holding the session ref after it dropped
the global map rwlock because of the per-session completion locks.  The
request in question was already marked completed by the time our thread
took the session lock.

Fixes: #9241
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2433 from ceph/wip-rbd-force-write-back
Josh Durgin [Tue, 9 Sep 2014 23:40:07 +0000 (16:40 -0700)]
Merge pull request #2433 from ceph/wip-rbd-force-write-back

rbd should use write-back when caching is enabled

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
10 years agoosd/ClassHandler: fix build
Sage Weil [Tue, 9 Sep 2014 21:45:28 +0000 (14:45 -0700)]
osd/ClassHandler: fix build

Broken by 70ce400a8b4e0f5a20e6ea9877784998cdbb9a2d.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoFileStore: report l_os_j_lat as commit latency 2454/head
Samuel Just [Tue, 9 Sep 2014 21:03:50 +0000 (14:03 -0700)]
FileStore: report l_os_j_lat as commit latency

l_os_commit_lat is actually the commit cycle latency.

Fixes: #9269
Backport: firefly
Signed-off-by: Samuel Just <sam.just@inktank.com>
10 years agoMerge pull request #2441 from ceph/wip-9365
Samuel Just [Tue, 9 Sep 2014 20:53:11 +0000 (13:53 -0700)]
Merge pull request #2441 from ceph/wip-9365

osd/ClassHandler: improve error logging

Reviewed-by: Samuel Just <sam.just@inktank.com>
10 years agoosd/ClassHandler: improve error logging 2441/head
Sage Weil [Tue, 9 Sep 2014 20:38:49 +0000 (13:38 -0700)]
osd/ClassHandler: improve error logging

Fixes: #9365
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2412 from dachary/wip-9370-flush-logs
Sage Weil [Tue, 9 Sep 2014 20:36:53 +0000 (13:36 -0700)]
Merge pull request #2412 from dachary/wip-9370-flush-logs

tests: flush logs before grepping them

10 years agoMerge pull request #2434 from dachary/wip-9381-erasure-code-rpm
Sage Weil [Tue, 9 Sep 2014 20:14:42 +0000 (13:14 -0700)]
Merge pull request #2434 from dachary/wip-9381-erasure-code-rpm

packaging: add to RPM packages isa and lrc

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2427 from ceph/wip-9362
Gregory Farnum [Tue, 9 Sep 2014 20:11:04 +0000 (13:11 -0700)]
Merge pull request #2427 from ceph/wip-9362

librados: do not write to user buffer after timeout

Reviewed-by: Greg Farnum <greg@inktank.com>
10 years agoMerge pull request #2437 from athanatos/wip-9339
Samuel Just [Tue, 9 Sep 2014 20:10:29 +0000 (13:10 -0700)]
Merge pull request #2437 from athanatos/wip-9339

ReplicatedPG: create max hitset size

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoosdc/Objecter: revoke rx_buffer on op_cancel 2427/head
Sage Weil [Mon, 8 Sep 2014 20:44:57 +0000 (13:44 -0700)]
osdc/Objecter: revoke rx_buffer on op_cancel

If we cancel a read, revoke the rx buffers to avoid a use-after-free and/or
other undefined badness by using user buffers that may no longer be
present.

Fixes: #9362
Backport: firefly, dumpling
Reported-by: Matthias Kiefer <matthias.kiefer@1und1.de>
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoceph_test_rados_api_io: add read timeout test
Sage Weil [Mon, 8 Sep 2014 20:45:52 +0000 (13:45 -0700)]
ceph_test_rados_api_io: add read timeout test

Verify we don't receive data after a timeout.

Based on reproducer for #9362 written by
Matthias Kiefer <matthias.kiefer@1und1.de>.

Signed-off-by: Sage Weil <sage@redhat.com>