]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Sage Weil [Thu, 8 Mar 2012 22:55:21 +0000 (14:55 -0800)]
doc: explain how unfound objects happen
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 8 Mar 2012 22:55:08 +0000 (14:55 -0800)]
doc: make osd failure example include >3 osds
More realistic.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 8 Mar 2012 22:46:56 +0000 (14:46 -0800)]
testrados: fix omap_get_vals_by_keys call
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 8 Mar 2012 22:29:42 +0000 (14:29 -0800)]
osd: add zero_to field to PG::OndiskLog; track zeroed region of pg log
Track which region of the log has been zeroed on disk. This may be
different from tail if 'osd preserved trimmed log = false' in the config.
Only zero the portion of the log we need to. This avoids rezeroing regions
or missing bits when 'osd preserved trimmed log' was off and is then turned
on.
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Thu, 8 Mar 2012 22:30:06 +0000 (14:30 -0800)]
filestore: use FL_ALLOC_PUNCH_HOLE to zero, when available
First try the FL_ALLOC_PUNCH_HOLE fallocate() flag. If we get EOPNOTSUPP,
fall back to writing zeros.
Check for fallocate(2) with configure. Also, avoid this if we are not
Linux, since I'm not sure about the hard-coded FL_ALLOC_PUNCH_HOLE being
correct on other platforms.
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Thu, 8 Mar 2012 22:16:59 +0000 (14:16 -0800)]
osd: fix op_wq vs pg->lock ordering
map_lock
-> pg->lock
-> op_wq
Fixes: #2153
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Yehuda Sadeh [Thu, 8 Mar 2012 06:53:32 +0000 (22:53 -0800)]
Merge branch 'wip-rgw-new-atomic'
Yehuda Sadeh [Thu, 8 Mar 2012 06:52:24 +0000 (22:52 -0800)]
rgw: append the currect bucket marker when removing bucket
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Thu, 8 Mar 2012 06:35:40 +0000 (22:35 -0800)]
Merge branch 'wip-rgw-omap'
Yehuda Sadeh [Thu, 8 Mar 2012 06:25:47 +0000 (22:25 -0800)]
cls_rgw: fix rgw_bucket_init_index
was failing to error in case header already existed
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Thu, 8 Mar 2012 06:19:25 +0000 (22:19 -0800)]
rgw: remove extra unused params from omap_get()
and also rename it to omap_get_all()
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Thu, 8 Mar 2012 06:18:57 +0000 (22:18 -0800)]
rgw: add cls_cxx_map_clear
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Samuel Just [Thu, 8 Mar 2012 05:59:30 +0000 (21:59 -0800)]
leveldb: drop compaction unit test
Signed-off-by: Samuel Just <rexludorum@gmail.com>
Samuel Just [Wed, 7 Mar 2012 21:08:36 +0000 (13:08 -0800)]
ReplicatedPG,librados: add filter_prefix to omap_get_vals
Signed-off-by: Samuel Just <rexludorum@gmail.com>
Yehuda Sadeh [Thu, 8 Mar 2012 01:10:18 +0000 (17:10 -0800)]
rgw: use prefix filter for bucket listing
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Thu, 8 Mar 2012 01:03:45 +0000 (17:03 -0800)]
objclass, cls_rgw: add prefix to omap_get_vals()
Yehuda Sadeh [Thu, 8 Mar 2012 01:02:57 +0000 (17:02 -0800)]
librados: add higher level call for omap_get_keys() with prefix
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Thu, 8 Mar 2012 00:46:18 +0000 (16:46 -0800)]
Merge remote-tracking branch 'origin/wip_prefix' into wip-rgw-omap
Josh Durgin [Wed, 7 Mar 2012 23:12:03 +0000 (15:12 -0800)]
rbd: pass all mon addrs when mapping devices
Previously this repeated the address of the first monitor.
Fixes: #2152
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
Greg Farnum [Sat, 3 Mar 2012 00:13:04 +0000 (16:13 -0800)]
msgr: remove declaration of undefined SimpleMessenger::write_pid_file
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Reviewed-by: Sage Weil <sage.weil@dreamhost.com>
Greg Farnum [Sat, 3 Mar 2012 00:08:15 +0000 (16:08 -0800)]
msgr: remove SimpleMessenger::get_ms_addr() in favor of Messenger::get_myaddr
And fix the comments on set_ip.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Reviewed-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 7 Mar 2012 22:07:38 +0000 (14:07 -0800)]
objectstore: fix collection_move() encoding
This was broken in the original
f43c3d958fe5c32ae647ffa715390ada51ae2650 .
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Samuel Just [Wed, 7 Mar 2012 21:08:36 +0000 (13:08 -0800)]
ReplicatedPG,librados: add filter_prefix to omap_get_vals
Signed-off-by: Samuel Just <rexludorum@gmail.com>
Yehuda Sadeh [Wed, 7 Mar 2012 20:34:35 +0000 (12:34 -0800)]
rgw: some minor cleanups
following a review
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Wed, 7 Mar 2012 18:45:13 +0000 (10:45 -0800)]
objclass: fix cls_cxx_map_write_header
Claiming the buffer instead of encoding it.
Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
Yehuda Sadeh [Wed, 7 Mar 2012 18:44:43 +0000 (10:44 -0800)]
cls_rgw: fix debug message
Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
Sage Weil [Wed, 7 Mar 2012 18:32:32 +0000 (10:32 -0800)]
Merge remote-tracking branch 'gh/wip-doc'
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Sage Weil [Wed, 7 Mar 2012 16:56:17 +0000 (08:56 -0800)]
osd: make degraded pgs count missing replicas as degraded objects
If a PG is smaller than it should be, make sure the missing replicas are
included in the degraded object count. This makes the overall degraded
percentage consistently meaningful even for PGs that aren't mid-recovery
of mid-backfill.
Fixes: #2137
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Wed, 7 Mar 2012 05:03:39 +0000 (21:03 -0800)]
mon: fix full osd detail
And use a helper to avoid dup code.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 7 Mar 2012 04:55:11 +0000 (20:55 -0800)]
mon: assign severity to each health summary/detail item
These can be included in the detail dump in the future.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 7 Mar 2012 04:35:33 +0000 (20:35 -0800)]
doc: fix misc typos, bad phrasing
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Yehuda Sadeh [Wed, 7 Mar 2012 01:17:03 +0000 (17:17 -0800)]
objclass, cls_rgw: update to use omap
Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
Sage Weil [Wed, 7 Mar 2012 00:18:13 +0000 (16:18 -0800)]
doc: 2 words about radosgw failures
- restarting the daemon.
- using the admin socket
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 7 Mar 2012 00:09:42 +0000 (16:09 -0800)]
doc: talk about mon failures a bit
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 7 Mar 2012 00:09:32 +0000 (16:09 -0800)]
doc: fix link
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 6 Mar 2012 23:45:29 +0000 (15:45 -0800)]
doc: slow osd requests
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 6 Mar 2012 23:39:28 +0000 (15:39 -0800)]
doc: diagnose full osd cluster
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 6 Mar 2012 23:38:31 +0000 (15:38 -0800)]
mon: list nearfull/full osd detail
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 6 Mar 2012 23:31:29 +0000 (15:31 -0800)]
doc: describe 'stuck' states we check for
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 6 Mar 2012 23:27:02 +0000 (15:27 -0800)]
doc: document some osd failure recovery scenarios
- simple osd failure
- ceph health [detail]
- peering failure ('down') state
- unfound objects
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 6 Mar 2012 23:26:18 +0000 (15:26 -0800)]
osd: list might_have_unfound locations in query result
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 6 Mar 2012 23:17:33 +0000 (15:17 -0800)]
mon: include unfound count in health detail
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 7 Mar 2012 01:05:22 +0000 (17:05 -0800)]
mon: refactor health, include optional detail
'ceph health' to get the usual summary, 'ceph health detail' to
additionally get a comprehensive list of problems found.
Eventually we can format this as yaml, json, whatever, too.
Signed-off-by: Sage Weil <sage@newdream.net>
Samuel Just [Wed, 7 Mar 2012 00:05:21 +0000 (16:05 -0800)]
Merge branch 'wip-collmove'
Yehuda Sadeh [Tue, 6 Mar 2012 23:48:23 +0000 (15:48 -0800)]
rgw: switch to omap api
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Samuel Just [Tue, 6 Mar 2012 23:15:33 +0000 (15:15 -0800)]
leveldb: remove flawed unit test for now
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Yehuda Sadeh [Tue, 6 Mar 2012 22:53:38 +0000 (14:53 -0800)]
librados: rename omap_get_vals_by_key to omap_get_vals_by_keys
merge fail
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Tue, 6 Mar 2012 21:40:17 +0000 (13:40 -0800)]
librados: add high level omap calls
also rename get_vals_by_key to get_vals_by_keys
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Tue, 6 Mar 2012 21:22:57 +0000 (13:22 -0800)]
rgw: fix warning
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Tue, 6 Mar 2012 19:15:43 +0000 (11:15 -0800)]
rgw: read bucket through tmap_get
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Samuel Just [Tue, 6 Mar 2012 19:46:24 +0000 (11:46 -0800)]
Merge branch 'wip_omap'
Reviewed-by: Sage Weil <sage.weil@dreamhost.com>
Samuel Just [Tue, 6 Mar 2012 19:32:04 +0000 (11:32 -0800)]
test_rados_api_aio: add omap
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Tue, 6 Mar 2012 18:35:24 +0000 (10:35 -0800)]
osd: testing for tmap auto upgrade
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Fri, 2 Mar 2012 17:25:13 +0000 (09:25 -0800)]
ReplicatedPG: transparently upgrade TMAP
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Tue, 7 Feb 2012 16:57:19 +0000 (08:57 -0800)]
RadosModel: Add omap operations to RadosModel
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Fri, 2 Mar 2012 00:22:27 +0000 (16:22 -0800)]
ReplicatedPG: Add omap ops to ReplicatedPG
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Thu, 1 Mar 2012 22:52:20 +0000 (14:52 -0800)]
librados: Added omap operations to librados
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Thu, 1 Mar 2012 20:33:33 +0000 (12:33 -0800)]
osdc: Add omap operation stubs to Objecter::ObjectOperation
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Fri, 2 Mar 2012 19:12:56 +0000 (11:12 -0800)]
ReplicatedPG: add omap_header to recovery
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Tue, 6 Mar 2012 18:34:21 +0000 (10:34 -0800)]
librados: add tmap_put to ObjectWriteOperation
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Tue, 6 Mar 2012 19:03:01 +0000 (11:03 -0800)]
Merge branch 'wip-1796'
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil [Sat, 3 Mar 2012 22:28:21 +0000 (14:28 -0800)]
mds: respawn when blacklisted
If we are blacklisted by the OSD cluster, it's because we were too slow
and were replaced by another ceph-mds. Respawn and re-register as a
standby.
If we get some other write error, shut down.
Fixes: #1796
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sat, 3 Mar 2012 22:25:25 +0000 (14:25 -0800)]
journaler: add generic write error handler
Specify a generic callback for any write error the journaler encounters.
This is more helpful than passing up write errors to specific callers
because
- there are several of them
- journaler initiates writes on its own (like the head)
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 6 Mar 2012 18:49:18 +0000 (10:49 -0800)]
Merge remote-tracking branch 'gh/wip-2105'
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Tue, 6 Mar 2012 18:24:04 +0000 (10:24 -0800)]
.gitignore: src/ocf/rbd
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 6 Mar 2012 17:19:32 +0000 (09:19 -0800)]
filestore: create snap_0 on mkfs
If we create a new filestore, apply one transaction, and then crash, we
want to make sure roll back to a consistent reference point--empty. The
simplest solution is to create that snap_0 during mkfs. This avoids
strangeness like
2012-02-27 00:42:00.336703
7fb1381ef780 filestore(/ceph/osd.0) mkfs in /ceph/osd.0
2012-02-27 00:42:00.341399
7fb1381ef780 journal _open /ceph/osd.0.journal fd 10:
1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0
2012-02-27 00:42:00.349705
7fb1381ef780 filestore(/ceph/osd.0) mkjournal created journal on /ceph/osd.0.journal
2012-02-27 00:42:00.349728
7fb1381ef780 filestore(/ceph/osd.0) mkfs done in /ceph/osd.0
2012-02-27 00:42:00.349787
7fb1381ef780 filestore(/ceph/osd.0) mount FIEMAP ioctl is NOT supported
2012-02-27 00:42:00.349800
7fb1381ef780 filestore(/ceph/osd.0) mount detected btrfs
2012-02-27 00:42:00.349813
7fb1381ef780 filestore(/ceph/osd.0) mount btrfs CLONE_RANGE ioctl is supported
2012-02-27 00:42:00.357023
7fb1381ef780 filestore(/ceph/osd.0) mount btrfs SNAP_CREATE is supported
2012-02-27 00:42:00.405174
7fb1381ef780 filestore(/ceph/osd.0) mount btrfs SNAP_DESTROY is supported
2012-02-27 00:42:00.405214
7fb1381ef780 filestore(/ceph/osd.0) mount btrfs START_SYNC got (25) Inappropriate ioctl for device
2012-02-27 00:42:00.405228
7fb1381ef780 filestore(/ceph/osd.0) mount btrfs START_SYNC is NOT supported: (25) Inappropriate ioctl for device
2012-02-27 00:42:00.405235
7fb1381ef780 filestore(/ceph/osd.0) mount WARNING: btrfs snaps enabled, but no SNAP_CREATE_V2 ioctl (from kernel 2.6.37+)
2012-02-27 00:42:00.405561
7fb1381ef780 filestore(/ceph/osd.0) mount found snaps <>
2012-02-27 00:42:00.405576
7fb1381ef780 filestore(/ceph/osd.0) mount WARNING: no consistent snaps found, store may be in inconsistent state
and subsequent badness if we fail before a proper commit is made.
Fixes: #2105
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 6 Mar 2012 17:19:16 +0000 (09:19 -0800)]
filestore: drop useless read_op_seq() arg
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 6 Mar 2012 17:14:25 +0000 (09:14 -0800)]
Merge pull request #9 from fghaas/ocf-ra
OCF resource agents: add rbd
Reviewed-by: Sage Weil <sage@newdream.net>
Reviewed-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Florian Haas [Tue, 6 Mar 2012 08:58:42 +0000 (09:58 +0100)]
rbd OCF RA: fix whitespace inconsistency
Signed-off-by: Florian Haas <florian@hastexo.com>
Sage Weil [Tue, 6 Mar 2012 06:48:07 +0000 (22:48 -0800)]
Merge remote branch 'gh/wip-msgr-interface'
Reviewed-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 6 Mar 2012 05:42:44 +0000 (21:42 -0800)]
osd: use new collection_move() operation
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 6 Mar 2012 05:42:23 +0000 (21:42 -0800)]
filestore: implement OP_COLL_MOVE
Equivalent to OP_COLL_ADD, OP_COLL_REMOVE.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 6 Mar 2012 05:41:49 +0000 (21:41 -0800)]
objectstore: OP_COLL_MOVE
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 6 Mar 2012 05:41:05 +0000 (21:41 -0800)]
objectstore: use enum for OP_*
Enforce no dups.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 6 Mar 2012 05:15:31 +0000 (21:15 -0800)]
objectstore: remove _fake_writes, _get_frag_stat
Also only implemented by ebofs.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 6 Mar 2012 05:12:20 +0000 (21:12 -0800)]
filestore: drop trim_from_cache, is_cached
These were used for read optimizations in ebofs; I don't think they'll
come back.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 6 Mar 2012 05:09:54 +0000 (21:09 -0800)]
objectstore: remove cruft
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Tue, 6 Mar 2012 05:09:05 +0000 (21:09 -0800)]
filestore: remove collection, attr faking
Useless functionality from the dark ages of development, when xattrs were
scarce.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Yehuda Sadeh [Mon, 5 Mar 2012 20:37:05 +0000 (12:37 -0800)]
rgw: make sure correct locator is used
Or more correct: locator is not used where not needed.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Sat, 3 Mar 2012 01:05:40 +0000 (17:05 -0800)]
rgw: implement copy using new scheme
for some reason target tail uses locator, this needs to be
fixed.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Fri, 2 Mar 2012 23:08:09 +0000 (15:08 -0800)]
rgw: don't use locator for multipart uploads
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Fri, 2 Mar 2012 22:56:22 +0000 (14:56 -0800)]
rgw: multipart object working with manifest
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Fri, 2 Mar 2012 18:10:54 +0000 (10:10 -0800)]
rgw: manifest object contains source offset info
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Fri, 2 Mar 2012 01:13:43 +0000 (17:13 -0800)]
rgw: basic functionality of new atomic get/put works
get/put of objects works. Stuff that is known to be broken:
copy object
Also, going through the code, we can probably improve object
reading (use aio). We can also keep the manifest information on
the handle so that we don't need to get_obj_state every iteration.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Thu, 1 Mar 2012 22:41:50 +0000 (14:41 -0800)]
rgw: get_obj uses manifest
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Thu, 1 Mar 2012 22:03:57 +0000 (14:03 -0800)]
rgw: atomic objects hold manifest header
When writing an object we update where all the chunks of this object
reside.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Thu, 1 Mar 2012 19:02:19 +0000 (11:02 -0800)]
rgw: atomic processor writes to shadow object
And the first chunk is going to the head object in the end
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Sage Weil [Mon, 5 Mar 2012 22:35:30 +0000 (14:35 -0800)]
Merge remote branch 'gh/wip-swift-acls'
Lightly-reviewed-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 5 Mar 2012 22:21:31 +0000 (14:21 -0800)]
osd: delay non-replayed ops during replay
If we get new (non-replayed) ops during replay, those need to wait until
after the replayed ops are ordered and applied. Otherwise we break the op
ordering completely, particularly with something like
- pg not active
- get op 1, put on waiting_for_active
- pg enters replay
- get op 2, apply immediately
- finish replay, requeue op 1
Fixes: #2082
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Mon, 5 Mar 2012 22:21:12 +0000 (14:21 -0800)]
librados: close narrow shutdown race
timer.shutdown() will drop and retake the lock, so set DISCONNECTED first
to avoid a message slipping in and reaching the objecter like so:
INFO:teuthology.task.rados.rados.0.err:osdc/Objecter.cc: In function 'void Objecter::handle_osd_op_reply(MOSDOpReply*)' thread
7f0bc2b1b700 time 2012-03-03 18:35:25.302135
INFO:teuthology.task.rados.rados.0.err:osdc/Objecter.cc: 1151: FAILED assert(initialized)
INFO:teuthology.task.rados.rados.0.err: ceph version
0.43-46-g2e57997 (commit:
2e57997894944696fcc737aae9b57e30b6bb5bdc )
INFO:teuthology.task.rados.rados.0.err: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0xb3) [0x7f0bc59bd66f]
INFO:teuthology.task.rados.rados.0.err: 2: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x82) [0x7f0bc58e885e]
INFO:teuthology.task.rados.rados.0.err: 3: (librados::RadosClient::_dispatch(Message*)+0x66) [0x7f0bc58a2674]
INFO:teuthology.task.rados.rados.0.err: 4: (librados::RadosClient::ms_dispatch(Message*)+0x130) [0x7f0bc58a246e]
INFO:teuthology.task.rados.rados.0.err: 5: (Messenger::ms_deliver_dispatch(Message*)+0x8b) [0x7f0bc5a4e859]
INFO:teuthology.task.rados.rados.0.err: 6: (SimpleMessenger::dispatch_entry()+0x7c2) [0x7f0bc5a377fc]
INFO:teuthology.task.rados.rados.0.err: 7: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x7f0bc58b5512]
INFO:teuthology.task.rados.rados.0.err: 8: (Thread::_entry_func(void*)+0x23) [0x7f0bc5ac4c75]
INFO:teuthology.task.rados.rados.0.err: 9: (()+0x7971) [0x7f0bc5110971]
INFO:teuthology.task.rados.rados.0.err: 10: (clone()+0x6d) [0x7f0bc495092d]
Fixes: #2135
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Sage Weil [Mon, 5 Mar 2012 22:21:00 +0000 (14:21 -0800)]
osd: don't trust pusher's data_complete
The pusher doesn't know what clone_overlap we'll see, so it has no idea
if we are data_complete from our perspective, making this check useless.
In particular, we screw up if we race with a recalculation of
clone_overlap.
Fixes: #2133
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Mon, 5 Mar 2012 22:20:48 +0000 (14:20 -0800)]
osd: warn if recovery still has missing at end
We shouldn't get to this point. If we do, recover_primary didn't do what
it needed to. Dump the remaining missing set and hope we can debug.
Signed-off-by: Sage Weil <sage@newdream.net>
Florian Haas [Sat, 3 Mar 2012 23:40:55 +0000 (00:40 +0100)]
OCF resource agents: add rbd
Add a resource agent for mapping, unmapping and monitoring RBD devices.
Maps an RBD on start, unmaps it on stop. Checks "rbd showmapped"
output for monitoring whether the device is mapped, thus does not
rely on the ceph-rbdnamer udev magic to be enabled.
This RA is cloneable and essentially allows people to use RBD devices
as a drop-in replacement for
- iSCSI devices,
- host-based mirrored devices using md RAID-1,
- DRBD devices
in Pacemaker clusters.
Sage Weil [Sun, 4 Mar 2012 05:01:45 +0000 (21:01 -0800)]
DBObjectMap: remove stray ;
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Sat, 3 Mar 2012 22:28:55 +0000 (14:28 -0800)]
LevelDBStore: #include types.h
This fixes some compile errors on one of my boxes (squeeze).
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 2 Mar 2012 22:59:51 +0000 (14:59 -0800)]
.gitignore: *.tar.bz2
Signed-off-by: Sage Weil <sage@newdream.net>
Greg Farnum [Fri, 2 Mar 2012 22:46:06 +0000 (14:46 -0800)]
msgr: start re-ordering functions into a better order
This is the start of making the SimpleMessenger interface legible
to users. In addition to moving the configuration and accessor
functions to the top of the file, it adds virtual to the functions
which are part of the defined Messenger interface.
You can tell from some of the comments that work remains.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil [Fri, 2 Mar 2012 21:45:03 +0000 (13:45 -0800)]
Merge branch 'stable'
Greg Farnum [Fri, 2 Mar 2012 19:08:51 +0000 (11:08 -0800)]
msgr: remove refcounting of Messengers.
This was pretty pointless since each Messenger has a well-defined
exit point and shutdown process.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Fri, 2 Mar 2012 02:48:46 +0000 (18:48 -0800)]
msgr: make nonce a required part of the SimpleMessenger constructor.
With that, remove the set_nonce function and the gratuitous passing
of nonce around through layers of functions.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>