]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Samuel Just [Wed, 29 Feb 2012 02:02:34 +0000 (18:02 -0800)]
Added LevelDBStore
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Wed, 29 Feb 2012 02:03:18 +0000 (18:03 -0800)]
Added leveldb submodule
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Thu, 1 Mar 2012 04:28:05 +0000 (20:28 -0800)]
Makefile: make check-local relative to $(srcdir)
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Yehuda Sadeh [Wed, 29 Feb 2012 21:51:45 +0000 (13:51 -0800)]
rgw: don't check for ECANCELED in the _impl() functions
We already check it in the outer functions.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Yehuda Sadeh [Wed, 29 Feb 2012 19:34:33 +0000 (11:34 -0800)]
rgw: don't retry certain operations if we raced
The atomic get/put scheme was retrying writes in case where it lost
races (head object was rewritten by another client). Instead we can
just back off and return success.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Sage Weil [Wed, 29 Feb 2012 21:22:34 +0000 (13:22 -0800)]
msgr: fix race in learned_addr()
- two connect() threads
- both hit if (need_addr) check
- one takes lock, sets addr, need_addr = false, unlocks
- continues to ::encode(ms_addr, ...);
- meanwhile, second thread set ms_addr _again_, but copies peer port into
place before adjusting it. racing ::encode() sees bad port and sends it
to the peer.
Fix this two ways:
- don't copy bad port into place; set it first
- re-check need_addr after taking lock
Fixes: #1747
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil [Wed, 29 Feb 2012 20:28:19 +0000 (12:28 -0800)]
msgr: print existing->state before failing assert
May help with #1378.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 29 Feb 2012 19:07:03 +0000 (11:07 -0800)]
Merge remote-tracking branch 'gh/wip-2121'
Reviewed-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
Sage Weil [Wed, 29 Feb 2012 17:46:13 +0000 (09:46 -0800)]
osd: unregister signal handlers on shutdown
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 29 Feb 2012 17:46:06 +0000 (09:46 -0800)]
mon: unregister signal handlers on shutdown
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 29 Feb 2012 17:45:56 +0000 (09:45 -0800)]
mds: unregister SIGHUP too
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 29 Feb 2012 17:45:46 +0000 (09:45 -0800)]
radosgw: handle SIGHUP
Fixes: #2121
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Wed, 29 Feb 2012 17:23:22 +0000 (09:23 -0800)]
init-radosgw: add 'reload' command to send SIGHUP
Fixes: #2121
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 29 Feb 2012 17:21:22 +0000 (09:21 -0800)]
osd: fix typo is recovery_state query dump
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 29 Feb 2012 17:17:07 +0000 (09:17 -0800)]
osd: add missing space to scrub error
[ERR] 18.5 osd.3: soid
8a5e37ad /rb.0.0.
000000002b99 /headextra attr _, extra attr snapset
Signed-off-by: Sage Weil <sage@newdream.net>
Greg Farnum [Wed, 29 Feb 2012 01:30:23 +0000 (17:30 -0800)]
msgr: discard the local_pipe's queue on shutdown.
To facilitate this, we do two things:
1) actually identify the number of special code values we pass around
2) use that to prevent trying to put() those non-pointer values in
Pipe::discard_queue().
Then we just call local_pipe.discard_queue() in wait() like happens
(indirectly, via reaping) with all the normal Pipes in rank_pipe.
But this does make me think that we may be approaching the point
where it's appropriate to create a subclass LocalPipe (against a
RemotePipe like our current Pipe implementation is mostly intended
to be).
Should fix #2086.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 29 Feb 2012 17:10:57 +0000 (09:10 -0800)]
osd: remove down OSDs from peer_info on reset
If an OSD goes down, remove it from peer_info. In particular, I saw
2012-02-28 11:04:25.851038
12e53700 osd.5 3602 pg[1.15( empty n=0 ec=1 les/c 0/3587 3598/3598/3598) [5,3] r=0 lpr=3599 mlcod 0'0 peering] state<Started/Primary/Peering>: Peering advmap
2012-02-28 11:04:25.851491
12e53700 osd.5 3602 pg[1.15( empty n=0 ec=1 les/c 0/3587 3598/3598/3598) [5,3] r=0 lpr=3599 mlcod 0'0 peering] PriorSet: affected_by_map osd.1 now down
...
2012-02-28 11:04:25.998186
12e53700 osd.5 3602 pg[1.15( empty n=0 ec=1 les/c 0/3587 3598/3598/3598) [5,3] r=0 lpr=3602 mlcod 0'0 peering] PriorSet: build_prior interval(3587-3597 [3,1]/[3,1] maybe_went_rw)
2012-02-28 11:04:25.998636
12e53700 osd.5 3602 pg[1.15( empty n=0 ec=1 les/c 0/3587 3598/3598/3598) [5,3] r=0 lpr=3602 mlcod 0'0 peering] PriorSet: build_prior prior osd.1 is down
2012-02-28 11:04:25.999106
12e53700 osd.5 3602 pg[1.15( empty n=0 ec=1 les/c 0/3587 3598/3598/3598) [5,3] r=0 lpr=3602 mlcod 0'0 peering] PriorSet: build_prior final: probe 3,5 down 1 blocked_by {}
...
2012-02-28 11:04:26.001723
12e53700 osd.5 3602 pg[1.15( empty n=0 ec=1 les/c 0/3587 3598/3598/3598) [5,3] r=0 lpr=3602 mlcod 0'0 peering] enter Started/Primary/Peering/GetLog
2012-02-28 11:04:26.002428
12e53700 osd.5 3602 pg[1.15( empty n=0 ec=1 les/c 0/3587 3598/3598/3598) [5,3] r=0 lpr=3602 mlcod 0'0 peering] calc_acting osd.1 1.15( v 10'1 (0'0,10'1] n=1 ec=1 les/c 0/3587 3598/3598/3598)
2012-02-28 11:04:26.003000
12e53700 osd.5 3602 pg[1.15( empty n=0 ec=1 les/c 0/3587 3598/3598/3598) [5,3] r=0 lpr=3602 mlcod 0'0 peering] calc_acting osd.3 1.15( v 10'1 (0'0,10'1] n=1 ec=1 les/c 0/3587 3598/3598/3598)
2012-02-28 11:04:26.003528
12e53700 osd.5 3602 pg[1.15( empty n=0 ec=1 les/c 0/3587 3598/3598/3598) [5,3] r=0 lpr=3602 mlcod 0'0 peering] calc_acting osd.5 1.15( empty n=0 ec=1 les/c 0/3587 3598/3598/3598)
2012-02-28 11:04:26.004109
12e53700 osd.5 3602 pg[1.15( empty n=0 ec=1 les/c 0/3587 3598/3598/3598) [5,3] r=0 lpr=3602 mlcod 0'0 peering] calc_acting newest update on osd.1 with 1.15( v 10'1 (0'0,10'1] n=1 ec=1 les/c 0/3587 3598/3598/3598)
Any time an osd goes down we want to ensure we remove it from peer_info.
Handling this in Reset and Started states captures all of the nested
states, which forward the event (or re-post transit to Reset). We can
also drop the Primary reaction, which is now superfluous.
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Sage Weil [Wed, 29 Feb 2012 01:04:55 +0000 (17:04 -0800)]
Merge branch 'next'
Josh Durgin [Tue, 28 Feb 2012 01:49:13 +0000 (17:49 -0800)]
mon: report pgs stuck inactive/unclean/stale in health check
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Sage Weil <sage.weil@dreamhost.com>
Greg Farnum [Tue, 28 Feb 2012 20:28:47 +0000 (12:28 -0800)]
mon: fix slurp_latest to fill in any missing incrementals
Fixes #1789.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil [Tue, 28 Feb 2012 17:33:18 +0000 (09:33 -0800)]
test_osd_types: fix unit test for new pg_t::is_split() prototype
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 28 Feb 2012 17:30:38 +0000 (09:30 -0800)]
Makefile: drop separate libjson_spirit.la
automake seems to have difficulty with the .la dependency on another .la.
Since libjson_spirit.la is only used by libcommon.la anyway, just build it
directly into that. Sigh.
...
CXXLD libjson_spirit.la
AR libmds.a
CXXLD libcls_rbd.la
CXXLD libcls_rgw.la
CXXLD cephfs
CCLD test_ioctls
CC libcommon_la-ceph_ver.lo
CXX libcommon_la-version.lo
CXX ceph_dencoder.o
CCLD mount.ceph
CC ceph_ver.o
CXX test_libhadoopcephfs_build-version.o
CXXLD test_libhadoopcephfs_build
CXXLD libcommon.la
libtool: link: cannot find the library `libjson_spirit.la' or unhandled argument `libjson_spirit.la'
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 28 Feb 2012 17:26:04 +0000 (09:26 -0800)]
osd: drop useless ENOMEM check
new throws exception; doesn't return NULL.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 28 Feb 2012 17:11:59 +0000 (09:11 -0800)]
ceph-osd: clarify error messages
So we know where the error came from. And use real error codes in init().
Signed-off-by: Sage Weil <sage@newdream.net>
Wido den Hollander [Tue, 28 Feb 2012 11:41:42 +0000 (12:41 +0100)]
init: Actually do start the daemons when 'service ceph start <type>' is specified
A bug in my previous patch prevented any daemon with auto_start set to false from starting.
This patch allows:
* /etc/init.d/ceph start osd|mds|mon
* service ceph start osd|mds|mon
It however does not start daemons if auto_start is disabled when you invoke:
* /etc/init.d/ceph start
* service ceph start
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Sage Weil [Mon, 27 Feb 2012 23:41:57 +0000 (15:41 -0800)]
doc: beginnings of documentation of stuck pgs and pg states
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 27 Feb 2012 23:13:13 +0000 (15:13 -0800)]
filestore: make less noise on ENOENT
Don't generate high-level log spam on every open error.
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Greg Farnum [Mon, 27 Feb 2012 22:49:18 +0000 (14:49 -0800)]
pg: use get_cluster_inst instead of get_inst in activate
This was mistakenly broken in
4b3bb5ab37a05fa001d59f24da7d9c30d650321b
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Reviewed-by: Sam Just <sam.just@dreamhost.com>
Sage Weil [Mon, 27 Feb 2012 22:37:41 +0000 (14:37 -0800)]
Merge branch 'wip-split2'
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Mon, 27 Feb 2012 22:35:21 +0000 (14:35 -0800)]
osd: pg_t::is_split(): make children out param a pointer, and optional
Also unit test it.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 27 Feb 2012 22:18:21 +0000 (14:18 -0800)]
osd: bypass split code
Until it is fully implemented. It's also disabled in the monitor
currently, but just in case it gets into the OSDMap, do nothing for now.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 21 Feb 2012 00:46:03 +0000 (16:46 -0800)]
osd: fix pg locking flags
Two things we need to handle:
- callers who already hold map_lock (split_pg())
- callers who already hold another pg->lock, and want to skip the lockdep
check for this one.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 27 Feb 2012 22:04:22 +0000 (14:04 -0800)]
osd: partially refactor pg split
This partially refactors the OSD split code to do the split synchronously
when processing a new OSDMap. It is incomplete in that it does not yet
do anything useful for the PG. The full solution needs to:
- Do the split synchronously when applying the map update.
- Reset the parent pg so that it repeers. This will cause problems until
we consistently consider this a new interval when looking backwards in
time; this needs to be fixed. Anybody doing generate_past_intervals()
or similar will need to consider a split/merge event as an interval
boundary.
- The recovery state machine should trigger appropriately when this
happens.
- The old PG that was split should probably be handle identically to the
new children. That means deleting the old PG instance and creating a new
PG object for the newly-split child. Ditto for merge.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 20 Feb 2012 23:59:00 +0000 (15:59 -0800)]
osd: implement pg_t::is_split()
Test to determine if a pg has split between two pool sizes, and if so,
what its children are.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 20 Feb 2012 22:12:16 +0000 (14:12 -0800)]
osd: factor hobject key into child pgid calc during split
When we calculate the object's new pg, take the locator key into
consideration, to avoid a crash like
osd/OSD.cc: In function 'void OSD::split_pg(PG*, std::map<pg_t, PG*>&,ObjectStore::Transaction&)' thread
7fe3df8c4700 time 2012-02-20 18:22:19.900886
osd/OSD.cc: 4066: FAILED assert(child)
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 27 Feb 2012 19:39:53 +0000 (11:39 -0800)]
journaler: log on unexpected objecter error
This will help with #2110, #1796, #1640.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Mon, 27 Feb 2012 17:56:21 +0000 (09:56 -0800)]
osd: fix recursive map_lock via check_replay_queue()
Also drop activate_pg() helper while we're at it, so it's clear that we
are the only user.
recursive lock of OSD::map_lock (33)
ceph version
0.42-146-g7ad35ce (commit:
7ad35ce489cc5f9169eb838e1196fa2ca4d6e985 )
2012-02-24 12:30:16.541416 1: (PG::lock(bool)+0x2a) [0xa09348]
2012-02-24 12:30:16.541424 2: (OSD::_lookup_lock_pg(pg_t)+0xbd) [0x84b8df]
2012-02-24 12:30:16.541431 3: (OSD::activate_pg(pg_t, utime_t)+0x9f) [0x87463b]
2012-02-24 12:30:16.541442 4: (OSD::check_replay_queue()+0x12f) [0x87452d]
2012-02-24 12:30:16.541450 5: (OSD::tick()+0x23c) [0x8535ea]
2012-02-24 12:30:16.541456 6: (OSD::C_Tick::finish(int)+0x1f) [0x881671]
2012-02-24 12:30:16.541462 7: (SafeTimer::timer_thread()+0x2d5) [0x8f8211]
2012-02-24 12:30:16.541468 8: (SafeTimerThread::entry()+0x1c) [0x8f923c]
2012-02-24 12:30:16.541475 9: (Thread::_entry_func(void*)+0x23) [0x9c8109]
2012-02-24 12:30:16.541485 10: (()+0x68ba) [0x7f9dbed838ba]
2012-02-24 12:30:16.541491 11: (clone()+0x6d) [0x7f9dbd66f02d]
2012-02-24 12:30:16.541495 common/lockdep.cc: In function 'int lockdep_will_lock(const char*, int)' thread
7f9db9d98700 time 2012-02-24 12:30:16.541504
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Sam Just <samuel.just@dreamhost.com>
Sage Weil [Mon, 27 Feb 2012 04:56:05 +0000 (20:56 -0800)]
init-ceph: stick with /var/run for the time being
/run isn't present on older systems. Stick with the old location until it
is more pervasive, or we add an autoconf option to control it.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Laszlo Boszormenyi [Mon, 27 Feb 2012 04:47:53 +0000 (20:47 -0800)]
debian: /var/run/ceph -> /run/ceph
/run/ceph should exists for creating UNIX domain sockets
ceph uses UNIX domain sockets for internal communication. Create their
directory on startup as /run is on a virtual filesystem.
Last-Update: <2012-02-26>
Bug-Debian: http://bugs.debian.org/660238
Forwarded: <ceph-devel@vger.kernel.org>
Signed-off-by: Laszlo Boszormenyi (GCS) <gcs@debian.hu>
Laszlo Boszormenyi [Mon, 27 Feb 2012 04:45:52 +0000 (20:45 -0800)]
debian: build-{indep,arch}
Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
Laszlo Boszormenyi [Mon, 27 Feb 2012 04:45:06 +0000 (20:45 -0800)]
debian: sdparm|hdparm, new standards version
Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
Yehuda Sadeh [Sat, 25 Feb 2012 01:00:35 +0000 (17:00 -0800)]
rgw: initialize bucket_id in bucket structure
might make valgrind a little bit less noisy.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Sage Weil [Fri, 24 Feb 2012 23:23:44 +0000 (15:23 -0800)]
rgw: _exit(0) on SIGTERM
We need to do something a bit smarter to get coverage information, but this
is a start.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 24 Feb 2012 21:52:32 +0000 (13:52 -0800)]
Merge remote branch 'gh/wip-crush-adjust'
Reviewed-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 21:48:06 +0000 (13:48 -0800)]
Merge remote branch 'gh/wip-mds-resetter'
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 21:43:43 +0000 (13:43 -0800)]
Merge branch 'wip-pg-query'
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 21:22:49 +0000 (13:22 -0800)]
Merge branch 'stable'
Sage Weil [Fri, 24 Feb 2012 20:59:53 +0000 (12:59 -0800)]
v0.42.2
Sage Weil [Fri, 24 Feb 2012 21:00:33 +0000 (13:00 -0800)]
Merge remote-tracking branch 'gh/stable' into stable
Sage Weil [Fri, 24 Feb 2012 20:54:41 +0000 (12:54 -0800)]
Merge branch 'stable'
Sage Weil [Fri, 24 Feb 2012 20:40:34 +0000 (12:40 -0800)]
osd: fix array index
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 24 Feb 2012 20:39:44 +0000 (12:39 -0800)]
lockdep: don't make noise on startup
Who cares!
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 20:38:13 +0000 (12:38 -0800)]
formatter: fix trailing dump_stream()
Flush a previous dump_stream() if it was the last thing prior to a
close_section().
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 20:04:29 +0000 (12:04 -0800)]
osd: include timestamps in state json dumps
Include the time we entered this state in the dump.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 20:00:00 +0000 (12:00 -0800)]
Merge branch 'wip-2007'
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 19:59:20 +0000 (11:59 -0800)]
osd: use blocks for readability in list_missing
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 24 Feb 2012 19:33:48 +0000 (11:33 -0800)]
osd: dump recovery_state states in json
Use a formatter. Present a vector of states, inner to outer.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 00:30:42 +0000 (16:30 -0800)]
osd: query Peering substates
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 24 Feb 2012 00:22:08 +0000 (16:22 -0800)]
osd: query recovery state machine
For now, just append this to the end of the pg <pgid> query json dump.
We definitely want to do something smarter here, but I'm not sure whether
json or plaintext is the way to go.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 24 Feb 2012 17:27:49 +0000 (09:27 -0800)]
osd: add tunable for number of records in osd command replies
e.g., 'pg <pgid> list_missing [offset]'.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 24 Feb 2012 04:30:44 +0000 (20:30 -0800)]
osd: 'pg <pgid> list_missing <json hobject_t offset>'
Dump missing objects in json. If more key is non-zero, user should ask for
more by passing the last object as the offset for the next request.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 14:07:40 +0000 (06:07 -0800)]
hobject_t: decode json
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 14:04:14 +0000 (06:04 -0800)]
add libjson_spirit.la
This is lightweight and relies on boost spirit, which we already use, so
there are no new dependencies.
There were some other libraries that also looked good, but they weren't
already packages for existing Debian distros like squeeze or even wheezy.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 04:16:05 +0000 (20:16 -0800)]
osd: pass in data to do_command
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 19:24:04 +0000 (11:24 -0800)]
osd: 'tell osd.N mark_unfound_lost revert' -> 'pg <pgid> mark_unfound_lost revert'
More consistent interface.
Fixes: #2030
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 15:06:51 +0000 (07:06 -0800)]
lockdep: warn on stderr (via derr), not stdout
Otherwise we screw up ceph-conf output and the like.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Thu, 23 Feb 2012 17:44:05 +0000 (09:44 -0800)]
do_autogen.sh: -T for --without-tcmalloc
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 24 Feb 2012 02:58:35 +0000 (18:58 -0800)]
ceph: fix help.t
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 02:46:30 +0000 (18:46 -0800)]
v0.42.1
Sage Weil [Tue, 21 Feb 2012 19:12:37 +0000 (11:12 -0800)]
debian: add ceph-dencoder
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 21 Feb 2012 19:12:30 +0000 (11:12 -0800)]
ceph.spec.in: add ceph-dencoder
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 21 Feb 2012 19:12:13 +0000 (11:12 -0800)]
ceph-dencoder: man page
Signed-off-by: Sage Weil <sage@newdream.net>
Greg Farnum [Fri, 24 Feb 2012 02:13:29 +0000 (18:13 -0800)]
ceph-tool: remove reference to "stop" command
This doesn't exist any more, and I don't think it
ever "cleanly shut down the filesystem" -- certainly not
within my recent lifetime!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Reviewed-by: Dan Mick <dan.mick@dreamhost.com>
Greg Farnum [Thu, 23 Feb 2012 23:40:20 +0000 (15:40 -0800)]
mds: remove unused MDBalancer dump_pop_map() function.
Commenting it out is not the right answer. ;)
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Reviewed-by: Dan Mick <dan.mick@dreamhost.com>
Sage Weil [Fri, 24 Feb 2012 00:35:22 +0000 (16:35 -0800)]
mds: clean up useless block
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 23 Feb 2012 05:15:19 +0000 (21:15 -0800)]
mds: fix Resetter locking
We need to hold the lock for ms_dispatch, esp calls into objecter. We
should only drop it when blocking; use distinct naming for the on-stack
mutex used for that.
Reported-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Greg Farnum [Thu, 23 Feb 2012 23:33:39 +0000 (15:33 -0800)]
Merge remote branch 'origin/wip-mds-old-inodes'
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Thu, 23 Feb 2012 23:06:32 +0000 (15:06 -0800)]
Merge remote branch 'origin/wip-dencoder'
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Thu, 23 Feb 2012 23:06:15 +0000 (15:06 -0800)]
Merge remote branch 'origin/wip-1820'
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil [Thu, 23 Feb 2012 23:05:46 +0000 (15:05 -0800)]
osd: only set CLEAN when we are not remapped (up == acting)
If we have a temporary mapping for this PG, consider that unclean. This
makes CLEAN and REMAPPED mutually exclusive. For example, a 2 node cluster
with 2x replication and one osd marked out will make the pgs all
active+remapped, not active+clean+remapped.
Fixes: #2094
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 23 Feb 2012 22:56:54 +0000 (14:56 -0800)]
Merge remote-tracking branch 'gh/wip-pg-query'
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Greg Farnum [Thu, 23 Feb 2012 22:55:48 +0000 (14:55 -0800)]
osd: conditionally encode old pg_pool_t when no CEPH_FEATURE_OSDENC
This fixes OSDMap compatibility between v0.42 and <v0.42.
For MOSDMap, reencode maps if OSDENC feature is missing. Also rev the
message version. We don't use COMPAT version here because v3 can't be
understood by v2 (that's why we're checking feature bits). (It will be
possible to do that later when our constituent types can be decoded by
multiple versions.)
Fixes: #2095
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil [Thu, 23 Feb 2012 22:38:03 +0000 (14:38 -0800)]
Merge remote-tracking branch 'gh/wip-dump-ops-in-flight'
Reviewed-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 23 Feb 2012 22:24:12 +0000 (14:24 -0800)]
mon: use pending_mdsmap for deactivate
We should always look at the proposed map to avoid weird races.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 23 Feb 2012 22:27:34 +0000 (14:27 -0800)]
doc: 'deactivate mds' instead of 'stop mds'
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 23 Feb 2012 20:16:59 +0000 (12:16 -0800)]
mon: mds "stop" -> "deactivate"
See #1820.
Signed-off-by: Sage Weil <sage@newdream.net>
Greg Farnum [Thu, 23 Feb 2012 20:11:27 +0000 (12:11 -0800)]
test: add basic test for the OSD's dump_ops_in_flight adminsocket command
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum [Wed, 15 Feb 2012 02:53:49 +0000 (18:53 -0800)]
osd: add "dump_ops_in_flight" to the AdminSocket.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil [Thu, 23 Feb 2012 20:08:52 +0000 (12:08 -0800)]
mon: refuse to stop mds if max_mds will make it rejoin
Otherwise the MDS will leave the cluster and immediately rejoin, which is
useless and confusing to users. See #1820.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 23 Feb 2012 19:42:08 +0000 (11:42 -0800)]
crushtool: add --reweight-item cli tests
Test list, tree, and straw buckets.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 23 Feb 2012 19:03:44 +0000 (11:03 -0800)]
crush: fix weight adjust for list, tree buckets
Fix the typo. Code now matches that for straw buckets.
Reported-by: ZhuRongze <zrz4ceph@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 23 Feb 2012 19:16:17 +0000 (11:16 -0800)]
Merge branch 'wip-2090'
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil [Thu, 23 Feb 2012 04:49:04 +0000 (20:49 -0800)]
mon: unlock mon before msgr shutdown
The ceph_mon.cc main() will delete mon when the msgr dispatch thread
completes. Make sure we unlock before we shut down the messenger, and
avoid touching this after messenger->shutdown().
Fixes: #2090
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Thu, 23 Feb 2012 04:43:20 +0000 (20:43 -0800)]
mon: deprecate mon 'stop' command
Send SIGTERM.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Thu, 23 Feb 2012 04:37:40 +0000 (20:37 -0800)]
msgr: join dispatch_thread after it completes
This is just for completeness. No change in behavior, since we don't
get here until the thread has signaled it is done.
Drop the destroy() overload, since we join earlier.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Thu, 23 Feb 2012 19:04:30 +0000 (11:04 -0800)]
Merge remote-tracking branch 'gh/wip-stop'
Sage Weil [Thu, 23 Feb 2012 17:51:31 +0000 (09:51 -0800)]
filestore: use IOC_CLONERANGE intead of IOC_CLONE ioctl
This is functionally equivalent, except that valgrind doesn't complain
about a bad pointer passed to an ioctl.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Thu, 23 Feb 2012 17:43:03 +0000 (09:43 -0800)]
osd: drop "stop" command
Send SIGINT.
Fixes: #1820
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 23 Feb 2012 17:42:11 +0000 (09:42 -0800)]
osd: drop unused "stop" check
This is never reached: both callers handle "stop" explicitly.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 23 Feb 2012 17:39:50 +0000 (09:39 -0800)]
osd: don't complete recovery if unfound
Otherwise we fail the !needs_recovery() assert. Because we aren't
recovered. For example,
2012-02-21 16:16:13.104665
1685c700 osd.5 1217 pg[0.16( v 1215'337 lc 19'2 (0'0,1215'337] n=25 ec=1 les/c 0/1061 1210/1210/1210) [5,3] r=0 lpr=1210 mlcod 0'0 active m=23 u=23 snaptrimq=[1~99,9b~e,aa~72,11d~3d,15b~e,16a~f,17a~5,180~4,185~1a,1a0~a,1ac~10,1bd~4,1c2~8,1cb~1,1cd~1,1cf~1a,1ea~10,1fb~6,202~2,205~2,209~2,20c~8,215~2,218~5,21e~1,220~1,222~9,22c~4,231~3,235~2,238~3,23e~2,241~4,246~1,248~1,24b~1,24d~9,257~6,25e~1,263~1,265~2,268~3,26e~1,273~1,275~5,27e~1,280~2]] needs_recovery osd.3 has 23 missing
osd/PG.cc: In function 'boost::statechart::result PG::RecoveryState::Active::react(const PG::RecoveryState::RecoveryComplete&)' thread
1685c700 time 2012-02-21 16:16:13.108923
osd/PG.cc: 4070: FAILED assert(!pg->needs_recovery())
ceph version
0.42-70-g0e4367a (commit:
0e4367aaac88b99c36386b6ce5e8d816fdd4ada0 )
1: (PG::RecoveryState::Active::react(PG::RecoveryState::RecoveryComplete const&)+0x1b3) [0x6a1173]
2: (boost::statechart::simple_state<PG::RecoveryState::Active, PG::RecoveryState::Primary, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x121) [0x6c7301]
3: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x6b) [0x6bfc6b]
4: (PG::RecoveryState::handle_recovery_complete(PG::RecoveryCtx*)+0x10c) [0x67c03c]
5: (ReplicatedPG::start_recovery_ops(int, PG::RecoveryCtx*)+0x241) [0x4f83c1]
6: (OSD::do_recovery(PG*)+0x345) [0x54b3e5]
7: (ThreadPool::worker()+0xa26) [0x619e66]
8: (ThreadPool::WorkThread::entry()+0xd) [0x57ad5d]
9: (()+0x7971) [0x5037971]
10: (clone()+0x6d) [0x679f92d]
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>