]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agoosd: drop useless ENOMEM check
Sage Weil [Tue, 28 Feb 2012 17:26:04 +0000 (09:26 -0800)]
osd: drop useless ENOMEM check

new throws exception; doesn't return NULL.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agodoc: beginnings of documentation of stuck pgs and pg states
Sage Weil [Mon, 27 Feb 2012 23:41:57 +0000 (15:41 -0800)]
doc: beginnings of documentation of stuck pgs and pg states

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'wip-split2'
Sage Weil [Mon, 27 Feb 2012 22:37:41 +0000 (14:37 -0800)]
Merge branch 'wip-split2'

Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd: pg_t::is_split(): make children out param a pointer, and optional
Sage Weil [Mon, 27 Feb 2012 22:35:21 +0000 (14:35 -0800)]
osd: pg_t::is_split(): make children out param a pointer, and optional

Also unit test it.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: bypass split code
Sage Weil [Mon, 27 Feb 2012 22:18:21 +0000 (14:18 -0800)]
osd: bypass split code

Until it is fully implemented.  It's also disabled in the monitor
currently, but just in case it gets into the OSDMap, do nothing for now.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: fix pg locking flags
Sage Weil [Tue, 21 Feb 2012 00:46:03 +0000 (16:46 -0800)]
osd: fix pg locking flags

Two things we need to handle:

 - callers who already hold map_lock (split_pg())
 - callers who already hold another pg->lock, and want to skip the lockdep
   check for this one.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: partially refactor pg split
Sage Weil [Mon, 27 Feb 2012 22:04:22 +0000 (14:04 -0800)]
osd: partially refactor pg split

This partially refactors the OSD split code to do the split synchronously
when processing a new OSDMap.  It is incomplete in that it does not yet
do anything useful for the PG.  The full solution needs to:

- Do the split synchronously when applying the map update.
- Reset the parent pg so that it repeers.  This will cause problems until
  we consistently consider this a new interval when looking backwards in
  time; this needs to be fixed.  Anybody doing generate_past_intervals()
  or similar will need to consider a split/merge event as an interval
  boundary.
- The recovery state machine should trigger appropriately when this
  happens.
- The old PG that was split should probably be handle identically to the
  new children.  That means deleting the old PG instance and creating a new
  PG object for the newly-split child.  Ditto for merge.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: implement pg_t::is_split()
Sage Weil [Mon, 20 Feb 2012 23:59:00 +0000 (15:59 -0800)]
osd: implement pg_t::is_split()

Test to determine if a pg has split between two pool sizes, and if so,
what its children are.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: factor hobject key into child pgid calc during split
Sage Weil [Mon, 20 Feb 2012 22:12:16 +0000 (14:12 -0800)]
osd: factor hobject key into child pgid calc during split

When we calculate the object's new pg, take the locator key into
consideration, to avoid a crash like

osd/OSD.cc: In function 'void OSD::split_pg(PG*, std::map<pg_t, PG*>&,ObjectStore::Transaction&)' thread 7fe3df8c4700 time 2012-02-20 18:22:19.900886
osd/OSD.cc: 4066: FAILED assert(child)

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agojournaler: log on unexpected objecter error
Sage Weil [Mon, 27 Feb 2012 19:39:53 +0000 (11:39 -0800)]
journaler: log on unexpected objecter error

This will help with #2110, #1796, #1640.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: fix recursive map_lock via check_replay_queue()
Sage Weil [Mon, 27 Feb 2012 17:56:21 +0000 (09:56 -0800)]
osd: fix recursive map_lock via check_replay_queue()

Also drop activate_pg() helper while we're at it, so it's clear that we
are the only user.

recursive lock of OSD::map_lock (33)
 ceph version 0.42-146-g7ad35ce (commit:7ad35ce489cc5f9169eb838e1196fa2ca4d6e985)
2012-02-24 12:30:16.541416 1: (PG::lock(bool)+0x2a) [0xa09348]
2012-02-24 12:30:16.541424 2: (OSD::_lookup_lock_pg(pg_t)+0xbd) [0x84b8df]
2012-02-24 12:30:16.541431 3: (OSD::activate_pg(pg_t, utime_t)+0x9f) [0x87463b]
2012-02-24 12:30:16.541442 4: (OSD::check_replay_queue()+0x12f) [0x87452d]
2012-02-24 12:30:16.541450 5: (OSD::tick()+0x23c) [0x8535ea]
2012-02-24 12:30:16.541456 6: (OSD::C_Tick::finish(int)+0x1f) [0x881671]
2012-02-24 12:30:16.541462 7: (SafeTimer::timer_thread()+0x2d5) [0x8f8211]
2012-02-24 12:30:16.541468 8: (SafeTimerThread::entry()+0x1c) [0x8f923c]
2012-02-24 12:30:16.541475 9: (Thread::_entry_func(void*)+0x23) [0x9c8109]
2012-02-24 12:30:16.541485 10: (()+0x68ba) [0x7f9dbed838ba]
2012-02-24 12:30:16.541491 11: (clone()+0x6d) [0x7f9dbd66f02d]
2012-02-24 12:30:16.541495 common/lockdep.cc: In function 'int lockdep_will_lock(const char*, int)' thread 7f9db9d98700 time 2012-02-24 12:30:16.541504

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Sam Just <samuel.just@dreamhost.com>
13 years agoinit-ceph: stick with /var/run for the time being
Sage Weil [Mon, 27 Feb 2012 04:56:05 +0000 (20:56 -0800)]
init-ceph: stick with /var/run for the time being

/run isn't present on older systems.  Stick with the old location until it
is more pervasive, or we add an autoconf option to control it.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agodebian: /var/run/ceph -> /run/ceph
Laszlo Boszormenyi [Mon, 27 Feb 2012 04:47:53 +0000 (20:47 -0800)]
debian: /var/run/ceph -> /run/ceph

/run/ceph should exists for creating UNIX domain sockets
ceph uses UNIX domain sockets for internal communication. Create their
directory on startup as /run is on a virtual filesystem.

Last-Update: <2012-02-26>
Bug-Debian: http://bugs.debian.org/660238
Forwarded: <ceph-devel@vger.kernel.org>
Signed-off-by: Laszlo Boszormenyi (GCS) <gcs@debian.hu>
13 years agodebian: build-{indep,arch}
Laszlo Boszormenyi [Mon, 27 Feb 2012 04:45:52 +0000 (20:45 -0800)]
debian: build-{indep,arch}

Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
13 years agodebian: sdparm|hdparm, new standards version
Laszlo Boszormenyi [Mon, 27 Feb 2012 04:45:06 +0000 (20:45 -0800)]
debian: sdparm|hdparm, new standards version

Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
13 years agorgw: initialize bucket_id in bucket structure
Yehuda Sadeh [Sat, 25 Feb 2012 01:00:35 +0000 (17:00 -0800)]
rgw: initialize bucket_id in bucket structure

might make valgrind a little bit less noisy.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agorgw: _exit(0) on SIGTERM
Sage Weil [Fri, 24 Feb 2012 23:23:44 +0000 (15:23 -0800)]
rgw: _exit(0) on SIGTERM

We need to do something a bit smarter to get coverage information, but this
is a start.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote branch 'gh/wip-crush-adjust'
Sage Weil [Fri, 24 Feb 2012 21:52:32 +0000 (13:52 -0800)]
Merge remote branch 'gh/wip-crush-adjust'

Reviewed-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoMerge remote branch 'gh/wip-mds-resetter'
Sage Weil [Fri, 24 Feb 2012 21:48:06 +0000 (13:48 -0800)]
Merge remote branch 'gh/wip-mds-resetter'

Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoMerge branch 'wip-pg-query'
Sage Weil [Fri, 24 Feb 2012 21:43:43 +0000 (13:43 -0800)]
Merge branch 'wip-pg-query'

Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMerge branch 'stable'
Sage Weil [Fri, 24 Feb 2012 21:22:49 +0000 (13:22 -0800)]
Merge branch 'stable'

13 years agov0.42.2 v0.42.2
Sage Weil [Fri, 24 Feb 2012 20:59:53 +0000 (12:59 -0800)]
v0.42.2

13 years agoMerge remote-tracking branch 'gh/stable' into stable
Sage Weil [Fri, 24 Feb 2012 21:00:33 +0000 (13:00 -0800)]
Merge remote-tracking branch 'gh/stable' into stable

13 years agoMerge branch 'stable'
Sage Weil [Fri, 24 Feb 2012 20:54:41 +0000 (12:54 -0800)]
Merge branch 'stable'

13 years agoosd: fix array index
Sage Weil [Fri, 24 Feb 2012 20:40:34 +0000 (12:40 -0800)]
osd: fix array index

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agolockdep: don't make noise on startup
Sage Weil [Fri, 24 Feb 2012 20:39:44 +0000 (12:39 -0800)]
lockdep: don't make noise on startup

Who cares!

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoformatter: fix trailing dump_stream()
Sage Weil [Fri, 24 Feb 2012 20:38:13 +0000 (12:38 -0800)]
formatter: fix trailing dump_stream()

Flush a previous dump_stream() if it was the last thing prior to a
close_section().

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: include timestamps in state json dumps
Sage Weil [Fri, 24 Feb 2012 20:04:29 +0000 (12:04 -0800)]
osd: include timestamps in state json dumps

Include the time we entered this state in the dump.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'wip-2007'
Sage Weil [Fri, 24 Feb 2012 20:00:00 +0000 (12:00 -0800)]
Merge branch 'wip-2007'

Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoosd: use blocks for readability in list_missing
Sage Weil [Fri, 24 Feb 2012 19:59:20 +0000 (11:59 -0800)]
osd: use blocks for readability in list_missing

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: dump recovery_state states in json
Sage Weil [Fri, 24 Feb 2012 19:33:48 +0000 (11:33 -0800)]
osd: dump recovery_state states in json

Use a formatter.  Present a vector of states, inner to outer.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: query Peering substates
Sage Weil [Fri, 24 Feb 2012 00:30:42 +0000 (16:30 -0800)]
osd: query Peering substates

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: query recovery state machine
Sage Weil [Fri, 24 Feb 2012 00:22:08 +0000 (16:22 -0800)]
osd: query recovery state machine

For now, just append this to the end of the pg <pgid> query json dump.
We definitely want to do something smarter here, but I'm not sure whether
json or plaintext is the way to go.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: add tunable for number of records in osd command replies
Sage Weil [Fri, 24 Feb 2012 17:27:49 +0000 (09:27 -0800)]
osd: add tunable for number of records in osd command replies

e.g., 'pg <pgid> list_missing [offset]'.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: 'pg <pgid> list_missing <json hobject_t offset>'
Sage Weil [Fri, 24 Feb 2012 04:30:44 +0000 (20:30 -0800)]
osd: 'pg <pgid> list_missing <json hobject_t offset>'

Dump missing objects in json.  If more key is non-zero, user should ask for
more by passing the last object as the offset for the next request.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agohobject_t: decode json
Sage Weil [Fri, 24 Feb 2012 14:07:40 +0000 (06:07 -0800)]
hobject_t: decode json

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoadd libjson_spirit.la
Sage Weil [Fri, 24 Feb 2012 14:04:14 +0000 (06:04 -0800)]
add libjson_spirit.la

This is lightweight and relies on boost spirit, which we already use, so
there are no new dependencies.

There were some other libraries that also looked good, but they weren't
already packages for existing Debian distros like squeeze or even wheezy.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: pass in data to do_command
Sage Weil [Fri, 24 Feb 2012 04:16:05 +0000 (20:16 -0800)]
osd: pass in data to do_command

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: 'tell osd.N mark_unfound_lost revert' -> 'pg <pgid> mark_unfound_lost revert'
Sage Weil [Fri, 24 Feb 2012 19:24:04 +0000 (11:24 -0800)]
osd: 'tell osd.N mark_unfound_lost revert' -> 'pg <pgid> mark_unfound_lost revert'

More consistent interface.

Fixes: #2030
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agolockdep: warn on stderr (via derr), not stdout
Sage Weil [Fri, 24 Feb 2012 15:06:51 +0000 (07:06 -0800)]
lockdep: warn on stderr (via derr), not stdout

Otherwise we screw up ceph-conf output and the like.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agodo_autogen.sh: -T for --without-tcmalloc
Sage Weil [Thu, 23 Feb 2012 17:44:05 +0000 (09:44 -0800)]
do_autogen.sh: -T for --without-tcmalloc

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoceph: fix help.t
Sage Weil [Fri, 24 Feb 2012 02:58:35 +0000 (18:58 -0800)]
ceph: fix help.t

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agov0.42.1 v0.42.1
Sage Weil [Fri, 24 Feb 2012 02:46:30 +0000 (18:46 -0800)]
v0.42.1

13 years agodebian: add ceph-dencoder
Sage Weil [Tue, 21 Feb 2012 19:12:37 +0000 (11:12 -0800)]
debian: add ceph-dencoder

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoceph.spec.in: add ceph-dencoder
Sage Weil [Tue, 21 Feb 2012 19:12:30 +0000 (11:12 -0800)]
ceph.spec.in: add ceph-dencoder

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoceph-dencoder: man page
Sage Weil [Tue, 21 Feb 2012 19:12:13 +0000 (11:12 -0800)]
ceph-dencoder: man page

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoceph-tool: remove reference to "stop" command
Greg Farnum [Fri, 24 Feb 2012 02:13:29 +0000 (18:13 -0800)]
ceph-tool: remove reference to "stop" command

This doesn't exist any more, and I don't think it
ever "cleanly shut down the filesystem" -- certainly not
within my recent lifetime!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Reviewed-by: Dan Mick <dan.mick@dreamhost.com>
13 years agomds: remove unused MDBalancer dump_pop_map() function.
Greg Farnum [Thu, 23 Feb 2012 23:40:20 +0000 (15:40 -0800)]
mds: remove unused MDBalancer dump_pop_map() function.

Commenting it out is not the right answer. ;)

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Reviewed-by: Dan Mick <dan.mick@dreamhost.com>
13 years agomds: clean up useless block
Sage Weil [Fri, 24 Feb 2012 00:35:22 +0000 (16:35 -0800)]
mds: clean up useless block

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomds: fix Resetter locking
Sage Weil [Thu, 23 Feb 2012 05:15:19 +0000 (21:15 -0800)]
mds: fix Resetter locking

We need to hold the lock for ms_dispatch, esp calls into objecter.  We
should only drop it when blocking; use distinct naming for the on-stack
mutex used for that.

Reported-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge remote branch 'origin/wip-mds-old-inodes'
Greg Farnum [Thu, 23 Feb 2012 23:33:39 +0000 (15:33 -0800)]
Merge remote branch 'origin/wip-mds-old-inodes'

Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoMerge remote branch 'origin/wip-dencoder'
Greg Farnum [Thu, 23 Feb 2012 23:06:32 +0000 (15:06 -0800)]
Merge remote branch 'origin/wip-dencoder'

Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoMerge remote branch 'origin/wip-1820'
Greg Farnum [Thu, 23 Feb 2012 23:06:15 +0000 (15:06 -0800)]
Merge remote branch 'origin/wip-1820'

Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoosd: only set CLEAN when we are not remapped (up == acting)
Sage Weil [Thu, 23 Feb 2012 23:05:46 +0000 (15:05 -0800)]
osd: only set CLEAN when we are not remapped (up == acting)

If we have a temporary mapping for this PG, consider that unclean.  This
makes CLEAN and REMAPPED mutually exclusive.  For example, a 2 node cluster
with 2x replication and one osd marked out will make the pgs all
active+remapped, not active+clean+remapped.

Fixes: #2094
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote-tracking branch 'gh/wip-pg-query'
Sage Weil [Thu, 23 Feb 2012 22:56:54 +0000 (14:56 -0800)]
Merge remote-tracking branch 'gh/wip-pg-query'

Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoosd: conditionally encode old pg_pool_t when no CEPH_FEATURE_OSDENC
Greg Farnum [Thu, 23 Feb 2012 22:55:48 +0000 (14:55 -0800)]
osd: conditionally encode old pg_pool_t when no CEPH_FEATURE_OSDENC

This fixes OSDMap compatibility between v0.42 and <v0.42.

For MOSDMap, reencode maps if OSDENC feature is missing.  Also rev the
message version.  We don't use COMPAT version here because v3 can't be
understood by v2 (that's why we're checking feature bits).  (It will be
possible to do that later when our constituent types can be decoded by
multiple versions.)

Fixes: #2095
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoMerge remote-tracking branch 'gh/wip-dump-ops-in-flight'
Sage Weil [Thu, 23 Feb 2012 22:38:03 +0000 (14:38 -0800)]
Merge remote-tracking branch 'gh/wip-dump-ops-in-flight'

Reviewed-by: Sage Weil <sage@newdream.net>
13 years agomon: use pending_mdsmap for deactivate
Sage Weil [Thu, 23 Feb 2012 22:24:12 +0000 (14:24 -0800)]
mon: use pending_mdsmap for deactivate

We should always look at the proposed map to avoid weird races.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agodoc: 'deactivate mds' instead of 'stop mds'
Sage Weil [Thu, 23 Feb 2012 22:27:34 +0000 (14:27 -0800)]
doc: 'deactivate mds' instead of 'stop mds'

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: mds "stop" -> "deactivate"
Sage Weil [Thu, 23 Feb 2012 20:16:59 +0000 (12:16 -0800)]
mon: mds "stop" -> "deactivate"

See #1820.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agotest: add basic test for the OSD's dump_ops_in_flight adminsocket command
Greg Farnum [Thu, 23 Feb 2012 20:11:27 +0000 (12:11 -0800)]
test: add basic test for the OSD's dump_ops_in_flight adminsocket command

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoosd: add "dump_ops_in_flight" to the AdminSocket.
Greg Farnum [Wed, 15 Feb 2012 02:53:49 +0000 (18:53 -0800)]
osd: add "dump_ops_in_flight" to the AdminSocket.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomon: refuse to stop mds if max_mds will make it rejoin
Sage Weil [Thu, 23 Feb 2012 20:08:52 +0000 (12:08 -0800)]
mon: refuse to stop mds if max_mds will make it rejoin

Otherwise the MDS will leave the cluster and immediately rejoin, which is
useless and confusing to users.  See #1820.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agocrushtool: add --reweight-item cli tests
Sage Weil [Thu, 23 Feb 2012 19:42:08 +0000 (11:42 -0800)]
crushtool: add --reweight-item cli tests

Test list, tree, and straw buckets.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agocrush: fix weight adjust for list, tree buckets
Sage Weil [Thu, 23 Feb 2012 19:03:44 +0000 (11:03 -0800)]
crush: fix weight adjust for list, tree buckets

Fix the typo.  Code now matches that for straw buckets.

Reported-by: ZhuRongze <zrz4ceph@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'wip-2090'
Sage Weil [Thu, 23 Feb 2012 19:16:17 +0000 (11:16 -0800)]
Merge branch 'wip-2090'

Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomon: unlock mon before msgr shutdown
Sage Weil [Thu, 23 Feb 2012 04:49:04 +0000 (20:49 -0800)]
mon: unlock mon before msgr shutdown

The ceph_mon.cc main() will delete mon when the msgr dispatch thread
completes.  Make sure we unlock before we shut down the messenger, and
avoid touching this after messenger->shutdown().

Fixes: #2090
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: deprecate mon 'stop' command
Sage Weil [Thu, 23 Feb 2012 04:43:20 +0000 (20:43 -0800)]
mon: deprecate mon 'stop' command

Send SIGTERM.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomsgr: join dispatch_thread after it completes
Sage Weil [Thu, 23 Feb 2012 04:37:40 +0000 (20:37 -0800)]
msgr: join dispatch_thread after it completes

This is just for completeness.  No change in behavior, since we don't
get here until the thread has signaled it is done.

Drop the destroy() overload, since we join earlier.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge remote-tracking branch 'gh/wip-stop'
Sage Weil [Thu, 23 Feb 2012 19:04:30 +0000 (11:04 -0800)]
Merge remote-tracking branch 'gh/wip-stop'

13 years agofilestore: use IOC_CLONERANGE intead of IOC_CLONE ioctl
Sage Weil [Thu, 23 Feb 2012 17:51:31 +0000 (09:51 -0800)]
filestore: use IOC_CLONERANGE intead of IOC_CLONE ioctl

This is functionally equivalent, except that valgrind doesn't complain
about a bad pointer passed to an ioctl.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd: drop "stop" command
Sage Weil [Thu, 23 Feb 2012 17:43:03 +0000 (09:43 -0800)]
osd: drop "stop" command

Send SIGINT.

Fixes: #1820
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: drop unused "stop" check
Sage Weil [Thu, 23 Feb 2012 17:42:11 +0000 (09:42 -0800)]
osd: drop unused "stop" check

This is never reached: both callers handle "stop" explicitly.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: don't complete recovery if unfound
Sage Weil [Thu, 23 Feb 2012 17:39:50 +0000 (09:39 -0800)]
osd: don't complete recovery if unfound

Otherwise we fail the !needs_recovery() assert.  Because we aren't
recovered.  For example,

2012-02-21 16:16:13.104665 1685c700 osd.5 1217 pg[0.16( v 1215'337 lc 19'2 (0'0,1215'337] n=25 ec=1 les/c 0/1061 1210/1210/1210) [5,3] r=0 lpr=1210 mlcod 0'0 active m=23 u=23 snaptrimq=[1~99,9b~e,aa~72,11d~3d,15b~e,16a~f,17a~5,180~4,185~1a,1a0~a,1ac~10,1bd~4,1c2~8,1cb~1,1cd~1,1cf~1a,1ea~10,1fb~6,202~2,205~2,209~2,20c~8,215~2,218~5,21e~1,220~1,222~9,22c~4,231~3,235~2,238~3,23e~2,241~4,246~1,248~1,24b~1,24d~9,257~6,25e~1,263~1,265~2,268~3,26e~1,273~1,275~5,27e~1,280~2]] needs_recovery osd.3 has 23 missing
osd/PG.cc: In function 'boost::statechart::result PG::RecoveryState::Active::react(const PG::RecoveryState::RecoveryComplete&)' thread 1685c700 time 2012-02-21 16:16:13.108923
osd/PG.cc: 4070: FAILED assert(!pg->needs_recovery())
 ceph version 0.42-70-g0e4367a (commit:0e4367aaac88b99c36386b6ce5e8d816fdd4ada0)
 1: (PG::RecoveryState::Active::react(PG::RecoveryState::RecoveryComplete const&)+0x1b3) [0x6a1173]
 2: (boost::statechart::simple_state<PG::RecoveryState::Active, PG::RecoveryState::Primary, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x121) [0x6c7301]
 3: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x6b) [0x6bfc6b]
 4: (PG::RecoveryState::handle_recovery_complete(PG::RecoveryCtx*)+0x10c) [0x67c03c]
 5: (ReplicatedPG::start_recovery_ops(int, PG::RecoveryCtx*)+0x241) [0x4f83c1]
 6: (OSD::do_recovery(PG*)+0x345) [0x54b3e5]
 7: (ThreadPool::worker()+0xa26) [0x619e66]
 8: (ThreadPool::WorkThread::entry()+0xd) [0x57ad5d]
 9: (()+0x7971) [0x5037971]
 10: (clone()+0x6d) [0x679f92d]

Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agomds: make EMetaBlob::fullbit::old_inodes non-ptr
Sage Weil [Thu, 23 Feb 2012 05:24:27 +0000 (21:24 -0800)]
mds: make EMetaBlob::fullbit::old_inodes non-ptr

No need to put this separately on the heap, as a static map<> isn't much
more expensive than a pointer.  Also, this ensures we unconditonally
reset in->old_inodes to a potentially empty value if we replay the same
inode multiple times and lose old inodes in subsequent versions.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomds: Add old_inodes to emetablob
Alexandre Oliva [Tue, 21 Feb 2012 09:22:01 +0000 (07:22 -0200)]
mds: Add old_inodes to emetablob

Add information about old inodes to the mds journal.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoFix ceph-mds --journal-reset
Alexandre Oliva [Tue, 21 Feb 2012 09:10:22 +0000 (07:10 -0200)]
Fix ceph-mds --journal-reset

Complete configuration initialization for special actions, and
hold Resetter lock while running reset.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMakefile: include encoding check scripts in dist tarball
Sage Weil [Wed, 22 Feb 2012 01:11:02 +0000 (17:11 -0800)]
Makefile: include encoding check scripts in dist tarball

This makes 'make distcheck' happy.  Well, more happy at least; it's still
cranky but I can't tell why.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agodebian: add ceph-dencoder
Sage Weil [Tue, 21 Feb 2012 19:12:37 +0000 (11:12 -0800)]
debian: add ceph-dencoder

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoceph.spec.in: add ceph-dencoder
Sage Weil [Tue, 21 Feb 2012 19:12:30 +0000 (11:12 -0800)]
ceph.spec.in: add ceph-dencoder

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoceph-dencoder: man page
Sage Weil [Tue, 21 Feb 2012 19:12:13 +0000 (11:12 -0800)]
ceph-dencoder: man page

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: make object_info_t::dump using hobject_t and object_locator_t dumpers
Sage Weil [Tue, 21 Feb 2012 23:08:26 +0000 (15:08 -0800)]
osd: make object_info_t::dump  using hobject_t and object_locator_t dumpers

Makes the output more readable.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote-tracking branch 'gh/wip-dump-stuck-pgs'
Sage Weil [Tue, 21 Feb 2012 22:46:00 +0000 (14:46 -0800)]
Merge remote-tracking branch 'gh/wip-dump-stuck-pgs'

Reviewed-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote-tracking branch 'gh/wip-osd-write'
Sage Weil [Tue, 21 Feb 2012 22:44:44 +0000 (14:44 -0800)]
Merge remote-tracking branch 'gh/wip-osd-write'

Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoosdmap: dump embedded crush map in Incremental::dump()
Sage Weil [Tue, 21 Feb 2012 22:43:23 +0000 (14:43 -0800)]
osdmap: dump embedded crush map in Incremental::dump()

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'wip-crush'
Sage Weil [Tue, 21 Feb 2012 22:39:16 +0000 (14:39 -0800)]
Merge branch 'wip-crush'

Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agocrush: write CrushWrapper:dump()
Sage Weil [Tue, 21 Feb 2012 22:37:50 +0000 (14:37 -0800)]
crush: write CrushWrapper:dump()

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agotest/rados-api/misc: fix LibRadosMisc.Operate1PP test
Sage Weil [Tue, 21 Feb 2012 05:12:21 +0000 (21:12 -0800)]
test/rados-api/misc: fix LibRadosMisc.Operate1PP test

It's a mutation, so we get a result of 0 (or error).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: refuse to return data payload if request wrote anything
Sage Weil [Tue, 21 Feb 2012 05:11:46 +0000 (21:11 -0800)]
osd: refuse to return data payload if request wrote anything

Write operations aren't allowed to return a data payload because
we can't do so reliably. If the client has to resend the request
and it has already been applied, we will return 0 with no
payload.  Non-deterministic behavior is no good.

See #1765.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'wip-osdmap'
Sage Weil [Tue, 21 Feb 2012 21:51:27 +0000 (13:51 -0800)]
Merge branch 'wip-osdmap'

Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoosdmap: dump fullmap from dump()
Sage Weil [Tue, 21 Feb 2012 21:50:34 +0000 (13:50 -0800)]
osdmap: dump fullmap from dump()

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'wip-1821'
Sage Weil [Tue, 21 Feb 2012 21:43:36 +0000 (13:43 -0800)]
Merge branch 'wip-1821'

Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agorgw: accepted access key chars should be url safe
Yehuda Sadeh [Tue, 21 Feb 2012 20:11:26 +0000 (12:11 -0800)]
rgw: accepted access key chars should be url safe

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoceph: if 'pg <pgid> ..' doesn't parse a pgid, send to mon
Sage Weil [Tue, 21 Feb 2012 04:40:35 +0000 (20:40 -0800)]
ceph: if 'pg <pgid> ..' doesn't parse a pgid, send to mon

E.g., 'pg dump'.  Sigh.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMakefile: fix misplaced unit tests
Sage Weil [Tue, 21 Feb 2012 00:01:34 +0000 (16:01 -0800)]
Makefile: fix misplaced unit tests

These weren't run on make check because they were defined in the wrong
spot.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agohobject_t: remove unused back_up_to_bounding_key()
Sage Weil [Mon, 20 Feb 2012 19:25:37 +0000 (11:25 -0800)]
hobject_t: remove unused back_up_to_bounding_key()

This was a path not taken in the backfill code.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: sched_scrub() outside of map_lock
Sage Weil [Mon, 20 Feb 2012 22:41:28 +0000 (14:41 -0800)]
osd: sched_scrub() outside of map_lock

Inside sched_scrub() we call _lookup_lock_pg(), which takes
map_lock.get_read().  That's technically okay because RWLock read side is
recursive, but lockdep doesn't know that, and we don't need map_lock
because we hold osd_lock.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoglobal: resurrect lockdep
Sage Weil [Mon, 20 Feb 2012 22:38:20 +0000 (14:38 -0800)]
global: resurrect lockdep

Add 'lockdep' config option, and initialize g_lockdep from that in
global_init().

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: disable pg_num adjustment
Sage Weil [Mon, 20 Feb 2012 21:00:14 +0000 (13:00 -0800)]
mon: disable pg_num adjustment

Until #1515 is fixed/reimplemented.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: use encode function for new Incremental
Sage Weil [Mon, 20 Feb 2012 19:02:49 +0000 (11:02 -0800)]
mon: use encode function for new Incremental

When we encode an Incremental, use the encode wrapper function, so that
we can capture the encoded struct when building with ENCODE_DUMP.  Set
all features (the default when encode() is called directly).

Signed-off-by: Sage Weil <sage@newdream.net>