]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agoMerge remote branch 'gh/master' into wip-backfill
Sage Weil [Wed, 11 Jan 2012 18:34:35 +0000 (10:34 -0800)]
Merge remote branch 'gh/master' into wip-backfill

13 years agoosd: limit size of log sent to reset backfill targets
Sage Weil [Wed, 11 Jan 2012 14:41:13 +0000 (06:41 -0800)]
osd: limit size of log sent to reset backfill targets

Need to replace magic number with new tunable, once that is merged.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoclient: start caching readdir results after readdir_start
Alexandre Oliva [Tue, 10 Jan 2012 03:41:45 +0000 (01:41 -0200)]
client: start caching readdir results after readdir_start

Use upper_bound rather than lower_bound to compute the initial pd within
insert_trace, so that we don't attempt to remove it if it happens to be
in the same frag as the new reply.

Fixes: #1774
Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomonclient: fix resolve_addrs() call
Sage Weil [Wed, 11 Jan 2012 00:39:23 +0000 (16:39 -0800)]
monclient: fix resolve_addrs() call

This was broken in def36668a13459d9c0851e4d4da440a288f9a34f it looks like.
Passing uninitialized memory to resolve_addrs(), and needlessly
allocating a buffer.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoresolve_addrs: return ipv4 and ipv6 addrs
Sage Weil [Wed, 11 Jan 2012 00:35:40 +0000 (16:35 -0800)]
resolve_addrs: return ipv4 and ipv6 addrs

Fixes: #1891
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoReplicatedPG: fix typo in stats accounting in _rollback_to
Samuel Just [Wed, 11 Jan 2012 00:21:13 +0000 (16:21 -0800)]
ReplicatedPG: fix typo in stats accounting in _rollback_to

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd: send log with backfill restart
Sage Weil [Wed, 11 Jan 2012 00:14:13 +0000 (16:14 -0800)]
osd: send log with backfill restart

This makes backfill restart less of a special case: we send an info AND
log, just like we do normally.  Code paths are more similar than before.

The main change here is that the backfill target gets a pg log with recent
history, which allows it to more reliably detect dup operations.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: fail to peer if interval lacks any !incomplete replicas
Sage Weil [Tue, 10 Jan 2012 21:23:00 +0000 (13:23 -0800)]
osd: fail to peer if interval lacks any !incomplete replicas

We need at least one non-incomplete replica during a rw interval in order
to peer.  The backfilling/incomplete replicas get log entries, but not
all object writes, so they are (mostly) excluded from the peering process
(find_best_info(), in particular).

We can't do this during the PriorSet calculation because we don't have
their PG::Info yet.  But, once we get it, we need to make sure at least one
of the replicas during the last rw interval is not incomplete, or else we
should mark the pg DOWN (just like the PriorSet calculation does).

This logic mostly mirrors that of PriorSet, but additionally requires
the replicas be !incomplete.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: allow specifying pg_num and pgp_num when creating new pools.
Greg Farnum [Tue, 10 Jan 2012 19:25:25 +0000 (11:25 -0800)]
mon: allow specifying pg_num and pgp_num when creating new pools.

Right now this is only exposed via the monitor command interface:
osd pool create <poolname> [pg_num [pgp_num]]
but it can be expanded to other interfaces as appropriate.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoauth: Fix Doxygen warnings.
Tommi Virtanen [Tue, 10 Jan 2012 19:11:35 +0000 (11:11 -0800)]
auth: Fix Doxygen warnings.

Match prototype and implementation argument names and types
(textually, that is use std:: prefix).

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoFix several doxygen warnings, to minimize noise. Only changes comments.
Tommi Virtanen [Tue, 10 Jan 2012 18:08:52 +0000 (10:08 -0800)]
Fix several doxygen warnings, to minimize noise. Only changes comments.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agolibrados: Make API docs use @note instead of @bug for now.
Tommi Virtanen [Tue, 10 Jan 2012 18:07:18 +0000 (10:07 -0800)]
librados: Make API docs use @note instead of @bug for now.

Asphyxiate doesn't yet support all of the Doxygen markup.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoFileStore: assert on ENOSPC even for SETXATTR
Samuel Just [Tue, 10 Jan 2012 19:16:21 +0000 (11:16 -0800)]
FileStore: assert on ENOSPC even for SETXATTR

Otherwise we can get corrupt object attributes on ext*.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agomds: initiate monitor reconnect if beacon acks take too long
Greg Farnum [Tue, 10 Jan 2012 18:41:36 +0000 (10:41 -0800)]
mds: initiate monitor reconnect if beacon acks take too long

If it takes 2*mds_beacon_grace (default 30 seconds total) seconds
to get an ack back, maybe it's the monitor and not us. Try a reconnect,
which will just add the teensiest bit of load if we're wrong.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomds: remove beacon_killer code.
Greg Farnum [Tue, 10 Jan 2012 18:32:43 +0000 (10:32 -0800)]
mds: remove beacon_killer code.

This no longer does *anything* except print out
useless warning messages.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoosd: make less noise when filestore is already up to date
Sage Weil [Tue, 10 Jan 2012 17:49:41 +0000 (09:49 -0800)]
osd: make less noise when filestore is already up to date

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agodoc: add librados C aio example
Josh Durgin [Tue, 10 Jan 2012 03:02:05 +0000 (19:02 -0800)]
doc: add librados C aio example

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: describe some rados_pool_stat_t members
Josh Durgin [Tue, 10 Jan 2012 02:58:27 +0000 (18:58 -0800)]
doc: describe some rados_pool_stat_t members

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: add librados pool creation defaults
Josh Durgin [Tue, 10 Jan 2012 02:29:04 +0000 (18:29 -0800)]
doc: add librados pool creation defaults

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: add short section on documenting code
Josh Durgin [Thu, 29 Dec 2011 23:57:33 +0000 (15:57 -0800)]
doc: add short section on documenting code

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: clarify librados return codes
Josh Durgin [Thu, 29 Dec 2011 00:00:25 +0000 (16:00 -0800)]
doc: clarify librados return codes

Adding a second @returns for specific error codes makes the sphinx output more readable.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: @return -> @returns to match the sphinx output
Josh Durgin [Wed, 28 Dec 2011 22:26:10 +0000 (14:26 -0800)]
doc: @return -> @returns to match the sphinx output

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: standardize rados_tmap_* docs
Josh Durgin [Wed, 28 Dec 2011 22:17:39 +0000 (14:17 -0800)]
doc: standardize rados_tmap_* docs

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: fix rados_version todo formatting
Josh Durgin [Wed, 28 Dec 2011 22:04:28 +0000 (14:04 -0800)]
doc: fix rados_version todo formatting

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: add a prefix to group names in librados.h
Josh Durgin [Wed, 28 Dec 2011 22:01:25 +0000 (14:01 -0800)]
doc: add a prefix to group names in librados.h

doxygen groups are in a global namespace.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: Put rados_ioctx_locator_set_key in a group so it can be cross-referenced
Josh Durgin [Wed, 28 Dec 2011 21:46:44 +0000 (13:46 -0800)]
doc: Put rados_ioctx_locator_set_key in a group so it can be cross-referenced

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: move rados_ioctx_get_id to the pool group
Josh Durgin [Wed, 28 Dec 2011 21:29:44 +0000 (13:29 -0800)]
doc: move rados_ioctx_get_id to the pool group

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: fix some typos in librados C API
Josh Durgin [Wed, 28 Dec 2011 21:25:05 +0000 (13:25 -0800)]
doc: fix some typos in librados C API

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: Switch doxygen integration from breathe to asphyxiate.
Tommi Virtanen [Fri, 23 Dec 2011 01:17:20 +0000 (17:17 -0800)]
doc: Switch doxygen integration from breathe to asphyxiate.

TODO: path of librados.h is now just the basename

TODO: no enum support for now

TODO: no @bug support for now

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agolibrados: Avoid using "crush_rule" as name of function argument.
Tommi Virtanen [Fri, 23 Dec 2011 01:09:23 +0000 (17:09 -0800)]
librados: Avoid using "crush_rule" as name of function argument.

"struct crush_rule" exists already, using the same identifier
confuses Doxygen.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agodoxygen: Use first sentence as brief description.
Tommi Virtanen [Fri, 23 Dec 2011 00:45:24 +0000 (16:45 -0800)]
doxygen: Use first sentence as brief description.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agodoc: add configuration and connecting to librados C api example
Josh Durgin [Tue, 20 Dec 2011 22:47:02 +0000 (14:47 -0800)]
doc: add configuration and connecting to librados C api example

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: add librados C api docs
Josh Durgin [Fri, 16 Dec 2011 21:57:38 +0000 (13:57 -0800)]
doc: add librados C api docs

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoceph: add a new "run_uml.sh" script to manage running a UML client
Alex Elder [Tue, 10 Jan 2012 02:13:41 +0000 (18:13 -0800)]
ceph: add a new "run_uml.sh" script to manage running a UML client

This script is used to automate most of what's required to run a
User-Mode Linux (UML) instance.  This is mainly of interest for
ceph client developers who might benefit from the debugger access
that UML affords.  It was written for ceph development but isn't
really dependent on ceph.  It basically makes a few assumptions and
follows some conventions, and in doing so is able to encapsulate
most of the "tricky parts" of setting up to run a UML instance.

Signed-off-by: Alex Elder <elder@dreamhost.com>
13 years agorgw: adjust log level
Yehuda Sadeh [Mon, 9 Jan 2012 19:40:44 +0000 (11:40 -0800)]
rgw: adjust log level

13 years agorgw: some cleanup
Yehuda Sadeh [Mon, 9 Jan 2012 18:31:15 +0000 (10:31 -0800)]
rgw: some cleanup

13 years agorgw: only use plain PUT processor when !chunked_upload
Yehuda Sadeh [Mon, 9 Jan 2012 18:15:00 +0000 (10:15 -0800)]
rgw: only use plain PUT processor when !chunked_upload

13 years agoMerge remote branch 'gh/master' into wip-backfill
Sage Weil [Mon, 9 Jan 2012 17:13:03 +0000 (09:13 -0800)]
Merge remote branch 'gh/master' into wip-backfill

13 years agoosd: populate_obc_watchers when object pulled to primary
Sage Weil [Mon, 9 Jan 2012 00:23:55 +0000 (16:23 -0800)]
osd: populate_obc_watchers when object pulled to primary

We don't care about degraded state, only whether the object is on the
primary so that we can load the object_info_t.

In particular, this avoids problems with backfill, where an object is
not degraded and populated, is then degraded while we backfill to the
target, and then not degraded again, and populate_obc_watchers() is called
a second time.

Fixes: #1903
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: handle case where no acceptable info exists
Sage Weil [Sun, 8 Jan 2012 23:15:18 +0000 (15:15 -0800)]
osd: handle case where no acceptable info exists

This happens when the only available replicas has last_backfill != MAX.

In that case, revert to up, and then set the DOWN state bit.

Instead of waiting for a new map, we should actually wait for a new info
to show up...

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote branch 'gh/wip-osd-retry-attempt'
Sage Weil [Sun, 8 Jan 2012 18:15:41 +0000 (10:15 -0800)]
Merge remote branch 'gh/wip-osd-retry-attempt'

13 years agoMerge remote branch 'gh/wip-admin-socket'
Sage Weil [Sun, 8 Jan 2012 16:16:56 +0000 (08:16 -0800)]
Merge remote branch 'gh/wip-admin-socket'

13 years agoperfcounters: fix unittest for new admin_socket interface
Sage Weil [Sat, 7 Jan 2012 03:09:10 +0000 (19:09 -0800)]
perfcounters: fix unittest for new admin_socket interface

Broken by b389685afa1be00b5147855bf71c50042bfbfa6c.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMakefile: disable untitest_interval_tree
Sage Weil [Sat, 7 Jan 2012 04:39:05 +0000 (20:39 -0800)]
Makefile: disable untitest_interval_tree

Segfaults. Valgrind errors. Accessing uninitialized memory.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agounittest_interval_tree: make it compile
Sage Weil [Sat, 7 Jan 2012 04:38:33 +0000 (20:38 -0800)]
unittest_interval_tree: make it compile

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: clean up src_oid, src_obc map key calculation
Sage Weil [Sat, 7 Jan 2012 01:18:01 +0000 (17:18 -0800)]
osd: clean up src_oid, src_obc map key calculation

Be consistent about how we generate the src_oid and src_oloc, so that we
feed good value into find_object_context and use a consistent key for
the src_obc map<>.  This fixes a crash in do_osd_ops() due to a missing
src_obc key when the get_src_oloc() normalizes the key in do_op() but not
in do_osd_ops().

Also use a nicer name.

Fixes: #1897
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: read op should claim_append data instead of claim
Yehuda Sadeh [Sat, 7 Jan 2012 00:55:52 +0000 (16:55 -0800)]
osd: read op should claim_append data instead of claim

13 years agorgw: remove object before writing both xattrs and data
Yehuda Sadeh [Sat, 7 Jan 2012 00:51:23 +0000 (16:51 -0800)]
rgw: remove object before writing both xattrs and data

otherwise we'll leak xattrs from previous incarnation

13 years agorgw: create plain processor for small objects
Yehuda Sadeh [Sat, 7 Jan 2012 00:16:06 +0000 (16:16 -0800)]
rgw: create plain processor for small objects

13 years agorgw: fix multipart PUT
Yehuda Sadeh [Fri, 6 Jan 2012 23:07:08 +0000 (15:07 -0800)]
rgw: fix multipart PUT

latest revamp broke it, missed calling RGWPutObjProcessor::prepare(s)
where needed.

13 years agorgw: rearrange PutObj::execute()
Yehuda Sadeh [Fri, 6 Jan 2012 20:41:33 +0000 (12:41 -0800)]
rgw: rearrange PutObj::execute()

groundwork for different handling of small object PUTs

13 years agorgw: different atomic handling for small objects
Yehuda Sadeh [Thu, 5 Jan 2012 20:51:27 +0000 (12:51 -0800)]
rgw: different atomic handling for small objects

13 years agoMerge remote branch 'gh/master' into wip-backfill
Sage Weil [Sat, 7 Jan 2012 00:44:11 +0000 (16:44 -0800)]
Merge remote branch 'gh/master' into wip-backfill

13 years agomon: fix uninitialized cluster_logger_registered
Sage Weil [Fri, 6 Jan 2012 22:32:16 +0000 (14:32 -0800)]
mon: fix uninitialized cluster_logger_registered

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobjecter: ignore replies from old request attempts
Sage Weil [Fri, 6 Jan 2012 19:38:15 +0000 (11:38 -0800)]
objecter: ignore replies from old request attempts

If we know the request attempt, ignore old attempts.

If we do not know the attempt (because the server is old), accept the
reply.  This could lead to doing some ACK callbacks we shouldn't in
extreme failure/recovery scenarios, but that is better than doing
the callbacks out of order.

Partially fixes: #1490
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: encode retry attempt in MOSDOp[Reply]
Sage Weil [Fri, 6 Jan 2012 20:49:13 +0000 (12:49 -0800)]
osd: encode retry attempt in MOSDOp[Reply]

In addition to the boolean flag, also encode the exact retry attempt.

Return -1 if we don't know.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: document quorum_status, mon_status
Sage Weil [Fri, 6 Jan 2012 20:20:18 +0000 (12:20 -0800)]
mon: document quorum_status, mon_status

Fixes: #1824
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: fix misplaced else
Sage Weil [Fri, 6 Jan 2012 20:19:59 +0000 (12:19 -0800)]
mon: fix misplaced else

Broken by 435c29448a10ec343f5a2b7195d94c72de5b1a25.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote branch 'gh/wip-mon-timeouts'
Sage Weil [Fri, 6 Jan 2012 18:20:55 +0000 (10:20 -0800)]
Merge remote branch 'gh/wip-mon-timeouts'

13 years agoceph: speak new admin socket protocol
Sage Weil [Thu, 5 Jan 2012 21:58:05 +0000 (13:58 -0800)]
ceph: speak new admin socket protocol

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoadmin_socket: fix, extend admin_socket unit tests
Sage Weil [Fri, 6 Jan 2012 17:30:54 +0000 (09:30 -0800)]
admin_socket: fix, extend admin_socket unit tests

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoadmin_socket: string commands
Sage Weil [Thu, 5 Jan 2012 21:57:58 +0000 (13:57 -0800)]
admin_socket: string commands

Commands are strings.  Old __be32 works too.  'help' to list available
commands.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: elector needs to reset leader_acked on every election start
Greg Farnum [Thu, 5 Jan 2012 23:29:32 +0000 (15:29 -0800)]
mon: elector needs to reset leader_acked on every election start

Otherwise you never reset the leader_acked after a failed
election attempt, so if mon 0 is available on the first round
but then fails, you never make progress!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomon: instrument elector so you can stop participating in the quorum
Greg Farnum [Thu, 5 Jan 2012 23:36:37 +0000 (15:36 -0800)]
mon: instrument elector so you can stop participating in the quorum

Add new monitor commands "quorum exit" and "quorum enter" to use it.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomon: kill client sessions when we're not in quorum
Greg Farnum [Thu, 5 Jan 2012 22:03:43 +0000 (14:03 -0800)]
mon: kill client sessions when we're not in quorum

After a timeout of 2*mon_lease length (ie, two election rounds),
kill existing client sessions so they can reconnect to a
monitor that's (hopefully) remained in the quorum. Let any
new client sessions stick around for a mon_lease interval, then
do the same to them.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoOCF RA: fix variable name
Florian Haas [Thu, 5 Jan 2012 21:33:32 +0000 (22:33 +0100)]
OCF RA: fix variable name

13 years agodebian: build ceph-resource-agents
Florian Haas [Thu, 5 Jan 2012 21:33:31 +0000 (22:33 +0100)]
debian: build ceph-resource-agents

13 years agoosd: parameterize min/max values for backfill scanning
Sage Weil [Tue, 3 Jan 2012 17:30:42 +0000 (09:30 -0800)]
osd: parameterize min/max values for backfill scanning

For local scans, use the optimal value for the local filestore.

For remote scans, make it configurable, so we can control how frequently
we need to wait for scan requests over the wire.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoadmin_socket: refactor
Sage Weil [Thu, 5 Jan 2012 18:46:31 +0000 (10:46 -0800)]
admin_socket: refactor

Combine AdminSocketConfigObs with AdminSocket so that we can interact
with it via the cct.  Simpler class structure.  Less pointer indirection.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorbd: add a command to delete all snapshots of an image
Josh Durgin [Thu, 5 Jan 2012 01:07:07 +0000 (17:07 -0800)]
rbd: add a command to delete all snapshots of an image

This makes deleting images with many snapshots easier.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoadmin_socket: whitespace
Sage Weil [Thu, 5 Jan 2012 17:32:49 +0000 (09:32 -0800)]
admin_socket: whitespace

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agocommon: default 'mon osd auto mark in = false'
Sage Weil [Thu, 5 Jan 2012 17:30:33 +0000 (09:30 -0800)]
common: default 'mon osd auto mark in = false'

This way an osd that was explicitly marked out will stay out, even when
it is restarted.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: log backfill restart
Sage Weil [Thu, 5 Jan 2012 17:26:12 +0000 (09:26 -0800)]
osd: log backfill restart

This is interesting, particularly in determining when a peer that was
partially backfilled needs to be restarted.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agolibrbd: don't remove an image that still has snapshots
Josh Durgin [Thu, 5 Jan 2012 01:05:49 +0000 (17:05 -0800)]
librbd: don't remove an image that still has snapshots

Return -EBUSY instead. After the header is removed, the snapshots
can't be removed or read, so make sure they're gone before proceeding.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoSimpleMessenger: clarify when ms_bind_ipv6 is used
Josh Durgin [Thu, 5 Jan 2012 01:34:17 +0000 (17:34 -0800)]
SimpleMessenger: clarify when ms_bind_ipv6 is used

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoqa: add a slightly more stressful anchortable test
Greg Farnum [Thu, 5 Jan 2012 01:08:45 +0000 (17:08 -0800)]
qa: add a slightly more stressful anchortable test

This creates more than 8 links.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoqa: fix mdstable script for proper injectargs use.
Greg Farnum [Wed, 4 Jan 2012 23:36:08 +0000 (15:36 -0800)]
qa: fix mdstable script for proper injectargs use.

This script is fairly primitive, but somebody might find it useful...

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoMerge remote branch 'gh/master' into wip-backfill
Sage Weil [Thu, 5 Jan 2012 00:38:23 +0000 (16:38 -0800)]
Merge remote branch 'gh/master' into wip-backfill

13 years agorados: fix run-length option parsing for rados load-gen
Sage Weil [Thu, 5 Jan 2012 00:25:29 +0000 (16:25 -0800)]
rados: fix run-length option parsing for rados load-gen

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomove cluster protocol definitions out of ceph_fs.h
Sage Weil [Wed, 4 Jan 2012 21:54:45 +0000 (13:54 -0800)]
move cluster protocol definitions out of ceph_fs.h

Among other things, we don't recompile the whole system when we touch
these.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomsgr: explicitly specify internal cluster protocol
Sage Weil [Wed, 4 Jan 2012 21:54:11 +0000 (13:54 -0800)]
msgr: explicitly specify internal cluster protocol

Replace case statement based on my_type.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: rev cluster protocol
Sage Weil [Wed, 4 Jan 2012 21:21:36 +0000 (13:21 -0800)]
mon: rev cluster protocol

The OSDMap NEW and AUTOOUT bit additions subtely change the decoding of
the incremental maps in a reasonably harmless way in that the bits get
implicitly cleared whenever the OSD weight changes from non-zero.  The
monitors need to agree on this behavior to avoid odd behavior.  We don't
care what clients see, since those bits are informational only.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosdmap: include state names in dump()
Sage Weil [Wed, 4 Jan 2012 20:30:42 +0000 (12:30 -0800)]
osdmap: include state names in dump()

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: separately control auto-mark-in of new OSDs
Sage Weil [Wed, 4 Jan 2012 20:30:23 +0000 (12:30 -0800)]
mon: separately control auto-mark-in of new OSDs

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: maintain CEPH_OSD_NEW bit for new, unused OSDs
Sage Weil [Wed, 4 Jan 2012 20:29:14 +0000 (12:29 -0800)]
mon: maintain CEPH_OSD_NEW bit for new, unused OSDs

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: independently control whether AUTOOUT OSDs are marked in on boot
Sage Weil [Wed, 4 Jan 2012 19:31:35 +0000 (11:31 -0800)]
mon: independently control whether AUTOOUT OSDs are marked in on boot

Add separate config option to control whether the monitor will mark
AUTOOUT OSDs in on boot.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: track auto-marked-out osds
Sage Weil [Wed, 4 Jan 2012 20:56:15 +0000 (12:56 -0800)]
mon: track auto-marked-out osds

Mark OSDs that were automatically marked OUT by the monitor because they
were down for too long.  Clear the bit as soon as they are no longer out,
as soon as the weight is changed from 0.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: don't add all strays in calc_acting()
Sage Weil [Wed, 4 Jan 2012 21:43:17 +0000 (13:43 -0800)]
osd: don't add all strays in calc_acting()

We weren't counting up usable strays, which meant we added all of them.
This could result in acting sets with more active replicas than we wanted.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: mark degraded only when < desired replica count
Sage Weil [Wed, 4 Jan 2012 21:38:03 +0000 (13:38 -0800)]
osd: mark degraded only when < desired replica count

Having extra replicas is not 'degraded' per se.  Although it's weird that
we ever do that!

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: avoid querying missing set from (full) backfill target
Sage Weil [Wed, 4 Jan 2012 20:32:10 +0000 (12:32 -0800)]
osd: avoid querying missing set from (full) backfill target

If we are doing a complete backfill, we don't care about missing; it will
clearly all be below last_backfill anwyay and get ignored.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: fix backfill reset on activate
Sage Weil [Wed, 4 Jan 2012 20:31:30 +0000 (12:31 -0800)]
osd: fix backfill reset on activate

Look at peer's info, now our own!

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge pull request #8 from kylemarsh/master
Sage Weil [Wed, 4 Jan 2012 23:53:51 +0000 (15:53 -0800)]
Merge pull request #8 from kylemarsh/master

Remove cloudfiles requirement from obsync.

13 years agoqa: load-gen-mix-small-long
Sage Weil [Wed, 4 Jan 2012 22:21:01 +0000 (14:21 -0800)]
qa: load-gen-mix-small-long

30 minutes

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobsync: make obsync run without cloudfiles installed 8/head
Kyle Marsh [Wed, 4 Jan 2012 22:14:17 +0000 (14:14 -0800)]
obsync: make obsync run without cloudfiles installed

Cloudfiles probably shouldn't be a requirement for running obsync, so this
commit makes it optional.

13 years agoosd: initialize backfill_target; include in PG operator<<
Sage Weil [Wed, 4 Jan 2012 17:52:35 +0000 (09:52 -0800)]
osd: initialize backfill_target; include in PG operator<<

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: initialize backfill_pos on activate
Sage Weil [Wed, 4 Jan 2012 17:42:02 +0000 (09:42 -0800)]
osd: initialize backfill_pos on activate

Handling of writes depends on backfill_pos being initialized (to know what
is between the leading and trailing edge of the backfill), so it needs to
be initialized at activate time to avoid badness on writes prior to
recovery starting.

- initialize during activate to last_backfill
- update on receiving the digest to maintain the invariant that
  backfill_pos = min(peer_backfill_info.start, backfill_info.start)
  in recover_backfill().

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: do not use incomplete peer for best info/log
Sage Weil [Sun, 1 Jan 2012 04:44:05 +0000 (20:44 -0800)]
osd: do not use incomplete peer for best info/log

For one, their stats are incomplete; if we use them we'll screw up everyone
else.  For another, it doesn't do us any good if they are a bit ahead of
the peers: we/they may not even have the objects their newer log says were
updated.  The only real use is if their log extends farther back in time,
but that is a problem in general that we'll eventually solve in other ways.

On the other hand, having the pg_stats sum only through last_backfill may
not have been the best choice; we could avoid that part of things by adding
a objects_backfilled field.  But this is probably a good idea anyway.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: fix misdirect check for requests with old epochs
Sage Weil [Wed, 4 Jan 2012 18:46:35 +0000 (10:46 -0800)]
osd: fix misdirect check for requests with old epochs

get_map() assumes the epoch passed is valid.  Check here in the caller.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: check that we're supposed to be getting a PG before waitlisting requests.
Sage Weil [Wed, 4 Jan 2012 18:44:12 +0000 (10:44 -0800)]
osd: check that we're supposed to be getting a PG before waitlisting requests.

This was broken in fa722de6708d3e92037df6289cc29ece12c8ea66.

Fix it by checking if the mapping was correct in the sender's epoch, and
either drop it (if valid) or handle_misdirected_request() if not.

Also fix the documentation for op_is_queueable() to not be a gigantic lie.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorados: gracefully report errors from 'ls'
Sage Weil [Wed, 4 Jan 2012 18:40:06 +0000 (10:40 -0800)]
rados: gracefully report errors from 'ls'

Catch the exception thrown by the iterator when the OSD returns errors.

Signed-off-by: Sage Weil <sage@newdream.net>