]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agorados: load-gen: wake up on reply
Sage Weil [Mon, 16 Jan 2012 18:30:38 +0000 (10:30 -0800)]
rados: load-gen: wake up on reply

So we can send requests more than once per second.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorados: fix load-gen 'max-ops'
Sage Weil [Mon, 16 Jan 2012 18:25:00 +0000 (10:25 -0800)]
rados: fix load-gen 'max-ops'

This was mixed up with min/max_op_len.  And max_ops wasn't being used
the initial object creation stage, flooding the OSDs.  Or during run().

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: recover_primary_got() -> recover_got()
Sage Weil [Mon, 16 Jan 2012 17:46:10 +0000 (09:46 -0800)]
osd: recover_primary_got() -> recover_got()

This is called on primary and replicas alike.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: clear missing set on replica when restarting backfill
Sage Weil [Mon, 16 Jan 2012 17:34:47 +0000 (09:34 -0800)]
osd: clear missing set on replica when restarting backfill

The primary does the same in PG::activate().

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: rev osd internal cluster protocol
Sage Weil [Sat, 14 Jan 2012 01:13:34 +0000 (17:13 -0800)]
osd: rev osd internal cluster protocol

Prevent backfill code from talking to pre-backfill code.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoReplicatedPG: munge truncate_seq 1/truncate_size -1 to seq 0/size 0
Samuel Just [Fri, 13 Jan 2012 19:15:42 +0000 (11:15 -0800)]
ReplicatedPG: munge truncate_seq 1/truncate_size -1 to seq 0/size 0

Truncate with seq 1 and size -1 is a noop.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote branch 'gh/master' into wip-backfill
Sage Weil [Fri, 13 Jan 2012 16:35:47 +0000 (08:35 -0800)]
Merge remote branch 'gh/master' into wip-backfill

13 years agoReplicatedPG: Update stat accounting for truncate during write
Samuel Just [Fri, 13 Jan 2012 01:07:35 +0000 (17:07 -0800)]
ReplicatedPG: Update stat accounting for truncate during write

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agorgw: wrap cls_cxx_map_* with try/catch around decoding
Yehuda Sadeh [Fri, 13 Jan 2012 00:39:30 +0000 (16:39 -0800)]
rgw: wrap cls_cxx_map_* with try/catch around decoding

13 years agorgw: bucket index creation and init in a single operation
Yehuda Sadeh [Fri, 13 Jan 2012 00:22:20 +0000 (16:22 -0800)]
rgw: bucket index creation and init in a single operation

13 years agolibrados: add ObjectOperation::exec
Yehuda Sadeh [Fri, 13 Jan 2012 00:17:56 +0000 (16:17 -0800)]
librados: add ObjectOperation::exec

13 years agosecret: move null check before strlen(key_name) deref
Sage Weil [Thu, 12 Jan 2012 23:25:21 +0000 (15:25 -0800)]
secret: move null check before strlen(key_name) deref

Coverity cid: 98
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: stat op, don't compare in memory state to object
Yehuda Sadeh [Fri, 13 Jan 2012 00:10:02 +0000 (16:10 -0800)]
osd: stat op, don't compare in memory state to object

might be that object is being created by the current compound request.

13 years agoosd: fill in empty item in peer_missing for strays
Sage Weil [Thu, 12 Jan 2012 23:09:18 +0000 (15:09 -0800)]
osd: fill in empty item in peer_missing for strays

If we search_for_missing() on a host, make a corresponding entry in our
peer_missing map (if it isn't already there).  This ensure we get (empty)
entries for strays, which makes all_unfound_are_queried_or_lost() happy.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorgw: don't crash when copying a zero sized object
Yehuda Sadeh [Thu, 12 Jan 2012 23:02:09 +0000 (15:02 -0800)]
rgw: don't crash when copying a zero sized object

13 years agoReplicatedPG: Do a write even for 0 length operation
Samuel Just [Thu, 12 Jan 2012 21:13:47 +0000 (13:13 -0800)]
ReplicatedPG: Do a write even for 0 length operation

Otherwise, a 0 length write to an offset past the end of the file will
cause the internal accounting to reflect the full size of the file, but
not the file on disk.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoReplicatedPG: fix stat accounting error in CEPH_OSD_OP_WRITEFULL
Samuel Just [Thu, 12 Jan 2012 21:12:55 +0000 (13:12 -0800)]
ReplicatedPG: fix stat accounting error in CEPH_OSD_OP_WRITEFULL

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoqa/client/gen-1774.sh
Sage Weil [Thu, 12 Jan 2012 20:59:07 +0000 (12:59 -0800)]
qa/client/gen-1774.sh

Capture Alexandre's script for reproducing #1774 here for posterity, until
we write a properly harnessed test for this.  Currently, workunits can't
mount/unmount, and we don't have a way to make ceph-fuse drop it's cache.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: fix PG::Log::copy_up_to() tail
Sage Weil [Thu, 12 Jan 2012 19:46:27 +0000 (11:46 -0800)]
osd: fix PG::Log::copy_up_to() tail

The tail needs to refer to the entry preceeding the first entry in the
log.  This updates copy_up_to() to match the basic structure of the other
copy_*() methods.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: reset last_complete on backfill restart
Sage Weil [Thu, 12 Jan 2012 19:07:02 +0000 (11:07 -0800)]
osd: reset last_complete on backfill restart

Since last_backfill is hobject_t(), we can set this equal to last_update.
This fixes a problem where last_complete preceeds the abbreviated log we
send to the replica below.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoclient: avoid taking inode ref in case of nonexistent dir
Andrey Stepachev [Thu, 12 Jan 2012 15:26:34 +0000 (19:26 +0400)]
client: avoid taking inode ref in case of nonexistent dir

Signed-off-by: Andrey Stepachev <octo@yandex-team.ru>
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'wip-makefile'
Sage Weil [Thu, 12 Jan 2012 18:35:03 +0000 (10:35 -0800)]
Merge branch 'wip-makefile'

13 years agoCOPYING: note licenses for all files, not just the default
Sage Weil [Thu, 12 Jan 2012 18:01:40 +0000 (10:01 -0800)]
COPYING: note licenses for all files, not just the default

This (mostly) copies debian/copyright for now, but there are format
restrictions for that file.  Suggestions for a cleaner way to handle this
are welcome.  In the meantime, this is better...

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agodebian/copyright: note acx_pthread.m4 license
Sage Weil [Thu, 12 Jan 2012 17:58:21 +0000 (09:58 -0800)]
debian/copyright: note acx_pthread.m4 license

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMakefile: Add headers that were omitted in make dist and prevented tests from building
Kacper Kowalik (Xarthisius) [Sat, 7 Jan 2012 15:10:43 +0000 (16:10 +0100)]
Makefile: Add headers that were omitted in make dist and prevented tests from building

Signed-off-by: Kacper Kowalik (Xarthisius) <xarthisius@gentoo.org>
13 years agoMakefile: Handle corner case of crypto++ correctly
Kacper Kowalik (Xarthisius) [Sat, 7 Jan 2012 15:02:45 +0000 (16:02 +0100)]
Makefile: Handle corner case of crypto++ correctly

i.e. use c++ while compiling, append to CRYPTO_LIBS instead of LIBS

Signed-off-by: Kacper Kowalik (Xarthisius) <xarthisius@gentoo.org>
13 years agoMakefile: Use ACX_PTHREAD in configure.ac and resulting flags in src/Makefile.am
Kacper Kowalik (Xarthisius) [Sat, 7 Jan 2012 14:32:17 +0000 (15:32 +0100)]
Makefile: Use ACX_PTHREAD in configure.ac and resulting flags in src/Makefile.am

instead of hardcoded flags

Signed-off-by: Kacper Kowalik (Xarthisius) <xarthisius@gentoo.org>
13 years agoMakefile: Add recent acx_pthread.m4 that has a fix for nostdlib issue.
Kacper Kowalik (Xarthisius) [Sat, 7 Jan 2012 13:43:22 +0000 (14:43 +0100)]
Makefile: Add recent acx_pthread.m4 that has a fix for nostdlib issue.

See http://code.google.com/p/protobuf/issues/detail?id=188 for details

Signed-off-by: Kacper Kowalik (Xarthisius) <xarthisius@gentoo.org>
13 years agoPG: gen_prefix should grab a map reference atomically
Samuel Just [Wed, 11 Jan 2012 21:20:17 +0000 (13:20 -0800)]
PG: gen_prefix should grab a map reference atomically

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agorgw-admin: add pool rm and pools list
Yehuda Sadeh [Wed, 11 Jan 2012 21:37:37 +0000 (13:37 -0800)]
rgw-admin: add pool rm and pools list

13 years agorgw-admin: clean up unused commands
Yehuda Sadeh [Wed, 11 Jan 2012 21:05:47 +0000 (13:05 -0800)]
rgw-admin: clean up unused commands

13 years agoosd: bound log we send when restarting backfill
Sage Weil [Wed, 11 Jan 2012 21:04:11 +0000 (13:04 -0800)]
osd: bound log we send when restarting backfill

Use the new tunable from b1da5115aa0756aefa4f0aad36395911e82fce28.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorados.py: avoid getting return value of void function
Josh Durgin [Wed, 11 Jan 2012 20:20:47 +0000 (12:20 -0800)]
rados.py: avoid getting return value of void function

rados_ioctx_locator_set_key is void. The return value seems to have
been uninitialized, so the tests failed rarely.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agopg: remove unnecessary guard from calc_trim_to()
Josh Durgin [Tue, 10 Jan 2012 22:19:12 +0000 (14:19 -0800)]
pg: remove unnecessary guard from calc_trim_to()

The num_objects check doesn't make sense, and could only make trimming
happen more often than it should. Sage did not remember why it was
added.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agopg: add a configurable lower bound on log size
Josh Durgin [Tue, 10 Jan 2012 22:16:41 +0000 (14:16 -0800)]
pg: add a configurable lower bound on log size

This helps prevent problems with retrying requests being detected as
duplicates. The problem occurs when the log is trimmed too
aggressively, and an earlier tid is removed from the log, while a
later one is not. The later request will be detected as a duplicate
and responded to immediately, possibly violating the ordering of the
requests.

Partially fixes #1490.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoMerge remote branch 'gh/master' into wip-backfill
Sage Weil [Wed, 11 Jan 2012 18:34:35 +0000 (10:34 -0800)]
Merge remote branch 'gh/master' into wip-backfill

13 years agoosd: limit size of log sent to reset backfill targets
Sage Weil [Wed, 11 Jan 2012 14:41:13 +0000 (06:41 -0800)]
osd: limit size of log sent to reset backfill targets

Need to replace magic number with new tunable, once that is merged.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoclient: start caching readdir results after readdir_start
Alexandre Oliva [Tue, 10 Jan 2012 03:41:45 +0000 (01:41 -0200)]
client: start caching readdir results after readdir_start

Use upper_bound rather than lower_bound to compute the initial pd within
insert_trace, so that we don't attempt to remove it if it happens to be
in the same frag as the new reply.

Fixes: #1774
Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomonclient: fix resolve_addrs() call
Sage Weil [Wed, 11 Jan 2012 00:39:23 +0000 (16:39 -0800)]
monclient: fix resolve_addrs() call

This was broken in def36668a13459d9c0851e4d4da440a288f9a34f it looks like.
Passing uninitialized memory to resolve_addrs(), and needlessly
allocating a buffer.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoresolve_addrs: return ipv4 and ipv6 addrs
Sage Weil [Wed, 11 Jan 2012 00:35:40 +0000 (16:35 -0800)]
resolve_addrs: return ipv4 and ipv6 addrs

Fixes: #1891
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoReplicatedPG: fix typo in stats accounting in _rollback_to
Samuel Just [Wed, 11 Jan 2012 00:21:13 +0000 (16:21 -0800)]
ReplicatedPG: fix typo in stats accounting in _rollback_to

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoosd: send log with backfill restart
Sage Weil [Wed, 11 Jan 2012 00:14:13 +0000 (16:14 -0800)]
osd: send log with backfill restart

This makes backfill restart less of a special case: we send an info AND
log, just like we do normally.  Code paths are more similar than before.

The main change here is that the backfill target gets a pg log with recent
history, which allows it to more reliably detect dup operations.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: fail to peer if interval lacks any !incomplete replicas
Sage Weil [Tue, 10 Jan 2012 21:23:00 +0000 (13:23 -0800)]
osd: fail to peer if interval lacks any !incomplete replicas

We need at least one non-incomplete replica during a rw interval in order
to peer.  The backfilling/incomplete replicas get log entries, but not
all object writes, so they are (mostly) excluded from the peering process
(find_best_info(), in particular).

We can't do this during the PriorSet calculation because we don't have
their PG::Info yet.  But, once we get it, we need to make sure at least one
of the replicas during the last rw interval is not incomplete, or else we
should mark the pg DOWN (just like the PriorSet calculation does).

This logic mostly mirrors that of PriorSet, but additionally requires
the replicas be !incomplete.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: allow specifying pg_num and pgp_num when creating new pools.
Greg Farnum [Tue, 10 Jan 2012 19:25:25 +0000 (11:25 -0800)]
mon: allow specifying pg_num and pgp_num when creating new pools.

Right now this is only exposed via the monitor command interface:
osd pool create <poolname> [pg_num [pgp_num]]
but it can be expanded to other interfaces as appropriate.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoauth: Fix Doxygen warnings.
Tommi Virtanen [Tue, 10 Jan 2012 19:11:35 +0000 (11:11 -0800)]
auth: Fix Doxygen warnings.

Match prototype and implementation argument names and types
(textually, that is use std:: prefix).

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoFix several doxygen warnings, to minimize noise. Only changes comments.
Tommi Virtanen [Tue, 10 Jan 2012 18:08:52 +0000 (10:08 -0800)]
Fix several doxygen warnings, to minimize noise. Only changes comments.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agolibrados: Make API docs use @note instead of @bug for now.
Tommi Virtanen [Tue, 10 Jan 2012 18:07:18 +0000 (10:07 -0800)]
librados: Make API docs use @note instead of @bug for now.

Asphyxiate doesn't yet support all of the Doxygen markup.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agoFileStore: assert on ENOSPC even for SETXATTR
Samuel Just [Tue, 10 Jan 2012 19:16:21 +0000 (11:16 -0800)]
FileStore: assert on ENOSPC even for SETXATTR

Otherwise we can get corrupt object attributes on ext*.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agomds: initiate monitor reconnect if beacon acks take too long
Greg Farnum [Tue, 10 Jan 2012 18:41:36 +0000 (10:41 -0800)]
mds: initiate monitor reconnect if beacon acks take too long

If it takes 2*mds_beacon_grace (default 30 seconds total) seconds
to get an ack back, maybe it's the monitor and not us. Try a reconnect,
which will just add the teensiest bit of load if we're wrong.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomds: remove beacon_killer code.
Greg Farnum [Tue, 10 Jan 2012 18:32:43 +0000 (10:32 -0800)]
mds: remove beacon_killer code.

This no longer does *anything* except print out
useless warning messages.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoosd: make less noise when filestore is already up to date
Sage Weil [Tue, 10 Jan 2012 17:49:41 +0000 (09:49 -0800)]
osd: make less noise when filestore is already up to date

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agodoc: add librados C aio example
Josh Durgin [Tue, 10 Jan 2012 03:02:05 +0000 (19:02 -0800)]
doc: add librados C aio example

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: describe some rados_pool_stat_t members
Josh Durgin [Tue, 10 Jan 2012 02:58:27 +0000 (18:58 -0800)]
doc: describe some rados_pool_stat_t members

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: add librados pool creation defaults
Josh Durgin [Tue, 10 Jan 2012 02:29:04 +0000 (18:29 -0800)]
doc: add librados pool creation defaults

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: add short section on documenting code
Josh Durgin [Thu, 29 Dec 2011 23:57:33 +0000 (15:57 -0800)]
doc: add short section on documenting code

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: clarify librados return codes
Josh Durgin [Thu, 29 Dec 2011 00:00:25 +0000 (16:00 -0800)]
doc: clarify librados return codes

Adding a second @returns for specific error codes makes the sphinx output more readable.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: @return -> @returns to match the sphinx output
Josh Durgin [Wed, 28 Dec 2011 22:26:10 +0000 (14:26 -0800)]
doc: @return -> @returns to match the sphinx output

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: standardize rados_tmap_* docs
Josh Durgin [Wed, 28 Dec 2011 22:17:39 +0000 (14:17 -0800)]
doc: standardize rados_tmap_* docs

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: fix rados_version todo formatting
Josh Durgin [Wed, 28 Dec 2011 22:04:28 +0000 (14:04 -0800)]
doc: fix rados_version todo formatting

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: add a prefix to group names in librados.h
Josh Durgin [Wed, 28 Dec 2011 22:01:25 +0000 (14:01 -0800)]
doc: add a prefix to group names in librados.h

doxygen groups are in a global namespace.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: Put rados_ioctx_locator_set_key in a group so it can be cross-referenced
Josh Durgin [Wed, 28 Dec 2011 21:46:44 +0000 (13:46 -0800)]
doc: Put rados_ioctx_locator_set_key in a group so it can be cross-referenced

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: move rados_ioctx_get_id to the pool group
Josh Durgin [Wed, 28 Dec 2011 21:29:44 +0000 (13:29 -0800)]
doc: move rados_ioctx_get_id to the pool group

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: fix some typos in librados C API
Josh Durgin [Wed, 28 Dec 2011 21:25:05 +0000 (13:25 -0800)]
doc: fix some typos in librados C API

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: Switch doxygen integration from breathe to asphyxiate.
Tommi Virtanen [Fri, 23 Dec 2011 01:17:20 +0000 (17:17 -0800)]
doc: Switch doxygen integration from breathe to asphyxiate.

TODO: path of librados.h is now just the basename

TODO: no enum support for now

TODO: no @bug support for now

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agolibrados: Avoid using "crush_rule" as name of function argument.
Tommi Virtanen [Fri, 23 Dec 2011 01:09:23 +0000 (17:09 -0800)]
librados: Avoid using "crush_rule" as name of function argument.

"struct crush_rule" exists already, using the same identifier
confuses Doxygen.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agodoxygen: Use first sentence as brief description.
Tommi Virtanen [Fri, 23 Dec 2011 00:45:24 +0000 (16:45 -0800)]
doxygen: Use first sentence as brief description.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
13 years agodoc: add configuration and connecting to librados C api example
Josh Durgin [Tue, 20 Dec 2011 22:47:02 +0000 (14:47 -0800)]
doc: add configuration and connecting to librados C api example

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agodoc: add librados C api docs
Josh Durgin [Fri, 16 Dec 2011 21:57:38 +0000 (13:57 -0800)]
doc: add librados C api docs

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoceph: add a new "run_uml.sh" script to manage running a UML client
Alex Elder [Tue, 10 Jan 2012 02:13:41 +0000 (18:13 -0800)]
ceph: add a new "run_uml.sh" script to manage running a UML client

This script is used to automate most of what's required to run a
User-Mode Linux (UML) instance.  This is mainly of interest for
ceph client developers who might benefit from the debugger access
that UML affords.  It was written for ceph development but isn't
really dependent on ceph.  It basically makes a few assumptions and
follows some conventions, and in doing so is able to encapsulate
most of the "tricky parts" of setting up to run a UML instance.

Signed-off-by: Alex Elder <elder@dreamhost.com>
13 years agorgw: adjust log level
Yehuda Sadeh [Mon, 9 Jan 2012 19:40:44 +0000 (11:40 -0800)]
rgw: adjust log level

13 years agorgw: some cleanup
Yehuda Sadeh [Mon, 9 Jan 2012 18:31:15 +0000 (10:31 -0800)]
rgw: some cleanup

13 years agorgw: only use plain PUT processor when !chunked_upload
Yehuda Sadeh [Mon, 9 Jan 2012 18:15:00 +0000 (10:15 -0800)]
rgw: only use plain PUT processor when !chunked_upload

13 years agoMerge remote branch 'gh/master' into wip-backfill
Sage Weil [Mon, 9 Jan 2012 17:13:03 +0000 (09:13 -0800)]
Merge remote branch 'gh/master' into wip-backfill

13 years agoosd: populate_obc_watchers when object pulled to primary
Sage Weil [Mon, 9 Jan 2012 00:23:55 +0000 (16:23 -0800)]
osd: populate_obc_watchers when object pulled to primary

We don't care about degraded state, only whether the object is on the
primary so that we can load the object_info_t.

In particular, this avoids problems with backfill, where an object is
not degraded and populated, is then degraded while we backfill to the
target, and then not degraded again, and populate_obc_watchers() is called
a second time.

Fixes: #1903
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: handle case where no acceptable info exists
Sage Weil [Sun, 8 Jan 2012 23:15:18 +0000 (15:15 -0800)]
osd: handle case where no acceptable info exists

This happens when the only available replicas has last_backfill != MAX.

In that case, revert to up, and then set the DOWN state bit.

Instead of waiting for a new map, we should actually wait for a new info
to show up...

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote branch 'gh/wip-osd-retry-attempt'
Sage Weil [Sun, 8 Jan 2012 18:15:41 +0000 (10:15 -0800)]
Merge remote branch 'gh/wip-osd-retry-attempt'

13 years agoMerge remote branch 'gh/wip-admin-socket'
Sage Weil [Sun, 8 Jan 2012 16:16:56 +0000 (08:16 -0800)]
Merge remote branch 'gh/wip-admin-socket'

13 years agoperfcounters: fix unittest for new admin_socket interface
Sage Weil [Sat, 7 Jan 2012 03:09:10 +0000 (19:09 -0800)]
perfcounters: fix unittest for new admin_socket interface

Broken by b389685afa1be00b5147855bf71c50042bfbfa6c.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMakefile: disable untitest_interval_tree
Sage Weil [Sat, 7 Jan 2012 04:39:05 +0000 (20:39 -0800)]
Makefile: disable untitest_interval_tree

Segfaults. Valgrind errors. Accessing uninitialized memory.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agounittest_interval_tree: make it compile
Sage Weil [Sat, 7 Jan 2012 04:38:33 +0000 (20:38 -0800)]
unittest_interval_tree: make it compile

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: clean up src_oid, src_obc map key calculation
Sage Weil [Sat, 7 Jan 2012 01:18:01 +0000 (17:18 -0800)]
osd: clean up src_oid, src_obc map key calculation

Be consistent about how we generate the src_oid and src_oloc, so that we
feed good value into find_object_context and use a consistent key for
the src_obc map<>.  This fixes a crash in do_osd_ops() due to a missing
src_obc key when the get_src_oloc() normalizes the key in do_op() but not
in do_osd_ops().

Also use a nicer name.

Fixes: #1897
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: read op should claim_append data instead of claim
Yehuda Sadeh [Sat, 7 Jan 2012 00:55:52 +0000 (16:55 -0800)]
osd: read op should claim_append data instead of claim

13 years agorgw: remove object before writing both xattrs and data
Yehuda Sadeh [Sat, 7 Jan 2012 00:51:23 +0000 (16:51 -0800)]
rgw: remove object before writing both xattrs and data

otherwise we'll leak xattrs from previous incarnation

13 years agorgw: create plain processor for small objects
Yehuda Sadeh [Sat, 7 Jan 2012 00:16:06 +0000 (16:16 -0800)]
rgw: create plain processor for small objects

13 years agorgw: fix multipart PUT
Yehuda Sadeh [Fri, 6 Jan 2012 23:07:08 +0000 (15:07 -0800)]
rgw: fix multipart PUT

latest revamp broke it, missed calling RGWPutObjProcessor::prepare(s)
where needed.

13 years agorgw: rearrange PutObj::execute()
Yehuda Sadeh [Fri, 6 Jan 2012 20:41:33 +0000 (12:41 -0800)]
rgw: rearrange PutObj::execute()

groundwork for different handling of small object PUTs

13 years agorgw: different atomic handling for small objects
Yehuda Sadeh [Thu, 5 Jan 2012 20:51:27 +0000 (12:51 -0800)]
rgw: different atomic handling for small objects

13 years agoMerge remote branch 'gh/master' into wip-backfill
Sage Weil [Sat, 7 Jan 2012 00:44:11 +0000 (16:44 -0800)]
Merge remote branch 'gh/master' into wip-backfill

13 years agomon: fix uninitialized cluster_logger_registered
Sage Weil [Fri, 6 Jan 2012 22:32:16 +0000 (14:32 -0800)]
mon: fix uninitialized cluster_logger_registered

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobjecter: ignore replies from old request attempts
Sage Weil [Fri, 6 Jan 2012 19:38:15 +0000 (11:38 -0800)]
objecter: ignore replies from old request attempts

If we know the request attempt, ignore old attempts.

If we do not know the attempt (because the server is old), accept the
reply.  This could lead to doing some ACK callbacks we shouldn't in
extreme failure/recovery scenarios, but that is better than doing
the callbacks out of order.

Partially fixes: #1490
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: encode retry attempt in MOSDOp[Reply]
Sage Weil [Fri, 6 Jan 2012 20:49:13 +0000 (12:49 -0800)]
osd: encode retry attempt in MOSDOp[Reply]

In addition to the boolean flag, also encode the exact retry attempt.

Return -1 if we don't know.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: document quorum_status, mon_status
Sage Weil [Fri, 6 Jan 2012 20:20:18 +0000 (12:20 -0800)]
mon: document quorum_status, mon_status

Fixes: #1824
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: fix misplaced else
Sage Weil [Fri, 6 Jan 2012 20:19:59 +0000 (12:19 -0800)]
mon: fix misplaced else

Broken by 435c29448a10ec343f5a2b7195d94c72de5b1a25.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote branch 'gh/wip-mon-timeouts'
Sage Weil [Fri, 6 Jan 2012 18:20:55 +0000 (10:20 -0800)]
Merge remote branch 'gh/wip-mon-timeouts'

13 years agoceph: speak new admin socket protocol
Sage Weil [Thu, 5 Jan 2012 21:58:05 +0000 (13:58 -0800)]
ceph: speak new admin socket protocol

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoadmin_socket: fix, extend admin_socket unit tests
Sage Weil [Fri, 6 Jan 2012 17:30:54 +0000 (09:30 -0800)]
admin_socket: fix, extend admin_socket unit tests

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoadmin_socket: string commands
Sage Weil [Thu, 5 Jan 2012 21:57:58 +0000 (13:57 -0800)]
admin_socket: string commands

Commands are strings.  Old __be32 works too.  'help' to list available
commands.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: elector needs to reset leader_acked on every election start
Greg Farnum [Thu, 5 Jan 2012 23:29:32 +0000 (15:29 -0800)]
mon: elector needs to reset leader_acked on every election start

Otherwise you never reset the leader_acked after a failed
election attempt, so if mon 0 is available on the first round
but then fails, you never make progress!

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomon: instrument elector so you can stop participating in the quorum
Greg Farnum [Thu, 5 Jan 2012 23:36:37 +0000 (15:36 -0800)]
mon: instrument elector so you can stop participating in the quorum

Add new monitor commands "quorum exit" and "quorum enter" to use it.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomon: kill client sessions when we're not in quorum
Greg Farnum [Thu, 5 Jan 2012 22:03:43 +0000 (14:03 -0800)]
mon: kill client sessions when we're not in quorum

After a timeout of 2*mon_lease length (ie, two election rounds),
kill existing client sessions so they can reconnect to a
monitor that's (hopefully) remained in the quorum. Let any
new client sessions stick around for a mon_lease interval, then
do the same to them.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>