]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agoosd: restart peering if requesting acting osd goes down
Sage Weil [Tue, 31 Jan 2012 17:53:32 +0000 (09:53 -0800)]
osd: restart peering if requesting acting osd goes down

If we request an acting set, we need to restart peering if one of the
requested nodes goes down.  This prevents a deadlock where we get stuck
in WaitActingChange because we have [a,b], want [a,b,c], but c is down and
our up and acting don't actually change.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: rename recovery event NeedNewMap -> NeedActingChange
Sage Weil [Tue, 31 Jan 2012 17:40:23 +0000 (09:40 -0800)]
osd: rename recovery event NeedNewMap -> NeedActingChange

This is more precise.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: use RecoveryContext transaction, finishers on recovery completion
Sage Weil [Tue, 31 Jan 2012 15:23:10 +0000 (07:23 -0800)]
osd: use RecoveryContext transaction, finishers on recovery completion

We should use the enclosing transaction and finisher list here.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoqa: test_backfill.sh: limit pg log length so we trigger backfill
Sage Weil [Tue, 31 Jan 2012 15:16:37 +0000 (07:16 -0800)]
qa: test_backfill.sh: limit pg log length so we trigger backfill

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: fix divergent backfill targets
Sage Weil [Tue, 31 Jan 2012 15:25:04 +0000 (07:25 -0800)]
osd: fix divergent backfill targets

During peering, a previous backfill target may have a slightly newer
last_update than the other options, but it will not be chosen because it
is incomplete.  That caused a failed assert during activate() (#1983).

To fix, we remove the bad assert, and then fix merge_log() so that the
replica/backfill target will trim its divergent entries when it gets the
activation MLogRec.  We also fix the handling of MInfoRec, as that can
trigger the same analogous condition.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: implement filestore_blackhole hook
Sage Weil [Tue, 31 Jan 2012 01:39:23 +0000 (17:39 -0800)]
filestore: implement filestore_blackhole hook

If true, we'll drop any new transactions on the floor. Useful for
triggering failure conditions (e.g., prior to killing ceph-osd itself, to
ensure some operations don't reach the local disk).

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: disable clone overlap for push/pull
Sage Weil [Mon, 30 Jan 2012 22:27:24 +0000 (14:27 -0800)]
osd: disable clone overlap for push/pull

There is a bug in the push/pull code.  Disable the recovery smarts by
default until we fix #2002.

There is currently a race (in the callers) where:
 - an adjacent clone is missing
 - we (calculate some clone overlap? and) start pulling
 - we get adjacent clone
 - we get push, calc a different overlap, and then get confused.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote branch 'gh/wip-warnings'
Sage Weil [Mon, 30 Jan 2012 21:42:45 +0000 (13:42 -0800)]
Merge remote branch 'gh/wip-warnings'

13 years agomon: make 'osd [out|in|down]' succeed if already whatever
Sage Weil [Mon, 30 Jan 2012 05:46:53 +0000 (21:46 -0800)]
mon: make 'osd [out|in|down]' succeed if already whatever

If we want something out and it is already out, succeed.  This makes the
client command succeed if there is a transient error and it gets resent.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoqa: encoding: silence warning
Sage Weil [Mon, 30 Jan 2012 05:05:08 +0000 (21:05 -0800)]
qa: encoding: silence warning

This is cheating, but we always use this class with int types, so it makes
this go away:

warning: test/encoding.cc:79:20: ‘*((void*)(& tu)+4).ConstructorCounter::data’ may be used uninitialized in this function [-Wuninitialized]

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoqa: test/gather fix warning
Sage Weil [Mon, 30 Jan 2012 04:56:03 +0000 (20:56 -0800)]
qa: test/gather fix warning

warning: test/gather.cc:29:222: passing NULL to non-pointer argument 3 of ‘static testing::AssertionResult testing::internal::EqHelper::Compare(const char*, const char*, const T1&, T2*) [with T1 = long int, T2 = C_Gather]’ [-Wconversion-null]

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoqa: test/rados-api/list fix warning
Sage Weil [Mon, 30 Jan 2012 04:54:18 +0000 (20:54 -0800)]
qa: test/rados-api/list fix warning

warning: test/rados-api/list.cc:43:156: converting ‘false’ to pointer type for argument 1 of ‘char testing::internal::IsNullLiteralHelper(testing::internal::Secret*)’ [-Wconversion-null]

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agotest_ipaddr: reverse ASSERT_EQ order
Sage Weil [Mon, 30 Jan 2012 04:36:46 +0000 (20:36 -0800)]
test_ipaddr: reverse ASSERT_EQ order

Make these warnings go away:

warning: test/test_ipaddr.cc:217:156: converting ‘false’ to pointer type for argument 1 of ‘char testing::internal::IsNullLiteralHelper(testing::internal::Secret*)’ [-Wconversion-null]

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: remove unused var
Sage Weil [Mon, 30 Jan 2012 01:26:55 +0000 (17:26 -0800)]
osd: remove unused var

warning: osd/PG.cc:1331:20: variable 'plu' set but not used [-Wunused-but-set-variable]

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoadmin_socket: fix uninit warning
Sage Weil [Mon, 30 Jan 2012 01:26:14 +0000 (17:26 -0800)]
admin_socket: fix uninit warning

warning: common/admin_socket_client.cc:166:19: 'socket_fd' may be used uninitialized in this function [-Wuninitialized]

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: trim old auth states
Sage Weil [Sun, 29 Jan 2012 17:26:28 +0000 (09:26 -0800)]
mon: trim old auth states

These aren't exposed outside the monitor, so we really only keep them
around to assist in mon recovery.  Give ourselves a healthy margin over
the max join drift for that.

Fixes: #2000
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: fix rollback when current/ missing entirely
Sage Weil [Sun, 29 Jan 2012 16:48:22 +0000 (08:48 -0800)]
filestore: fix rollback when current/ missing entirely

This can happen when we are starting, rolling back, remove current/, and
then fail before we snapshot a snap_ into place.

Most of the logic was already in place for this; we tried to fix it in
cd2dedd7d190a43a6be50a7f18849fe0123c72bc but missed this piece.

Fixes: #1999
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: reset pgstats timer when we reopen monitor session
Sage Weil [Sat, 28 Jan 2012 01:32:28 +0000 (17:32 -0800)]
osd: reset pgstats timer when we reopen monitor session

Otherwise we'll reopen every second from here on out, without giving the
new session a chance to start up and do it's thing.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoclock: ignore clock_offset if cct is NULL
Sage Weil [Sat, 28 Jan 2012 19:40:08 +0000 (11:40 -0800)]
clock: ignore clock_offset if cct is NULL

This is helpful e.g. from assert.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoclient: do not send release to down mds
Sage Weil [Sat, 28 Jan 2012 17:38:46 +0000 (09:38 -0800)]
client: do not send release to down mds

We can have a session with state where the mds is not up; don't blindly
send a message or we can get

./mds/MDSMap.h: In function 'const entity_inst_t MDSMap::get_inst(int)', in thread '0x7f092aad1910'
./mds/MDSMap.h: 465: FAILED assert(up.count(m))
 ceph version 0.35-6-g6eb8862 (commit:6eb8862e91d142451e256aaa02b34c81a4f21dea)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x70) [0x71f11a]
 2: (MDSMap::get_inst(int)+0x4b) [0x6dc191]
 3: (Client::flush_cap_releases()+0x94) [0x677e60]
 4: (Client::tick()+0x1f0) [0x690adc]
 5: (C_C_Tick::finish(int)+0x1c) [0x6f3fbe]
 6: (SafeTimer::timer_thread()+0x2c5) [0x6fbfe5]
 7: (SafeTimerThread::entry()+0x19) [0x6fe399]
 8: (Thread::_entry_func(void*)+0x20) [0x72e944]
 9: /lib/libpthread.so.0 [0x7f092dea573a]
 10: (clone()+0x6d) [0x7f092cba169d]

with a map like

$ ./ceph mds dump 85
2012-01-28 09:37:19.251946 mon <- [mds,dump,85]
2012-01-28 09:37:19.252618 mon.1 -> 'dumped mdsmap epoch 85' (0)
epoch   85
flags   0
created 2012-01-28 09:24:42.411202
modified        2012-01-28 09:28:45.093301
tableserver     0
root    0
session_timeout 60
session_autoclose       300
last_failure    0
last_failure_osd_epoch  18
compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object}
max_mds 1
in      0
up      {}
failed  0
stopped
data_pools      [0]
metadata_pool   1

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'stable'
Sage Weil [Sat, 28 Jan 2012 18:04:45 +0000 (10:04 -0800)]
Merge branch 'stable'

13 years agosignal: use _exit() on SIGTERM
Sage Weil [Sat, 28 Jan 2012 17:26:46 +0000 (09:26 -0800)]
signal: use _exit() on SIGTERM

No need to call onexit handlers, static dtors, whatever.

This may help with #1996 and #1549.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agotest: add script for checking admin socket 'objecter_requests' output
Josh Durgin [Fri, 27 Jan 2012 19:45:26 +0000 (11:45 -0800)]
test: add script for checking admin socket 'objecter_requests' output

Just a couple internal consistency checks for now. More specific ones
would depend on workload.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoobjecter: add an admin socket command to get in-flight requests
Josh Durgin [Mon, 23 Jan 2012 21:04:14 +0000 (13:04 -0800)]
objecter: add an admin socket command to get in-flight requests

Fixes: #1881
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoadmin socket: increase debug level for successful requests
Josh Durgin [Mon, 23 Jan 2012 23:04:17 +0000 (15:04 -0800)]
admin socket: increase debug level for successful requests

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoadmin socket: add include guard
Josh Durgin [Sat, 21 Jan 2012 01:02:52 +0000 (17:02 -0800)]
admin socket: add include guard

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoCephContext: add method for retrieving admin socket
Josh Durgin [Fri, 20 Jan 2012 23:58:37 +0000 (15:58 -0800)]
CephContext: add method for retrieving admin socket

This is needed to allow higher layers in the stack to add admin socket
commands.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoMerge branch 'wip-pg-stale'
Sage Weil [Sat, 28 Jan 2012 00:40:53 +0000 (16:40 -0800)]
Merge branch 'wip-pg-stale'

13 years agomon: stale pgs -> HEALTH_WARN
Sage Weil [Fri, 27 Jan 2012 21:27:27 +0000 (13:27 -0800)]
mon: stale pgs -> HEALTH_WARN

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agomon: mark pgs stale in pg_map if primary osd is down
Sage Weil [Fri, 27 Jan 2012 21:21:39 +0000 (13:21 -0800)]
mon: mark pgs stale in pg_map if primary osd is down

This alerts the administrator when all OSDs for a PG have failed and the
monitor doesn't receive any further updates.  Otherwise we may continue
to think a pg is active+clean when it is in fact offline.

Fixes: #1993
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: add STALE pg state bit
Sage Weil [Fri, 27 Jan 2012 21:02:28 +0000 (13:02 -0800)]
osd: add STALE pg state bit

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agov0.41 v0.41
Sage Weil [Fri, 27 Jan 2012 18:42:21 +0000 (10:42 -0800)]
v0.41

13 years agoobjector: document Objecter::init_ops()
Sage Weil [Fri, 27 Jan 2012 20:23:33 +0000 (12:23 -0800)]
objector: document Objecter::init_ops()

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoobjecter: fix out_* initialization
Sage Weil [Fri, 27 Jan 2012 20:23:23 +0000 (12:23 -0800)]
objecter: fix out_* initialization

This looks more like the real cause for #1986.  Op ctor gets a vector of
ops but out_* aren't initialized to match.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: dump offending transaction on any error
Sage Weil [Fri, 27 Jan 2012 18:41:50 +0000 (10:41 -0800)]
filestore: dump offending transaction on any error

Clean this code up to explicitly whitelist what is ok so that the flow is
less annoying to follow/maintain, and so that we dump the transaction
contents on whitelisted errors.

Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoobjecter: warn when OSD returns mismatched op vector
Sage Weil [Fri, 27 Jan 2012 18:40:14 +0000 (10:40 -0800)]
objecter: warn when OSD returns mismatched op vector

The osd shouldn't do this (even though we should tolerate it).

Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoobjecter: fix bounds checking on op reply demuxing
Sage Weil [Fri, 27 Jan 2012 18:39:49 +0000 (10:39 -0800)]
objecter: fix bounds checking on op reply demuxing

We can't assume that the size of out_ops (from the reply) matches the
op->out_* vectors from our request state.  In particular, the out_ops might
be shorter than what we sent the OSD if the OSD was sloppy.  Check them.

We can assume that op->ops and op->out_* all match; assert as much in
op_submit().

Fixes: #1986
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agomds: remove test assert
Sage Weil [Fri, 27 Jan 2012 18:01:47 +0000 (10:01 -0800)]
mds: remove test assert

Grr!

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoassert: include timestamp
Sage Weil [Fri, 27 Jan 2012 14:32:29 +0000 (06:32 -0800)]
assert: include timestamp

Also drop quotes around thread id.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: fix typo
Sage Weil [Wed, 25 Jan 2012 22:07:06 +0000 (14:07 -0800)]
filestore: fix typo

Grr

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'wip-kb'
Sage Weil [Wed, 25 Jan 2012 22:03:18 +0000 (14:03 -0800)]
Merge branch 'wip-kb'

Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agofilestore: zero btrfs vol_args prior to ioctl
Sage Weil [Wed, 25 Jan 2012 21:52:32 +0000 (13:52 -0800)]
filestore: zero btrfs vol_args prior to ioctl

Just to be paranoid.  Nothing we haven't set *should* affect the ABI,
but...

Always do this immediately after declaration so that we catch everything.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge remote branch 'upstream/wip-osd-clone-obc'
Samuel Just [Wed, 25 Jan 2012 21:58:58 +0000 (13:58 -0800)]
Merge remote branch 'upstream/wip-osd-clone-obc'

13 years agomon: num_kb -> num_bytes in cluster perfcounters
Sage Weil [Wed, 25 Jan 2012 20:38:59 +0000 (12:38 -0800)]
mon: num_kb -> num_bytes in cluster perfcounters

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: remove num_kb from object_stat_sum_t stats
Sage Weil [Wed, 25 Jan 2012 20:38:06 +0000 (12:38 -0800)]
osd: remove num_kb from object_stat_sum_t stats

This is redundant--we can just use num_bytes.  If we're worried about the
per-object overhead or rounding, we can factor in some overhead based on
num_objects.

And, the kb accounting has a bug (#1988).

Avoid changing the encoding at all for now.  Next time the encoding changes
we'll drop the old field.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: improve object context debug output
Sage Weil [Wed, 25 Jan 2012 17:56:58 +0000 (09:56 -0800)]
osd: improve object context debug output

Include pointer.  This may help with #1979.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: track obc for clone from log replay
Sage Weil [Wed, 25 Jan 2012 06:03:51 +0000 (22:03 -0800)]
osd: track obc for clone from log replay

We need to keep an in-memory obc to track the state of the in-flight io
to disk.  This is analogous to when an object is pushed + written, and we
can share the same completion function.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: set object_info_t::oid properly when recovering clones
Sage Weil [Wed, 25 Jan 2012 05:34:27 +0000 (21:34 -0800)]
osd: set object_info_t::oid properly when recovering clones

I saw a case (#1973) where the clone had the oid set to the head.  That is
clearly wrong.  Not sure what damage this caused.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote branch 'gh/wip-filestore-errors'
Sage Weil [Wed, 25 Jan 2012 05:19:44 +0000 (21:19 -0800)]
Merge remote branch 'gh/wip-filestore-errors'

13 years agopackage *.py* files
Alexandre Oliva [Tue, 17 Jan 2012 19:22:17 +0000 (17:22 -0200)]
package *.py* files

Some post-install rpmbuild defaults byte-compile all packaged python
files, so don't bother removing the .pyc files, and package .py* to
get both .pyo and .pyc.  It wastes a tiny little bit of space, but it
makes the spec file portable across a wider range of rpm and python
configurations.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicam.br>
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agolibrbd: don't infinite loop when header is too large
Josh Durgin [Wed, 25 Jan 2012 00:52:27 +0000 (16:52 -0800)]
librbd: don't infinite loop when header is too large

Since snapshots are currently stored at the end of the header, having
many snapshots made the header larger than the read size, resulting in
an infinite loop when the offset was not changed.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoReplicatedPG: data_subset may be empty during sub_op_push
Samuel Just [Tue, 24 Jan 2012 22:57:07 +0000 (14:57 -0800)]
ReplicatedPG: data_subset may be empty during sub_op_push

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agofilestore: fix non-::-prefixed close
Josh Durgin [Tue, 24 Jan 2012 21:23:21 +0000 (13:23 -0800)]
filestore: fix non-::-prefixed close

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agofilestore: add debugging to each error case in lfn_open
Josh Durgin [Tue, 24 Jan 2012 21:20:20 +0000 (13:20 -0800)]
filestore: add debugging to each error case in lfn_open

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agofilestore: TEMP_FAILURE_RETRY on ::close(2)
Sage Weil [Tue, 24 Jan 2012 21:16:30 +0000 (13:16 -0800)]
filestore: TEMP_FAILURE_RETRY on ::close(2)

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: return -errno from lfn_open
Sage Weil [Tue, 24 Jan 2012 17:31:39 +0000 (09:31 -0800)]
filestore: return -errno from lfn_open

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agofilestore: audit + clean up error checks
Sage Weil [Tue, 24 Jan 2012 17:31:33 +0000 (09:31 -0800)]
filestore: audit + clean up error checks

- use temp var for errno
- in general return -errno from helpers

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge commit '9dc7b9233b985bf859751fc89a5b02253e829836'
Sage Weil [Mon, 23 Jan 2012 21:50:19 +0000 (13:50 -0800)]
Merge commit '9dc7b9233b985bf859751fc89a5b02253e829836'

Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agorgw: fix warning
Sage Weil [Mon, 23 Jan 2012 20:48:46 +0000 (12:48 -0800)]
rgw: fix warning

rgw/rgw_rest.cc:258: warning: comparison between signed and unsigned integer expressions

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoceph: bail out on first failing command
Sage Weil [Mon, 23 Jan 2012 20:43:19 +0000 (12:43 -0800)]
ceph: bail out on first failing command

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoceph: don't write output on error
Sage Weil [Mon, 23 Jan 2012 20:43:03 +0000 (12:43 -0800)]
ceph: don't write output on error

Accumulate all output, and write it at the end.  This way we can avoid
writing it if any of the commands fail.

Fixes: #1954
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: ignore MInfoRec, MNotifyRec in WaitActingChange
Sage Weil [Mon, 23 Jan 2012 18:21:04 +0000 (10:21 -0800)]
osd: ignore MInfoRec, MNotifyRec in WaitActingChange

We should ignore logs, infos, and notifies while we are waiting for the
map to change.  Peering has reached a dead-end (we need acting to change)
and we will redo our work when that happens.  That includes the replicas
resending notifies.

Fixes: #1958
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
13 years agorgw: fix warning in 32bit arch
Yehuda Sadeh [Mon, 23 Jan 2012 17:50:56 +0000 (09:50 -0800)]
rgw: fix warning in 32bit arch

13 years agopg: unindex entries when clearing or removing from the log
Josh Durgin [Thu, 19 Jan 2012 01:34:50 +0000 (17:34 -0800)]
pg: unindex entries when clearing or removing from the log

Leaving the index around could cause use of the indexes to access
freed memory.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoosd: do not clobber log on backfill progress update
Sage Weil [Thu, 19 Jan 2012 02:01:09 +0000 (18:01 -0800)]
osd: do not clobber log on backfill progress update

This is unnecessary and counterproductive, since the log is used to detect
dup ops.  It's an artifact of an earlier backfill iteration that didn't
preserve the log on the backfill target.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agorgw: read_user_buckets() fix redone
Yehuda Sadeh [Fri, 20 Jan 2012 20:54:14 +0000 (12:54 -0800)]
rgw: read_user_buckets() fix redone

The problem with the original fix is that it wasn't atomic. Going back
to the original inefficient (though atomic) method. We should limit
the number of buckets per user anyway, and shouldn't get into a point
where this code is actually execised.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoosd: implement --dump-journal
Sage Weil [Sun, 15 Jan 2012 05:15:02 +0000 (21:15 -0800)]
osd: implement --dump-journal

Dump the contents of the journal to stdout in text form.  Useful for
debugging.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agorgw: read large bucket directory correctly
Yehuda Sadeh [Fri, 20 Jan 2012 18:46:31 +0000 (10:46 -0800)]
rgw: read large bucket directory correctly

Issue #1955. When there wre too many buckets, we failed reading
the bucket directory.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agorgw: fix warning
Yehuda Sadeh [Thu, 19 Jan 2012 17:11:09 +0000 (09:11 -0800)]
rgw: fix warning

Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
13 years agoMerge remote branch 'gh/wip-op-data-mux'
Sage Weil [Thu, 19 Jan 2012 04:41:04 +0000 (20:41 -0800)]
Merge remote branch 'gh/wip-op-data-mux'

Reviewed-by: Greg Farnum <greg.farnum@dreamhost.com>
Reviewed-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
13 years agoConvert mount.ceph to use KEY_SPEC_PROCESS_KEYRING
Neil Horman [Wed, 18 Jan 2012 17:00:14 +0000 (12:00 -0500)]
Convert mount.ceph to use KEY_SPEC_PROCESS_KEYRING

having mount.ceph use KEY_SPEC_USER_KEYRING to pass keys to the kernel has
several disadvantages:

1) It leaves the key setting in the uid_keyring, which is reachable from the
session keyring via a link (see keyctl list <root session keyring ref>).  This
means its accessible to other processes in the same session that don't need
access to it, even after the kernel is done with it.

2) The user keyring has some very counter-intuitive semantics as far as keyring
permissions goes.  The user keyring is access via a link from the session
keyring, which a process may not have permission to access in some situations.
For instance if mount.ceph is executed via su without having started a new
session, mount.ceph will not have access to the uid keyring unless the calling
proces (in this case su) has granted access permission.  The result is a -EPERM
error when executing mount.ceph to a cephx enabled server.  If the same command
is attempted in a new root session (e.g. su - or su -l), the mount command will
work fine

Switching the mount.ceph command to use the KEY_SPEC_PROCESS_KEYRING solves both
of these problems.  By using this keyring, accessibility is guaranteed because
its added and accessed in the same process context both in user space and the
kernel, assuring aceesability, despite the session specifics.  It also ensures
that the key will get cleaned up after the mount.ceph process exits
automatically, since there is no longer a need for it (the kernel clones the key
during the mount process and releases it on unmount).

I've tested this here on my local ceph cluster, and it works properly under both
su and su -l .

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoMerge branch 'wip-rgw-simplelog'
Yehuda Sadeh [Wed, 18 Jan 2012 19:46:24 +0000 (11:46 -0800)]
Merge branch 'wip-rgw-simplelog'

13 years agorgw: adjust high level debug level
Yehuda Sadeh [Wed, 18 Jan 2012 19:37:59 +0000 (11:37 -0800)]
rgw: adjust high level debug level

setting it to 2 instead of 1

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoMerge remote branch 'gh/wip-rgw-simplelog'
Sage Weil [Wed, 18 Jan 2012 19:25:13 +0000 (11:25 -0800)]
Merge remote branch 'gh/wip-rgw-simplelog'

* gh/wip-rgw-simplelog:
  rgw: add timestamp to high level log
  rgw: log host_bucket, http status
  rgw: simple request logging

Reviewed-by: Sage Weil <sage@newdream.net>
13 years agorgw: fix intent log processing
Yehuda Sadeh [Wed, 18 Jan 2012 07:42:08 +0000 (23:42 -0800)]
rgw: fix intent log processing

Intent log processing was completely broken. First, it wasn't
parsing the date correctly (due to failure to initalize strptime).
Second, it was trying to load the entire log to memory in one
piece (and in a racy way). This fixed bug #1948.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agorgw: initialize tm before calling strptime
Yehuda Sadeh [Wed, 18 Jan 2012 07:40:52 +0000 (23:40 -0800)]
rgw: initialize tm before calling strptime

strptime assumes tm is already initialized.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoobjecter: some helpful multiop result debug output
Sage Weil [Wed, 18 Jan 2012 05:59:32 +0000 (21:59 -0800)]
objecter: some helpful multiop result debug output

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobjecter: make getxattrs set rval on decode error
Sage Weil [Wed, 18 Jan 2012 05:32:11 +0000 (21:32 -0800)]
objecter: make getxattrs set rval on decode error

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobjecter: add stat ops to op vector!
Sage Weil [Wed, 18 Jan 2012 05:31:56 +0000 (21:31 -0800)]
objecter: add stat ops to op vector!

They work better that way.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoobjecter: gift reply data to outbl _after_ demuxing
Sage Weil [Wed, 18 Jan 2012 05:10:05 +0000 (21:10 -0800)]
objecter: gift reply data to outbl _after_ demuxing

Divvy up the result bl first, then gift the whole shebang to outbl.  If
we gift it first, there's nothing to demux (since we move intead of copy
the bufferlist ptrs).

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote branch 'gh/master' into wip-op-data-mux
Sage Weil [Wed, 18 Jan 2012 01:33:57 +0000 (17:33 -0800)]
Merge remote branch 'gh/master' into wip-op-data-mux

13 years agoosd: make in/outdata split/merge helpers static OSDOp methods
Sage Weil [Wed, 18 Jan 2012 01:33:37 +0000 (17:33 -0800)]
osd: make in/outdata split/merge helpers static OSDOp methods

Avoid defining new global functions.

Also add basic doxygen descriptions.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agorgw: log_show_next() fix reading of the next buffer
Yehuda Sadeh [Tue, 17 Jan 2012 23:10:58 +0000 (15:10 -0800)]
rgw: log_show_next() fix reading of the next buffer

Bug #1939. Failed reading large logs.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agoMerge branch 'master' of ssh://github.com/NewDreamNetwork/ceph
Yehuda Sadeh [Tue, 17 Jan 2012 23:05:38 +0000 (15:05 -0800)]
Merge branch 'master' of ssh://github.com/NewDreamNetwork/ceph

13 years agoMerge remote branch 'gh/wip-backfill'
Sage Weil [Tue, 17 Jan 2012 22:23:58 +0000 (14:23 -0800)]
Merge remote branch 'gh/wip-backfill'

Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Conflicts:
src/ceph_mds.cc
src/ceph_osd.cc

13 years agofilestore: overwrite fsid during --mkfs
Sage Weil [Tue, 17 Jan 2012 19:41:15 +0000 (11:41 -0800)]
filestore: overwrite fsid during --mkfs

This mainly matters because read_fsid() now looks at the file size to
determine if it's an old- or new-style fsid, and not overwriting mean a
downgrade confuses things.  Not that anyone would do that, but...

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorgw: reset timestamp when processing starts
Yehuda Sadeh [Tue, 17 Jan 2012 21:39:43 +0000 (13:39 -0800)]
rgw: reset timestamp when processing starts

otherwise we'd count also the time waiting for the request.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agohadoop: fix unix timestamp calculation in hadoop lib
Andrey Stepachev [Fri, 13 Jan 2012 15:12:24 +0000 (19:12 +0400)]
hadoop: fix unix timestamp calculation in hadoop lib

Hadoop always see wrong dates due of wrong timestamp calculation. Properly
convert nanoseconds to millis when adding.
Possibly fixes #1666.

Signed-off-by: Andrey Stepachev <octo@yandex-team.ru>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agohadoop: check for valid filehandler, before using in next calls
Andrey Stepachev [Fri, 13 Jan 2012 11:58:36 +0000 (15:58 +0400)]
hadoop: check for valid filehandler, before using in next calls

In case of nonexistent file, calling Client::replication()
triggers assert.

Signed-off-by: Andrey Stepachev <octo@yandex-team.ru>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agodoc: update control file for setting pg num on pool create
Greg Farnum [Tue, 10 Jan 2012 19:33:20 +0000 (11:33 -0800)]
doc: update control file for setting pg num on pool create

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoTestRados: fix {min,max}_stride_size initialization
Sage Weil [Tue, 17 Jan 2012 19:43:04 +0000 (11:43 -0800)]
TestRados: fix {min,max}_stride_size initialization

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'master' of ssh://ceph.newdream.net/git/ceph
Yehuda Sadeh [Tue, 17 Jan 2012 18:54:57 +0000 (10:54 -0800)]
Merge branch 'master' of ssh://ceph.newdream.net/git/ceph

13 years agoosd: fix bind error checks
Sage Weil [Tue, 17 Jan 2012 18:51:00 +0000 (10:51 -0800)]
osd: fix bind error checks

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMakefile: fix testkeys non-tcmalloc linkage
Sage Weil [Tue, 17 Jan 2012 18:44:17 +0000 (10:44 -0800)]
Makefile: fix testkeys non-tcmalloc linkage

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorgw: add timestamp to high level log
Yehuda Sadeh [Tue, 17 Jan 2012 17:54:31 +0000 (09:54 -0800)]
rgw: add timestamp to high level log

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agorgw: log host_bucket, http status
Yehuda Sadeh [Tue, 17 Jan 2012 01:46:49 +0000 (17:46 -0800)]
rgw: log host_bucket, http status

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agorgw: simple request logging
Yehuda Sadeh [Tue, 17 Jan 2012 01:03:19 +0000 (17:03 -0800)]
rgw: simple request logging

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
13 years agomds: abort startup if we fail to bind
Sage Weil [Mon, 16 Jan 2012 20:00:55 +0000 (12:00 -0800)]
mds: abort startup if we fail to bind

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: abort on startup if we fail to bind to a port
Sage Weil [Mon, 16 Jan 2012 19:54:26 +0000 (11:54 -0800)]
osd: abort on startup if we fail to bind to a port

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoceph: fix "run_uml.sh" script
Alex Elder [Tue, 17 Jan 2012 16:21:16 +0000 (10:21 -0600)]
ceph: fix "run_uml.sh" script

Last-minute cleverness prior to checkin broke the "run-uml.sh" script.
Rearange where a few definitions are done to make it work again.

Signed-off-by: Alex Elder <elder@dreamhost.com>