]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
13 years agohadoop: simplify workingDir handling; add home directory
Noah Watkins [Wed, 2 Nov 2011 04:52:48 +0000 (21:52 -0700)]
hadoop: simplify workingDir handling; add home directory

1. Simplifies the handling of paths by allowing them to be passed
around and manipulated in their fully qualified form. Before
paths are passed into native Ceph calls the path-only portion
is extracted.

2. Sets the initial working directory to be the default home
directory for a user (e.g. /user/<username>/).

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
13 years agohadoop: emulate Ceph file owner as current user
Noah Watkins [Wed, 2 Nov 2011 00:25:49 +0000 (17:25 -0700)]
hadoop: emulate Ceph file owner as current user

Make CephFileSystem tell Hadoop that the owner
of all files is the current user. This provides
zero security or isolation, but allows Hadoop
to be used with its default security settings.

A future solution will need to be developed that
provides some isolation, and gives a better user
experience.

This fixes tracker issue #1663

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
13 years agohadoop: use standard log4j logging facility
Noah Watkins [Tue, 1 Nov 2011 23:35:12 +0000 (16:35 -0700)]
hadoop: use standard log4j logging facility

Replace ceph.debug(msg, level) with LOG.level(msg)
provided by the log4j facility used by Hadoop. The
level can now be provided on a class-by-class basis
by modifying conf/log4j.properties.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
13 years agoPG: mark scrubmap entry as not absent when we see an update
Samuel Just [Wed, 2 Nov 2011 18:50:29 +0000 (11:50 -0700)]
PG: mark scrubmap entry as not absent when we see an update

Previously, there would be an assert failure in _scan_list if we see an
object deleted and then recreated.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoMerge branch 'wip-freebsd'
Sage Weil [Wed, 2 Nov 2011 15:45:36 +0000 (08:45 -0700)]
Merge branch 'wip-freebsd'

Conflicts:
src/osd/OSD.cc

13 years agodebian: empty dependency_libs in *.la files
Laszlo Boszormenyi [Tue, 1 Nov 2011 19:57:11 +0000 (12:57 -0700)]
debian: empty dependency_libs in *.la files

Per policy and multiarch support.

Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
13 years agoadd missingok to logrotate
Laszlo Boszormenyi [Tue, 1 Nov 2011 19:56:34 +0000 (12:56 -0700)]
add missingok to logrotate

When ceph is not running, it has no logs. Thus logrotate has nothing to
rotate. The missingok directive handles this situation.

Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
13 years agodebian: update VCS sources
Laszlo Boszormenyi [Tue, 1 Nov 2011 19:55:47 +0000 (12:55 -0700)]
debian: update VCS sources

Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
13 years agodebian: fix libceph1 -> libcephfs1 rename
Laszlo Boszormenyi [Tue, 1 Nov 2011 19:55:17 +0000 (12:55 -0700)]
debian: fix libceph1 -> libcephfs1 rename

Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
13 years agodebian: add watch
Laszlo Boszormenyi [Tue, 1 Nov 2011 19:54:27 +0000 (12:54 -0700)]
debian: add watch

Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
13 years agoosdmaptool: test --create-with-conf with racks
Sage Weil [Wed, 2 Nov 2011 04:20:56 +0000 (21:20 -0700)]
osdmaptool: test --create-with-conf with racks

Make sure we generate a map that will map (and not assert about bad
max_osd/max_device mismatch).

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosdmap: assert that osdmap max_osds >= crushmap max_devices
Sage Weil [Wed, 2 Nov 2011 04:14:19 +0000 (21:14 -0700)]
osdmap: assert that osdmap max_osds >= crushmap max_devices

This will catch potential array overruns before they happen.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosdmap: fix off-by-one in build_simple_from_conf
Sage Weil [Wed, 2 Nov 2011 04:11:11 +0000 (21:11 -0700)]
osdmap: fix off-by-one in build_simple_from_conf

maxosd is the highest osd id.  set_max_osd(that + 1), since that is
setting the array size.  This fixes references off the end of that
array.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: fix assert include
Sage Weil [Wed, 2 Nov 2011 03:04:25 +0000 (20:04 -0700)]
osd: fix assert include

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorgw: swift prefix and path params fixes
Yehuda Sadeh [Tue, 1 Nov 2011 23:01:25 +0000 (16:01 -0700)]
rgw: swift prefix and path params fixes

13 years ago.gitignore: test_str_list
Sage Weil [Tue, 1 Nov 2011 20:12:21 +0000 (13:12 -0700)]
.gitignore: test_str_list

13 years agoMakefile: include/compat.h in tarball
Sage Weil [Tue, 1 Nov 2011 20:10:15 +0000 (13:10 -0700)]
Makefile: include/compat.h in tarball

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'master' into wip-freebsd
Sage Weil [Tue, 1 Nov 2011 19:35:47 +0000 (12:35 -0700)]
Merge branch 'master' into wip-freebsd

13 years agoMerge remote-tracking branch 'gh/wip-auth'
Sage Weil [Tue, 1 Nov 2011 18:49:36 +0000 (11:49 -0700)]
Merge remote-tracking branch 'gh/wip-auth'

13 years agocommon: get_str_list unit tests
Sage Weil [Tue, 1 Nov 2011 18:42:45 +0000 (11:42 -0700)]
common: get_str_list unit tests

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agocommon: make get_str_list work with other delimiters, and skip the
Sage Weil [Tue, 1 Nov 2011 18:42:25 +0000 (11:42 -0700)]
common: make get_str_list work with other delimiters, and skip the

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomonclient: fix else formatting
Josh Durgin [Tue, 1 Nov 2011 17:43:50 +0000 (10:43 -0700)]
monclient: fix else formatting

If one branch has braces, the other should too.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agomonclient: fail fast when our auth protocols aren't supported
Josh Durgin [Tue, 1 Nov 2011 17:40:41 +0000 (10:40 -0700)]
monclient: fail fast when our auth protocols aren't supported

This handles the case where the server does not support any of the
authentication protocols that the client does. Previously this error
would never be propagated, and you'd only know something went wrong
when the optional timeout expired. Now, monclient->authenticate()
fails as soon as it gets the first response from the monitor.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoPG: set_last_peering_reset in Reset constructor
Samuel Just [Tue, 1 Nov 2011 18:16:53 +0000 (11:16 -0700)]
PG: set_last_peering_reset in Reset constructor

If an osd in the prior set comes up, we can restart peering without a
new peering interval starting.  However, we still want to ignore
anything we previously requested from replicas.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agomonclient: fix else formatting
Josh Durgin [Tue, 1 Nov 2011 17:43:50 +0000 (10:43 -0700)]
monclient: fix else formatting

If one branch has braces, the other should too.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agomonclient: fail fast when our auth protocols aren't supported
Josh Durgin [Tue, 1 Nov 2011 17:40:41 +0000 (10:40 -0700)]
monclient: fail fast when our auth protocols aren't supported

This handles the case where the server does not support any of the
authentication protocols that the client does. Previously this error
would never be propagated, and you'd only know something went wrong
when the optional timeout expired. Now, monclient->authenticate()
fails as soon as it gets the first response from the monitor.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agoosd: kill unused on_osd_failure() hook
Sage Weil [Mon, 31 Oct 2011 22:03:29 +0000 (15:03 -0700)]
osd: kill unused on_osd_failure() hook

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoRadosModel.h: use default conf location
Samuel Just [Mon, 31 Oct 2011 22:00:43 +0000 (15:00 -0700)]
RadosModel.h: use default conf location

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoRevert "PG: call set_last_peering_reset in Started contructor"
Samuel Just [Mon, 31 Oct 2011 20:56:32 +0000 (13:56 -0700)]
Revert "PG: call set_last_peering_reset in Started contructor"

Unfortunately, the Started constructor doesn't occur until map
activation.  We need to reset last_peering_reset exactly when the acting
set changes.

This reverts commit 6d123067ce1ba99522281d5c72623bd5ba3e0fc8.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agohadoop: Return NULL when the path does not exist.
Noah Watkins [Mon, 31 Oct 2011 18:15:26 +0000 (11:15 -0700)]
hadoop: Return NULL when the path does not exist.

Although unspecified in the declaration header, other file
systems return a single result when the path is a file.

This fixes tracker issue #1661

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agoosdmap: fix g_ceph_context reference
Sage Weil [Sun, 30 Oct 2011 00:42:17 +0000 (17:42 -0700)]
osdmap: fix g_ceph_context reference

Use cct.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoReplicatedPG: check for peering restart before share_pg_info
Samuel Just [Fri, 28 Oct 2011 22:34:27 +0000 (15:34 -0700)]
ReplicatedPG: check for peering restart before share_pg_info

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agomkcephfs: build initial osdmap from information in ceph.conf
Sage Weil [Fri, 28 Oct 2011 21:33:38 +0000 (14:33 -0700)]
mkcephfs: build initial osdmap from information in ceph.conf

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosdmaptool: build initial map from ceph.conf
Sage Weil [Fri, 28 Oct 2011 21:32:30 +0000 (14:32 -0700)]
osdmaptool: build initial map from ceph.conf

This builds the intial osd and crush maps from what is in the ceph.conf,
taking advantage of host or rack tags that are present there.

If there are >1 hosts, separate replica across hosts.  If there are >2
racks, separate across racks.  Semi arbitrary, but should capture most
use cases.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agocrush: make insert_item take float for weight
Sage Weil [Fri, 28 Oct 2011 20:59:25 +0000 (13:59 -0700)]
crush: make insert_item take float for weight

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoReplicatedPG: Clean up old snap links when recovering a clone
Samuel Just [Fri, 28 Oct 2011 21:18:57 +0000 (14:18 -0700)]
ReplicatedPG: Clean up old snap links when recovering a clone

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agoPG: Create new snap directories independently on replica
Samuel Just [Fri, 28 Oct 2011 21:18:12 +0000 (14:18 -0700)]
PG: Create new snap directories independently on replica

Previously, we shipped over the collection creation as part
of the transaction.  However, the snap directory on the
replica might or might not exist already due to recovery
progress.

Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
13 years agorgw: canonical resource should use unencoded url
Yehuda Sadeh [Fri, 28 Oct 2011 21:03:38 +0000 (14:03 -0700)]
rgw: canonical resource should use unencoded url

13 years agoMerge pull request #4 from vzctl/master
Sage Weil [Fri, 28 Oct 2011 20:00:42 +0000 (13:00 -0700)]
Merge pull request #4 from vzctl/master

fix error: 'snprintf' was not declared in this scope

13 years agorgw: cleanup, remove unused user_id
Yehuda Sadeh [Fri, 28 Oct 2011 18:48:38 +0000 (11:48 -0700)]
rgw: cleanup, remove unused user_id

Some access methods required user_id param, but that was never really used. At
this point we should just remove them.

13 years agomkcephfs: skip non-btrfs osds even with --mkbtrfs
Sage Weil [Fri, 28 Oct 2011 18:42:04 +0000 (11:42 -0700)]
mkcephfs: skip non-btrfs osds even with --mkbtrfs

This lets you do a mixed btrfs and non-btrfs file systems.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'stable'
Sage Weil [Fri, 28 Oct 2011 17:39:15 +0000 (10:39 -0700)]
Merge branch 'stable'

13 years agodebian: break redundant dependencies
Sage Weil [Fri, 28 Oct 2011 17:38:51 +0000 (10:38 -0700)]
debian: break redundant dependencies

They confuse APT it seems.

 ceph-common -> librbd1 -> librados2
 radosgw -> ceph-common -> librados2

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMOSDMap: do not leave {oldest,newest}_map uninitialized when decoding old messages
Sage Weil [Fri, 28 Oct 2011 17:05:20 +0000 (10:05 -0700)]
MOSDMap: do not leave {oldest,newest}_map uninitialized when decoding old messages

This leads to badness like

  osd_map(295..296 src has 74308224..0) v1

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoinclude stdio in order to fix snprintf compilation error 3/head 4/head
Alexey Lapitsky [Fri, 28 Oct 2011 13:37:09 +0000 (15:37 +0200)]
include stdio in order to fix snprintf compilation error

Signed-off-by: Alexey Lapitsky <lex@realisticgroup.com>
13 years agoceph: fix snprintf warning
Sage Weil [Fri, 28 Oct 2011 03:28:57 +0000 (20:28 -0700)]
ceph: fix snprintf warning

warning: tools/ceph.cc:146: format not a string literal and no format arguments

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoauth: return unknown if no supported auth is found
Josh Durgin [Fri, 28 Oct 2011 01:11:28 +0000 (18:11 -0700)]
auth: return unknown if no supported auth is found

If NONE is supported, it will already be in the list of supported
protocols, so there's no need to default to it here. This prevents
clients that request the NONE protocol from authenticating when the
server only accepts CEPHX. Instead, they get -ENOTSUP from the
AuthMonitor.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agorgw: swift related adjustments
Yehuda Sadeh [Thu, 27 Oct 2011 21:31:16 +0000 (14:31 -0700)]
rgw: swift related adjustments

13 years agoMerge branch 'master' of github.com:NewDreamNetwork/ceph
Sage Weil [Thu, 27 Oct 2011 21:26:53 +0000 (14:26 -0700)]
Merge branch 'master' of github.com:NewDreamNetwork/ceph

13 years agofixed graphic reference and headings
Sondra.Menthers [Thu, 27 Oct 2011 21:04:56 +0000 (14:04 -0700)]
fixed graphic reference and headings

13 years agofixed image reference
Sondra.Menthers [Thu, 27 Oct 2011 21:00:57 +0000 (14:00 -0700)]
fixed image reference

13 years agofixed architecture document
Sondra.Menthers [Thu, 27 Oct 2011 20:54:31 +0000 (13:54 -0700)]
fixed architecture document

13 years agoadd images for documentation
Sondra.Menthers [Thu, 27 Oct 2011 20:43:05 +0000 (13:43 -0700)]
add images for documentation

13 years agorgw: handle swift PUT with incorrect etag
Sondra.Menthers [Thu, 27 Oct 2011 19:51:57 +0000 (12:51 -0700)]
rgw: handle swift PUT with incorrect etag

13 years agorgw: handle swift PUT with incorrect etag
Sondra.Menthers [Thu, 27 Oct 2011 19:44:37 +0000 (12:44 -0700)]
rgw: handle swift PUT with incorrect etag

13 years agorgw: handle swift PUT with incorrect etag
Sondra.Menthers [Thu, 27 Oct 2011 18:20:41 +0000 (11:20 -0700)]
rgw: handle swift PUT with incorrect etag

13 years agorgw: handle swift PUT with incorrect etag
Sondra.Menthers [Thu, 27 Oct 2011 18:20:41 +0000 (11:20 -0700)]
rgw: handle swift PUT with incorrect etag

13 years agorgw: handle swift PUT with incorrect etag
Sondra.Menthers [Thu, 27 Oct 2011 18:16:51 +0000 (11:16 -0700)]
rgw: handle swift PUT with incorrect etag

13 years agorgw: handle swift PUT with incorrect etag
Sondra.Menthers [Thu, 27 Oct 2011 18:02:23 +0000 (11:02 -0700)]
rgw: handle swift PUT with incorrect etag

13 years agoceph: refactor for generic --admin-daemon <sock> <cmd> too
Sage Weil [Thu, 27 Oct 2011 17:02:42 +0000 (10:02 -0700)]
ceph: refactor for generic --admin-daemon <sock> <cmd> too

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoceph: --dump-perf-counters[-schema] sockpath
Sage Weil [Thu, 27 Oct 2011 16:48:08 +0000 (09:48 -0700)]
ceph: --dump-perf-counters[-schema] sockpath

Quick and dirty way to dump perfcounters stats.  Not documenting this until
we decide this is where it should live.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agofilejournal: journal_replay_from
Sage Weil [Thu, 27 Oct 2011 16:47:20 +0000 (09:47 -0700)]
filejournal: journal_replay_from

Force journal replay from a point other than the op_seq recorded by the
fs.  This is useful if you want to skip bad entries in the journal (e.g.,
because they were non-idempotent and you know they were applied and the fs
operations were fully ordered).

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'stable'
Sage Weil [Thu, 27 Oct 2011 16:26:08 +0000 (09:26 -0700)]
Merge branch 'stable'

13 years agorados: improve error message
Sage Weil [Wed, 26 Oct 2011 21:56:25 +0000 (14:56 -0700)]
rados: improve error message

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoradosgw-admin: fix key create check
Sage Weil [Thu, 27 Oct 2011 04:20:18 +0000 (21:20 -0700)]
radosgw-admin: fix key create check

Also fixes warning

warning: rgw/rgw_admin.cc:812: suggest parentheses around ‘&&’ within ‘||’

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: guard checks for writes
Josh Durgin [Thu, 27 Oct 2011 00:05:34 +0000 (17:05 -0700)]
osd: guard checks for writes

fa722de6708d3e92037df6289cc29ece12c8ea66 moved these checks, and
accidentally removed the may_write() guard. This caused reading from
snapshots to fail.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
13 years agorgw: handle swift PUT with incorrect etag
Yehuda Sadeh [Thu, 27 Oct 2011 00:20:51 +0000 (17:20 -0700)]
rgw: handle swift PUT with incorrect etag

13 years agorgw: rgw-admin --skip-zero-entries
Yehuda Sadeh [Wed, 26 Oct 2011 23:07:04 +0000 (16:07 -0700)]
rgw: rgw-admin --skip-zero-entries

13 years agoperfcounters: fix accessor name
Sage Weil [Wed, 26 Oct 2011 23:00:45 +0000 (16:00 -0700)]
perfcounters: fix accessor name

FreakingCamelCaps

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoobjecter: instrument with perfcounter
Sage Weil [Wed, 26 Oct 2011 22:54:15 +0000 (15:54 -0700)]
objecter: instrument with perfcounter

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agorgw: rgw-admin generate-key/access-key=false fix
Yehuda Sadeh [Wed, 26 Oct 2011 22:34:52 +0000 (15:34 -0700)]
rgw: rgw-admin generate-key/access-key=false fix

13 years agorgw: rgw-admin can show log summation
Yehuda Sadeh [Wed, 26 Oct 2011 22:34:18 +0000 (15:34 -0700)]
rgw: rgw-admin can show log summation

13 years agoosd: read_log: only list the collection once
Sage Weil [Wed, 26 Oct 2011 21:56:08 +0000 (14:56 -0700)]
osd: read_log: only list the collection once

After upgrading we may need to list the collection to recover the hash
value when upgrading an old collection.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agorgw: fix bucket suspension
Yehuda Sadeh [Wed, 26 Oct 2011 21:30:26 +0000 (14:30 -0700)]
rgw: fix bucket suspension

13 years agorgw: fix uninitialized variable warnings
Sage Weil [Wed, 26 Oct 2011 04:34:07 +0000 (21:34 -0700)]
rgw: fix uninitialized variable warnings

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoMerge branch 'master' of ssh://github.com/NewDreamNetwork/ceph
Yehuda Sadeh [Tue, 25 Oct 2011 23:29:40 +0000 (16:29 -0700)]
Merge branch 'master' of ssh://github.com/NewDreamNetwork/ceph

Conflicts:
src/rgw/rgw_rados.cc

13 years agohadoop: bring back Java changes.
Greg Farnum [Mon, 10 Oct 2011 15:19:47 +0000 (08:19 -0700)]
hadoop: bring back Java changes.

These convert the Hadoop stuff to work on the branch-0.20 API.

Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
13 years agorgw: fix attr cache
Yehuda Sadeh [Tue, 25 Oct 2011 23:23:08 +0000 (16:23 -0700)]
rgw: fix attr cache

13 years agocommon/ceph_extattr.[ch] > common/xattr.[ch]
Sage Weil [Tue, 25 Oct 2011 22:08:38 +0000 (15:08 -0700)]
common/ceph_extattr.[ch] > common/xattr.[ch]

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge branch 'master' into wip-freebsd
Sage Weil [Tue, 25 Oct 2011 21:54:16 +0000 (14:54 -0700)]
Merge branch 'master' into wip-freebsd

13 years agofix osdmaptool clitests
Sage Weil [Tue, 25 Oct 2011 21:15:13 +0000 (14:15 -0700)]
fix osdmaptool clitests

13 years agoMerge branch 'wip-pools'
Sage Weil [Tue, 25 Oct 2011 21:02:42 +0000 (14:02 -0700)]
Merge branch 'wip-pools'

13 years agomon: reencode routed messages
Sage Weil [Tue, 25 Oct 2011 17:52:06 +0000 (10:52 -0700)]
mon: reencode routed messages

The message encoding may depend on the target features.  Clear the
payload so that the Message gets reencoded appropriately.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMOSDMap: reencode full map embedded in Incremental, as needed
Sage Weil [Tue, 25 Oct 2011 17:51:21 +0000 (10:51 -0700)]
MOSDMap: reencode full map embedded in Incremental, as needed

The Incremental may have a bufferlist containing a full map; reencode
that too if we are reencoding for old clients.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote-tracking branch 'gh/wip-rbd-tool'
Sage Weil [Tue, 25 Oct 2011 17:13:44 +0000 (10:13 -0700)]
Merge remote-tracking branch 'gh/wip-rbd-tool'

13 years agomon: fix rare races with pool updates
Sage Weil [Mon, 24 Oct 2011 18:41:29 +0000 (11:41 -0700)]
mon: fix rare races with pool updates

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agomon: parse 0 values properly
Sage Weil [Mon, 24 Oct 2011 18:41:13 +0000 (11:41 -0700)]
mon: parse 0 values properly

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoMerge remote branch 'gh/wip-osd-queue'
Sage Weil [Tue, 25 Oct 2011 05:51:15 +0000 (22:51 -0700)]
Merge remote branch 'gh/wip-osd-queue'

13 years agoosd: fix last_complete adjustment after recovering an object
Sage Weil [Mon, 24 Oct 2011 20:55:29 +0000 (13:55 -0700)]
osd: fix last_complete adjustment after recovering an object

After we recover each object, we try to raise the last_complete value
(and matching complete_to iterator).  If our log was purely a backlog, this
won't necessarily end up bringing last_complete all the way up to the
last_update value, and we'll fail an assert later.

If complete_to does reach the end of the log, then we fast-forward
last_complete to last_update.

The crash we were hitting was in finish_recovery(), and looked something
like

osd/PG.cc: In function 'void PG::finish_recovery(ObjectStore::Transaction&, std::list<Context*, std::allocator<Context*> >&)', in thread '0x7f4573df7700'
osd/PG.cc: 1800: FAILED assert(info.last_complete == info.last_update)
 ceph version 0.36-251-g6e29c28 (commit:6e29c2826066a7723ed05b60b8ac0433a04c3c13)
 1: (PG::finish_recovery(ObjectStore::Transaction&, std::list<Context*, std::allocator<Context*> >&)+0x8d) [0x6ff0ed]
 2: (PG::RecoveryState::Active::react(PG::RecoveryState::ActMap const&)+0x316) [0x729196]
 3: (boost::statechart::simple_state<PG::RecoveryState::Active, PG::RecoveryState::Primary, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x21b) [0x759c0b]
 4: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x8d) [0x7423dd]
 5: (PG::RecoveryState::handle_activate_map(PG::RecoveryCtx*)+0x183) [0x711f43]
 6: (OSD::activate_map(ObjectStore::Transaction&, std::list<Context*, std::allocator<Context*> >&)+0x674) [0x579884]
 7: (OSD::handle_osd_map(MOSDMap*)+0x2270) [0x57bd50]
 8: (OSD::_dispatch(Message*)+0x4d0) [0x596bb0]
 9: (OSD::ms_dispatch(Message*)+0x17b) [0x59803b]
 10: (SimpleMessenger::dispatch_entry()+0x9c2) [0x617562]
 11: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x4a3dec]
 12: (Thread::_entry_func(void*)+0x12) [0x611a92]
 13: (()+0x7971) [0x7f457f87b971]
 14: (clone()+0x6d) [0x7f457e10b92d]

Fixes: #1609
Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: fix generate_past_intervals maybe_went_rw on oldest interval
Sage Weil [Sun, 23 Oct 2011 06:07:10 +0000 (23:07 -0700)]
osd: fix generate_past_intervals maybe_went_rw on oldest interval

We stop working backwards when we hit last_epoch_clean, which means for the
oldest interval first_epoch may not be the _real_ first_epoch.  (We can't
continue working backward because we may have thrown out those maps
entirely.)

However, if the last_epoch_clean epoch is contained within that interval,
we know that the OSD did in fact go rw because it had to have completed
recovery (and thus peering) to set last_clean_epoch in the first place.

This fixes cases where two different nodes have slightly different
past intervals, generate different prior probe sets as a result, and
flip/flop on the acting set choice.  (It may have eventually resolved when
the wrongly excluded node's notify races and arrives in time to be
considered, but that's still clearly no good.)

This does leave the start epoch for that oldest interval incorrect.  That
doesn't currently matter except that it's confusing, but I'm not sure how
to mark it properly, or if it's worth the effort.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: MOSDPGNotify: print prettier
Sage Weil [Sun, 23 Oct 2011 05:43:33 +0000 (22:43 -0700)]
osd: MOSDPGNotify: print prettier

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: print useful debug info from choose_acting
Sage Weil [Sun, 23 Oct 2011 05:43:21 +0000 (22:43 -0700)]
osd: print useful debug info from choose_acting

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: make proc_replica_log missing dump include useful information
Sage Weil [Fri, 21 Oct 2011 16:57:52 +0000 (09:57 -0700)]
osd: make proc_replica_log missing dump include useful information

I needed to see have/need to debug a weird unfound issue turned up by
thrashing.

Signed-off-by: Sage Weil <sage@newdream.net>
13 years agoosd: fix/simplify op discard checks
Sage Weil [Tue, 25 Oct 2011 05:21:43 +0000 (22:21 -0700)]
osd: fix/simplify op discard checks

Use a helper to determine when we should discard an op due to the client
being disconnected.  Use this when the op is first received, (re)queued,
and dequeued.

Fix the check to keep ops that are replayed ACKs, as we should make every
effort to reapply those even when the client goes away.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: move queue checks into enqueue_op, kill _handle_ helpers
Sage Weil [Tue, 25 Oct 2011 05:13:59 +0000 (22:13 -0700)]
osd: move queue checks into enqueue_op, kill _handle_ helpers

This simplifies things, and renames the checks to make it clear that we are
doing validation checks only, with no side-effects allowed.

Also move some checks into the parent handle_op() to further simplify the
(re)queue checks.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: move op cap check into helper
Sage Weil [Tue, 25 Oct 2011 04:59:49 +0000 (21:59 -0700)]
osd: move op cap check into helper

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: drop useless PG hooks
Sage Weil [Tue, 25 Oct 2011 04:48:50 +0000 (21:48 -0700)]
osd: drop useless PG hooks

These no longer need to be exposed to the generic OSD code.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: drop ability to disable op queue entirely
Sage Weil [Tue, 25 Oct 2011 04:46:56 +0000 (21:46 -0700)]
osd: drop ability to disable op queue entirely

This is pretty useless, and broken wrt requeueing anyway.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: handle missing/degraded in op thread
Sage Weil [Tue, 25 Oct 2011 04:44:36 +0000 (21:44 -0700)]
osd: handle missing/degraded in op thread

The _handle_op() method (and friends) are called when an op is initially
queued and when it is requeued.  In the requeue case we have to be more
careful because the caller may be in the middle of doing all sorts of
random stuff.  That means we need to limit ourselves to queueing or
discarding the op, and refrain from doing anything else with dangerous
side effects.

This fixes a crash like

osd/ReplicatedPG.cc: In function 'void ReplicatedPG::recover_primary_got(hobject_t, eversion_t)', in thread '7f21d0189700'
osd/ReplicatedPG.cc: 4109: FAILED assert(missing.num_missing() == 0)
 ceph version 0.37-105-gc2069eb (commit:c2069eb1e562ba7d753c9b5ce5c904f4f5ef6abe)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x76) [0x8ab95a]
 2: (ReplicatedPG::recover_primary_got(hobject_t, eversion_t)+0x62e) [0x767eea]
 3: (ReplicatedPG::sub_op_push(MOSDSubOp*)+0x2b79) [0x76abeb]
 4: (ReplicatedPG::do_sub_op(MOSDSubOp*)+0x1ab) [0x74761b]
 5: (OSD::dequeue_op(PG*)+0x47d) [0x820ac3]
 6: (OSD::OpWQ::_process(PG*)+0x27) [0x82cc8b]

due to an object being pushed to a replica before it is activated.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
13 years agoosd: set reqid on push/pull ops
Sage Weil [Tue, 25 Oct 2011 03:54:26 +0000 (20:54 -0700)]
osd: set reqid on push/pull ops

Not strictly necessary, but makes logs easier to follow.

Signed-off-by: Sage Weil <sage.weil@dreamhost.com>