]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agodoc: update to describe new OSD version support as it actually exists
Greg Farnum [Tue, 27 Aug 2013 22:21:49 +0000 (15:21 -0700)]
doc: update to describe new OSD version support as it actually exists

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoReplicatedPG: add OpContext::user_at_version
Greg Farnum [Wed, 28 Aug 2013 00:24:24 +0000 (17:24 -0700)]
ReplicatedPG: add OpContext::user_at_version

Set this up with the existing at_version member, but only increase
it for user_modify ops. Use this when logging the PG's user_version. In
order to maintain compatibility with old clients on classic pools, we
force user_version to follow at_version whenever it's updated.

Now that we have and are maintaining this PG user version, use it
for the user version on ops that get ENOENT back, when short-circuiting
replies as part of reply_op_error()[1], or when replying to repops
in eval_repop; further use it for the cls_current_version() function. This
is a small semantic change for that function, as previously it would
generally return the same value as the user would get sent back via
MOSDOpReply -- but I don't think it was something you could count on.
We now define it as being the user version of the PG at the start of the
op, and as a bonus it is defined even for read ops (the at_version is
only filled in on write operations).

[1]: We tweak PGLog to make it easier to retrieve both user and PG versions.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoMOSDOpReply: stop filling in replay_version from the MOSDOp to begin with
Greg Farnum [Tue, 27 Aug 2013 19:55:52 +0000 (12:55 -0700)]
MOSDOpReply: stop filling in replay_version from the MOSDOp to begin with

It's just asking for trouble.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoMOSDOpReply: switch to comprehensive instead of individual version setters
Greg Farnum [Tue, 27 Aug 2013 21:06:49 +0000 (14:06 -0700)]
MOSDOpReply: switch to comprehensive instead of individual version setters

There's little point to updating versions individually when we can
do so en masse and avoid mistakes in duplication.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoMOSDOpReply: add enough fields to be backwards compatible.
Greg Farnum [Tue, 27 Aug 2013 18:02:44 +0000 (11:02 -0700)]
MOSDOpReply: add enough fields to be backwards compatible.

The system we've been building up works out very nicely for new clients,
but they could not have interoperated with old clients that were only
referring to our replay_version. In order to deal with this, we add
a bad_replay_version to MOSDOpReply which is encoded where we used
to encode replay_version. bad_replay_version will follow the same semantics
as reassert_version used to (except that it is filled in on reads), but
is not accessible to new clients, who can see only our properly-controlled
replay_version and user_version. This will let old and new clients
interoperate correctly when communicating about watches, etc.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoosd: actually fill in user_version in pg_log_entry_t
Greg Farnum [Wed, 28 Aug 2013 00:14:56 +0000 (17:14 -0700)]
osd: actually fill in user_version in pg_log_entry_t

We now require it when creating a pg_log_entry_t. The user_version
is the version which info.last_user_version should be set to
after the transaction is applied, which for everything except for
a user-modify op is going to be the version it was already at.
For now we are filling in the user-modify op's changing user_version
to be ctx->at_version.version

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoosd: add last_user_version to pg_info_t
Greg Farnum [Wed, 21 Aug 2013 18:26:28 +0000 (11:26 -0700)]
osd: add last_user_version to pg_info_t

We add a corresponding user_version to pg_log_entry_t, and the logic
to assign from one to the other and to recover last_user_version from
a master's log. We aren't yet setting it to anything, though.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoReplicatedPG: remove OpContext::reply_user_version
Greg Farnum [Wed, 21 Aug 2013 00:11:14 +0000 (17:11 -0700)]
ReplicatedPG: remove OpContext::reply_user_version

ctx->new_obs.oi.user_version is initialized to ctx->obs.oi.user_version,
and for read ops it won't be changed. That means
reply_user_version == ctx->new_obs.oi.user_version in all cases, which
means we don't want it.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoosd: switch object_info_t::user_version to be a version_t
Greg Farnum [Wed, 21 Aug 2013 00:13:53 +0000 (17:13 -0700)]
osd: switch object_info_t::user_version to be a version_t

We never expose the full eversion_t data to users, and do not want to.
However, we pull some tricks in the encode/decode functions to avoid
having to change the object_info_t disk format for this change.
When we can break compatibility, we should simplify this.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoReplicatedPG: Fill in the MOSDOpReply's user_version
Greg Farnum [Tue, 20 Aug 2013 23:22:27 +0000 (16:22 -0700)]
ReplicatedPG: Fill in the MOSDOpReply's user_version

As part of this, rename OpContext::reply_version->reply_user_version.
The semantics that necessitate the reply_version are only for user versions,
so rename it for clarity. Then use the reply_user_version in
set_user_version() (if the op succeeded).
For now we use the PG version for ENOENT (preserving the previous
semantics), but that will get changed to the pg's user_version soon
as well.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoReplicatedPG: set the replay version based on the at_version
Greg Farnum [Tue, 20 Aug 2013 23:18:18 +0000 (16:18 -0700)]
ReplicatedPG: set the replay version based on the at_version

The replay version is not for users to consume, so we don't want
to use the user_version for it.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoObjecter: expose MOSDOp's new user_version instead of the replay_version
Greg Farnum [Tue, 20 Aug 2013 20:55:54 +0000 (13:55 -0700)]
Objecter: expose MOSDOp's new user_version instead of the replay_version

We don't want users to ever see the replay_version, which is about
to become private RADOS data.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoObjecter: librados: mass switch from eversion_t to version_t
Greg Farnum [Tue, 20 Aug 2013 21:21:04 +0000 (14:21 -0700)]
Objecter: librados: mass switch from eversion_t to version_t

There are a lot of pointers throughout our request infrastructure used solely
for exporting the version to users. The interfaces we actually expose only
provide a uint64_t (leaving off eversion_t's epoch), and that's all we're
going to maintain in our new user_version scheme, so don't pretend we'll
have more in our internal interfaces.

I audited this pretty carefully; in particular:
Op::objver is only used for passing data back to users via the calling
functions IoCtxImpl::last_objver, etc
IoCtxImpl::last_objver is used only for the set_sync_op_version() call, which
provides data only for the uint64_t get_last_version() and
rados_get_last_version() calls.
AioCompletionImpl::objver is used only for the uint64_t get_version() call.
LingerOp::pobjver is used only for referencing things that are now version_t.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoObjecter: rename Op::version to Op::replay_version
Greg Farnum [Tue, 20 Aug 2013 21:21:32 +0000 (14:21 -0700)]
Objecter: rename Op::version to Op::replay_version

This is used for replay, so let's be more precise!

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoMOSDOpReply: add user_version field
Greg Farnum [Wed, 28 Aug 2013 00:02:15 +0000 (17:02 -0700)]
MOSDOpReply: add user_version field

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agodoc: include plan for new user_version support
Greg Farnum [Tue, 27 Aug 2013 22:16:29 +0000 (15:16 -0700)]
doc: include plan for new user_version support

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoReplicatedPG: do not do a redundant set of ctx->new_obs.oi.version
Greg Farnum [Thu, 22 Aug 2013 21:54:19 +0000 (14:54 -0700)]
ReplicatedPG: do not do a redundant set of ctx->new_obs.oi.version

We set this in the if below for writes, and for reads it doesn't need to
be updated (and isn't). Remove the confusing double-set so future code
inspectors don't get concerned there's a bug like I did.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoReplicatedPG: remove long-dead branch
Greg Farnum [Mon, 26 Aug 2013 21:38:30 +0000 (14:38 -0700)]
ReplicatedPG: remove long-dead branch

This was confusing the heck out of me when trying to figure out
why I was hitting an assert. So replace the if-else block with
a more appropriate assert and don't include any misleading calls
to prepare_transaction() from sub_op_modify().

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoMOSDOpReply: rename *_version() -> *_replay_version()
Greg Farnum [Wed, 28 Aug 2013 00:00:38 +0000 (17:00 -0700)]
MOSDOpReply: rename *_version() -> *_replay_version()

We have been returning the object's "user version" and using that
for replay, but that is in fact incorrect. In preparation for fixing
up the user version semantics, rename get_version to get_replay_version
and set_version to set_replay_version.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoMOSDOpReply: rename reassert_version -> replay_version
Greg Farnum [Tue, 27 Aug 2013 23:56:40 +0000 (16:56 -0700)]
MOSDOpReply: rename reassert_version -> replay_version

Because that's what it's for. reassert_version is a bit ambiguous.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agodocs: document how the current OSD PG/object versions work
Greg Farnum [Tue, 27 Aug 2013 22:08:28 +0000 (15:08 -0700)]
docs: document how the current OSD PG/object versions work

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoMOSDOpReply: set reassert_version for very old clients
Greg Farnum [Thu, 22 Aug 2013 22:28:15 +0000 (15:28 -0700)]
MOSDOpReply: set reassert_version for very old clients

I think this must make every sufficiently-old client fail on replay --
very bad!

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoMerge pull request #516 from dachary/master
athanatos [Tue, 20 Aug 2013 17:34:32 +0000 (10:34 -0700)]
Merge pull request #516 from dachary/master

erasure code : plugin, interface and glossary documentation updates

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoerasure code : plugin, interface and glossary documentation updates 516/head
Loic Dachary [Tue, 20 Aug 2013 14:17:10 +0000 (16:17 +0200)]
erasure code : plugin, interface and glossary documentation updates

* replace the erasure code plugin abstract interface with a doxygen link
  that will be populated when the header shows in master
* update the plugin documentation to reflect the current draft implementation
* fix broken link to PGBackend-h
* add a glossary to define chunk, stripe, shard and strip with a drawing

http://tracker.ceph.com/issues/4929 refs #4929

Signed-off-by: Loic Dachary <loic@dachary.org>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Tue, 20 Aug 2013 05:53:28 +0000 (22:53 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agoPG: remove old log when we upgrade log version
Samuel Just [Tue, 20 Aug 2013 00:23:44 +0000 (17:23 -0700)]
PG: remove old log when we upgrade log version

Otherwise the log_oid will be non-empty and the next
boot will cause us to try to upgrade again.

Fixes: #6057
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'wip-fallocate'
Sage Weil [Tue, 20 Aug 2013 05:50:11 +0000 (22:50 -0700)]
Merge branch 'wip-fallocate'

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoceph-fuse: fallocate appears in fuse 2.9.1, not 2.9
Sage Weil [Tue, 20 Aug 2013 04:46:29 +0000 (21:46 -0700)]
ceph-fuse: fallocate appears in fuse 2.9.1, not 2.9

There is no macro to differentiate 2.9 from 2.9.1, so we have to wait
to use this until 3.0.  :(

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoclient: do not mark_caps_dirty for generic fallocate
Sage Weil [Fri, 16 Aug 2013 06:05:17 +0000 (23:05 -0700)]
client: do not mark_caps_dirty for generic fallocate

A normal fallocate in which the size is not changed is still a no-op.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoclient: guard fallocate with #ifdefs
Sage Weil [Fri, 16 Aug 2013 06:01:59 +0000 (23:01 -0700)]
client: guard fallocate with #ifdefs

Only include linux header if it's linux.  Only implement the fallocate
method if FALLOC_FL_PUNCH_HOLE is defined.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoCeph-fuse: Fallocate and punch hole support
Li Wang [Thu, 15 Aug 2013 04:04:03 +0000 (12:04 +0800)]
Ceph-fuse: Fallocate and punch hole support

This patch implements fallocate and punch hole support for Ceph fuse client.

Signed-off-by: Yunchuan Wen <yunchuanwen@ubuntukylin.com>
Signed-off-by: Li Wang <liwang@ubuntukylin.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoPGLog: add a config to disable PGLog::check()
Samuel Just [Mon, 19 Aug 2013 07:02:24 +0000 (00:02 -0700)]
PGLog: add a config to disable PGLog::check()

This is a debug check which may be causing excessive
cpu usage.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agodoc: Title change.
John Wilkins [Tue, 20 Aug 2013 00:27:10 +0000 (17:27 -0700)]
doc: Title change.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoosd/ReplicatedPG: remove broken AccessMode logic
Sage Weil [Mon, 19 Aug 2013 05:34:24 +0000 (22:34 -0700)]
osd/ReplicatedPG: remove broken AccessMode logic

The original intent here was to handle reads in two modes.  For
workloads with read/modify/write ops, the RMW mode would:

 - queue writes for local store and replicas immediately
 - block reads until the write commits to all replicas

For mixed read/write workloads without read/modify/write ops, the
DELAYED mode would:

 - queue writes for replicas
 - allow local reads
 - once replicas commit, queue write locally
 - block local reads until local write completes

In reality, we never use the DELAYED mode.  It's untested and possibly
broken, and it is unlikely we will see a workload where it is important
in the near to mid term.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoMerge pull request #508 from ceph/wip-5905
Gregory Farnum [Mon, 19 Aug 2013 22:14:40 +0000 (15:14 -0700)]
Merge pull request #508 from ceph/wip-5905

examples: add a librados/hello_world program

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.org>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoexamples: add a librados/hello_world program 508/head
Greg Farnum [Thu, 15 Aug 2013 23:16:37 +0000 (16:16 -0700)]
examples: add a librados/hello_world program

This is a simple program with lots of explanatory comments people
can use as a model for using librados.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoceph: parse CEPH_ARGS environment variable
Sage Weil [Mon, 19 Aug 2013 19:48:50 +0000 (12:48 -0700)]
ceph: parse CEPH_ARGS environment variable

Fixes: #6052
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agorados pybind: add conf_parse_env()
Sage Weil [Mon, 19 Aug 2013 19:48:40 +0000 (12:48 -0700)]
rados pybind: add conf_parse_env()

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Mon, 19 Aug 2013 19:41:54 +0000 (12:41 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agodoc/release-notes: v0.61.8
Sage Weil [Mon, 19 Aug 2013 19:41:26 +0000 (12:41 -0700)]
doc/release-notes: v0.61.8

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #513 from dalgaaf/fix/wip-da-documentation
Sage Weil [Mon, 19 Aug 2013 19:32:30 +0000 (12:32 -0700)]
Merge pull request #513 from dalgaaf/fix/wip-da-documentation

Fix documentation issues

12 years agofilestore-config-ref.rst: mark some filestore keys as deprecated 513/head
Danny Al-Gaaf [Mon, 19 Aug 2013 18:56:48 +0000 (20:56 +0200)]
filestore-config-ref.rst: mark some filestore keys as deprecated

Marked the following keys as deprecated since v0.65:
- filestore flusher
- filestore flusher max fds
- filestore sync flush

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoMerge pull request #512 from ceph/wip-5988
Sage Weil [Mon, 19 Aug 2013 18:16:57 +0000 (11:16 -0700)]
Merge pull request #512 from ceph/wip-5988

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'wip-erasure-coded-doc'
Samuel Just [Mon, 19 Aug 2013 18:02:45 +0000 (11:02 -0700)]
Merge branch 'wip-erasure-coded-doc'

12 years agolibrados: synchronous commands should return on commit instead of ack 512/head
Greg Farnum [Mon, 19 Aug 2013 17:29:49 +0000 (10:29 -0700)]
librados: synchronous commands should return on commit instead of ack

This is unlikely to be noticed by anybody, but it is a big change. Document
in the PendingReleaseNotes and bump up the librados minor version number
to 68.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoMerge pull request #493 from dachary/wip-erasure-coding-doc
athanatos [Mon, 19 Aug 2013 17:28:48 +0000 (10:28 -0700)]
Merge pull request #493 from dachary/wip-erasure-coding-doc

rearrange erasure code documents

Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agomon: make MonMap error message about unspecified monitors less specific.
Greg Farnum [Mon, 19 Aug 2013 17:21:16 +0000 (10:21 -0700)]
mon: make MonMap error message about unspecified monitors less specific.

The error message helpfully references the -m and -c CLI options for
specifying monitors, but this code can be invoked from non-core librados
client applications so that's unfortunately not kosher. Remove the
reference.

Fixes #5979.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoauth-config-ref.rst: fix signature keys
Danny Al-Gaaf [Mon, 19 Aug 2013 08:33:37 +0000 (10:33 +0200)]
auth-config-ref.rst: fix signature keys

Fix names of cephx signature keys.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
12 years agoobjclass: move cls_log into class_api.cc
Sage Weil [Sat, 17 Aug 2013 21:30:37 +0000 (14:30 -0700)]
objclass: move cls_log into class_api.cc

Not sure why but this seems to resolve a linking problem when loading
classes:

2013-08-17 13:28:19.015776 7fb2bcffa700  0 _load_class could not open class /usr/lib/rados-classes/libcls_hello.so (dlopen failed): /usr/lib/rados-classes/libcls_hello.so: undefined symbol: cls_log
2013-08-17 13:28:19.015786 7fb2bcffa700 -1 osd.4 12 class hello open got (5) Input/output error

In any case, it's simpler.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc/dev/filestore-filesystem-compatibliity: remove outdated xattr notes
Sage Weil [Sat, 17 Aug 2013 18:04:47 +0000 (11:04 -0700)]
doc/dev/filestore-filesystem-compatibliity: remove outdated xattr notes

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #494 from kri5/wip-s3-compliance-doc
Sage Weil [Sat, 17 Aug 2013 18:00:59 +0000 (11:00 -0700)]
Merge pull request #494 from kri5/wip-s3-compliance-doc

doc: complete S3 features status from existing doc page

12 years agoMerge pull request #491 from kri5/wip-clang-compilation
Sage Weil [Sat, 17 Aug 2013 17:59:01 +0000 (10:59 -0700)]
Merge pull request #491 from kri5/wip-clang-compilation

Fix compilation -Wmismatched-tags warnings

Reviewed-by: Loic Dachary <loic@dachary.org>
12 years agodoc: Updated upgrade doc to include dumpling and incorporate ceph-deploy.
John Wilkins [Sat, 17 Aug 2013 17:35:32 +0000 (10:35 -0700)]
doc: Updated upgrade doc to include dumpling and incorporate ceph-deploy.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge pull request #479 from devoid/fix-5797
Sage Weil [Sat, 17 Aug 2013 17:09:01 +0000 (10:09 -0700)]
Merge pull request #479 from devoid/fix-5797

Document unstable nature of CephFS

12 years agoMakefile: move objclass/*.cc to libosd.la
Sage Weil [Sat, 17 Aug 2013 16:40:44 +0000 (09:40 -0700)]
Makefile: move objclass/*.cc to libosd.la

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc/changelog: add missing file
Sage Weil [Sat, 17 Aug 2013 15:38:55 +0000 (08:38 -0700)]
doc/changelog: add missing file

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoos/FileStore: initialize blk_size on _detect_fs()
Sage Weil [Sat, 17 Aug 2013 15:30:26 +0000 (08:30 -0700)]
os/FileStore: initialize blk_size on _detect_fs()

This was missed by a25d73effb38118602bc73da0aa258c639f69c2c.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agodoc/release-notes: v0.67.1
Sage Weil [Sat, 17 Aug 2013 15:20:00 +0000 (08:20 -0700)]
doc/release-notes: v0.67.1

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #505 from ceph/wip-post-file
Sage Weil [Sat, 17 Aug 2013 06:41:38 +0000 (23:41 -0700)]
Merge pull request #505 from ceph/wip-post-file

ceph-post-file: single command to upload a file to cephdrop

12 years agomds: create only one ESubtreeMap during fs creation
Sage Weil [Sat, 17 Aug 2013 05:08:00 +0000 (22:08 -0700)]
mds: create only one ESubtreeMap during fs creation

Previously we would create an empty ESubtreeMap when we opened the log
segment and then immediately journal a second one that created the root
and mdsdir.  More importantly, for the second ESubtreeMap, we would not
wait for it to commit before requesting the ACTIVE state, leading to
#4894.

Instead, break start_new_segment() into two steps: one that creates the
in-memory LogSegment tracking structure, and one that journals the
ESubtreeMap.  Open things early and write the (one) ESubtreeMap at the
end of boot_create().. and then wait for it.

Fixes: #4894
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
12 years agodoc: quickstart: be more explicit that node == mon node
Sage Weil [Sat, 17 Aug 2013 04:18:21 +0000 (21:18 -0700)]
doc: quickstart: be more explicit that node == mon node

This appears to be one source of confusion for new users that leads to
a failure to form an initial mon quorum.  See comments on

 http://tracker.ceph.com/issues/4924

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agorgw: drain requests before exiting
Yehuda Sadeh [Tue, 13 Aug 2013 20:16:07 +0000 (13:16 -0700)]
rgw: drain requests before exiting

Fixes: #5953
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoceph-post-file: single command to upload a file to cephdrop 505/head
Sage Weil [Sat, 17 Aug 2013 00:59:11 +0000 (17:59 -0700)]
ceph-post-file: single command to upload a file to cephdrop

Use sftp to upload to a directory that only this user and ceph devs can
access.

Distribute an ssh key to connect to the account.  This will let us revoke
the key in the future if we feel the need.  Also distribute a known_hosts
file so that users have some confidence that they are connecting to the
real ceph drop account and not some third party.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agodoc: Removed old mkcephfs references.
John Wilkins [Sat, 17 Aug 2013 00:31:43 +0000 (17:31 -0700)]
doc: Removed old mkcephfs references.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Removed mkcephfs references.
John Wilkins [Sat, 17 Aug 2013 00:28:15 +0000 (17:28 -0700)]
doc: Removed mkcephfs references.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Updated script for dumpling.
John Wilkins [Sat, 17 Aug 2013 00:27:53 +0000 (17:27 -0700)]
doc: Updated script for dumpling.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Updated APT script for dumpling.
John Wilkins [Sat, 17 Aug 2013 00:27:16 +0000 (17:27 -0700)]
doc: Updated APT script for dumpling.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Removed mkcephfs references. Did a bit of clean-up work.
John Wilkins [Sat, 17 Aug 2013 00:26:25 +0000 (17:26 -0700)]
doc: Removed mkcephfs references. Did a bit of clean-up work.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoMerge pull request #509 from dmick/wip-rest-conf
Dan Mick [Sat, 17 Aug 2013 00:05:52 +0000 (17:05 -0700)]
Merge pull request #509 from dmick/wip-rest-conf

config_opts: add two ceph-rest-api-only variables for convenience

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoReplicatedPG: add osd_recover_clone_overlap_limit to limit clones
Samuel Just [Thu, 15 Aug 2013 22:35:26 +0000 (15:35 -0700)]
ReplicatedPG: add osd_recover_clone_overlap_limit to limit clones

We don't want to clone_range from clones too many times.
For now, just skip the cloning if there are too many holes.

Fixes: #5985
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoconfig_opts: add two ceph-rest-api-only variables for convenience 509/head
Dan Mick [Fri, 16 Aug 2013 20:15:34 +0000 (13:15 -0700)]
config_opts: add two ceph-rest-api-only variables for convenience

These aren't used by the C++ code at all, but in order for
rados_conf_get to find them, they need to be listed.  They're
consumed by ceph_rest_api.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agoMerge remote-tracking branch 'upstream/wip-zfs'
Samuel Just [Fri, 16 Aug 2013 23:35:21 +0000 (16:35 -0700)]
Merge remote-tracking branch 'upstream/wip-zfs'

Reviewed-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #504 from ceph/wip-cls-hello
Sage Weil [Fri, 16 Aug 2013 18:07:02 +0000 (11:07 -0700)]
Merge pull request #504 from ceph/wip-cls-hello

cls/hello: hello, world rados class

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.com>
12 years agoMerge branch 'wip-5848-coll'
David Zafman [Fri, 16 Aug 2013 01:32:30 +0000 (18:32 -0700)]
Merge branch 'wip-5848-coll'

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoosd: Add perf tracking for all states in RecoveryState
David Zafman [Thu, 15 Aug 2013 19:28:06 +0000 (12:28 -0700)]
osd: Add perf tracking for all states in RecoveryState

Fixes: #5848
Signed-off-by: David Zafman <david.zafman@inktank.com>
12 years agocls/hello: hello, world rados class 504/head
Sage Weil [Fri, 16 Aug 2013 00:20:43 +0000 (17:20 -0700)]
cls/hello: hello, world rados class

Simple example of a rados class doing read, write, and read/modify/write
methods.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: enforce RD, WR flags for class methods
Sage Weil [Thu, 15 Aug 2013 23:19:21 +0000 (16:19 -0700)]
osd: enforce RD, WR flags for class methods

Class methods are marked with RD and WR to help the OSD decide when we need
to flush objects or require certain permissions.  Ensure that methods do
not step outside their advertised capabilities by keeping a counter of rd
and wr ops we perform in do_osd_ops() and making sure that class methods,
and any ops the indirectly call, do not break the rules.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocls_rbd: remove old assign_bid method
Sage Weil [Thu, 15 Aug 2013 22:22:41 +0000 (15:22 -0700)]
cls_rbd: remove old assign_bid method

This method is problematic because it both writes/mutates and returns data,
which means that an untimely client disconnect or peering event will result
in a success to the client with no payload.

It has not been used since v0.52 (18054ba46fe2779d8df8b1a0d69ec93ca6a66c34)
which is pre-bobtail; so this change breaks compatibility with pre-bobtail
librbd clients (at least for image creation).

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agolibrbd: remove mostly-useless assign_bid helper
Sage Weil [Thu, 15 Aug 2013 22:18:51 +0000 (15:18 -0700)]
librbd: remove mostly-useless assign_bid helper

Do it inline.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoosd: do not return data payload for successful writes
Sage Weil [Thu, 15 Aug 2013 22:06:38 +0000 (15:06 -0700)]
osd: do not return data payload for successful writes

We were somewhat inadvertantly returning a data payload for write
operations.  This was a side-effect of the OpContext::ops field being a
reference to MOSDOp::ops: the return data would end up there, and then
the MOSDOpReply ctor would copy it.

Fix this by breaking the ref, and making the do_op() logic also claim
return result data for error values (so that errors can return data to the
caller).

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agocommon/Preforker: shut up warning
Sage Weil [Thu, 15 Aug 2013 21:35:28 +0000 (14:35 -0700)]
common/Preforker: shut up warning

common/Preforker.h: In member function 'void Preforker::daemonize()':
common/Preforker.h:97:40: warning: ignoring return value of 'ssize_t write(int, const void*, size_t)', declared with attribute warn_unused_result [-Wunused-result]

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Fri, 16 Aug 2013 00:21:00 +0000 (17:21 -0700)]
Merge remote-tracking branch 'gh/next'

12 years agoMerge pull request #506 from dmick/wip-admin-daemon
Sage Weil [Fri, 16 Aug 2013 00:14:23 +0000 (17:14 -0700)]
Merge pull request #506 from dmick/wip-admin-daemon

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoceph.in: --admin-daemon was not returning EINVAL on bad command 506/head
Dan Mick [Fri, 16 Aug 2013 00:10:56 +0000 (17:10 -0700)]
ceph.in: --admin-daemon was not returning EINVAL on bad command

Fix by restructuring code to hoist common code and have only one
place where admin_socket is actually called.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
12 years agoMerge pull request #507 from ceph/wip-4635.master
João Eduardo Luís [Thu, 15 Aug 2013 22:54:10 +0000 (15:54 -0700)]
Merge pull request #507 from ceph/wip-4635.master

Bunch of tidying up on monitor services & fix #4635

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoPendingReleaseNotes: reflect 'osd crush set' behavior change 507/head
Joao Eduardo Luis [Thu, 15 Aug 2013 22:46:30 +0000 (15:46 -0700)]
PendingReleaseNotes: reflect 'osd crush set' behavior change

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agovstart.sh: s/osd crush set/osd crush add/ as it's supposed to be
Joao Eduardo Luis [Thu, 15 Aug 2013 01:22:29 +0000 (18:22 -0700)]
vstart.sh: s/osd crush set/osd crush add/ as it's supposed to be

'osd crush set' should only be used to update already existing items on
the map whereas 'osd crush add' should be able to 'add and update' items.

Considering at that point we are effectively adding a new item to the
crush map, use 'add' instead of 'set'.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: OSDMonitor: don't expose uncommitted state on 'osd crush add/set'
Joao Eduardo Luis [Thu, 15 Aug 2013 01:20:24 +0000 (18:20 -0700)]
mon: OSDMonitor: don't expose uncommitted state on 'osd crush add/set'

Fixes: #4635
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: OSDMonitor: document 'prepare_command' wrt expected behavior of no-ops
Joao Eduardo Luis [Wed, 14 Aug 2013 23:32:17 +0000 (16:32 -0700)]
mon: OSDMonitor: document 'prepare_command' wrt expected behavior of no-ops

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: OSDMonitor: don't expose uncommitted state on 'osd crush link'
Sage Weil [Wed, 14 Aug 2013 23:23:14 +0000 (16:23 -0700)]
mon: OSDMonitor: don't expose uncommitted state on 'osd crush link'

Fixes: #4635
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: clarify 'osd crush add' vs 'osd crush set'
Sage Weil [Wed, 14 Aug 2013 22:24:44 +0000 (15:24 -0700)]
mon: clarify 'osd crush add' vs 'osd crush set'

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon/MonCap: remove useless 'osd crush add' perm from profile bootstrap-osd
Sage Weil [Wed, 14 Aug 2013 22:22:07 +0000 (15:22 -0700)]
mon/MonCap: remove useless 'osd crush add' perm from profile bootstrap-osd

Bootstrap doesn't use or need this; the crush update happens when the osd
starts up (see init-ceph or upstart/ceph-osd.conf).

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: AuthMonitor: fix some >80 columns debug strings
Joao Eduardo Luis [Tue, 6 Aug 2013 21:50:09 +0000 (14:50 -0700)]
mon: AuthMonitor: fix some >80 columns debug strings

Give AuthMonitor a new look.  She sure deserves it.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: AuthMonitor: fix whitespaces
Joao Eduardo Luis [Tue, 6 Aug 2013 21:48:29 +0000 (14:48 -0700)]
mon: AuthMonitor: fix whitespaces

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: AuthMonitor: remove dead code
Joao Eduardo Luis [Tue, 6 Aug 2013 21:47:57 +0000 (14:47 -0700)]
mon: AuthMonitor: remove dead code

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: use str_join instead of std::copy
Sage Weil [Thu, 15 Aug 2013 21:37:07 +0000 (14:37 -0700)]
mon: use str_join instead of std::copy

The std::copy method leaves a trailing separator.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agoconfig: fix stringification of config values
Sage Weil [Thu, 15 Aug 2013 21:36:57 +0000 (14:36 -0700)]
config: fix stringification of config values

The std::copy construct leaves a trailing separator character, which breaks
parsing for booleans (among other things) and probably mangles everything
else too.

Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agocommon: add str_join helper
Sage Weil [Thu, 15 Aug 2013 21:36:49 +0000 (14:36 -0700)]
common: add str_join helper

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
12 years agomon/PGMap: fix typo
Sage Weil [Thu, 15 Aug 2013 21:11:23 +0000 (14:11 -0700)]
mon/PGMap: fix typo

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoRevert "config: fix stringification of config values"
Sage Weil [Thu, 15 Aug 2013 21:07:39 +0000 (14:07 -0700)]
Revert "config: fix stringification of config values"

This reverts commit fefe0c602f78e66d35fd5806da4c2e4e154a267c.

I have a cleaner cleanup.