]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
11 years agov0.68 v0.68
Gary Lowell [Tue, 3 Sep 2013 23:10:31 +0000 (16:10 -0700)]
v0.68

11 years agorgw: change watch init ordering, don't distribute if can't
Yehuda Sadeh [Thu, 29 Aug 2013 20:06:33 +0000 (13:06 -0700)]
rgw: change watch init ordering, don't distribute if can't

Backport: dumpling

Moving back the watch initialization after the zone init,
as the zone info holds the control pool name. Since zone
init might need to create a new system object (that needs
to distribute cache), don't try to distribute cache if
watch is not yet initialized.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agomon: fix uninitialized Op field
Roald J. van Loon [Sat, 31 Aug 2013 17:30:14 +0000 (10:30 -0700)]
mon: fix uninitialized Op field

- Uninitialized field in MonitorLevelDB::Op causes random build errors.

Signed-off-by: Roald J. van Loon <roaldvanloon@gmail.com>
11 years agoautomake cleanup: uninitialized version_t
Roald J. van Loon [Fri, 30 Aug 2013 21:05:52 +0000 (23:05 +0200)]
automake cleanup: uninitialized version_t

This sometimes gives a completely random uint64_t value, because it is
potentially used uninitialized.

Signed-off-by: Roald J. van Loon <roaldvanloon@gmail.com>
11 years agoMerge pull request #530 from ceph/wip-monc-leak
João Eduardo Luís [Fri, 30 Aug 2013 17:36:07 +0000 (10:36 -0700)]
Merge pull request #530 from ceph/wip-monc-leak

mon/MonClient: release pending outgoing messages on shutdown

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoceph-post-file: use mktemp instead of tempfile
Sage Weil [Fri, 30 Aug 2013 16:41:29 +0000 (09:41 -0700)]
ceph-post-file: use mktemp instead of tempfile

tempfile is a debian thing, apparently; mktemp is present everywhere.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agorgw: Fix S3 auth when using response-* query string params
Sylvain Munaut [Thu, 29 Aug 2013 14:17:30 +0000 (16:17 +0200)]
rgw: Fix S3 auth when using response-* query string params

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Signed-off-by: Sylvain Munaut <s.munaut@whatever-company.com>
11 years agoceph.spec.in: remove trailing paren in previous commit
Gary Lowell [Thu, 22 Aug 2013 20:29:32 +0000 (13:29 -0700)]
ceph.spec.in:  remove trailing paren in previous commit

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
11 years agoceph.spec.in: Don't invoke debug_package macro on centos.
Gary Lowell [Thu, 22 Aug 2013 18:07:16 +0000 (11:07 -0700)]
ceph.spec.in:  Don't invoke debug_package macro on centos.

If the redhat-rpm-config package is installed, the debuginfo rpms will
be built by default.   The build will fail when the package installed
and the specfile also invokes the macro.

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
11 years agoMerge pull request #548 from dmick/next
Sage Weil [Tue, 27 Aug 2013 21:02:26 +0000 (14:02 -0700)]
Merge pull request #548 from dmick/next

ceph.in: add to $PATH if needed regardless of LD_LIBRARY_PATH state

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoceph.in: add to $PATH if needed regardless of LD_LIBRARY_PATH state 548/head
Dan Mick [Tue, 27 Aug 2013 20:37:14 +0000 (13:37 -0700)]
ceph.in: add to $PATH if needed regardless of LD_LIBRARY_PATH state

Signed-off-by: Dan Mick <dan.mick@inktank.com>
11 years agoosd: install admin socket commands after signals
Sage Weil [Sat, 24 Aug 2013 21:04:09 +0000 (14:04 -0700)]
osd: install admin socket commands after signals

This lets us tell by the presence of the admin socket commands whether
a signal will make us shut down cleanly.  See #5924.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #531 from dmick/wip-6099
Sage Weil [Fri, 23 Aug 2013 22:30:41 +0000 (15:30 -0700)]
Merge pull request #531 from dmick/wip-6099

ceph_rest_api.py: create own default for log_file

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoceph_rest_api.py: create own default for log_file 531/head
Dan Mick [Fri, 23 Aug 2013 00:30:24 +0000 (17:30 -0700)]
ceph_rest_api.py: create own default for log_file

common/config thinks the default log_file for non-daemons should be "".
Override that so that the default is
    /var/log/ceph/{cluster}-{name}.{pid}.log
since ceph-rest-api is more of a daemon than a client.

Fixes: #6099
Backport: dumpling
Signed-off-by: Dan Mick <dan.mick@inktank.com>
11 years agoMerge pull request #535 from ceph/wip-readdir-r-sucks
Yehuda Sadeh [Fri, 23 Aug 2013 19:00:30 +0000 (12:00 -0700)]
Merge pull request #535 from ceph/wip-readdir-r-sucks

Fix readdir_r invocation

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoos: make readdir_r buffers larger 535/head
Sage Weil [Fri, 23 Aug 2013 18:45:35 +0000 (11:45 -0700)]
os: make readdir_r buffers larger

PATH_MAX isn't quite big enough.

Backport: dumpling, cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoos: fix readdir_r buffer size
Sage Weil [Fri, 23 Aug 2013 18:45:08 +0000 (11:45 -0700)]
os: fix readdir_r buffer size

The buffer needs to be big or else we're walk all over the stack.

Backport: dumpling, cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomon/Paxos: fix another uncommitted value corner case
Sage Weil [Thu, 22 Aug 2013 22:54:48 +0000 (15:54 -0700)]
mon/Paxos: fix another uncommitted value corner case

It is possible that we begin the paxos recovery with an uncommitted
value for, say, commit 100.  During last/collect we discover 100 has been
committed already.  But also, another node provides an uncommitted value
for 101 with the same pn.  Currently, we refuse to learn it, because the
pn is not strictly > than our current uncommitted pn... even though it is
the next last_committed+1 value that we need.

There are two possible fixes here:

 - make this a >= as we can accept newer values from the same pn.
 - discard our uncommitted value metadata when we commit the value.

Let's do both!

Fixes: #6090
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agorgw: bucket meta remove don't overwrite entry point first
Yehuda Sadeh [Mon, 19 Aug 2013 23:56:27 +0000 (16:56 -0700)]
rgw: bucket meta remove don't overwrite entry point first

Fixes: #6056
When removing a bucket metadata entry we first unlink the bucket
and then we remove the bucket entrypoint object. Originally
when unlinking the bucket we first overwrote the bucket entrypoint
entry marking it as 'unlinked'. However, this is not really needed
as we're just about to remove it. The original version triggered
a bug, as we needed to propagate the new header version first (which
we didn't do, so the subsequent bucket removal failed).

Reviewed-by: Greg Farnum <greg@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoceph-disk: specify the filetype when mounting
Alfredo Deza [Fri, 23 Aug 2013 12:56:07 +0000 (08:56 -0400)]
ceph-disk: specify the filetype when mounting

Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #532 from dmick/next
Sage Weil [Fri, 23 Aug 2013 04:34:57 +0000 (21:34 -0700)]
Merge pull request #532 from dmick/next

PGMonitor: pg dump_stuck should respect --format (plain works fine)

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoQA: Compile fsstress if missing on machine.
Sandon Van Ness [Fri, 23 Aug 2013 02:44:40 +0000 (19:44 -0700)]
QA: Compile fsstress if missing on machine.

Some distro's have a lack of ltp-kernel packages and all we need is
fstress. This just modified the shell script to download/compile
fstress from source and copy it to the right location if it doesn't
currently exist where it is expected. It is a very small/quick
compile and currently only SLES and debian do not have it already.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
11 years agoPGMonitor: pg dump_stuck should respect --format (plain works fine) 532/head
Dan Mick [Fri, 23 Aug 2013 01:53:13 +0000 (18:53 -0700)]
PGMonitor: pg dump_stuck should respect --format (plain works fine)

Signed-off-by: Dan Mick <dan.mick@inktank.com>
11 years agomon/MonClient: release pending outgoing messages on shutdown 530/head
Sage Weil [Fri, 23 Aug 2013 00:46:45 +0000 (17:46 -0700)]
mon/MonClient: release pending outgoing messages on shutdown

This fixes a small memory leak when we have messages queued for the mon
when we shut down.  It is harmless except for the valgrind leak check
noise that obscures real leaks.

Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agorgw: fix crash when creating new zone on init
Yehuda Sadeh [Thu, 22 Aug 2013 17:53:12 +0000 (10:53 -0700)]
rgw: fix crash when creating new zone on init

Moving the watch/notify init before the zone init,
as we might need to send a notification.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoenable mds rejoin with active inodes' old parent xattrs
Alexandre Oliva [Thu, 22 Aug 2013 06:40:22 +0000 (03:40 -0300)]
enable mds rejoin with active inodes' old parent xattrs

When the parent xattrs of active inodes that the mds attempts to open
during rejoin lack pool info (struct_v < 5), this field will be filled
in with -1, causing the mds to retry fetching a backtrace with a pool
number that matches the expected value, which fails and causes the
err==-ENOENT branch to be taken and retry pool 1, which succeeds, but
with pool -1, and so keeps on bouncing between the two retry cases
forever.

This patch arranges for the mds to go along with pool -1 instead of
insisting that it be refetched, enabling it to complete recovery
instead of eating cpu, network bandwidth and metadata osd's resources
like there's no tomorrow, in what AFAICT is an infinite and very busy
loop.

This is not a new problem: I've had it even before upgrading from
Cuttlefish to Dumpling, I'd just never managed to track it down, and
force-unmounting the filesystem and then restarting the mds was an
easier (if inconvenient) work-around, particularly because it always
hit when the filesystem was under active, heavy-ish use (or there
wouldn't be much reason for caps recovery ;-)

There are two issues not addressed in this patch, however.  One is
that nothing seems to proactively update the parent xattr when it is
found to be outdated, so it remains out of date forever.  Not even
renaming top-level directories causes the xattrs to be recursively
rewritten.  AFAICT that's a bug.

The other is that inodes that don't have a parent xattr (created by
even older versions of ceph) are reported as non-existing in the mds
rejoin message, because the absence of the parent xattr is signaled as
a missing inode (?failed to reconnect caps for missing inodes?).  I
suppose this may cause more serious recovery problems.

I suppose a global pass over the filesystem tree updating parent
xattrs that are out-of-date would be desirable, if we find any parent
xattrs still lacking current information; it might make sense to
activate it as a background thread from the backtrace decoding
function, when it finds a parent xattr that's too out-of-date, or as a
separate client (ceph-fsck?).

Backport: dumpling, cuttlefish
Signed-off-by: Alexandre Oliva <oliva@gnu.org>
Reviewed-by: Zheng, Yan <zheng.z.yan@intel.com>
11 years agoceph-monstore-tool: shut up coverity
Sage Weil [Wed, 21 Aug 2013 05:44:43 +0000 (22:44 -0700)]
ceph-monstore-tool: shut up coverity

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agostore: fix issues reported by coverity
Yan, Zheng [Wed, 21 Aug 2013 05:26:50 +0000 (13:26 +0800)]
store: fix issues reported by coverity

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoobjecter: fix keys of dump_linger_ops
Josh Durgin [Wed, 21 Aug 2013 22:56:20 +0000 (15:56 -0700)]
objecter: fix keys of dump_linger_ops

The registering flag no longer exists, and registered was using the
wrong property due to a copy-paste error.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage.weil@inktank.com>
11 years agoobjecter: resend unfinished lingers when osdmap is no longer paused
Josh Durgin [Wed, 21 Aug 2013 21:28:49 +0000 (14:28 -0700)]
objecter: resend unfinished lingers when osdmap is no longer paused

Plain Ops that haven't finished yet need to be resent if the osdmap
transitions from full or paused to unpaused.  If these Ops are
triggered by LingerOps, they will be cancelled instead (since
should_resend = false), but the LingerOps that triggered them will not
be resent.

Fix this by checking the registered flag for all linger ops, and
resending any of them that aren't paused anymore.

Fixes: #6070
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage.weil@inktank.com>
11 years agorgw: change cache / watch-notify init sequence
Yehuda Sadeh [Mon, 19 Aug 2013 15:40:16 +0000 (08:40 -0700)]
rgw: change cache / watch-notify init sequence

Fixes: #6046
We were initializing the watch-notify (through the cache
init) before reading the zone info which was much too
early, as we didn't have the control pool name yet. Now
simplifying init/cleanup a bit, cache doesn't call watch/notify
init and cleanup directly, but rather states its need
through a virtual callback.

Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge remote-tracking branch 'gh/wip-6004' into next
Sage Weil [Tue, 20 Aug 2013 23:57:46 +0000 (16:57 -0700)]
Merge remote-tracking branch 'gh/wip-6004' into next

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years ago.gitignore: ignore test-driver
Sage Weil [Fri, 9 Aug 2013 19:49:57 +0000 (12:49 -0700)]
.gitignore: ignore test-driver

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agofuse: fix warning when compiled against old fuse versions
Sage Weil [Fri, 9 Aug 2013 19:42:49 +0000 (12:42 -0700)]
fuse: fix warning when compiled against old fuse versions

client/fuse_ll.cc: In function 'void invalidate_cb(void*, vinodeno_t, int64_t, int64_t)':
warning: client/fuse_ll.cc:540: unused variable 'fino'

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agojson_spirit: remove unused typedef
Sage Weil [Fri, 9 Aug 2013 19:40:34 +0000 (12:40 -0700)]
json_spirit: remove unused typedef

In file included from json_spirit/json_spirit_writer.cpp:7:0:
json_spirit/json_spirit_writer_template.h: In function 'String_type json_spirit::non_printable_to_string(unsigned int)':
json_spirit/json_spirit_writer_template.h:37:50: warning: typedef 'Char_type' locally defined but not used [-Wunused-local-typedefs]
         typedef typename String_type::value_type Char_type;

(Also, ha ha, this file uses \r\n.)

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agogtest: add build-aux/test-driver to .gitignore
Sage Weil [Fri, 9 Aug 2013 19:31:41 +0000 (12:31 -0700)]
gtest: add build-aux/test-driver to .gitignore

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #517 from dmick/wip-6049
Dan Mick [Tue, 20 Aug 2013 19:18:43 +0000 (12:18 -0700)]
Merge pull request #517 from dmick/wip-6049

mon/PGMap: OSD byte counts 4x too large (conversion to bytes overzealous)

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon/Paxos: always refresh after any store_state
Sage Weil [Tue, 20 Aug 2013 18:27:23 +0000 (11:27 -0700)]
mon/Paxos: always refresh after any store_state

If we store any new state, we need to refresh the services, even if we
are still in the midst of Paxos recovery.  This is because the
subscription path will share any committed state even when paxos is
still recovering.  This prevents a race like:

 - we have maps 10..20
 - we drop out of quorum
 - we are elected leader, paxos recovery starts
 - we get one LAST with committed states that trim maps 10..15
 - we get a subscribe for map 10..20
   - we crash because 10 is no longer on disk because the PaxosService
     is out of sync with the on-disk state.

Fixes: #6045
Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon/Paxos: return whether store_state stored anything
Sage Weil [Tue, 20 Aug 2013 18:27:09 +0000 (11:27 -0700)]
mon/Paxos: return whether store_state stored anything

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon/Paxos: cleanup: use do_refresh from handle_commit
Sage Weil [Tue, 20 Aug 2013 18:26:57 +0000 (11:26 -0700)]
mon/Paxos: cleanup: use do_refresh from handle_commit

This avoid duplicated code by using the helper created exactly for this
purpose.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agopybind: fix Rados.conf_parse_env test
Sage Weil [Tue, 20 Aug 2013 18:23:46 +0000 (11:23 -0700)]
pybind: fix Rados.conf_parse_env test

This happens after we connect, which means we get ENOSYS always.
Instead, parse_env inside the normal setup method, which had the added
benefit of being able to debug these tests.

Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomon/PGMap: OSD byte counts 4x too large (conversion to bytes overzealous) 517/head
Dan Mick [Tue, 20 Aug 2013 18:10:42 +0000 (11:10 -0700)]
mon/PGMap: OSD byte counts 4x too large (conversion to bytes overzealous)

Fixes: #6049
Signed-off-by: Dan Mick <dan.mick@inktank.com>
11 years agoPG: remove old log when we upgrade log version
Samuel Just [Tue, 20 Aug 2013 00:23:44 +0000 (17:23 -0700)]
PG: remove old log when we upgrade log version

Otherwise the log_oid will be non-empty and the next
boot will cause us to try to upgrade again.

Fixes: #6057
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoPGLog: add a config to disable PGLog::check()
Samuel Just [Mon, 19 Aug 2013 07:02:24 +0000 (00:02 -0700)]
PGLog: add a config to disable PGLog::check()

This is a debug check which may be causing excessive
cpu usage.

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoceph: parse CEPH_ARGS environment variable
Sage Weil [Mon, 19 Aug 2013 19:48:50 +0000 (12:48 -0700)]
ceph: parse CEPH_ARGS environment variable

Fixes: #6052
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
11 years agorados pybind: add conf_parse_env()
Sage Weil [Mon, 19 Aug 2013 19:48:40 +0000 (12:48 -0700)]
rados pybind: add conf_parse_env()

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
11 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Mon, 19 Aug 2013 19:41:54 +0000 (12:41 -0700)]
Merge remote-tracking branch 'gh/next'

11 years agodoc/release-notes: v0.61.8
Sage Weil [Mon, 19 Aug 2013 19:41:26 +0000 (12:41 -0700)]
doc/release-notes: v0.61.8

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #513 from dalgaaf/fix/wip-da-documentation
Sage Weil [Mon, 19 Aug 2013 19:32:30 +0000 (12:32 -0700)]
Merge pull request #513 from dalgaaf/fix/wip-da-documentation

Fix documentation issues

11 years agofilestore-config-ref.rst: mark some filestore keys as deprecated 513/head
Danny Al-Gaaf [Mon, 19 Aug 2013 18:56:48 +0000 (20:56 +0200)]
filestore-config-ref.rst: mark some filestore keys as deprecated

Marked the following keys as deprecated since v0.65:
- filestore flusher
- filestore flusher max fds
- filestore sync flush

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
11 years agoMerge pull request #512 from ceph/wip-5988
Sage Weil [Mon, 19 Aug 2013 18:16:57 +0000 (11:16 -0700)]
Merge pull request #512 from ceph/wip-5988

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge branch 'wip-erasure-coded-doc'
Samuel Just [Mon, 19 Aug 2013 18:02:45 +0000 (11:02 -0700)]
Merge branch 'wip-erasure-coded-doc'

11 years agolibrados: synchronous commands should return on commit instead of ack 512/head
Greg Farnum [Mon, 19 Aug 2013 17:29:49 +0000 (10:29 -0700)]
librados: synchronous commands should return on commit instead of ack

This is unlikely to be noticed by anybody, but it is a big change. Document
in the PendingReleaseNotes and bump up the librados minor version number
to 68.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoMerge pull request #493 from dachary/wip-erasure-coding-doc
athanatos [Mon, 19 Aug 2013 17:28:48 +0000 (10:28 -0700)]
Merge pull request #493 from dachary/wip-erasure-coding-doc

rearrange erasure code documents

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agomon: make MonMap error message about unspecified monitors less specific.
Greg Farnum [Mon, 19 Aug 2013 17:21:16 +0000 (10:21 -0700)]
mon: make MonMap error message about unspecified monitors less specific.

The error message helpfully references the -m and -c CLI options for
specifying monitors, but this code can be invoked from non-core librados
client applications so that's unfortunately not kosher. Remove the
reference.

Fixes #5979.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoauth-config-ref.rst: fix signature keys
Danny Al-Gaaf [Mon, 19 Aug 2013 08:33:37 +0000 (10:33 +0200)]
auth-config-ref.rst: fix signature keys

Fix names of cephx signature keys.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
11 years agoobjclass: move cls_log into class_api.cc
Sage Weil [Sat, 17 Aug 2013 21:30:37 +0000 (14:30 -0700)]
objclass: move cls_log into class_api.cc

Not sure why but this seems to resolve a linking problem when loading
classes:

2013-08-17 13:28:19.015776 7fb2bcffa700  0 _load_class could not open class /usr/lib/rados-classes/libcls_hello.so (dlopen failed): /usr/lib/rados-classes/libcls_hello.so: undefined symbol: cls_log
2013-08-17 13:28:19.015786 7fb2bcffa700 -1 osd.4 12 class hello open got (5) Input/output error

In any case, it's simpler.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc/dev/filestore-filesystem-compatibliity: remove outdated xattr notes
Sage Weil [Sat, 17 Aug 2013 18:04:47 +0000 (11:04 -0700)]
doc/dev/filestore-filesystem-compatibliity: remove outdated xattr notes

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #494 from kri5/wip-s3-compliance-doc
Sage Weil [Sat, 17 Aug 2013 18:00:59 +0000 (11:00 -0700)]
Merge pull request #494 from kri5/wip-s3-compliance-doc

doc: complete S3 features status from existing doc page

11 years agoMerge pull request #491 from kri5/wip-clang-compilation
Sage Weil [Sat, 17 Aug 2013 17:59:01 +0000 (10:59 -0700)]
Merge pull request #491 from kri5/wip-clang-compilation

Fix compilation -Wmismatched-tags warnings

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agodoc: Updated upgrade doc to include dumpling and incorporate ceph-deploy.
John Wilkins [Sat, 17 Aug 2013 17:35:32 +0000 (10:35 -0700)]
doc: Updated upgrade doc to include dumpling and incorporate ceph-deploy.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agoMerge pull request #479 from devoid/fix-5797
Sage Weil [Sat, 17 Aug 2013 17:09:01 +0000 (10:09 -0700)]
Merge pull request #479 from devoid/fix-5797

Document unstable nature of CephFS

11 years agoMakefile: move objclass/*.cc to libosd.la
Sage Weil [Sat, 17 Aug 2013 16:40:44 +0000 (09:40 -0700)]
Makefile: move objclass/*.cc to libosd.la

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc/changelog: add missing file
Sage Weil [Sat, 17 Aug 2013 15:38:55 +0000 (08:38 -0700)]
doc/changelog: add missing file

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoos/FileStore: initialize blk_size on _detect_fs()
Sage Weil [Sat, 17 Aug 2013 15:30:26 +0000 (08:30 -0700)]
os/FileStore: initialize blk_size on _detect_fs()

This was missed by a25d73effb38118602bc73da0aa258c639f69c2c.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc/release-notes: v0.67.1
Sage Weil [Sat, 17 Aug 2013 15:20:00 +0000 (08:20 -0700)]
doc/release-notes: v0.67.1

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #505 from ceph/wip-post-file
Sage Weil [Sat, 17 Aug 2013 06:41:38 +0000 (23:41 -0700)]
Merge pull request #505 from ceph/wip-post-file

ceph-post-file: single command to upload a file to cephdrop

11 years agomds: create only one ESubtreeMap during fs creation
Sage Weil [Sat, 17 Aug 2013 05:08:00 +0000 (22:08 -0700)]
mds: create only one ESubtreeMap during fs creation

Previously we would create an empty ESubtreeMap when we opened the log
segment and then immediately journal a second one that created the root
and mdsdir.  More importantly, for the second ESubtreeMap, we would not
wait for it to commit before requesting the ACTIVE state, leading to
#4894.

Instead, break start_new_segment() into two steps: one that creates the
in-memory LogSegment tracking structure, and one that journals the
ESubtreeMap.  Open things early and write the (one) ESubtreeMap at the
end of boot_create().. and then wait for it.

Fixes: #4894
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agodoc: quickstart: be more explicit that node == mon node
Sage Weil [Sat, 17 Aug 2013 04:18:21 +0000 (21:18 -0700)]
doc: quickstart: be more explicit that node == mon node

This appears to be one source of confusion for new users that leads to
a failure to form an initial mon quorum.  See comments on

 http://tracker.ceph.com/issues/4924

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agorgw: drain requests before exiting
Yehuda Sadeh [Tue, 13 Aug 2013 20:16:07 +0000 (13:16 -0700)]
rgw: drain requests before exiting

Fixes: #5953
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoceph-post-file: single command to upload a file to cephdrop 505/head
Sage Weil [Sat, 17 Aug 2013 00:59:11 +0000 (17:59 -0700)]
ceph-post-file: single command to upload a file to cephdrop

Use sftp to upload to a directory that only this user and ceph devs can
access.

Distribute an ssh key to connect to the account.  This will let us revoke
the key in the future if we feel the need.  Also distribute a known_hosts
file so that users have some confidence that they are connecting to the
real ceph drop account and not some third party.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
11 years agodoc: Removed old mkcephfs references.
John Wilkins [Sat, 17 Aug 2013 00:31:43 +0000 (17:31 -0700)]
doc: Removed old mkcephfs references.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agodoc: Removed mkcephfs references.
John Wilkins [Sat, 17 Aug 2013 00:28:15 +0000 (17:28 -0700)]
doc: Removed mkcephfs references.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agodoc: Updated script for dumpling.
John Wilkins [Sat, 17 Aug 2013 00:27:53 +0000 (17:27 -0700)]
doc: Updated script for dumpling.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agodoc: Updated APT script for dumpling.
John Wilkins [Sat, 17 Aug 2013 00:27:16 +0000 (17:27 -0700)]
doc: Updated APT script for dumpling.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agodoc: Removed mkcephfs references. Did a bit of clean-up work.
John Wilkins [Sat, 17 Aug 2013 00:26:25 +0000 (17:26 -0700)]
doc: Removed mkcephfs references. Did a bit of clean-up work.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agoMerge pull request #509 from dmick/wip-rest-conf
Dan Mick [Sat, 17 Aug 2013 00:05:52 +0000 (17:05 -0700)]
Merge pull request #509 from dmick/wip-rest-conf

config_opts: add two ceph-rest-api-only variables for convenience

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoReplicatedPG: add osd_recover_clone_overlap_limit to limit clones
Samuel Just [Thu, 15 Aug 2013 22:35:26 +0000 (15:35 -0700)]
ReplicatedPG: add osd_recover_clone_overlap_limit to limit clones

We don't want to clone_range from clones too many times.
For now, just skip the cloning if there are too many holes.

Fixes: #5985
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoconfig_opts: add two ceph-rest-api-only variables for convenience 509/head
Dan Mick [Fri, 16 Aug 2013 20:15:34 +0000 (13:15 -0700)]
config_opts: add two ceph-rest-api-only variables for convenience

These aren't used by the C++ code at all, but in order for
rados_conf_get to find them, they need to be listed.  They're
consumed by ceph_rest_api.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
11 years agoMerge remote-tracking branch 'upstream/wip-zfs'
Samuel Just [Fri, 16 Aug 2013 23:35:21 +0000 (16:35 -0700)]
Merge remote-tracking branch 'upstream/wip-zfs'

Reviewed-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #504 from ceph/wip-cls-hello
Sage Weil [Fri, 16 Aug 2013 18:07:02 +0000 (11:07 -0700)]
Merge pull request #504 from ceph/wip-cls-hello

cls/hello: hello, world rados class

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.com>
11 years agoosdc/ObjectCacher: do not merge rx buffers
Sage Weil [Fri, 16 Aug 2013 04:48:06 +0000 (21:48 -0700)]
osdc/ObjectCacher: do not merge rx buffers

We do not try to merge rx buffers currently.  Make that explicit and
documented in the code that it is not supported.  (Otherwise the
last_read_tid values will get lost and read results won't get applied
to the cache properly.)

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosdc/ObjectCacher: match reads with their original rx buffers
Sage Weil [Fri, 16 Aug 2013 04:47:18 +0000 (21:47 -0700)]
osdc/ObjectCacher: match reads with their original rx buffers

Consider a sequence like:

 1- start read on 100~200
       100~200 state rx
 2- truncate to 200
       100~100 state rx
 3- start read on 200~200
       100~100 state rx
       200~200 state rx
 4- get 100~200 read result

Currently this makes us crash on

osdc/ObjectCacher.cc: 738: FAILED assert(bh->length() <= start+(loff_t)length-opos)

when processing the second 200~200 bufferhead (it is too big).  The
larger issue, though, is that we should not be looking at this data at
all; it has been truncated away.

Fix this by marking each rx buffer with the read request that is sent to
fill it, and only fill it from that read request.  Then the first reply
will fill the first 100~100 extend but not touch the other extent; the
second read will do that.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge branch 'wip-5848-coll'
David Zafman [Fri, 16 Aug 2013 01:32:30 +0000 (18:32 -0700)]
Merge branch 'wip-5848-coll'

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoosd: Add perf tracking for all states in RecoveryState
David Zafman [Thu, 15 Aug 2013 19:28:06 +0000 (12:28 -0700)]
osd: Add perf tracking for all states in RecoveryState

Fixes: #5848
Signed-off-by: David Zafman <david.zafman@inktank.com>
11 years agocls/hello: hello, world rados class 504/head
Sage Weil [Fri, 16 Aug 2013 00:20:43 +0000 (17:20 -0700)]
cls/hello: hello, world rados class

Simple example of a rados class doing read, write, and read/modify/write
methods.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: enforce RD, WR flags for class methods
Sage Weil [Thu, 15 Aug 2013 23:19:21 +0000 (16:19 -0700)]
osd: enforce RD, WR flags for class methods

Class methods are marked with RD and WR to help the OSD decide when we need
to flush objects or require certain permissions.  Ensure that methods do
not step outside their advertised capabilities by keeping a counter of rd
and wr ops we perform in do_osd_ops() and making sure that class methods,
and any ops the indirectly call, do not break the rules.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocls_rbd: remove old assign_bid method
Sage Weil [Thu, 15 Aug 2013 22:22:41 +0000 (15:22 -0700)]
cls_rbd: remove old assign_bid method

This method is problematic because it both writes/mutates and returns data,
which means that an untimely client disconnect or peering event will result
in a success to the client with no payload.

It has not been used since v0.52 (18054ba46fe2779d8df8b1a0d69ec93ca6a66c34)
which is pre-bobtail; so this change breaks compatibility with pre-bobtail
librbd clients (at least for image creation).

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agolibrbd: remove mostly-useless assign_bid helper
Sage Weil [Thu, 15 Aug 2013 22:18:51 +0000 (15:18 -0700)]
librbd: remove mostly-useless assign_bid helper

Do it inline.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: do not return data payload for successful writes
Sage Weil [Thu, 15 Aug 2013 22:06:38 +0000 (15:06 -0700)]
osd: do not return data payload for successful writes

We were somewhat inadvertantly returning a data payload for write
operations.  This was a side-effect of the OpContext::ops field being a
reference to MOSDOp::ops: the return data would end up there, and then
the MOSDOpReply ctor would copy it.

Fix this by breaking the ref, and making the do_op() logic also claim
return result data for error values (so that errors can return data to the
caller).

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocommon/Preforker: shut up warning
Sage Weil [Thu, 15 Aug 2013 21:35:28 +0000 (14:35 -0700)]
common/Preforker: shut up warning

common/Preforker.h: In member function 'void Preforker::daemonize()':
common/Preforker.h:97:40: warning: ignoring return value of 'ssize_t write(int, const void*, size_t)', declared with attribute warn_unused_result [-Wunused-result]

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Fri, 16 Aug 2013 00:21:00 +0000 (17:21 -0700)]
Merge remote-tracking branch 'gh/next'

11 years agoMerge pull request #506 from dmick/wip-admin-daemon
Sage Weil [Fri, 16 Aug 2013 00:14:23 +0000 (17:14 -0700)]
Merge pull request #506 from dmick/wip-admin-daemon

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoceph.in: --admin-daemon was not returning EINVAL on bad command 506/head
Dan Mick [Fri, 16 Aug 2013 00:10:56 +0000 (17:10 -0700)]
ceph.in: --admin-daemon was not returning EINVAL on bad command

Fix by restructuring code to hoist common code and have only one
place where admin_socket is actually called.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
11 years agoMerge pull request #507 from ceph/wip-4635.master
João Eduardo Luís [Thu, 15 Aug 2013 22:54:10 +0000 (15:54 -0700)]
Merge pull request #507 from ceph/wip-4635.master

Bunch of tidying up on monitor services & fix #4635

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoPendingReleaseNotes: reflect 'osd crush set' behavior change 507/head
Joao Eduardo Luis [Thu, 15 Aug 2013 22:46:30 +0000 (15:46 -0700)]
PendingReleaseNotes: reflect 'osd crush set' behavior change

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agovstart.sh: s/osd crush set/osd crush add/ as it's supposed to be
Joao Eduardo Luis [Thu, 15 Aug 2013 01:22:29 +0000 (18:22 -0700)]
vstart.sh: s/osd crush set/osd crush add/ as it's supposed to be

'osd crush set' should only be used to update already existing items on
the map whereas 'osd crush add' should be able to 'add and update' items.

Considering at that point we are effectively adding a new item to the
crush map, use 'add' instead of 'set'.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: OSDMonitor: don't expose uncommitted state on 'osd crush add/set'
Joao Eduardo Luis [Thu, 15 Aug 2013 01:20:24 +0000 (18:20 -0700)]
mon: OSDMonitor: don't expose uncommitted state on 'osd crush add/set'

Fixes: #4635
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: OSDMonitor: document 'prepare_command' wrt expected behavior of no-ops
Joao Eduardo Luis [Wed, 14 Aug 2013 23:32:17 +0000 (16:32 -0700)]
mon: OSDMonitor: document 'prepare_command' wrt expected behavior of no-ops

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: OSDMonitor: don't expose uncommitted state on 'osd crush link'
Sage Weil [Wed, 14 Aug 2013 23:23:14 +0000 (16:23 -0700)]
mon: OSDMonitor: don't expose uncommitted state on 'osd crush link'

Fixes: #4635
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>