Ilya Dryomov [Tue, 27 May 2014 14:35:36 +0000 (18:35 +0400)]
qa: catch up with xfstests changes
Back in 2013 xfstests were rearranged, which also changed the way
./check parses test lists. Catch up with those changes. Note that
tests can no longer be listed in ranges, we only accept individual
tests and test groups (e.g. -g quick).
Ilya Dryomov [Fri, 30 May 2014 09:37:04 +0000 (13:37 +0400)]
qa: cp run_xfstests.sh run_xfstests-obsolete.sh
run_xfstests.sh is going to be updated in the next commit to be able to
drive newer xfstests. Among other things, the new xfstests proper
doesn't support listing tests in ranges, which is what the qemu wrapper
(run_xfstests_qemu.sh) relies on. So keep a copy of the old
run_xfstests.sh around until the qemu vm image is regenerated and the
up-to-date exclusion list for that kernel is shaken out.
Ilya Dryomov [Mon, 12 May 2014 08:30:45 +0000 (12:30 +0400)]
mon: set MMonGetVersionReply tid
Currently we don't set MMonGetVersionReply tid even if the original
MMonGetVersion message had a non-zero tid. This is bad for the kernel
client, which has the infrastructure in place that relies on tids to
lookup message buffers and contexts. To kick off transitioning away
from the workaround, set MMonGetVersionReply tid to the tid of the
original MMonGetVersion message.
Aristoteles Neto [Tue, 20 May 2014 22:20:55 +0000 (10:20 +1200)]
Update manual-deployment.rst
- When creating the OSD data, specify osd-uuid so that it matches when the osd is first created.
- Modify caps when adding osd auth to match what ceph-deploy does.
~~~~
./include/atomic.h: In member function 'size_t ceph::atomic_t::inc()':
./include/atomic.h:42:36: error: 'AO_fetch_and_add1' was not declared in this scope
return AO_fetch_and_add1(&val) + 1;
^
./include/atomic.h: In member function 'size_t ceph::atomic_t::dec()':
./include/atomic.h:45:42: error: 'AO_fetch_and_sub1_write' was not declared in this scope
return AO_fetch_and_sub1_write(&val) - 1;
^
./include/atomic.h: In member function 'void ceph::atomic_t::add(size_t)':
./include/atomic.h:48:36: error: 'AO_fetch_and_add' was not declared in this scope
AO_fetch_and_add(&val, add_me);
^
./include/atomic.h: In member function 'void ceph::atomic_t::sub(int)':
./include/atomic.h:52:48: error: 'AO_fetch_and_add_write' was not declared in this scope
AO_fetch_and_add_write(&val, (AO_t)negsub);
^
./include/atomic.h: In member function 'size_t ceph::atomic_t::dec()':
./include/atomic.h:46:5: warning: control reaches end of non-void function [-Wreturn-type]
}
^
make[5]: *** [cls/user/cls_user_client.o] Error 1
~~~~
John Spray [Tue, 20 May 2014 15:25:19 +0000 (16:25 +0100)]
mon: Fix default replicated pool ruleset choice
Specifically, in the case where the configured
default ruleset is CEPH_DEFAULT_CRUSH_REPLICATED_RULESET,
instead of assuming ruleset 0 exists, choose the lowest
numbered ruleset.
In the case where an explicit ruleset is passed to
OSDMonitor::prepare_pool_crush_ruleset, verify
that it really exists.
The idea is to eliminate cases where a pool could
exist with its crush ruleset set to something
other than a value ruleset ID.
Fixes: #8373 Signed-off-by: John Spray <john.spray@inktank.com>
Sage Weil [Tue, 20 May 2014 22:07:07 +0000 (15:07 -0700)]
mds: use mds_stamp for mksnap
Use the server timestamp for the snapshot timestamp. This could arguably
be the client timestamp, but I think snapshot creation times are a bit
more important to have accurate timestamps on, and this should not be
something that existing client apps will strongly depend on.
Sage Weil [Tue, 20 May 2014 22:04:03 +0000 (15:04 -0700)]
mds: reset mds_stamp for readdir, rename, link
These ops to complicated work prior to starting the real operation, like
fetching missing directories, or opening remote dirfrags, creating
snaprealms. Reset the mds timestamp after this slow work has completed.
Sage Weil [Tue, 20 May 2014 21:59:35 +0000 (14:59 -0700)]
mds: use client-provided time stamp for user-visible file metadata
Use the op_stamp from the MDRequest, populated by the MClientRequest when
possible, for setting timestamps on user-visible metadata (like ctime,
mtime).
Sage Weil [Tue, 20 May 2014 21:55:05 +0000 (14:55 -0700)]
mds: do rstat timestamps (rctime, fragstat mtime) in terms of op stamp
Use the op (client) timestamp for the recursive stats, for santity's sake.
Note that since this is monotonically increasing, the danger here is
that we lose track of nested changes due to skewed client clocks.
Sage Weil [Tue, 20 May 2014 21:52:59 +0000 (14:52 -0700)]
mds: make sure mds_stamp is set when we journal
This is a catch-all that we are carrying over from before. It may not
be strictly necessary, but I'm not inclined to check the code for
Mutation users who didn't call acquire_locks().
Kevin Dalley [Mon, 19 May 2014 22:03:35 +0000 (15:03 -0700)]
doc: quick-ceph-deploy cleanup
Improve documentation in quick-ceph-deploy.rst
Use admin-node consistently.
ceph should be installed on admin-node for the following reasons:
"ceph-deploy admin admin-node" assumes that /etc/ceph exists.
"ceph health" requires the use of ceph
Samuel Just [Fri, 16 May 2014 23:56:33 +0000 (16:56 -0700)]
ReplicatedPG::start_flush: fix clone deletion case
dsnapc.snaps will be non-empty most of the time if there
have been snaps before prev_snapc. What we really want to
know is whether there are any snaps between oi.snaps.back()
and prev_snapc.
Fixes: 8334
Backport: firefly Signed-off-by: Samuel Just <sam.just@inktank.com>
Kevin Dalley [Mon, 19 May 2014 20:38:31 +0000 (13:38 -0700)]
doc: Clean up pre-flight documentation
Mention recent Ceph releases.
Move important message about sudo and ceph-deploy closer to the use of
ceph-deploy.
Mention files created by ceph-deploy comment
Separate apt-get from yum command
Yan, Zheng [Wed, 14 May 2014 06:32:34 +0000 (14:32 +0800)]
mds: fix remote auth pin race
When removing auth unpinned objects from mdr->remote_auth_pins,
Server::handle_slave_auth_pin() checks object's authority to decide
if the object was auth pinned by a given MDS. This method isn't
reliable because when object isn't auth pinned, its authority may
change.
The fix is remember from which MDS an objects was auth pinned.
Yan, Zheng [Thu, 8 May 2014 07:14:43 +0000 (15:14 +0800)]
mds: skip journaling slave rename when possible
Rename operation can affect three dentries and two inodes. For MDS
who receives rename slave request, but isn't authority of any of
these dentries/inodes and doesn't have any auth subtree under these
dentries/inodes, journaling slave rename can be skipped.
Yan, Zheng [Thu, 8 May 2014 05:55:25 +0000 (13:55 +0800)]
mds: include all of directory inode's replicas in rmdir witnesses
If a MDS crashed after journaling a rmdir operation, but before sending
MDentryUnlink messages. Survivor MDS may have incorrect linkage for the
removed directory. Later when the MDS recovers, the incorrect linkage
can cause survivor MDS crash.
The fix is include all of directory inode's replicas in rmdir witnesses
list. When receiving a rmdir slave request, MDS who has no auth subtree
in the directory only need to update its cache and send reply (doesn't
need to journal the slave request).
Ilya Dryomov [Fri, 16 May 2014 15:03:13 +0000 (19:03 +0400)]
OSDMonitor: set next commit in mon primary-affinity reply
Commit 8c5c55c8b47e ("mon: set next commit in mon command replies")
fixed MMonCommand replies to include the right version, but the
primary-affinity handler was authored before that. Fix it.
Dmitry Smirnov [Fri, 16 May 2014 10:26:38 +0000 (20:26 +1000)]
sample.ceph.conf: minor update
* Moved filestore settings above [osd.*] declarations otherwise
(if uncommented) those settings might be applied only to last
OSD which is not very obvious.
* Few options added.
Greg Farnum [Thu, 15 May 2014 23:50:43 +0000 (16:50 -0700)]
OSD: fix an osdmap_subscribe interface misuse
When calling osdmap_subscribe, you have to pass an epoch newer than the
current map's. _maybe_boot() was not doing this correctly -- we would
fail a check for being *in* the monitor's existing map range, and then
pass along the map prior to the monitor's range. But if we were exactly
one behind, that value would be our current epoch, and the request would
get dropped. So instead, make sure we are not *in contact* with the monitor's
existing map range.
Signed-off-by: Greg Farnum <greg@inktank.com> Reviewed-by: Samuel Just <sam.just@inktank.com>
John Spray [Tue, 13 May 2014 16:32:03 +0000 (17:32 +0100)]
doc: update instructions for RPM distros
Fix RPM building instructions: this has been broken since
libs3 was included inline in the ceph repo as a submodule.
"rpmbuild -tb" was concatenating the ceph.spec and
libs3.spec files, resulting in something that didn't work.
Also, the instructions suggested downloading a .tar.gz file
whereas the specfile requires a .tar.bz2 file.
Also, add a convenient yum command line for getting the compile
dependencies on Fedora 20.
Signed-off-by: John Spray <john.spray@inktank.com>