Samuel Just [Tue, 29 Mar 2011 18:28:19 +0000 (11:28 -0700)]
MDS: change messenger name for replay mdses
This will cause read operations from standby mdses to be distinguishable
from those from the normal by changing the node name in the messenger.
Previously, the replay node would have the same name as the node it's
following.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Fri, 1 Apr 2011 22:57:46 +0000 (15:57 -0700)]
journaler: fix partial tail entry correction
If we encounter a partial tail entry, we drop it by moving the write_pos
(end of journal) back to read_pos. We also need to reset the read
state (read_buf, requested/received_pos) so that subsequent replay attempts
won't be horribly confused.
Sage Weil [Tue, 22 Mar 2011 21:47:15 +0000 (14:47 -0700)]
mds: fix bounds on import
The add_ambiguous_import() call was clobbering the bounds field for
EImportStart::replay(), screwing up the subtree auth adjustment. Make the
argument const.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Thu, 17 Mar 2011 22:09:15 +0000 (15:09 -0700)]
mds: close exported dirfrag
We have to keep export bounds open for auth subtrees. After we export a
subtree, though, there are two opportunities to drop empty dirfrags from
our cache:
- The children of the exported subtree may now be trimmable, if they are
also non-auth and empty.
- The exported subtree may be trimmable if it is empty and the parent is
also non-auth. This may be true for ancestors further up the hierarchy
as well.
This helps ensure that when we get to rejoin, the only non-auth subtrees we
have are there because they are non-empty or because they are bounds on our
own subtrees.
Sage Weil [Fri, 1 Apr 2011 18:00:44 +0000 (11:00 -0700)]
mds: fix discover_path
If we have the base dirfrag, do not request it. Otherwise we can get a
reply that contains only that (partial progress), and we will then fail
to wake up our dentry waiter.
When initializing the config_options array, complain if the size of the
option field we're trying to initialize doesn't match the size of our
type. This will prevent careless type annotations from overwriting
neighboring option fields.
Also create a header called "static assert" which implements a
compile-time assert.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Sage Weil [Thu, 31 Mar 2011 23:42:43 +0000 (16:42 -0700)]
mds: allow explicit finisher context for path_traverse
Previously we could only path_traverse and retry a request or message.
This just allows an explicit context to be used as well. It's the caller's
job to clean it up if we return <= 0.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Josh Durgin [Wed, 30 Mar 2011 23:41:00 +0000 (16:41 -0700)]
librbd: fix snapshot handling
To ensure consistency, always set the snap context when the header is
updated. If snapid is set, we update librados' snapid when refreshing
the header as well. Also use CEPH_NOSNAP instead of 0 as the default
snapid to prevent confusion. These changes fix snapshot creation
and removal, and prevent writing to a snapshot.
Rollback is fixed by using selfmanaged_snapshot_rollback.
Josh Durgin [Wed, 30 Mar 2011 22:00:55 +0000 (15:00 -0700)]
librados: add selfmanaged_snap_rollback
This was removed in 2cb86f713df38ebee6aa10a81157f99264a59a70, but is
required for selfmanaged snaps because their snapids aren't in the
pool's snap list, which is how regular rollback finds them.
Samuel Just [Wed, 30 Mar 2011 20:14:55 +0000 (13:14 -0700)]
mkcephfs: copy to daemon nodes for each daemon
The tmp directory is removed after each daemon. Previously, this would
break if two daemons were on the same node. Now, the files will be
copied for each daemon.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Wed, 30 Mar 2011 23:46:04 +0000 (16:46 -0700)]
journaler: don't block when we adjust back write_pos
is_readable() may need to adjust the write_pos backward, but will return
false. If we are at the end we still need to wake up any waiters so they
know about it.
md_config_t::parse_argv: fold md_config_t::parse_argv_part2 into
parse_argv. Fix brokenness introduced by the std::string switchover.
OPTION macro: move single-character options out of the OPTION macro and
into config.cc
Fix ceph_argparse_witharg / ceph_argparse_flag uses to include a
trailing (char*)NULL, to ensure that we terminate with a pointer rather
than a 32-bit int.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Samuel Just [Wed, 30 Mar 2011 20:14:55 +0000 (13:14 -0700)]
mkcephfs: copy to daemon nodes for each daemon
The tmp directory is removed after each daemon. Previously, this would
break if two daemons were on the same node. Now, the files will be
copied for each daemon.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Use std::string to represent md_config_t strings. This makes memory
management a lot easier and should fix some leaks. "No value" is now
represented by an empty string, whereas before some places were using
empty strings and some were using NULL.
config.cc: Fix a minor decode bug.
In pid_file.cc, copy the pid_file using snprintf, since strncpy
does not always NULL-terminate.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Move parsing into config.cc, since there was already parsing code there.
Move metavariable escaping out of ConfUtils; having this in ConfUtils
makes it impossible to de-globalize g_conf.
Create a nicer API for pulling stuff out of the configuration file.
Since the value we pull is determined by the config structure in effect
at the time, it should be an instance method of md_config_t.
Remove some deadcode. Add some comments.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Tommi Virtanen [Tue, 29 Mar 2011 16:21:09 +0000 (09:21 -0700)]
common: Make armor.h safe to use from C.
mount.ceph needs to base64-decode the secrets, so we can get rid of
the kernel-side base64 decode, but it doesn't need all of common lib.
And it is written in C.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Tommi Virtanen [Tue, 29 Mar 2011 00:32:24 +0000 (17:32 -0700)]
mount.ceph: Modprobe ceph before trying the mount.
This will be needed for the next few commits, where we try to load the
keys into the kernel; without ceph.ko loaded, the key type will not be
recognized.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Sage Weil [Tue, 29 Mar 2011 18:58:13 +0000 (11:58 -0700)]
cmon: add --inject-monmap option
This lets you manually inject a monmap into a down monitor. This is useful
in cases where you need to change the monmap but aren't able to get a
quorum with the old map.