Tommi Virtanen [Fri, 11 Mar 2011 18:28:58 +0000 (10:28 -0800)]
auth: Allow using NSS as crypto library.
Added new configure flag --with-nss that enables this. NSS is also
automatically used if it is available and CryptoPP is not; use
--without-nss to explicitly forbid this.
No change on rgw crypto yet; rgw won't build without CryptoPP for now.
NSS initialization is in a static constructor for now. All it does is
set some values on in-memory data structures, so as long as no (other)
static constructor tries to use it, everything should just work. While
this could be moved to common_init, there are several other context
initialization steps with NSS, and a later refactoring to share the
results of these can just include NSS init as its first operation, at
practically no cost.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Sage Weil [Fri, 11 Mar 2011 20:45:16 +0000 (12:45 -0800)]
osd: wait for handle_osd_map transaction ondisk without doing a full sync
Doing a full sync (forcing a btrfs transaction etc) was just wrong here.
All we (might) care about is whether our Objectstore::Transaction is
stable (in journal or fs) or not.
We are still waiting for those operations to flush to the fs (to be
readable). That may not be necessary either, but shouldn't have a big
performance impact.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 11 Mar 2011 20:30:35 +0000 (12:30 -0800)]
osd: avoid setting up_thru on new PGs
This trades off the possibility of peering blockage if the OSDs in the
first interval (after pg creation) go down and stay down with avoiding
two osdmap updates for any pg creation. I think this is reasonable given
that:
- If the pg did go active, then it did go RW and assuming as much changed
nothing.
- If the pg did not go active, then it is empty, and there is no data
lost.
To do this:
- When peering during the first ever interval, don't bother setting
up_thru.
- Always mark that interval maybe_went_rw, even though up_thru isn't set
as such.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Tommi Virtanen [Fri, 11 Mar 2011 00:37:11 +0000 (16:37 -0800)]
clitest: Fix tests after osdmaptool --clobber bugfix.
Commit 5c8146b55dbd60bdfa47b53b93f2769f7d0524dc fixed clobber,
adjust clitests to match. Reordered test logic to have "fsid
does not change" check cover a run without --clobber.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Greg Farnum [Thu, 10 Mar 2011 22:06:16 +0000 (14:06 -0800)]
rados: Add "stat" option, and fix "put" to work on larger block sizes.
We didn't have a stat option, now we do.
Previously, "put" allocated its read space on the stack. That meant
the max block size was a little under 8MB, or you got a segfault! Now
it's on the stack, and you can set it as you like.
Sage Weil [Thu, 10 Mar 2011 23:12:22 +0000 (15:12 -0800)]
mkcephfs: modularize
The goal is to support the old "ssh to everything" mode and also a
piecewise mode that lets the administrator do each step and handle
data copying and remote execution themselves.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Greg Farnum [Tue, 8 Mar 2011 20:54:23 +0000 (12:54 -0800)]
librbd: cast offset values to uint64_t for unsigned comparison warning.
It seems that size_t, off_t, and le64 have different signed/unsigned
properties on 32 and 64-bit Linux platforms, so just cast
them both (since offsets obviously can't be negative).
Greg Farnum [Tue, 8 Mar 2011 17:36:52 +0000 (09:36 -0800)]
uclient: Clear the CEPH_CAP_FILE_BUFFER ref on _flush, if safe.
Previously we just returned if safe, but leaving the CEPH_CAP_FILE_BUFFER
ref around breaks _fsync horribly. The root cause of this is
update_inode_file_bits calling objectcacher->truncate_set without
clearing the BUFFER ref, but the mechanics of clearing it there are
complicated, and I don't believe there are any issues with keeping
around the extra reference, as long as it's cleared when necessary.
Sage Weil [Tue, 8 Mar 2011 00:25:30 +0000 (16:25 -0800)]
mds: use projected subtree in rename anchor check
We want to (try to) reanchor the directory on rename when our _projected_
subtree is not a leaf. If we use the normal get_subtree_root() call,
we get NULL if we are unlinked, which makes is_leaf_subtree() crash.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Mon, 7 Mar 2011 19:32:20 +0000 (11:32 -0800)]
osd: include all stray peers in might_have_unfound
We should always consider any OSD that has a copy of the PG as a possible
location for missing objects. There are cases where might_have_unfound is
not completed. For example,
- objects on [1,2]
- 2 marked down/out
- objects on [1,3]
- recovery completes, last_epoch_clean is set.
- 2 comes back online
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 4 Mar 2011 21:59:24 +0000 (13:59 -0800)]
osd: include all up peers in might_have_unfound when desperate
If our might_have_unfound calculation was off (it currently can be, see
#865) we could prematurely give up. Try any up OSD at this stage just to
be sure.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 4 Mar 2011 17:39:59 +0000 (09:39 -0800)]
osd: recover_primary if recover_replicas starts no ops
recover_replicas may fail to start anything if we see an unexpected error.
In that case, try recover_primary immediately instead of waiting for the
PG to (hopefully) get requeued for recovery later.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil [Fri, 4 Mar 2011 17:38:47 +0000 (09:38 -0800)]
osd: discover more missing if unfound and do_recovery can't start anything
If we couldn't start any recovery ops and things are still
unfound, see if we can discover more missing object locations.
It may be that our initial locations were bad and we errored
out while trying to pull.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>