]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
15 years agomds: only purge dentries with no extra refs (besides dirty)
Sage Weil [Fri, 4 Jun 2010 00:32:39 +0000 (17:32 -0700)]
mds: only purge dentries with no extra refs (besides dirty)

Signed-off-by: Sage Weil <sage@newdream.net>
15 years agomds: set straydn first to match inode on unlink
Sage Weil [Fri, 4 Jun 2010 00:30:41 +0000 (17:30 -0700)]
mds: set straydn first to match inode on unlink

15 years agomds: don't export stray (~mdsfoo/stray), and ignore in balancer
Sage Weil [Fri, 4 Jun 2010 00:26:11 +0000 (17:26 -0700)]
mds: don't export stray (~mdsfoo/stray), and ignore in balancer

We _must_ keep mdsdir and stray on local mds for normal operations.

Signed-off-by: Sage Weil <sage@newdream.net>
15 years agomds: make discover work for multiversion inodes (e.g. dirs)
Sage Weil [Thu, 3 Jun 2010 23:55:14 +0000 (16:55 -0700)]
mds: make discover work for multiversion inodes (e.g. dirs)

If we don't have the specific snap, look up the head and see if it's
multiversion.

This doesn't give us a "range" lookup like we get with dentries because
the inode_map is a hash, not a map.  However, we shouldn't need it,
because we always have a specific snapped inode we're looking for (because
it is refered to by a dentry) or we are looking at a multiversion head.

15 years agomds: fix CDir::take_sub_waiting vs dnwaiter pin
Sage Weil [Thu, 3 Jun 2010 23:19:36 +0000 (16:19 -0700)]
mds: fix CDir::take_sub_waiting vs dnwaiter pin

Signed-off-by: Sage Weil <sage@newdream.net>
15 years agomds: kill open_foreign_stray; but open remote mdsdirs instead
Sage Weil [Thu, 3 Jun 2010 23:09:23 +0000 (16:09 -0700)]
mds: kill open_foreign_stray; but open remote mdsdirs instead

Signed-off-by: Sage Weil <sage@newdream.net>
15 years agomds: fix cap clone logic to look at matching first, not last
Sage Weil [Thu, 3 Jun 2010 22:00:44 +0000 (15:00 -0700)]
mds: fix cap clone logic to look at matching first, not last

The cap->client_follows is set to follows+1 by flushsnap, since the real
follows value isn't convenient.  But it is enough to know that it is more
than the old version's follows, so do that.

15 years agolibatomic: fix assert.h compilation
Yehuda Sadeh [Thu, 3 Jun 2010 23:44:31 +0000 (16:44 -0700)]
libatomic: fix assert.h compilation

15 years agoosd: make sure we don't return EAGAIN to client
Greg Farnum [Thu, 3 Jun 2010 20:54:24 +0000 (13:54 -0700)]
osd: make sure we don't return EAGAIN to client

15 years agomds: open past snap parents at end of rejoin phase
Sage Weil [Thu, 3 Jun 2010 21:14:04 +0000 (14:14 -0700)]
mds: open past snap parents at end of rejoin phase

We really need past parents open before we go active or else anything
that needs to build a snap context will fail.

15 years agomdsmap: show individual mds states in summary
Sage Weil [Thu, 3 Jun 2010 20:48:10 +0000 (13:48 -0700)]
mdsmap: show individual mds states in summary

15 years agoosd: improve snap_trimmer debug output
Sage Weil [Thu, 3 Jun 2010 20:26:39 +0000 (13:26 -0700)]
osd: improve snap_trimmer debug output

15 years agomds: another cap_exports message/mdcache encoding fix
Sage Weil [Thu, 3 Jun 2010 20:24:48 +0000 (13:24 -0700)]
mds: another cap_exports message/mdcache encoding fix

Signed-off-by: Sage Weil <sage@newdream.net>
15 years agomds: only adjust dn->first on lock msg if !multiversion
Sage Weil [Thu, 3 Jun 2010 20:08:16 +0000 (13:08 -0700)]
mds: only adjust dn->first on lock msg if !multiversion

The multiversion dn->first references a range of inode versions; don't
drag it forward.  Fixes 38cb2403c043e6676b563197d086edeb11b71ddf.

Signed-off-by: Sage Weil <sage@newdream.net>
15 years agomds: more fix cap_exports typing
Sage Weil [Thu, 3 Jun 2010 19:03:23 +0000 (12:03 -0700)]
mds: more fix cap_exports typing

15 years agomds: fix scatter_nudge infinite loop
Sage Weil [Thu, 3 Jun 2010 18:59:50 +0000 (11:59 -0700)]
mds: fix scatter_nudge infinite loop

15 years agomds: fix ESessions type
Sage Weil [Thu, 3 Jun 2010 18:08:00 +0000 (11:08 -0700)]
mds: fix ESessions type

15 years agomds: drag in->first forward with straydn in handle_dentry_unlink
Sage Weil [Thu, 3 Jun 2010 18:04:05 +0000 (11:04 -0700)]
mds: drag in->first forward with straydn in handle_dentry_unlink

15 years agomds: fix anchorclient dup lookups, again
Sage Weil [Thu, 3 Jun 2010 17:38:56 +0000 (10:38 -0700)]
mds: fix anchorclient dup lookups, again

15 years agomds: only log successful requests as completed
Sage Weil [Thu, 3 Jun 2010 17:17:37 +0000 (10:17 -0700)]
mds: only log successful requests as completed

15 years agomds: anchor dir on mksnap
Sage Weil [Thu, 3 Jun 2010 17:09:19 +0000 (10:09 -0700)]
mds: anchor dir on mksnap

15 years agomkcephfs: error when creating journal file in a directory that differs from OSD...
CC Lien [Thu, 3 Jun 2010 16:45:10 +0000 (09:45 -0700)]
mkcephfs: error when creating journal file in a directory that  differs from OSD data dir

mkcephfs creates osd data directory automatically, but it doesn't create a
directory for the osd journal file.

When you have a journal file in a directory that differs from the osd data
directory in your configuration, like:

       osd data = /osd/osd$id
       osd journal = /journal/osd$id

You will receive a "mount failed to open journal /journal/osd0/journal: No
such file or directory" error when doing mkcephfs

Signed-off-by: CC Lien <cc_lien@tcloudcomputing.com>
15 years agomds: fix mismatched cap_exports type between msg and MDCache
Sage Weil [Thu, 3 Jun 2010 16:40:57 +0000 (09:40 -0700)]
mds: fix mismatched cap_exports type between msg and MDCache

The types need to match because they are encoded/decoded interchangeably.
See MMDSCacheRejoin::decode() and MDCache::rejoin_send_rejoins().

15 years agomds: fix trim_unlinked iterator badness
Sage Weil [Thu, 3 Jun 2010 16:33:27 +0000 (09:33 -0700)]
mds: fix trim_unlinked iterator badness

We may remove the next inode in the map.  Queue up unlinked roots first,
which we know remove_inode_recursive() won't reach, and iterate over those.

15 years agomds: define MDS_REF_SET in unstable
Sage Weil [Thu, 3 Jun 2010 16:28:15 +0000 (09:28 -0700)]
mds: define MDS_REF_SET in unstable

15 years agomds: clear dirtyscattered in remove_inode()
Sage Weil [Thu, 3 Jun 2010 16:27:56 +0000 (09:27 -0700)]
mds: clear dirtyscattered in remove_inode()

15 years agomds: allow dup lookups in anchorclient
Sage Weil [Thu, 3 Jun 2010 16:17:13 +0000 (09:17 -0700)]
mds: allow dup lookups in anchorclient

It's not practical for callers to avoid dups, particularly since they may
be unaware of each other.  And it's trivial to support it here.

15 years agoassert: fix assert vs atomic_ops.h breakage
Sage Weil [Thu, 3 Jun 2010 16:01:58 +0000 (09:01 -0700)]
assert: fix assert vs atomic_ops.h breakage

This was causing us to use the system assert, not the ceph one.

15 years agomds: ensure past snap parents get opened before doing file recovery
Sage Weil [Thu, 3 Jun 2010 15:19:24 +0000 (08:19 -0700)]
mds: ensure past snap parents get opened before doing file recovery

Otherwise we can fail to get_snaps() when we start the recovery:

#0  0x00007fa037625f55 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007fa037628d90 in *__GI_abort () at abort.c:88
#2  0x00007fa03761f07a in *__GI___assert_fail (assertion=0x9f3d81 "oldparent", file=<value optimized out>, line=170, function=0x9f4680 "void SnapRealm::build_snap_set(std::set<snapid_t, std::less<snapid_t>, std::allocator<snapid_t> >&, snapid_t&, snapid_t&, snapid_t&, snapid_t, snapid_t)") at assert.c:78
#3  0x00000000008f7656 in SnapRealm::build_snap_set (this=0x222a300, s=..., max_seq=..., max_last_created=..., max_last_destroyed=..., first=..., last=...) at mds/snap.cc:170
#4  0x00000000008f7e8c in SnapRealm::check_cache (this=0x222a300) at mds/snap.cc:194
#5  0x00000000008f892a in SnapRealm::get_snaps (this=0x222a300) at mds/snap.cc:209
#6  0x00000000007f2c85 in MDCache::queue_file_recover (this=0x2202a00, in=0x7fa0340f5450) at mds/MDCache.cc:4398
#7  0x0000000000865011 in Locker::file_recover (this=0x21fe850, lock=0x7fa0340f59b0) at mds/Locker.cc:3437
#8  0x00000000007e5899 in MDCache::start_files_to_recover (this=0x2202a00, recover_q=..., check_q=...) at mds/MDCache.cc:4503
#9  0x00000000007e887e in MDCache::rejoin_gather_finish (this=0x2202a00) at mds/MDCache.cc:3904
#10 0x00000000007ed6cf in MDCache::handle_cache_rejoin_strong (this=0x2202a00, strong=0x7fa030025440) at mds/MDCache.cc:3618
#11 0x00000000007ed84a in MDCache::handle_cache_rejoin (this=0x2202a00, m=0x7fa030025440) at mds/MDCache.cc:3063
#12 0x00000000007fade6 in MDCache::dispatch (this=0x2202a00, m=0x7fa030025440) at mds/MDCache.cc:5668
#13 0x0000000000735313 in MDS::_dispatch (this=0x22014d0, m=0x7fa030025440) at mds/MDS.cc:1390
#14 0x00000000007372a3 in MDS::ms_dispatch (this=0x22014d0, m=0x7fa030025440) at mds/MDS.cc:1295
#15 0x0000000000728b97 in Messenger::ms_deliver_dispatch(Message*) ()
#16 0x0000000000716c5e in SimpleMessenger::dispatch_entry (this=0x2202350) at msg/SimpleMessenger.cc:332
#17 0x00000000007119c7 in SimpleMessenger::DispatchThread::entry (this=0x2202760) at msg/SimpleMessenger.h:494
#18 0x000000000071f4e7 in Thread::_entry_func (arg=0x2202760) at ./common/Thread.h:39
#19 0x00007fa03849673a in start_thread (arg=<value optimized out>) at pthread_create.c:300
#20 0x00007fa0376bf6dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

Signed-off-by: Sage Weil <sage@newdream.net>
15 years agomds: relax lock state before encoding export (and lock state)
Sage Weil [Thu, 3 Jun 2010 15:04:33 +0000 (08:04 -0700)]
mds: relax lock state before encoding export (and lock state)

We can't fuss with lock state in the finish method because we already
encoded the old state to the new auth, and we are now just a replica.

We do still want to relax the lock state to be more replica friendly,
though, so do that in the encode_export_inode method.

15 years agomds: do not bother tableserver until it is active
Sage Weil [Thu, 3 Jun 2010 06:07:42 +0000 (23:07 -0700)]
mds: do not bother tableserver until it is active

We resend these requests when the TS does go active, and if we send dups
things get all screwed up (see partial log below).

Should we worry about dup queries?

10.06.02_22:32:08.112834 7f881dfdb910 -- 10.3.64.22:6802/7866 --> mds0 10.3.64.22:6803/13552 -- mds_table_request(anchortable prepare 69 148 bytes) v1 -- ?+0 0x7f88180e4580
10.06.02_22:32:08.116427 7f881dfdb910 mds1.tableserver(anchortable) handle_mds_recovery mds0
10.06.02_22:32:08.116449 7f881dfdb910 mds1.tableclient(anchortable) handle_mds_recovery mds0
10.06.02_22:32:08.116457 7f881dfdb910 mds1.tableclient(anchortable) resending 69
10.06.02_22:32:08.116470 7f881dfdb910 -- 10.3.64.22:6802/7866 --> mds0 10.3.64.22:6803/13552 -- mds_table_request(anchortable prepare 69 148 bytes) v1 -- ?+0 0x7f8818120cb0
10.06.02_22:32:08.116840 7f881dfdb910 -- 10.3.64.22:6802/7866 <== mds0 10.3.64.22:6803/13552 7 ==== mds_table_request(anchortable agree 69 tid 165) v1 ==== 16+0+0 (1328913316 0 0) 0x2362830
10.06.02_22:32:08.116861 7f881dfdb910 mds1.tableclient(anchortable) handle_request mds_table_request(anchortable agree 69 tid 165) v1
10.06.02_22:32:08.116872 7f881dfdb910 mds1.tableclient(anchortable) got agree on 69 atid 165
10.06.02_22:32:08.127662 7f881dfdb910 mds1.tableclient(anchortable) commit 165
10.06.02_22:32:08.127683 7f881dfdb910 -- 10.3.64.22:6802/7866 --> mds0 10.3.64.22:6803/13552 -- mds_table_request(anchortable commit tid 165) v1 -- ?+0 0x7f8818114860
10.06.02_22:32:08.128244 7f881dfdb910 mds1.tableclient(anchortable) _prepare 70
10.06.02_22:32:08.128261 7f881dfdb910 -- 10.3.64.22:6802/7866 --> mds0 10.3.64.22:6803/13552 -- mds_table_request(anchortable prepare 70 82 bytes) v1 -- ?+0 0x7f88180e4580
10.06.02_22:32:08.131873 7f881dfdb910 -- 10.3.64.22:6802/7866 <== mds0 10.3.64.22:6803/13552 8 ==== mds_table_request(anchortable agree 69 tid 165 148 bytes) v1 ==== 164+0+0 (4238497285 0 0) 0x2362310
10.06.02_22:32:08.131900 7f881dfdb910 mds1.tableclient(anchortable) handle_request mds_table_request(anchortable agree 69 tid 165 148 bytes) v1
10.06.02_22:32:08.131911 7f881dfdb910 mds1.tableclient(anchortable) stray agree on 69 tid 165, already committing, resending COMMIT
10.06.02_22:32:08.131923 7f881dfdb910 -- 10.3.64.22:6802/7866 --> mds0 10.3.64.22:6803/13552 -- mds_table_request(anchortable commit tid 165) v1 -- ?+0 0x7f8818120cb0
10.06.02_22:32:08.144147 7f881dfdb910 -- 10.3.64.22:6802/7866 <== mds0 10.3.64.22:6803/13552 10 ==== mds_table_request(anchortable ack tid 165) v1 ==== 16+0+0 (584840829 0 0) 0x246dd20
10.06.02_22:32:08.144179 7f881dfdb910 mds1.tableclient(anchortable) handle_request mds_table_request(anchortable ack tid 165) v1
10.06.02_22:32:08.144195 7f881dfdb910 mds1.tableclient(anchortable) got ack on tid 165, logging
10.06.02_22:32:08.144217 7f881dfdb910 mds1.log submit_entry 5515297~17 : ETableClient anchortable ack tid 165
10.06.02_22:32:08.152419 7f881dfdb910 -- 10.3.64.22:6802/7866 <== mds0 10.3.64.22:6803/13552 11 ==== mds_table_request(anchortable agree 69 tid 166 148 bytes) v1 ==== 164+0+0 (4238497285 0 0) 0x2362830
10.06.02_22:32:08.152448 7f881dfdb910 mds1.tableclient(anchortable) handle_request mds_table_request(anchortable agree 69 tid 166 148 bytes) v1
10.06.02_22:32:08.152460 7f881dfdb910 mds1.tableclient(anchortable) stray agree on 69 tid 166, sending ROLLBACK
10.06.02_22:32:08.152470 7f881dfdb910 -- 10.3.64.22:6802/7866 --> mds0 10.3.64.22:6803/13552 -- mds_table_request(anchortable rollback tid 166) v1 -- ?+0 0x7f8818120cb0
10.06.02_22:32:08.172729 7f881dfdb910 -- 10.3.64.22:6802/7866 <== mds0 10.3.64.22:6803/13552 13 ==== mds_table_request(anchortable ack tid 165) v1 ==== 16+0+0 (584840829 0 0) 0x2362310
10.06.02_22:32:08.172770 7f881dfdb910 mds1.tableclient(anchortable) handle_request mds_table_request(anchortable ack tid 165) v1
10.06.02_22:32:08.172786 7f881dfdb910 mds1.tableclient(anchortable) got ack on tid 165, logging
10.06.02_22:32:08.172806 7f881dfdb910 mds1.log submit_entry 5515318~17 : ETableClient anchortable ack tid 165
10.06.02_22:32:08.174091 7f881dfdb910 -- 10.3.64.22:6802/7866 <== mds0 10.3.64.22:6803/13552 14 ==== mds_table_request(anchortable agree 70 tid 168 82 bytes) v1 ==== 98+0+0 (1154743153 0 0) 0x246dd20
10.06.02_22:32:08.174119 7f881dfdb910 mds1.tableclient(anchortable) handle_request mds_table_request(anchortable agree 70 tid 168 82 bytes) v1
10.06.02_22:32:08.174131 7f881dfdb910 mds1.tableclient(anchortable) got agree on 70 atid 168
10.06.02_22:32:08.202508 7f881dfdb910 mds1.tableclient(anchortable) _logged_ack 165
10.06.02_22:32:08.202530 7f881dfdb910 mds1.tableclient(anchortable) _logged_ack 165
<crash>

15 years agomds: do not reset filelock state when checking max_size during recovery
Sage Weil [Thu, 3 Jun 2010 05:14:54 +0000 (22:14 -0700)]
mds: do not reset filelock state when checking max_size during recovery

This was broken by d5574993 (probably, that commit fixed a similar
problem).  The rejoin_ack initializes replica state properly, so we can't
go changing it now.  I'm not sure why this was resetting the state to
LOCK, because that's clearly not allowed.

Print when check_max_size does a no-op so that this is a bit easier to see
next time.

15 years agomds: lock->sync replica state is lock, not sync
Sage Weil [Thu, 3 Jun 2010 04:33:40 +0000 (21:33 -0700)]
mds: lock->sync replica state is lock, not sync

It's not readable yet.  And after the lock->sync gather completes we send
out a SYNC.

Fixes failed assertion like:

10.06.02_21:27:04.444202 7f17a25ac910 mds1.locker handle_file_lock a=sync on (ifile sync) from mds0 [inode 1 [...2,head] / rep@0.2 v7 snaprealm=0xe27400 f(v0 m10.06.02_21:26:13.366344 1=0+1) ds=1=0+1 rb=0 rf=0 rd=0 (iauth sync) (ilink sync) (idft sync) (isnap sync) (inest sync) (ifile sync) (ixattr sync) (iversion lock) | nref=1 0x7f179c006280]
mds/Locker.cc: In function 'void Locker::handle_file_lock(ScatterLock*, MLock*)':
mds/Locker.cc:3468: FAILED assert(lock->get_state() == 2 || lock->get_state() == 15 || lock->get_state() == 21)
 1: (Locker::handle_file_lock(ScatterLock*, MLock*)+0x1d8) [0x86d70a]
 2: (Locker::handle_lock(MLock*)+0x191) [0x86e30f]
 3: (Locker::dispatch(Message*)+0x41) [0x870f27]
 4: (MDS::_dispatch(Message*)+0x1a17) [0x7364cb]
 5: (MDS::ms_dispatch(Message*)+0x2f) [0x737961]
 6: (Messenger::ms_deliver_dispatch(Message*)+0x55) [0x72918d]
 7: (SimpleMessenger::dispatch_entry()+0x532) [0x71710a]
 8: (SimpleMessenger::DispatchThread::entry()+0x29) [0x711f25]
 9: (Thread::_entry_func(void*)+0x20) [0x7232f4]
 10: /lib/libpthread.so.0 [0x7f17a407073a]
 11: (clone()+0x6d) [0x7f17a329469d]

Signed-off-by: Sage Weil <sage@newdream.net>
15 years agomsg: add missing msg_types.cc
Sage Weil [Thu, 3 Jun 2010 02:37:44 +0000 (19:37 -0700)]
msg: add missing msg_types.cc

15 years agomds: add export_dir command
Sage Weil [Wed, 2 Jun 2010 19:40:23 +0000 (12:40 -0700)]
mds: add export_dir command

15 years agomds: add MDCache::cache_traverse()
Sage Weil [Wed, 2 Jun 2010 19:40:15 +0000 (12:40 -0700)]
mds: add MDCache::cache_traverse()

15 years agoinitscript: unmount btrfs if we mounted it
Sage Weil [Wed, 2 Jun 2010 18:32:22 +0000 (11:32 -0700)]
initscript: unmount btrfs if we mounted it

15 years agomove addr parse() into entity_addr_t
Sage Weil [Wed, 2 Jun 2010 05:07:15 +0000 (22:07 -0700)]
move addr parse() into entity_addr_t

15 years agotcp: parse ipv4 and ipv6 addresses
Sage Weil [Wed, 2 Jun 2010 05:02:03 +0000 (22:02 -0700)]
tcp: parse ipv4 and ipv6 addresses

15 years agomon: fix unsynchronized clock logic;
Greg Farnum [Wed, 2 Jun 2010 17:12:39 +0000 (10:12 -0700)]
mon: fix unsynchronized clock logic;
change output for clarity

15 years agomds: lookup exact snap dn on import
Sage Weil [Tue, 1 Jun 2010 23:34:16 +0000 (16:34 -0700)]
mds: lookup exact snap dn on import

15 years agomds: update dn->first too when lock state adjusts inode->first
Sage Weil [Tue, 1 Jun 2010 23:33:53 +0000 (16:33 -0700)]
mds: update dn->first too when lock state adjusts inode->first

This keeps dn->first in sync with inode->first

15 years agomds: don't change lock states on replicated inode
Sage Weil [Tue, 1 Jun 2010 22:23:46 +0000 (15:23 -0700)]
mds: don't change lock states on replicated inode

The reconnect will infer some client caps, which will affect what lock
states we want.  If we're not replicated, fine, just pick something good.
Otherwise, try_eval() and go through the proper channels.

This _might_ be the source of #165...

15 years agomds: fix root null deref in recalc_auth_bits
Sage Weil [Tue, 1 Jun 2010 22:02:56 +0000 (15:02 -0700)]
mds: fix root null deref in recalc_auth_bits

Root may be null if we don't have any subtrees besides ~mds$id.

15 years agomds: adjust subtree map when unlinking dirs
Sage Weil [Tue, 1 Jun 2010 21:14:23 +0000 (14:14 -0700)]
mds: adjust subtree map when unlinking dirs

Otherwise we get subtree bounds in the stray dir and get confused down
the line.

15 years agomds: discover snapped paths on retried ops
Sage Weil [Tue, 1 Jun 2010 18:38:22 +0000 (11:38 -0700)]
mds: discover snapped paths on retried ops

This is intended to mitigate a livelock issue with traversing to snapped
metadata.  The client specifies all snap requests relative to a non-snap
inode.  The traversal through the snapped portion of the namespace will
normally happen on the auth node, but the actual target may be on another
node that does not have that portion of the namespace.  To avoid indefinite
request ping-pong, the mds will begin to discover and replicate the snapped
path components if the request has been retried.

This doesn't perform optimally, but it will at least work.

15 years agomon: add wiggle room for clock synchronization check
Greg Farnum [Tue, 1 Jun 2010 18:39:58 +0000 (11:39 -0700)]
mon: add wiggle room for clock synchronization check

15 years agomds: add case for CEPH_LOCK_DVERSION to LockType
Greg Farnum [Tue, 1 Jun 2010 17:30:05 +0000 (10:30 -0700)]
mds: add case for CEPH_LOCK_DVERSION to LockType

15 years agoxlist: add assert to catch invalid iterator usage
Greg Farnum [Sun, 30 May 2010 01:36:05 +0000 (18:36 -0700)]
xlist: add assert to catch invalid iterator usage

15 years agoObjectCacher: do not try to deref an invalidated xlist::iterator
Greg Farnum [Sat, 29 May 2010 18:02:53 +0000 (11:02 -0700)]
ObjectCacher: do not try to deref an invalidated xlist::iterator

Fixes #159

15 years agopaxos: fix store_state fix
Sage Weil [Fri, 28 May 2010 20:21:19 +0000 (13:21 -0700)]
paxos: fix store_state fix

15 years agomsgr: print bind errors to stderr
Sage Weil [Fri, 28 May 2010 19:59:25 +0000 (12:59 -0700)]
msgr: print bind errors to stderr

15 years agopaxos: cleanup
Sage Weil [Fri, 28 May 2010 19:50:37 +0000 (12:50 -0700)]
paxos: cleanup

15 years agopaxos: only store committed values in store_state
Sage Weil [Fri, 28 May 2010 19:48:41 +0000 (12:48 -0700)]
paxos: only store committed values in store_state

The uncommitted value is handled specially by handle_last()

15 years agoinitscript: fix typo with $lockfile stuff
Sage Weil [Fri, 28 May 2010 19:41:41 +0000 (12:41 -0700)]
initscript: fix typo with $lockfile stuff

15 years agopaxos: set last_committed in share_state()
Sage Weil [Fri, 28 May 2010 19:37:24 +0000 (12:37 -0700)]
paxos: set last_committed in share_state()

It wasn't getting set for LAST message, which broke recovery somewhat.

Broken by 8e76c5a1d827e01f77149245679bd00ba27120e0.

15 years agomds: fix null dn deref during anchor_prepare
Sage Weil [Thu, 27 May 2010 23:32:43 +0000 (16:32 -0700)]
mds: fix null dn deref during anchor_prepare

15 years agoconfig: parse in $host from conf file
Sage Weil [Thu, 27 May 2010 21:59:12 +0000 (14:59 -0700)]
config: parse in $host from conf file

So you can do stuff like
log dir = /data/$host

15 years agoosdmaptool: include raw, up, acting mappings
Sage Weil [Thu, 27 May 2010 20:47:00 +0000 (13:47 -0700)]
osdmaptool: include raw, up, acting mappings

15 years agoosdmap: assert maxrep >= minrep
Sage Weil [Thu, 27 May 2010 20:46:51 +0000 (13:46 -0700)]
osdmap: assert maxrep >= minrep

15 years agomkcephfs: pass -c to cmon --mkfs
Sage Weil [Thu, 27 May 2010 19:59:41 +0000 (12:59 -0700)]
mkcephfs: pass -c to cmon --mkfs

15 years agoosd: warn, don't crash, on purged_snaps shrinkage
Sage Weil [Tue, 25 May 2010 22:23:19 +0000 (15:23 -0700)]
osd: warn, don't crash, on purged_snaps shrinkage

15 years agoinitscript: incorporate Josef's fedora fixes
Sage Weil [Thu, 27 May 2010 21:58:56 +0000 (14:58 -0700)]
initscript: incorporate Josef's fedora fixes

Add 'status' command.
Add chkconfig line.
Do lockfile stuff only if /var/run/subsys exists.

Still specifying the runlevels, though.  The init script bails out (with
success code) if the ceph.conf is missing.

15 years agoceph.spec: build-required libatomic_ops-devel, not libatomic_ops
Sage Weil [Thu, 27 May 2010 21:24:37 +0000 (14:24 -0700)]
ceph.spec: build-required libatomic_ops-devel, not libatomic_ops

And no perl-devel.

15 years agosample.ceph.conf: include debug options, commented out
Sage Weil [Thu, 27 May 2010 04:47:35 +0000 (21:47 -0700)]
sample.ceph.conf: include debug options, commented out

15 years agorados: you can now set the crush rule to use when creating a pool
Greg Farnum [Wed, 26 May 2010 23:58:31 +0000 (16:58 -0700)]
rados: you can now set the crush rule to use when creating a pool

15 years agolibrados: add crush_rule parameter to create_pool functions
Greg Farnum [Wed, 26 May 2010 23:44:33 +0000 (16:44 -0700)]
librados: add crush_rule parameter to create_pool functions

15 years agoobjecter: add optional crush_rule parameter; set in pool_op_submit as needed
Greg Farnum [Wed, 26 May 2010 23:44:17 +0000 (16:44 -0700)]
objecter: add optional crush_rule parameter; set in pool_op_submit as needed

15 years agomon: add crush_rule data member to MPoolOp; use it in new pool creation on mon
Greg Farnum [Wed, 26 May 2010 23:12:02 +0000 (16:12 -0700)]
mon: add crush_rule data member to MPoolOp; use it in new pool creation on mon

15 years agomds: LAYZIO is not liked, but it is allowed
Sage Weil [Wed, 26 May 2010 21:47:32 +0000 (14:47 -0700)]
mds: LAYZIO is not liked, but it is allowed

15 years agoclient: update ioctl.h (lazyio, invalidate_range)
Sage Weil [Wed, 26 May 2010 21:31:03 +0000 (14:31 -0700)]
client: update ioctl.h (lazyio, invalidate_range)

15 years agomds: include LAYZIO cap in sync->mix and mix->sync transitions
Sage Weil [Wed, 26 May 2010 21:30:49 +0000 (14:30 -0700)]
mds: include LAYZIO cap in sync->mix and mix->sync transitions

15 years agomds: include LAZYIO in CEPH_CAP_ANY set
Sage Weil [Wed, 26 May 2010 20:49:39 +0000 (13:49 -0700)]
mds: include LAZYIO in CEPH_CAP_ANY set

15 years agomon: warn to log, not just dout, on clock drift
Greg Farnum [Wed, 26 May 2010 21:35:05 +0000 (14:35 -0700)]
mon: warn to log, not just dout, on clock drift

15 years agomon: detect and warn on clock synchronization problems;
Greg Farnum [Wed, 26 May 2010 20:54:53 +0000 (13:54 -0700)]
mon: detect and warn on clock synchronization problems;
change MMonPaxos::lease_expire to lease_timestamp

15 years agoceph: add conversion to qemu coding style
Christian Brunner [Wed, 26 May 2010 20:38:14 +0000 (22:38 +0200)]
ceph: add conversion to qemu coding style

Hi Yehuda,

I've added a small hack to make push_to_qemu.pl convert tabs to spaces.

Christian

15 years agopaxos: use helper to store committed state; fix master mon catch up using stash
Sage Weil [Wed, 26 May 2010 17:59:21 +0000 (10:59 -0700)]
paxos: use helper to store committed state; fix master mon catch up using stash

The catch up logic in handle_last didn't handle the stashed state, so we
crashed and burned if it was the master that was behind and caught up.
Use a helper that does the work for handle_commit AND handle_last.

15 years agocfuse: bail out on mount() errors
Sage Weil [Wed, 26 May 2010 17:01:49 +0000 (10:01 -0700)]
cfuse: bail out on mount() errors

15 years agoMerge branch 'lazyio' into unstable
Sage Weil [Tue, 25 May 2010 23:40:34 +0000 (16:40 -0700)]
Merge branch 'lazyio' into unstable

Conflicts:
src/mds/locks.c

15 years agointerval_set: fix union_of, intersection_of size accounting
Sage Weil [Tue, 25 May 2010 21:44:05 +0000 (14:44 -0700)]
interval_set: fix union_of, intersection_of size accounting

15 years agoinit-ceph: use = not == for comparison operator
Sage Weil [Tue, 25 May 2010 20:47:14 +0000 (13:47 -0700)]
init-ceph: use = not == for comparison operator

15 years agoMerge branch 'mds_dentries' into unstable
Sage Weil [Tue, 25 May 2010 20:13:29 +0000 (13:13 -0700)]
Merge branch 'mds_dentries' into unstable

15 years agomds: better debugging on rmdir
Sage Weil [Tue, 25 May 2010 20:01:48 +0000 (13:01 -0700)]
mds: better debugging on rmdir

15 years agomds: fix scatterlock gather, writebehind
Sage Weil [Tue, 25 May 2010 20:01:37 +0000 (13:01 -0700)]
mds: fix scatterlock gather, writebehind

We stopped overloading the virutal is_updated() when we renamed to
is_dirty.

broken by 7f19ee1ac36095cd4d4c169858d93149f083318e

15 years agomds: make export targets stay in mdsmap for a while
Sage Weil [Mon, 24 May 2010 23:49:05 +0000 (16:49 -0700)]
mds: make export targets stay in mdsmap for a while

This limits the mdsmap churn some.  Keep old targets around for at least
min-max iterations before removing them.

15 years agomds: balancer cleanup
Sage Weil [Mon, 24 May 2010 23:08:50 +0000 (16:08 -0700)]
mds: balancer cleanup

15 years agomds: warn on dn release that dne
Sage Weil [Mon, 24 May 2010 22:50:20 +0000 (15:50 -0700)]
mds: warn on dn release that dne

15 years agorbd: modify rbd on-disk header
Yehuda Sadeh [Mon, 24 May 2010 23:11:29 +0000 (16:11 -0700)]
rbd: modify rbd on-disk header

15 years agorbd: fix push_to_qemu.pl
Yehuda Sadeh [Mon, 24 May 2010 22:58:36 +0000 (15:58 -0700)]
rbd: fix push_to_qemu.pl

15 years agomon: roll mkmonfs functionality into cmon --mkfs
Sage Weil [Mon, 24 May 2010 22:28:59 +0000 (15:28 -0700)]
mon: roll mkmonfs functionality into cmon --mkfs

15 years agofilestore: make mkfs() zap any file or dirs it finds
Sage Weil [Mon, 24 May 2010 22:24:16 +0000 (15:24 -0700)]
filestore: make mkfs() zap any file or dirs it finds

15 years agorbd: modify header, add utility to ease sync with qemu tree
Yehuda Sadeh [Mon, 24 May 2010 21:06:43 +0000 (14:06 -0700)]
rbd: modify header, add utility to ease sync with qemu tree

15 years agoosd: keep recovery ops in sync with pull
Sage Weil [Mon, 24 May 2010 20:50:00 +0000 (13:50 -0700)]
osd: keep recovery ops in sync with pull

Call start_recovery_op from pull() instead of fixing every caller (some
were wrong).  This keeps the recovery state in sync with pulling state,
even when pull() has to pull something different (head, snapdir) first.

Fixes this crash:
osd/PG.cc: In function 'void PG::finish_recovery_op(const sobject_t&, bool)':
osd/PG.cc:1842: FAILED assert(recovering_oids.count(soid))
 1: (PG::finish_recovery_op(sobject_t const&, bool)+0x14e) [0x74caf6]
 2: (ReplicatedPG::sub_op_push(MOSDSubOp*)+0x1da8) [0x669292]
 3: (ReplicatedPG::do_sub_op(MOSDSubOp*)+0x109) [0x671a73]
 4: (OSD::dequeue_op(PG*)+0x23c) [0x6bda00]
 5: (OSD::OpWQ::_process(PG*)+0x21) [0x7387c9]
 6: (ThreadPool::WorkQueue<PG>::_void_process(void*)+0x28) [0x6f5e12]
 7: (ThreadPool::worker()+0x23a) [0x7f2404]
 8: (ThreadPool::WorkThread::entry()+0x19) [0x73b783]
 9: (Thread::_entry_func(void*)+0x20) [0x64f92a]
 10: /lib/libpthread.so.0 [0x7f7a12cf473a]
 11: (clone()+0x6d) [0x7f7a11f1e69d]

15 years agomon: no need for 'whoami' file in store
Sage Weil [Mon, 17 May 2010 22:50:26 +0000 (15:50 -0700)]
mon: no need for 'whoami' file in store

The monitor rank is provided during startup.  No need to verify it against
the monitor store, especially since the stores are otherwise identical.

This makes it simpler to restore/duplicate/wahtever a monitor.. just copy
the files.

15 years agoreword blacklisted output so it's clearly discussing MDSes and not OSDs
Greg Farnum [Sun, 23 May 2010 22:13:33 +0000 (15:13 -0700)]
reword blacklisted output so it's clearly discussing MDSes and not OSDs

15 years agouclient: don't unlink null dentry when getting null linkage in mds reply
Sage Weil [Sat, 22 May 2010 16:56:27 +0000 (09:56 -0700)]
uclient: don't unlink null dentry when getting null linkage in mds reply

This broke semi-recently when the mds started returning null linkages (and
associated leases).

15 years agomon: trim pgmap states even when we don't have a full quorum
Sage Weil [Fri, 21 May 2010 23:17:48 +0000 (16:17 -0700)]
mon: trim pgmap states even when we don't have a full quorum

15 years agopaxos: recover using stashed latest when state histories don't overlap
Sage Weil [Fri, 21 May 2010 23:17:34 +0000 (16:17 -0700)]
paxos: recover using stashed latest when state histories don't overlap

If we don't have incremental states to catch up, jump to the latest.

15 years agomds: anchor multiversion inode before unlinking it
Sage Weil [Fri, 21 May 2010 21:55:38 +0000 (14:55 -0700)]
mds: anchor multiversion inode before unlinking it

If we are going to create a remote dentry linking to a multiversion inode
we're unlinking, make sure it's anchored!

This is a bit fugly because it mirrors the logic in journal_cow_dentry. No
obvious way to use a generic helper for that though.

15 years agolibrados.h: add other TMAP definitions
Yehuda Sadeh [Fri, 21 May 2010 20:44:40 +0000 (13:44 -0700)]
librados.h: add other TMAP definitions

also add a comment in rados.h about the defines in librados.h