]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Greg Farnum [Mon, 7 Jun 2010 21:57:47 +0000 (14:57 -0700)]
buffer: fix padding distances
Greg Farnum [Mon, 7 Jun 2010 19:01:34 +0000 (12:01 -0700)]
osd: init auid to CEPH_AUTH_UID_DEFAULT in case authorizer doesn't set it.
We should probably also require the authorizer to set it for us.
Sage Weil [Tue, 8 Jun 2010 20:38:15 +0000 (13:38 -0700)]
mds: scan stray dir, eval strays on mds startup
Sage Weil [Tue, 8 Jun 2010 16:42:40 +0000 (09:42 -0700)]
mon: make mon lease clock check protocol change backward compatible
Sage Weil [Tue, 8 Jun 2010 05:18:59 +0000 (22:18 -0700)]
qa: add untar_snap_rm.sh
Sage Weil [Tue, 8 Jun 2010 05:05:08 +0000 (22:05 -0700)]
osd: print rollback osd_op nicely
Sage Weil [Mon, 7 Jun 2010 23:03:37 +0000 (16:03 -0700)]
mds: wire Connection to Session when Session already exists on connect
Sage Weil [Mon, 7 Jun 2010 23:03:14 +0000 (16:03 -0700)]
mds: funnel mds->client messages through single Session* helper
Simplify callers where possible.
Sage Weil [Mon, 7 Jun 2010 22:42:19 +0000 (15:42 -0700)]
mon: simplify clock drift checks
Ignore lease sent vs lease_ack receive times bc multiple lease msgs may
be in flight and the ack may be from a previous one. This was causing
spurious
[WRN] : lease_ack from follower sent at time(10.06.07_15:07:11.441391), before lease extend was sent (10.06.07_15:07:11.826340)! Clocks not synchronized.
messages.
It is sufficient to just check for messages received from the future. To
avoid cruftiness trying to do that when the only stamp is the lease
timeout, add a sent_timestamp to the message and use that instead. This
simplifies things quite a bit, at the expense of not being backward
compatible.
Sage Weil [Mon, 7 Jun 2010 22:04:37 +0000 (15:04 -0700)]
monc: behave in ms_handle_reset if cur_mon is < 0
Sage Weil [Mon, 7 Jun 2010 22:03:28 +0000 (15:03 -0700)]
msgr: don't throttle.get 0
Sage Weil [Mon, 7 Jun 2010 22:00:23 +0000 (15:00 -0700)]
throttle: allow put(0)
Still returns a consistent value for the count.
Sage Weil [Mon, 7 Jun 2010 21:59:16 +0000 (14:59 -0700)]
msgr: don't thottle.put 0
Sage Weil [Mon, 7 Jun 2010 21:47:56 +0000 (14:47 -0700)]
Merge remote branch 'origin/msgr' into unstable
Sage Weil [Mon, 7 Jun 2010 19:05:55 +0000 (12:05 -0700)]
mds: use cap on head if there is none on the snapped inode
This is needed, in particular, when we're flushing snap data on an inode
that already got COWed.
Sage Weil [Mon, 7 Jun 2010 18:40:32 +0000 (11:40 -0700)]
osd: use low-level helper getting obc in sub_op_push
find_object_context does all sorts of stuff we don't need here: we know
which object the context is for. Just set it up.
Greg Farnum [Mon, 7 Jun 2010 12:55:05 +0000 (05:55 -0700)]
throtle: add asserts on max and change parameters where appropriate
Greg Farnum [Mon, 7 Jun 2010 12:54:47 +0000 (05:54 -0700)]
throttle: fix assert count to actually use count
Sage Weil [Mon, 7 Jun 2010 05:14:50 +0000 (22:14 -0700)]
crypto: don't clean up EVP table on every decrypt()
Don't think that's appropriate? And certainly doesn't happen for the
encrypt() case.
Sage Weil [Mon, 7 Jun 2010 05:14:21 +0000 (22:14 -0700)]
crypto: don't leak memory in CryptoAES::encrypt()
Sage Weil [Mon, 7 Jun 2010 05:04:29 +0000 (22:04 -0700)]
mon: don't leak MAuth
Greg Farnum [Sat, 5 Jun 2010 00:01:32 +0000 (17:01 -0700)]
throttle: use signed counters and assert that count never drops below 0
Greg Farnum [Fri, 4 Jun 2010 23:19:00 +0000 (16:19 -0700)]
msgr: Fix uses of get_[data, payload, middle] to use throttling-aware functions.
Greg Farnum [Fri, 4 Jun 2010 22:01:30 +0000 (15:01 -0700)]
msgr: put throttler usage on Message destruct
Greg Farnum [Fri, 4 Jun 2010 21:02:56 +0000 (14:02 -0700)]
osd: fix compile issues
Greg Farnum [Fri, 4 Jun 2010 21:02:40 +0000 (14:02 -0700)]
msgr: switch to get/set functions for Message:throttler
Greg Farnum [Fri, 4 Jun 2010 20:37:25 +0000 (13:37 -0700)]
osd: add osd_client_message_size_cap option to config; default 500MB
And change the name in cosd to be that
Sage Weil [Fri, 4 Jun 2010 23:31:42 +0000 (16:31 -0700)]
objectcacher: add verify_stats() debugging helper
Sage Weil [Fri, 4 Jun 2010 23:31:08 +0000 (16:31 -0700)]
objectcacher: fix stat accounting when resizing bufferheads
Must keep stats in mind when adjusting bufferheads!
Sage Weil [Fri, 4 Jun 2010 23:10:44 +0000 (16:10 -0700)]
objectcacher: cleanup formatting
Sage Weil [Fri, 4 Jun 2010 23:05:55 +0000 (16:05 -0700)]
objectcacher: fix use of invalid iterator in map_write()
The p points to bh, which is removed by merge_left. Move it back to final,
so we can advance to the new next a few lines down.
Sage Weil [Fri, 4 Jun 2010 21:26:05 +0000 (14:26 -0700)]
objectcacher: match states before merging in map_write
The caller is going to set us to dirty, so we don't care what state we
have, so long as the left and right bits we're merging match all is ok.
Yehuda Sadeh [Fri, 4 Jun 2010 23:20:35 +0000 (16:20 -0700)]
osd: fix rollback when head points at the rolled back snapshot
Greg Farnum [Fri, 4 Jun 2010 20:23:57 +0000 (13:23 -0700)]
msg: remove copy_payload and copy_data functions; change set to use throttler
Sage Weil [Fri, 4 Jun 2010 20:10:28 +0000 (13:10 -0700)]
Merge branch 'rbd' into unstable
Sage Weil [Fri, 4 Jun 2010 20:09:42 +0000 (13:09 -0700)]
osd: clean up rollback debug output
Sage Weil [Fri, 4 Jun 2010 19:54:25 +0000 (12:54 -0700)]
uclient: handle inode with no caps from mds
This happens when you readdir and some inodes are in a different snaprealm.
Greg Farnum [Fri, 4 Jun 2010 19:57:59 +0000 (12:57 -0700)]
osd: filter_xattrs on a rollback op
Greg Farnum [Fri, 4 Jun 2010 19:55:27 +0000 (12:55 -0700)]
osd: fix naughty iterator usage after invalidating it
Greg Farnum [Fri, 4 Jun 2010 19:19:06 +0000 (12:19 -0700)]
osd: _make_clone now properly duplicates xattrs
Greg Farnum [Fri, 4 Jun 2010 19:49:20 +0000 (12:49 -0700)]
osd: add filter_xattrs function to remove non-user xattrs from a map of them
Greg Farnum [Fri, 4 Jun 2010 19:04:08 +0000 (12:04 -0700)]
progress
Sage Weil [Fri, 4 Jun 2010 18:07:09 +0000 (11:07 -0700)]
mds: fix straydn->first part deux
9ed0c30ecf6611193db52e1facc1f46b37f04bc4 forgot to remove the old code.
Greg Farnum [Fri, 4 Jun 2010 01:22:24 +0000 (18:22 -0700)]
debugging output
Greg Farnum [Fri, 4 Jun 2010 01:22:18 +0000 (18:22 -0700)]
rados: print out pool instead of object
Sage Weil [Fri, 4 Jun 2010 00:32:39 +0000 (17:32 -0700)]
mds: only purge dentries with no extra refs (besides dirty)
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 4 Jun 2010 00:30:41 +0000 (17:30 -0700)]
mds: set straydn first to match inode on unlink
Sage Weil [Fri, 4 Jun 2010 00:26:11 +0000 (17:26 -0700)]
mds: don't export stray (~mdsfoo/stray), and ignore in balancer
We _must_ keep mdsdir and stray on local mds for normal operations.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 3 Jun 2010 23:55:14 +0000 (16:55 -0700)]
mds: make discover work for multiversion inodes (e.g. dirs)
If we don't have the specific snap, look up the head and see if it's
multiversion.
This doesn't give us a "range" lookup like we get with dentries because
the inode_map is a hash, not a map. However, we shouldn't need it,
because we always have a specific snapped inode we're looking for (because
it is refered to by a dentry) or we are looking at a multiversion head.
Sage Weil [Thu, 3 Jun 2010 23:19:36 +0000 (16:19 -0700)]
mds: fix CDir::take_sub_waiting vs dnwaiter pin
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 3 Jun 2010 23:09:23 +0000 (16:09 -0700)]
mds: kill open_foreign_stray; but open remote mdsdirs instead
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 3 Jun 2010 22:00:44 +0000 (15:00 -0700)]
mds: fix cap clone logic to look at matching first, not last
The cap->client_follows is set to follows+1 by flushsnap, since the real
follows value isn't convenient. But it is enough to know that it is more
than the old version's follows, so do that.
Yehuda Sadeh [Thu, 3 Jun 2010 23:44:31 +0000 (16:44 -0700)]
libatomic: fix assert.h compilation
Greg Farnum [Thu, 3 Jun 2010 23:40:32 +0000 (16:40 -0700)]
msgr: add Throttle pointer to Policy
Greg Farnum [Thu, 3 Jun 2010 23:20:47 +0000 (16:20 -0700)]
Merge branch 'unstable' into msgr
Greg Farnum [Thu, 3 Jun 2010 20:54:24 +0000 (13:54 -0700)]
osd: make sure we don't return EAGAIN to client
Sage Weil [Thu, 3 Jun 2010 21:14:04 +0000 (14:14 -0700)]
mds: open past snap parents at end of rejoin phase
We really need past parents open before we go active or else anything
that needs to build a snap context will fail.
Sage Weil [Thu, 3 Jun 2010 20:48:10 +0000 (13:48 -0700)]
mdsmap: show individual mds states in summary
Sage Weil [Thu, 3 Jun 2010 20:26:39 +0000 (13:26 -0700)]
osd: improve snap_trimmer debug output
Sage Weil [Thu, 3 Jun 2010 20:24:48 +0000 (13:24 -0700)]
mds: another cap_exports message/mdcache encoding fix
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 3 Jun 2010 20:08:16 +0000 (13:08 -0700)]
mds: only adjust dn->first on lock msg if !multiversion
The multiversion dn->first references a range of inode versions; don't
drag it forward. Fixes
38cb2403c043e6676b563197d086edeb11b71ddf .
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 3 Jun 2010 19:03:23 +0000 (12:03 -0700)]
mds: more fix cap_exports typing
Sage Weil [Thu, 3 Jun 2010 18:59:50 +0000 (11:59 -0700)]
mds: fix scatter_nudge infinite loop
Sage Weil [Thu, 3 Jun 2010 18:08:00 +0000 (11:08 -0700)]
mds: fix ESessions type
Sage Weil [Thu, 3 Jun 2010 18:04:05 +0000 (11:04 -0700)]
mds: drag in->first forward with straydn in handle_dentry_unlink
Sage Weil [Thu, 3 Jun 2010 17:38:56 +0000 (10:38 -0700)]
mds: fix anchorclient dup lookups, again
Sage Weil [Thu, 3 Jun 2010 17:17:37 +0000 (10:17 -0700)]
mds: only log successful requests as completed
Sage Weil [Thu, 3 Jun 2010 17:09:19 +0000 (10:09 -0700)]
mds: anchor dir on mksnap
CC Lien [Thu, 3 Jun 2010 16:45:10 +0000 (09:45 -0700)]
mkcephfs: error when creating journal file in a directory that differs from OSD data dir
mkcephfs creates osd data directory automatically, but it doesn't create a
directory for the osd journal file.
When you have a journal file in a directory that differs from the osd data
directory in your configuration, like:
osd data = /osd/osd$id
osd journal = /journal/osd$id
You will receive a "mount failed to open journal /journal/osd0/journal: No
such file or directory" error when doing mkcephfs
Signed-off-by: CC Lien <cc_lien@tcloudcomputing.com>
Sage Weil [Thu, 3 Jun 2010 16:40:57 +0000 (09:40 -0700)]
mds: fix mismatched cap_exports type between msg and MDCache
The types need to match because they are encoded/decoded interchangeably.
See MMDSCacheRejoin::decode() and MDCache::rejoin_send_rejoins().
Sage Weil [Thu, 3 Jun 2010 16:33:27 +0000 (09:33 -0700)]
mds: fix trim_unlinked iterator badness
We may remove the next inode in the map. Queue up unlinked roots first,
which we know remove_inode_recursive() won't reach, and iterate over those.
Sage Weil [Thu, 3 Jun 2010 16:28:15 +0000 (09:28 -0700)]
mds: define MDS_REF_SET in unstable
Sage Weil [Thu, 3 Jun 2010 16:27:56 +0000 (09:27 -0700)]
mds: clear dirtyscattered in remove_inode()
Sage Weil [Thu, 3 Jun 2010 16:17:13 +0000 (09:17 -0700)]
mds: allow dup lookups in anchorclient
It's not practical for callers to avoid dups, particularly since they may
be unaware of each other. And it's trivial to support it here.
Sage Weil [Thu, 3 Jun 2010 16:01:58 +0000 (09:01 -0700)]
assert: fix assert vs atomic_ops.h breakage
This was causing us to use the system assert, not the ceph one.
Sage Weil [Thu, 3 Jun 2010 15:19:24 +0000 (08:19 -0700)]
mds: ensure past snap parents get opened before doing file recovery
Otherwise we can fail to get_snaps() when we start the recovery:
#0 0x00007fa037625f55 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x00007fa037628d90 in *__GI_abort () at abort.c:88
#2 0x00007fa03761f07a in *__GI___assert_fail (assertion=0x9f3d81 "oldparent", file=<value optimized out>, line=170, function=0x9f4680 "void SnapRealm::build_snap_set(std::set<snapid_t, std::less<snapid_t>, std::allocator<snapid_t> >&, snapid_t&, snapid_t&, snapid_t&, snapid_t, snapid_t)") at assert.c:78
#3 0x00000000008f7656 in SnapRealm::build_snap_set (this=0x222a300, s=..., max_seq=..., max_last_created=..., max_last_destroyed=..., first=..., last=...) at mds/snap.cc:170
#4 0x00000000008f7e8c in SnapRealm::check_cache (this=0x222a300) at mds/snap.cc:194
#5 0x00000000008f892a in SnapRealm::get_snaps (this=0x222a300) at mds/snap.cc:209
#6 0x00000000007f2c85 in MDCache::queue_file_recover (this=0x2202a00, in=0x7fa0340f5450) at mds/MDCache.cc:4398
#7 0x0000000000865011 in Locker::file_recover (this=0x21fe850, lock=0x7fa0340f59b0) at mds/Locker.cc:3437
#8 0x00000000007e5899 in MDCache::start_files_to_recover (this=0x2202a00, recover_q=..., check_q=...) at mds/MDCache.cc:4503
#9 0x00000000007e887e in MDCache::rejoin_gather_finish (this=0x2202a00) at mds/MDCache.cc:3904
#10 0x00000000007ed6cf in MDCache::handle_cache_rejoin_strong (this=0x2202a00, strong=0x7fa030025440) at mds/MDCache.cc:3618
#11 0x00000000007ed84a in MDCache::handle_cache_rejoin (this=0x2202a00, m=0x7fa030025440) at mds/MDCache.cc:3063
#12 0x00000000007fade6 in MDCache::dispatch (this=0x2202a00, m=0x7fa030025440) at mds/MDCache.cc:5668
#13 0x0000000000735313 in MDS::_dispatch (this=0x22014d0, m=0x7fa030025440) at mds/MDS.cc:1390
#14 0x00000000007372a3 in MDS::ms_dispatch (this=0x22014d0, m=0x7fa030025440) at mds/MDS.cc:1295
#15 0x0000000000728b97 in Messenger::ms_deliver_dispatch(Message*) ()
#16 0x0000000000716c5e in SimpleMessenger::dispatch_entry (this=0x2202350) at msg/SimpleMessenger.cc:332
#17 0x00000000007119c7 in SimpleMessenger::DispatchThread::entry (this=0x2202760) at msg/SimpleMessenger.h:494
#18 0x000000000071f4e7 in Thread::_entry_func (arg=0x2202760) at ./common/Thread.h:39
#19 0x00007fa03849673a in start_thread (arg=<value optimized out>) at pthread_create.c:300
#20 0x00007fa0376bf6dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 3 Jun 2010 15:04:33 +0000 (08:04 -0700)]
mds: relax lock state before encoding export (and lock state)
We can't fuss with lock state in the finish method because we already
encoded the old state to the new auth, and we are now just a replica.
We do still want to relax the lock state to be more replica friendly,
though, so do that in the encode_export_inode method.
Sage Weil [Thu, 3 Jun 2010 06:07:42 +0000 (23:07 -0700)]
mds: do not bother tableserver until it is active
We resend these requests when the TS does go active, and if we send dups
things get all screwed up (see partial log below).
Should we worry about dup queries?
10.06.02_22:32:08.112834
7f881dfdb910 -- 10.3.64.22:6802/7866 --> mds0 10.3.64.22:6803/13552 -- mds_table_request(anchortable prepare 69 148 bytes) v1 -- ?+0 0x7f88180e4580
10.06.02_22:32:08.116427
7f881dfdb910 mds1.tableserver(anchortable) handle_mds_recovery mds0
10.06.02_22:32:08.116449
7f881dfdb910 mds1.tableclient(anchortable) handle_mds_recovery mds0
10.06.02_22:32:08.116457
7f881dfdb910 mds1.tableclient(anchortable) resending 69
10.06.02_22:32:08.116470
7f881dfdb910 -- 10.3.64.22:6802/7866 --> mds0 10.3.64.22:6803/13552 -- mds_table_request(anchortable prepare 69 148 bytes) v1 -- ?+0 0x7f8818120cb0
10.06.02_22:32:08.116840
7f881dfdb910 -- 10.3.64.22:6802/7866 <== mds0 10.3.64.22:6803/13552 7 ==== mds_table_request(anchortable agree 69 tid 165) v1 ==== 16+0+0 (
1328913316 0 0) 0x2362830
10.06.02_22:32:08.116861
7f881dfdb910 mds1.tableclient(anchortable) handle_request mds_table_request(anchortable agree 69 tid 165) v1
10.06.02_22:32:08.116872
7f881dfdb910 mds1.tableclient(anchortable) got agree on 69 atid 165
10.06.02_22:32:08.127662
7f881dfdb910 mds1.tableclient(anchortable) commit 165
10.06.02_22:32:08.127683
7f881dfdb910 -- 10.3.64.22:6802/7866 --> mds0 10.3.64.22:6803/13552 -- mds_table_request(anchortable commit tid 165) v1 -- ?+0 0x7f8818114860
10.06.02_22:32:08.128244
7f881dfdb910 mds1.tableclient(anchortable) _prepare 70
10.06.02_22:32:08.128261
7f881dfdb910 -- 10.3.64.22:6802/7866 --> mds0 10.3.64.22:6803/13552 -- mds_table_request(anchortable prepare 70 82 bytes) v1 -- ?+0 0x7f88180e4580
10.06.02_22:32:08.131873
7f881dfdb910 -- 10.3.64.22:6802/7866 <== mds0 10.3.64.22:6803/13552 8 ==== mds_table_request(anchortable agree 69 tid 165 148 bytes) v1 ==== 164+0+0 (
4238497285 0 0) 0x2362310
10.06.02_22:32:08.131900
7f881dfdb910 mds1.tableclient(anchortable) handle_request mds_table_request(anchortable agree 69 tid 165 148 bytes) v1
10.06.02_22:32:08.131911
7f881dfdb910 mds1.tableclient(anchortable) stray agree on 69 tid 165, already committing, resending COMMIT
10.06.02_22:32:08.131923
7f881dfdb910 -- 10.3.64.22:6802/7866 --> mds0 10.3.64.22:6803/13552 -- mds_table_request(anchortable commit tid 165) v1 -- ?+0 0x7f8818120cb0
10.06.02_22:32:08.144147
7f881dfdb910 -- 10.3.64.22:6802/7866 <== mds0 10.3.64.22:6803/13552 10 ==== mds_table_request(anchortable ack tid 165) v1 ==== 16+0+0 (
584840829 0 0) 0x246dd20
10.06.02_22:32:08.144179
7f881dfdb910 mds1.tableclient(anchortable) handle_request mds_table_request(anchortable ack tid 165) v1
10.06.02_22:32:08.144195
7f881dfdb910 mds1.tableclient(anchortable) got ack on tid 165, logging
10.06.02_22:32:08.144217
7f881dfdb910 mds1.log submit_entry
5515297 ~17 : ETableClient anchortable ack tid 165
10.06.02_22:32:08.152419
7f881dfdb910 -- 10.3.64.22:6802/7866 <== mds0 10.3.64.22:6803/13552 11 ==== mds_table_request(anchortable agree 69 tid 166 148 bytes) v1 ==== 164+0+0 (
4238497285 0 0) 0x2362830
10.06.02_22:32:08.152448
7f881dfdb910 mds1.tableclient(anchortable) handle_request mds_table_request(anchortable agree 69 tid 166 148 bytes) v1
10.06.02_22:32:08.152460
7f881dfdb910 mds1.tableclient(anchortable) stray agree on 69 tid 166, sending ROLLBACK
10.06.02_22:32:08.152470
7f881dfdb910 -- 10.3.64.22:6802/7866 --> mds0 10.3.64.22:6803/13552 -- mds_table_request(anchortable rollback tid 166) v1 -- ?+0 0x7f8818120cb0
10.06.02_22:32:08.172729
7f881dfdb910 -- 10.3.64.22:6802/7866 <== mds0 10.3.64.22:6803/13552 13 ==== mds_table_request(anchortable ack tid 165) v1 ==== 16+0+0 (
584840829 0 0) 0x2362310
10.06.02_22:32:08.172770
7f881dfdb910 mds1.tableclient(anchortable) handle_request mds_table_request(anchortable ack tid 165) v1
10.06.02_22:32:08.172786
7f881dfdb910 mds1.tableclient(anchortable) got ack on tid 165, logging
10.06.02_22:32:08.172806
7f881dfdb910 mds1.log submit_entry
5515318 ~17 : ETableClient anchortable ack tid 165
10.06.02_22:32:08.174091
7f881dfdb910 -- 10.3.64.22:6802/7866 <== mds0 10.3.64.22:6803/13552 14 ==== mds_table_request(anchortable agree 70 tid 168 82 bytes) v1 ==== 98+0+0 (
1154743153 0 0) 0x246dd20
10.06.02_22:32:08.174119
7f881dfdb910 mds1.tableclient(anchortable) handle_request mds_table_request(anchortable agree 70 tid 168 82 bytes) v1
10.06.02_22:32:08.174131
7f881dfdb910 mds1.tableclient(anchortable) got agree on 70 atid 168
10.06.02_22:32:08.202508
7f881dfdb910 mds1.tableclient(anchortable) _logged_ack 165
10.06.02_22:32:08.202530
7f881dfdb910 mds1.tableclient(anchortable) _logged_ack 165
<crash>
Sage Weil [Thu, 3 Jun 2010 05:14:54 +0000 (22:14 -0700)]
mds: do not reset filelock state when checking max_size during recovery
This was broken by
d5574993 (probably, that commit fixed a similar
problem). The rejoin_ack initializes replica state properly, so we can't
go changing it now. I'm not sure why this was resetting the state to
LOCK, because that's clearly not allowed.
Print when check_max_size does a no-op so that this is a bit easier to see
next time.
Sage Weil [Thu, 3 Jun 2010 04:33:40 +0000 (21:33 -0700)]
mds: lock->sync replica state is lock, not sync
It's not readable yet. And after the lock->sync gather completes we send
out a SYNC.
Fixes failed assertion like:
10.06.02_21:27:04.444202
7f17a25ac910 mds1.locker handle_file_lock a=sync on (ifile sync) from mds0 [inode 1 [...2,head] / rep@0.2 v7 snaprealm=0xe27400 f(v0 m10.06.02_21:26:13.366344 1=0+1) ds=1=0+1 rb=0 rf=0 rd=0 (iauth sync) (ilink sync) (idft sync) (isnap sync) (inest sync) (ifile sync) (ixattr sync) (iversion lock) | nref=1 0x7f179c006280]
mds/Locker.cc: In function 'void Locker::handle_file_lock(ScatterLock*, MLock*)':
mds/Locker.cc:3468: FAILED assert(lock->get_state() == 2 || lock->get_state() == 15 || lock->get_state() == 21)
1: (Locker::handle_file_lock(ScatterLock*, MLock*)+0x1d8) [0x86d70a]
2: (Locker::handle_lock(MLock*)+0x191) [0x86e30f]
3: (Locker::dispatch(Message*)+0x41) [0x870f27]
4: (MDS::_dispatch(Message*)+0x1a17) [0x7364cb]
5: (MDS::ms_dispatch(Message*)+0x2f) [0x737961]
6: (Messenger::ms_deliver_dispatch(Message*)+0x55) [0x72918d]
7: (SimpleMessenger::dispatch_entry()+0x532) [0x71710a]
8: (SimpleMessenger::DispatchThread::entry()+0x29) [0x711f25]
9: (Thread::_entry_func(void*)+0x20) [0x7232f4]
10: /lib/libpthread.so.0 [0x7f17a407073a]
11: (clone()+0x6d) [0x7f17a329469d]
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 3 Jun 2010 02:37:44 +0000 (19:37 -0700)]
msg: add missing msg_types.cc
Sage Weil [Wed, 2 Jun 2010 19:40:23 +0000 (12:40 -0700)]
mds: add export_dir command
Sage Weil [Wed, 2 Jun 2010 19:40:15 +0000 (12:40 -0700)]
mds: add MDCache::cache_traverse()
Sage Weil [Wed, 2 Jun 2010 18:32:22 +0000 (11:32 -0700)]
initscript: unmount btrfs if we mounted it
Sage Weil [Wed, 2 Jun 2010 05:07:15 +0000 (22:07 -0700)]
move addr parse() into entity_addr_t
Sage Weil [Wed, 2 Jun 2010 05:02:03 +0000 (22:02 -0700)]
tcp: parse ipv4 and ipv6 addresses
Greg Farnum [Wed, 2 Jun 2010 17:12:39 +0000 (10:12 -0700)]
mon: fix unsynchronized clock logic;
change output for clarity
Sage Weil [Tue, 1 Jun 2010 23:34:16 +0000 (16:34 -0700)]
mds: lookup exact snap dn on import
Sage Weil [Tue, 1 Jun 2010 23:33:53 +0000 (16:33 -0700)]
mds: update dn->first too when lock state adjusts inode->first
This keeps dn->first in sync with inode->first
Sage Weil [Tue, 1 Jun 2010 22:23:46 +0000 (15:23 -0700)]
mds: don't change lock states on replicated inode
The reconnect will infer some client caps, which will affect what lock
states we want. If we're not replicated, fine, just pick something good.
Otherwise, try_eval() and go through the proper channels.
This _might_ be the source of #165...
Sage Weil [Tue, 1 Jun 2010 22:02:56 +0000 (15:02 -0700)]
mds: fix root null deref in recalc_auth_bits
Root may be null if we don't have any subtrees besides ~mds$id.
Sage Weil [Tue, 1 Jun 2010 21:14:23 +0000 (14:14 -0700)]
mds: adjust subtree map when unlinking dirs
Otherwise we get subtree bounds in the stray dir and get confused down
the line.
Sage Weil [Tue, 1 Jun 2010 18:38:22 +0000 (11:38 -0700)]
mds: discover snapped paths on retried ops
This is intended to mitigate a livelock issue with traversing to snapped
metadata. The client specifies all snap requests relative to a non-snap
inode. The traversal through the snapped portion of the namespace will
normally happen on the auth node, but the actual target may be on another
node that does not have that portion of the namespace. To avoid indefinite
request ping-pong, the mds will begin to discover and replicate the snapped
path components if the request has been retried.
This doesn't perform optimally, but it will at least work.
Greg Farnum [Tue, 1 Jun 2010 18:39:58 +0000 (11:39 -0700)]
mon: add wiggle room for clock synchronization check
Greg Farnum [Tue, 1 Jun 2010 17:30:05 +0000 (10:30 -0700)]
mds: add case for CEPH_LOCK_DVERSION to LockType
Greg Farnum [Sun, 30 May 2010 01:36:05 +0000 (18:36 -0700)]
xlist: add assert to catch invalid iterator usage
Greg Farnum [Sat, 29 May 2010 18:02:53 +0000 (11:02 -0700)]
ObjectCacher: do not try to deref an invalidated xlist::iterator
Fixes #159
Sage Weil [Fri, 28 May 2010 20:21:19 +0000 (13:21 -0700)]
paxos: fix store_state fix
Sage Weil [Fri, 28 May 2010 19:59:25 +0000 (12:59 -0700)]
msgr: print bind errors to stderr
Yehuda Sadeh [Fri, 28 May 2010 19:56:53 +0000 (12:56 -0700)]
rbd: some fixes to conform with qemy code style