]> git-server-git.apps.pok.os.sepia.ceph.com Git - ceph.git/log
ceph.git
16 years agomessages/MClass[Ack]: Roll back some unification.
Greg Farnum [Thu, 25 Jun 2009 19:05:15 +0000 (12:05 -0700)]
messages/MClass[Ack]: Roll back some unification.
version_t last and PaxosServiceMessage::version shouldn't
be the same in these messages. Remove that and add a new
constructor that does set the version (but it's unneeded).

16 years agomon/objecter: The monitors and Objecter now use the version in messages.
Greg Farnum [Thu, 25 Jun 2009 19:03:56 +0000 (12:03 -0700)]
mon/objecter: The monitors and Objecter now use the version in messages.

16 years agoinitscripts: do mount/mkfs as root, otherwise as any user
Sage Weil [Thu, 25 Jun 2009 21:03:44 +0000 (14:03 -0700)]
initscripts: do mount/mkfs as root, otherwise as any user

We want cosd to run unprivileged if possible.

16 years agoosd: update primary's notion of peer last_update on activate
Sage Weil [Thu, 25 Jun 2009 20:01:16 +0000 (13:01 -0700)]
osd: update primary's notion of peer last_update on activate

We are pushing the peer the log to bring it up to date, so
update our peer_info[peer].last_update to match.  Otherwise,
we get confused if we get, say, stray content and peer() is
called later, and we have out of date peer stats.

16 years agoosd: force RMW ordering globally
Sage Weil [Thu, 25 Jun 2009 19:45:35 +0000 (12:45 -0700)]
osd: force RMW ordering globally

We can't mix RMW and DELAYED in the same PG without screwing
up the ordering of writes, the pg log, and so forth.

So force RMW throughout.  This won't affect the mds log
appends because the client is constant.  It will slow down
concurrent writes to the same object by multiple clients, but
we don't have many (any?) of those yet.

This needs a real solution... :/

16 years agoosd: fix TMAPUP bug
Sage Weil [Thu, 25 Jun 2009 19:44:04 +0000 (12:44 -0700)]
osd: fix TMAPUP bug

Trailing bit was put in the wrong place.

16 years agoosd: fix tmapup
Sage Weil [Thu, 25 Jun 2009 17:32:40 +0000 (10:32 -0700)]
osd: fix tmapup

Various problems with decoding and applying an update.

16 years agoinitscripts: allow 'user' option, defaults to current user
Sage Weil [Thu, 25 Jun 2009 17:22:33 +0000 (10:22 -0700)]
initscripts: allow 'user' option, defaults to current user

16 years agomds: fix CDir decoding
Sage Weil [Thu, 25 Jun 2009 17:21:51 +0000 (10:21 -0700)]
mds: fix CDir decoding

16 years agomds: rev format (for TMAP changes)
Sage Weil [Thu, 25 Jun 2009 04:22:44 +0000 (21:22 -0700)]
mds: rev format (for TMAP changes)

16 years agoosd: fix head_existed check
Sage Weil [Thu, 25 Jun 2009 04:16:28 +0000 (21:16 -0700)]
osd: fix head_existed check

ssc isn't always defined, as we pass here for !may_read() too.

16 years agoMerge branch 'mdsmap' into unstable
Sage Weil [Thu, 25 Jun 2009 04:00:58 +0000 (21:00 -0700)]
Merge branch 'mdsmap' into unstable

Conflicts:

src/mds/CDir.cc

16 years agotodo
Sage Weil [Thu, 25 Jun 2009 03:57:48 +0000 (20:57 -0700)]
todo

16 years agoosd: print lost objects
Sage Weil [Thu, 25 Jun 2009 03:48:39 +0000 (20:48 -0700)]
osd: print lost objects

We still need to figure out how to continue...

16 years agoosd: rebuild missing OI_ATTR from log entry when possible
Sage Weil [Thu, 25 Jun 2009 03:48:21 +0000 (20:48 -0700)]
osd: rebuild missing OI_ATTR from log entry when possible

16 years agoosd: fix proc_replica_log stop condition
Sage Weil [Wed, 24 Jun 2009 21:09:57 +0000 (14:09 -0700)]
osd: fix proc_replica_log stop condition

This fixes condition from 4b5572a.

osd/PG.cc: In function 'void PG::activate(ObjectStore::Transaction&, std::map<int, MOSDPGInfo*, std::less<int>, std::allocator<std::pair<const int, MOSDPGInfo*> > >*)':
osd/PG.cc:1401: FAILED assert(log.backlog)
 1: ./cosd(_Z18__ceph_assert_failPKcS0_iS0_+0x3a) [0x7a838b]
 2: ./cosd(_ZN2PG8activateERN11ObjectStore11TransactionEPSt3mapIiP10MOSDPGInfoSt4lessIiESaISt4pairIKiS5_EEE+0xbe8) [0x71ff88]
 3: ./cosd(_ZN2PG4peerERN11ObjectStore11TransactionERSt3mapIiS3_I4pg_tNS_5QueryESt4lessIS4_ESaISt4pairIKS4_S5_EEES6_IiESaIS8_IKiSC_EEEPS3_IiP10MOSDPGInfoSD_SaIS8_ISE_SK_EEE+0xfa0) [0x722852]
 4: ./cosd(_ZN3OSD16_process_pg_infoEjiRN2PG4InfoERNS0_3LogERNS0_7MissingEPSt3mapIiP10MOSDPGInfoSt4lessIiESaISt4pairIKiS9_EEERi+0x712) [0x6a7000]
 5: ./cosd(_ZN3OSD13handle_pg_logEP9MOSDPGLog+0x126) [0x6a7768]
 6: ./cosd(_ZN3OSD9_dispatchEP7Message+0x34a) [0x6abe6c]
 7: ./cosd(_ZN3OSD13dispatch_implEP7Message+0x408) [0x6ac9c4]
 8: ./cosd(_ZN10Dispatcher8dispatchEP7Message+0x63) [0x61a4af]
 9: ./cosd(_ZN9Messenger8dispatchEP7Message+0x56) [0x6298f8]
 10: ./cosd(_ZN15SimpleMessenger8Endpoint14dispatch_entryEv+0x5ae) [0x62340a]
 11: ./cosd(_ZN15SimpleMessenger8Endpoint14DispatchThread5entryEv+0x19) [0x62fc9d]
 12: ./cosd(_ZN6Thread11_entry_funcEPv+0x20) [0x629e48]
 13: /lib/libpthread.so.0 [0x7fb482c933f7]

16 years agotodo
Sage Weil [Thu, 25 Jun 2009 03:49:41 +0000 (20:49 -0700)]
todo

16 years agoosd: rev ondisk format, protocols
Sage Weil [Thu, 25 Jun 2009 03:49:24 +0000 (20:49 -0700)]
osd: rev ondisk format, protocols

For monitor message changes AND osd snapset changes.

16 years agoosd: store snapset in _snapdir object if head dne
Sage Weil [Thu, 25 Jun 2009 03:43:01 +0000 (20:43 -0700)]
osd: store snapset in _snapdir object if head dne

If the _head doesn't logically exist, we can't keep it around just for
the SnapSet or else an 'ls' will have to stat in order to tell if the
head object logically exists and should be included.  That's no good,
so:

- put snapset in SS_ATTR on head if it exists
- otherwise, put it SS_ATTR on a _snapdir object

16 years agoosd: zero out pg_pool_t in constructor
Sage Weil [Thu, 25 Jun 2009 03:13:45 +0000 (20:13 -0700)]
osd: zero out pg_pool_t in constructor

Most things were getting initialized, but not snap_seq.

16 years agomon: set snap epoch for poolsnap removal, too
Sage Weil [Thu, 25 Jun 2009 03:13:08 +0000 (20:13 -0700)]
mon: set snap epoch for poolsnap removal, too

16 years agoosd: fix MOSDBoot, MOSDGetMap initialization
Sage Weil [Thu, 25 Jun 2009 02:58:19 +0000 (19:58 -0700)]
osd: fix MOSDBoot, MOSDGetMap initialization

16 years agomon: cleanup
Sage Weil [Wed, 24 Jun 2009 20:15:14 +0000 (13:15 -0700)]
mon: cleanup

16 years agokclient: update with new monitor message formats
Sage Weil [Wed, 24 Jun 2009 20:14:46 +0000 (13:14 -0700)]
kclient: update with new monitor message formats

16 years agomon: change MMDSMap to send map we have, not map we want.
Sage Weil [Wed, 24 Jun 2009 20:03:31 +0000 (13:03 -0700)]
mon: change MMDSMap to send map we have, not map we want.

16 years agoosd: make object delete not remove _head if there are clones
Sage Weil [Wed, 24 Jun 2009 19:55:00 +0000 (12:55 -0700)]
osd: make object delete not remove _head if there are clones

Truncate and rmattrs instead, so we can keep the SnapSet.

Still need to make 'ls' work properly.

16 years agofilestore: rmattrs command
Sage Weil [Wed, 24 Jun 2009 19:54:20 +0000 (12:54 -0700)]
filestore: rmattrs command

Delete all object attrs

16 years agomessages: Added PaxosServiceMessage to repository so previous commits work.
Greg Farnum [Wed, 24 Jun 2009 20:06:33 +0000 (13:06 -0700)]
messages: Added PaxosServiceMessage to repository so previous commits work.

16 years agoMonitor/Message: All messages used by Paxos are now PaxosServiceMessages.
Greg Farnum [Wed, 24 Jun 2009 18:42:24 +0000 (11:42 -0700)]
Monitor/Message: All messages used by Paxos are now PaxosServiceMessages.

16 years agomon/msg: PThey mostly hold version_t's now. Unused, though.
Greg Farnum [Tue, 23 Jun 2009 21:03:34 +0000 (14:03 -0700)]
mon/msg: PThey mostly hold version_t's now. Unused, though.

16 years agoosd: adjust recovery op accounting; explicitly track set of recovering objects
Sage Weil [Wed, 24 Jun 2009 18:17:55 +0000 (11:17 -0700)]
osd: adjust recovery op accounting; explicitly track set of recovering objects

Use a single {start,finish}_recovery_op() func to start and stop
recovery ops, so that there is a single point for counter adjustments
to occur.  On reset, simply call into OSD multiple times.

Also maintain a set<sobject_t> in each PG and on the OSD to track
the set of objects that are recovering.  This can hopefully be
compiled out once all the bugs are identified.

We are chasing this:

osd/OSD.cc:3465: FAILED assert(recovery_ops_active >= 0)
 1: ./cosd(_Z18__ceph_assert_failPKcS0_iS0_+0x3a) [0x7a769b]
 2: ./cosd(_ZN3OSD18finish_recovery_opEP2PGib+0x148) [0x696bce]
 3: ./cosd(_ZN12ReplicatedPG18finish_recovery_opEv+0x77) [0x6359c5]
 4: ./cosd(_ZN12ReplicatedPG17sub_op_push_replyEP14MOSDSubOpReply+0x540) [0x63628a]
 5: ./cosd(_ZN12ReplicatedPG15do_sub_op_replyEP14MOSDSubOpReply+0x64) [0x6407fe]
 6: ./cosd(_ZN3OSD10dequeue_opEP2PG+0x224) [0x6996ee]
 7: ./cosd(_ZN3OSD4OpWQ8_processEP2PG+0x21) [0x70d175]
 8: ./cosd(_ZN10ThreadPool9WorkQueueI2PGE13_void_processEPv+0x28) [0x6c9f78]
 9: ./cosd(_ZN10ThreadPool6workerEv+0x280) [0x7a825c]
 10: ./cosd(_ZN10ThreadPool10WorkThread5entryEv+0x19) [0x70cb9f]
 11: ./cosd(_ZN6Thread11_entry_funcEPv+0x20) [0x629d48]
 12: /lib/libpthread.so.0 [0x7f2f1e3f33f7]
 13: /lib/libc.so.6(clone+0x6d) [0x7f2f1d9c294d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

16 years agoosd: abort generate_backlog if already canceled
Sage Weil [Wed, 24 Jun 2009 18:12:03 +0000 (11:12 -0700)]
osd: abort generate_backlog if already canceled

Bail out of generate_backlog if we've been canceled.  Fixes

osd/OSD.cc: In function 'void OSD::generate_backlog(PG*)':
osd/OSD.cc:3305: FAILED assert(!pg->is_active())
 1: ./cosd(_Z18__ceph_assert_failPKcS0_iS0_+0x3a) [0x7a833b]
 2: ./cosd(_ZN3OSD16generate_backlogEP2PG+0xb6) [0x69a1a6]
 3: ./cosd(_ZN3OSD9BacklogWQ8_processEP2PG+0x21) [0x70d92b]
 4: ./cosd(_ZN10ThreadPool9WorkQueueI2PGE13_void_processEPv+0x28) [0x6ca5f8]
 5: ./cosd(_ZN10ThreadPool6workerEv+0x280) [0x7a8efc]
 6: ./cosd(_ZN10ThreadPool10WorkThread5entryEv+0x19) [0x70d331]
 7: ./cosd(_ZN6Thread11_entry_funcEPv+0x20) [0x629e48]
 8: /lib/libpthread.so.0 [0x7f0a8feed3f7]
 9: /lib/libc.so.6(clone+0x6d) [0x7f0a8f4bc94d]

16 years agoosd: fix merge_log when log and olog share bottom
Sage Weil [Wed, 24 Jun 2009 05:06:24 +0000 (22:06 -0700)]
osd: fix merge_log when log and olog share bottom

If log has 6'10 and olog has 7'10, on same object, merge_log
was failing to throw out log's 6'10 entry because the
last_kept iterator was still end().  Use a simple eversion_t
instead, and simplify existing (and otherwise correct)
log.bottom logic, but without the last_kept != end() guard
that threw us off.

09.06.23 16:52:56.032981 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log log(469'11020,476'11021] from osd0 into log(469'11020,469'11021]
09.06.23 16:52:56.033001 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log extending top to 476'11021
09.06.23 16:52:56.033033 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering]   ? 476'11021 (0'0) m 10001641d24.00000000/head by mds0.16:33860 09.06.23 16:50:28.931949
09.06.23 16:52:56.033057 1145465168 osd4 485 pg[1.cd( v 469'11021/469'11021 (469'11020,469'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering] merge_log 476'11021 (0'0) m 10001641d24.00000000/head by mds0.16:33860 09.06.23 16:50:28.931949
09.06.23 16:52:56.033090 1145465168 osd4 485 pg[1.cd( v 476'11021/469'11021 (469'11020,476'11021] n=8151 ec=2 les=476 485/480) r=0 lcod 0'0 mlcod 469'11021 !hml crashed+peering m=1 l=1] merge_log result log(469'11020,476'11021] missing(1) changed=1

16 years agofilestore: use readdir_r to avoid SIGBUS badness
Sage Weil [Wed, 24 Jun 2009 05:03:04 +0000 (22:03 -0700)]
filestore: use readdir_r to avoid SIGBUS badness

We need to use reentrant readdir, since multiple threads
will otherwise share the struct dirent and walk all over
each other.

16 years agomds: fix session purge bug
Sage Weil [Tue, 23 Jun 2009 22:48:37 +0000 (15:48 -0700)]
mds: fix session purge bug

mds/Server.cc: In function 'void Server::_finish_session_purge(Session*)':
mds/Server.cc:410: FAILED assert(session->is_stale_purging())
 1: ./cmds(_ZN6Server21_finish_session_purgeEP7Session+0x392) [0x49edf2]
 2: ./cmds(_ZN6Server18find_idle_sessionsEv+0xa18) [0x4a3188]
 3: ./cmds(_ZN3MDS4tickEv+0x220) [0x484f60]
 4: ./cmds(_ZN9SafeTimer12EventWrapper6finishEi+0x1c1) [0x63eb11]
 5: ./cmds(_ZN5Timer11timer_entryEv+0x6f6) [0x6412d6]
 6: ./cmds(_ZN5Timer11TimerThread5entryEv+0xd) [0x46d53d]
 7: ./cmds(_ZN6Thread11_entry_funcEPv+0xc) [0x480c9c]
 8: /lib/libpthread.so.0 [0x7f51a9f4c3f7]
 9: /lib/libc.so.6(clone+0x6d) [0x7f51a951b94d]

16 years agoosd: allow recovery of missing objects not in log
Sage Weil [Tue, 23 Jun 2009 21:54:08 +0000 (14:54 -0700)]
osd: allow recovery of missing objects not in log

This happens when a scrub/repair tells us to recovery an item, but
it's older than log.bottom.

16 years agoosd: avoid using null ctx pointer
Sage Weil [Tue, 23 Jun 2009 04:37:12 +0000 (21:37 -0700)]
osd: avoid using null ctx pointer

Use localt instead, it's on the stack.

16 years agoosd: stop rewinding replica log when we reach log.bottom
Sage Weil [Tue, 23 Jun 2009 04:32:09 +0000 (21:32 -0700)]
osd: stop rewinding replica log when we reach log.bottom

We stop rewinding a replica log when we reach our own
log.bottom, because we don't know enough to do so in any
meaningful way, and because we can assume it is not
divergent at that point (barring any complete screwupedness).

Also, if we do change last_update, make sure last_complete is
rewound too.

16 years agomds: no fatal assert on ino allocation failures
Sage Weil [Tue, 23 Jun 2009 03:29:40 +0000 (20:29 -0700)]
mds: no fatal assert on ino allocation failures

We still log them LOG_ERR.  Client will be unhappy, but
that's their problem.

16 years agoosd: small cleanups
Sage Weil [Tue, 23 Jun 2009 03:25:15 +0000 (20:25 -0700)]
osd: small cleanups

16 years agomds: don't choke on bad parallel_fetch paths
Sage Weil [Mon, 22 Jun 2009 23:11:21 +0000 (16:11 -0700)]
mds: don't choke on bad parallel_fetch paths

e.g., bad reconnect path from client, like /blah/file_not_dir/blah.

16 years agorados: cleanup
Sage Weil [Tue, 23 Jun 2009 03:57:44 +0000 (20:57 -0700)]
rados: cleanup

16 years agokclient: make r_path[12] dup strings
Sage Weil [Tue, 23 Jun 2009 03:18:58 +0000 (20:18 -0700)]
kclient: make r_path[12] dup strings

The mds_request lifetime differs from the caller's stack, so we need to
duplicate these strings.  Fixes problems with request reply after MDS
recovery.

16 years agokclient: clean up mds_request path generation
Sage Weil [Tue, 23 Jun 2009 03:18:00 +0000 (20:18 -0700)]
kclient: clean up mds_request path generation

16 years agotodo
Sage Weil [Mon, 22 Jun 2009 22:50:37 +0000 (15:50 -0700)]
todo

16 years agoMakefile: add missing kernel/ headers
Sage Weil [Mon, 22 Jun 2009 22:50:16 +0000 (15:50 -0700)]
Makefile: add missing kernel/ headers

16 years agokclient: import into fs/, not fs/staging/?
Sage Weil [Mon, 22 Jun 2009 18:03:47 +0000 (11:03 -0700)]
kclient: import into fs/, not fs/staging/?

16 years agorados/objecter: Changes to rados in/out, and various things work.
Greg Farnum [Mon, 22 Jun 2009 22:35:22 +0000 (15:35 -0700)]
rados/objecter: Changes to rados in/out, and various things work.

16 years agoObjecter/librados: Refactored and renamed for clarity.
Greg Farnum [Fri, 19 Jun 2009 20:13:45 +0000 (13:13 -0700)]
Objecter/librados: Refactored and renamed for clarity.

16 years agotodo
Sage Weil [Mon, 22 Jun 2009 17:05:19 +0000 (10:05 -0700)]
todo

16 years agokclient: clean up unaligned pointer accesses
Sage Weil [Sat, 20 Jun 2009 21:55:20 +0000 (14:55 -0700)]
kclient: clean up unaligned pointer accesses

Get rid of the likes of *(__le64*)foo.

Get rid of useless ceph_decode_##_le() macros; use ceph_decode_copy
instead.

16 years agocosd: conf updates
Sage Weil [Sat, 20 Jun 2009 21:01:10 +0000 (14:01 -0700)]
cosd: conf updates

16 years agomon: allow repair of entire osd
Sage Weil [Fri, 19 Jun 2009 20:28:44 +0000 (13:28 -0700)]
mon: allow repair of entire osd

16 years agomds: reduce default memory, journal footprint
Sage Weil [Fri, 19 Jun 2009 20:16:58 +0000 (13:16 -0700)]
mds: reduce default memory, journal footprint

16 years agoosd: do NOT include op vector when shipping raw transaction
Sage Weil [Sat, 20 Jun 2009 06:27:04 +0000 (23:27 -0700)]
osd: do NOT include op vector when shipping raw transaction

This just doubles up the data payload.  And makes the MOSDSubOp printout
look like garbage, since e.g. the setxattr names are taken from the
portion of the data payload encoding the transaction.

16 years agokclient: strip out kernel version compatibility cruft v0.9
Sage Weil [Thu, 18 Jun 2009 21:58:17 +0000 (14:58 -0700)]
kclient: strip out kernel version compatibility cruft

16 years agokclient: update script importer
Sage Weil [Fri, 19 Jun 2009 22:03:11 +0000 (15:03 -0700)]
kclient: update script importer

16 years agotodo
Sage Weil [Fri, 19 Jun 2009 21:56:32 +0000 (14:56 -0700)]
todo

16 years agoosd: on scrub repair, update replica pg stats as necessary
Sage Weil [Fri, 19 Jun 2009 19:46:21 +0000 (12:46 -0700)]
osd: on scrub repair, update replica pg stats as necessary

An MOSDPGInfo to an active replica is treated as a pg stat repair.  The
replica just saves it to disk.

16 years agoosd: pass updated stats to replica
Sage Weil [Fri, 19 Jun 2009 19:45:36 +0000 (12:45 -0700)]
osd: pass updated stats to replica

When we ship the raw transaction to the replica, we need to ship the
new pg_stat_t as well, since that isn't getting updated in parallel by
prepare_transaction().

16 years agouclient: close mds session close race
Sage Weil [Fri, 19 Jun 2009 18:40:09 +0000 (11:40 -0700)]
uclient: close mds session close race

If we get a mds push msg while closing the session, resend the close
request.

16 years agoobjecter: some list_objects cleanups
Sage Weil [Fri, 19 Jun 2009 17:06:32 +0000 (10:06 -0700)]
objecter: some list_objects cleanups

16 years agoosd: check that pg matches
Sage Weil [Fri, 19 Jun 2009 17:06:17 +0000 (10:06 -0700)]
osd: check that pg matches

Otherwise return an empty result.  May want to return an error here.. not
sure which tho.

16 years agotodo: bugs that have come up >2x now
Sage Weil [Fri, 19 Jun 2009 04:19:25 +0000 (21:19 -0700)]
todo: bugs that have come up >2x now

16 years agoosd: adjust debug levels a bit
Sage Weil [Fri, 19 Jun 2009 04:06:45 +0000 (21:06 -0700)]
osd: adjust debug levels a bit

Try to put iterative output to be at 20, other stuff at 10,
so that we can tolerate 10 on large data sets.

16 years agoosd: fix initialization of log.complete_to in PG::activate()
Sage Weil [Fri, 19 Jun 2009 04:05:59 +0000 (21:05 -0700)]
osd: fix initialization of log.complete_to in PG::activate()

The complete_to should point to the next object to get, which
should be just PAST info.last_complete.  That is because we
can trim the log up to and including last_complete (because
that entry is recovered), and we don't want to invalidate
the iterator.

That is
    while (log.complete_to->version <= info.last_complete)
      log.complete_to++;

and in sub_op_push,

    while (...) {
      ...
      if (info.last_complete < log.complete_to->version)
info.last_complete = log.complete_to->version;
      log.complete_to++;
    }

16 years agoosd: remove bad trim assertion: trim point may preceed local log.bottom
Sage Weil [Fri, 19 Jun 2009 01:40:18 +0000 (18:40 -0700)]
osd: remove bad trim assertion: trim point may preceed local log.bottom

16 years agoosd: remove bad assertion to allow trim before pg is clean
Sage Weil [Fri, 19 Jun 2009 01:39:58 +0000 (18:39 -0700)]
osd: remove bad assertion to allow trim before pg is clean

We may trim the log before recovery completes.

16 years agoObjecter: now has list instead of librados. Hurrah.
Greg Farnum [Fri, 19 Jun 2009 00:04:55 +0000 (17:04 -0700)]
Objecter: now has list instead of librados. Hurrah.

16 years agoObjecter: Now resubmits *Op as part of tick() if the response takes too long.
Greg Farnum [Wed, 17 Jun 2009 20:11:43 +0000 (13:11 -0700)]
Objecter: Now resubmits *Op as part of tick() if the response takes too long.

16 years agoosd: be a bit more verbose about peer_info
Sage Weil [Thu, 18 Jun 2009 23:39:19 +0000 (16:39 -0700)]
osd: be a bit more verbose about peer_info

Looking for residual bug where peer_info info is somehow missing
when activate() happens...

16 years agoosd: don't trim pg log if degraded
Sage Weil [Thu, 18 Jun 2009 23:38:39 +0000 (16:38 -0700)]
osd: don't trim pg log if degraded

Also be a bit more verbose about pg_trim_to changes.

16 years agoosd: we don't use MOSDPGInfo to signal replica uptodate anymore
Sage Weil [Thu, 18 Jun 2009 23:37:44 +0000 (16:37 -0700)]
osd: we don't use MOSDPGInfo to signal replica uptodate anymore

Clean out cruft from old replica-driven recovery.

16 years agoosd: make add_next_entry behave when we start at backlog split point
Sage Weil [Thu, 18 Jun 2009 23:37:03 +0000 (16:37 -0700)]
osd: make add_next_entry behave when we start at backlog split point

Weaken the assertions a bit and just adjust missing appropriately.
Things may not match up perfectly if the split point is a backlog
entry, so just make missing what it should be a worry less about
what it was.

Here is the specific crash:

09.06.18 16:29:15.085353 1124096336 osd1 10 pg[1.8( v 5'4/3'2 (0'0,5'4] n=2 ec=2 les=10 10/3) r=1 lcod 0'0 stray m=1] my log = log(0'0,5'4]+backlog
3'1 (0'0) m 200.00000000/head by mds0.1:1 09.06.18 16:20:07.524996 indexed
3'2 (0'0) m 2.00000000/head by mds0.1:5 09.06.18 16:20:07.527454 indexed
5'3 (3'1) m 200.00000000/head by mds0.1:23 09.06.18 16:20:25.128842 indexed
5'4 (5'3) m 200.00000000/head by mds0.1:35 09.06.18 16:20:48.623669 indexed

09.06.18 16:29:15.085393 1124096336 osd1 10 pg[1.8( v 5'4/3'2 (0'0,5'4] n=2 ec=2 les=10 10/3) r=1 lcod 0'0 stray m=1] osd2 log = log(8'68,9'69]+backlog
3'2 (0'0) b 2.00000000/head by mds0.1:5 09.06.18 16:20:07.527454
9'69 (8'68) m 200.00000000/head by mds0.1:1114 09.06.18 16:28:08.837907

09.06.18 16:29:15.085416 1124096336 osd1 10 pg[1.8( v 5'4/3'2 (0'0,5'4] n=2 ec=2 les=10 10/3) r=1 lcod 0'0 stray m=1] merge_log log(8'68,9'69]+backlog from osd2 into log(0'0,5'4]+backlog
09.06.18 16:29:15.085456 1124096336 osd1 10 pg[1.8( v 5'4/3'2 (0'0,5'4] n=2 ec=2 les=10 10/3) r=1 (log bound mismatch, actual=[3'2,9'69] len=2) lcod 0'0 stray m=1] merge_log split point is 3'2 (0'0) b 2.00000000/head by mds0.1:5 09.06.18 16:20:07.527454
09.06.18 16:29:15.085472 1124096336 osd1 10 pg[1.8( v 5'4/3'2 (0'0,5'4] n=2 ec=2 les=10 10/3) r=1 (log bound mismatch, actual=[3'2,9'69] len=2) lcod 0'0 stray m=1] merge_log merging 3'2 (0'0) b 2.00000000/head by mds0.1:5 09.06.18 16:20:07.527454
09.06.18 16:29:15.085493 1124096336 osd1 10 pg[1.8( v 5'4/3'2 (0'0,5'4] n=2 ec=2 les=10 10/3) r=1 (log bound mismatch, actual=[3'2,9'69] len=2) lcod 0'0 stray m=2] merge_log merging 9'69 (8'68) m 200.00000000/head by mds0.1:1114 09.06.18 16:28:08.837907
osd/PG.h: In function 'void PG::Missing::add_next_event(PG::Log::Entry&)':
osd/PG.h:494: FAILED assert(missing[e.soid].need == e.prior_version)

16 years agouclient: wait for mds sessions close on unmount
Sage Weil [Thu, 18 Jun 2009 23:35:18 +0000 (16:35 -0700)]
uclient: wait for mds sessions close on unmount

16 years agomds: only use send_message_client for caps, lease, and snap msgs
Sage Weil [Thu, 18 Jun 2009 23:35:00 +0000 (16:35 -0700)]
mds: only use send_message_client for caps, lease, and snap msgs

Otherwise we screw up the session seq count.

16 years agouclient: init, shutdown objecter
Sage Weil [Thu, 18 Jun 2009 22:51:50 +0000 (15:51 -0700)]
uclient: init, shutdown objecter

This fixes longstanding problems with csyn stalling.

16 years agoosd: consolidate trim logic in calc_trim_to()
Sage Weil [Thu, 18 Jun 2009 22:48:31 +0000 (15:48 -0700)]
osd: consolidate trim logic in calc_trim_to()

And call it from trim_peers(), so that we always apply the same
conditions on log trimming.

This ensures we don't trim the logs while degraded through one of
the other paths.

16 years agotodo
Sage Weil [Thu, 18 Jun 2009 22:22:15 +0000 (15:22 -0700)]
todo

16 years agokclient: fix whitespace
Sage Weil [Thu, 18 Jun 2009 21:29:12 +0000 (14:29 -0700)]
kclient: fix whitespace

16 years agokclient: include fs/staging patch in series
Sage Weil [Thu, 18 Jun 2009 21:28:35 +0000 (14:28 -0700)]
kclient: include fs/staging patch in series

16 years agocrush: fix coding style, whitespace
Sage Weil [Thu, 18 Jun 2009 21:23:43 +0000 (14:23 -0700)]
crush: fix coding style, whitespace

16 years agocrush: redefine hash using __u32, for consistency across 32/64 bit
Sage Weil [Thu, 18 Jun 2009 21:23:29 +0000 (14:23 -0700)]
crush: redefine hash using __u32, for consistency across 32/64 bit

I'm pretty sure this was giving inconsistent results across archs,
because bits would get shifted into the high 32 and then back again
on x86_64 but not x86_32.

16 years agomark v0.9
Sage Weil [Thu, 18 Jun 2009 20:31:04 +0000 (13:31 -0700)]
mark v0.9

16 years agoMakefile: add missing includes
Sage Weil [Thu, 18 Jun 2009 20:30:49 +0000 (13:30 -0700)]
Makefile: add missing includes

16 years agoMakefile: kill cls_trivialmap
Sage Weil [Thu, 18 Jun 2009 20:24:09 +0000 (13:24 -0700)]
Makefile: kill cls_trivialmap

16 years agokclient: avoid i_ino of 0 on 32-bit platforms
Sage Weil [Thu, 18 Jun 2009 20:10:34 +0000 (13:10 -0700)]
kclient: avoid i_ino of 0 on 32-bit platforms

This confuses ls.  How lame!

Reported-by: Jeremy Hanmer <jeremy@hq.newdream.net>
16 years agoosd: trim pg logs on recovery completion
Sage Weil [Thu, 18 Jun 2009 18:40:55 +0000 (11:40 -0700)]
osd: trim pg logs on recovery completion

When replica finds itself fully up to date (last_complete ==
last_update) it tells the primary.  Primary checks the same.
If the primary find the min_last_complete_ondisk has changed,
it sends out a trim command.

This will let us drop huge pg logs out of memory after a recovery
without waiting for IO and the usual piggybacked trimming logic
to kick in.

16 years agoosd: track last_complete_ondisk over pushes, too.
Sage Weil [Thu, 18 Jun 2009 18:19:47 +0000 (11:19 -0700)]
osd: track last_complete_ondisk over pushes, too.

16 years agoosd: revamp complete_thru code
Sage Weil [Thu, 18 Jun 2009 18:07:03 +0000 (11:07 -0700)]
osd: revamp complete_thru code

Use last_complete_ondisk terminology throughout.

16 years agoosd: some infrastructure for primary to trim replica logs
Sage Weil [Thu, 18 Jun 2009 16:37:03 +0000 (09:37 -0700)]
osd: some infrastructure for primary to trim replica logs

16 years agoosd: fix pg log trim on the non-primary
Sage Weil [Thu, 18 Jun 2009 03:48:50 +0000 (20:48 -0700)]
osd: fix pg log trim on the non-primary

16 years agorados: less chatty
Sage Weil [Wed, 17 Jun 2009 23:36:08 +0000 (16:36 -0700)]
rados: less chatty

16 years agorados: shutdown on exit
Sage Weil [Wed, 17 Jun 2009 23:35:24 +0000 (16:35 -0700)]
rados: shutdown on exit

16 years agolibrados: add shutdown to c++ interface
Sage Weil [Wed, 17 Jun 2009 23:35:19 +0000 (16:35 -0700)]
librados: add shutdown to c++ interface

16 years agokclient: initialize readdir next_offset on dir open
Sage Weil [Wed, 17 Jun 2009 23:15:15 +0000 (16:15 -0700)]
kclient: initialize readdir next_offset on dir open

Otherwise we don't compensate for . and .. properly.

16 years agokclient: update client for statfs changes
Sage Weil [Wed, 17 Jun 2009 23:08:21 +0000 (16:08 -0700)]
kclient: update client for statfs changes

16 years agoosd: add pg log sizes, bottoms to pg_stat_t et al
Sage Weil [Wed, 17 Jun 2009 23:00:14 +0000 (16:00 -0700)]
osd: add pg log sizes, bottoms to pg_stat_t et al

This will allow us to see the pg logging overhead, esp once pg
logs are kept for longer on disk.

16 years agokclient: clean out old debug cruft
Sage Weil [Wed, 17 Jun 2009 22:54:18 +0000 (15:54 -0700)]
kclient: clean out old debug cruft

16 years agorados: fix statfs definition
Sage Weil [Wed, 17 Jun 2009 22:46:49 +0000 (15:46 -0700)]
rados: fix statfs definition

Isolate posix lameness to uclient only.  Unify 'rados df' and
'rados dfpools'