]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
16 years agoosd: reply with EBLACKLISTED if sender is blacklisted
Sage Weil [Mon, 24 Nov 2008 20:06:08 +0000 (12:06 -0800)]
osd: reply with EBLACKLISTED if sender is blacklisted

Move reply_op_error helper into OSD.cc.

16 years agoosd: include a blacklist in the OSDMap
Sage Weil [Mon, 24 Nov 2008 19:49:10 +0000 (11:49 -0800)]
osd: include a blacklist in the OSDMap

16 years agoosd: comment
Sage Weil [Mon, 24 Nov 2008 19:39:51 +0000 (11:39 -0800)]
osd: comment

16 years agorev wire, disk formats
Sage Weil [Mon, 24 Nov 2008 19:39:43 +0000 (11:39 -0800)]
rev wire, disk formats

osdmap encoding changed.

16 years agoosd: ignore intervals prior to last_epoch_started in build_prior
Sage Weil [Mon, 24 Nov 2008 19:34:02 +0000 (11:34 -0800)]
osd: ignore intervals prior to last_epoch_started in build_prior

We may have raised last_epoch_started without trimming past_intervals.

16 years agoosd: simplify osdmap tracking of osd up/down epochs; fix pg build_prior logic
Sage Weil [Mon, 24 Nov 2008 19:30:17 +0000 (11:30 -0800)]
osd: simplify osdmap tracking of osd up/down epochs; fix pg build_prior logic

Use a single struct to track all of our osd up/down info.  Include
down_at, the epoch we last marked the osd down.

Fix PG::build_prior to require that the osd was clean through the _entire_
interval in question.

In monitor, adjust new clean interval foward to down_at-1 if the up_from
matches the interval we mounted.  That is, if the OSD shut down cleanly,
it obviously remained clean at least until we marked it down in the map.

16 years agomds: move filelock to lock state if we can't wrlock but lock is stable
Sage Weil [Mon, 24 Nov 2008 21:26:11 +0000 (13:26 -0800)]
mds: move filelock to lock state if we can't wrlock but lock is stable

For example, lock may be sync c=1 when we're trying to reset
max_size to 0.  We need to make sure the lock will change state
before we wait on WAIT_STABLE.

16 years agomds: rejournal using EUpdate instead of EOpen if no caps in check_inode_max_size
Sage Weil [Mon, 24 Nov 2008 21:23:53 +0000 (13:23 -0800)]
mds: rejournal using EUpdate instead of EOpen if no caps in check_inode_max_size

Using EOpen on non-open files is imprecise.  More importantly, if
it is a snapped inode, the EOpen replay code won't be able to
look up the ino and will throw an assertion.

So, use EUpdate instead to record the new info when necessary.

16 years agomds: use vector for EOpen
Sage Weil [Mon, 24 Nov 2008 21:21:15 +0000 (13:21 -0800)]
mds: use vector for EOpen

16 years agomake assertion output look more like gcc
Sage Weil [Mon, 24 Nov 2008 21:20:52 +0000 (13:20 -0800)]
make assertion output look more like gcc

16 years agomds: remove bad assertion
Sage Weil [Mon, 24 Nov 2008 20:08:49 +0000 (12:08 -0800)]
mds: remove bad assertion

remove_inode should do the asserting, here, as it may do some
cleanup.

16 years agoosd: use last_clean_interval in build_prior logic
Sage Weil [Mon, 24 Nov 2008 18:00:21 +0000 (10:00 -0800)]
osd: use last_clean_interval in build_prior logic

We now mark a PG crashed if any of the OSDs during a given interval
is not either still alive or cleanly shut down during the interval.  If
those two conditions are not yet, it may have crashed.

It isn't a perfect set of criteria, since last_clean_interval is only
clean shutdowns of an OSD.  We could, for example, track another interval
generated via old osd_up_thru, but the conditions for that are different,
since that requires survival past the end of the interval, not a clean
shutdown during the interval.  This should capture the common case, though,
of a clean unmount.
to

16 years agoosd: track last_clean_interval in osdmap; simplify encoding/decoding a bit
Sage Weil [Mon, 24 Nov 2008 17:57:11 +0000 (09:57 -0800)]
osd: track last_clean_interval in osdmap; simplify encoding/decoding a bit

Break osdmap into "base" and "extended" portions, so that clients can
ignore the extended portions completely.

Track last_clean_interval in the osdmap so we know when the osd last
cleanly shut down.

Disk and wire format changes.

16 years agoosd: track last_clean_interval in superblock
Sage Weil [Mon, 24 Nov 2008 17:55:23 +0000 (09:55 -0800)]
osd: track last_clean_interval in superblock

16 years agoosd: add a few getattr assertions
Sage Weil [Mon, 24 Nov 2008 18:27:22 +0000 (10:27 -0800)]
osd: add a few getattr assertions

Ensure we got the snapset attr before we decode it.

16 years agoosd: log a delete only if the head object is deleted
Sage Weil [Mon, 24 Nov 2008 18:25:11 +0000 (10:25 -0800)]
osd: log a delete only if the head object is deleted

Whether the head logically exists is not relevant to our logging.
Do not look at snapset.head_exists.

16 years agoosd: remove bad assertion in pick_read_snap
Sage Weil [Mon, 24 Nov 2008 18:17:58 +0000 (10:17 -0800)]
osd: remove bad assertion in pick_read_snap

The clone oid.snap does not necessary correspond to the newest
snap, since snaps may be deleted, or because the snap the snap is
named based on the snap context seq and not the oldest snap it
contains.

16 years agomds: remove capless inodes from logsegment open_file lists after reconnect
Sage Weil [Sun, 23 Nov 2008 18:57:28 +0000 (10:57 -0800)]
mds: remove capless inodes from logsegment open_file lists after reconnect

Inodes get added during replay of EOpen events.  We remove capless inodes
after reconnect restores from clients.

We only want inodes with caps on those lists.  Add an assertion to
enforce constraint.

Also, remove ourselves explicitly in remove_inode(), since that may happen
during replay when an inode is destroyed.

16 years agomds: fix up completed_request handling during journal replay
Sage Weil [Sat, 22 Nov 2008 16:57:19 +0000 (08:57 -0800)]
mds: fix up completed_request handling during journal replay

The completed_requests is handled separately from the session table
itself, in that we may add completed requests to the table even when
we may have loaded newer info.  But the handling was a bit wrong.
We make sure we only add completed requests if the session is already
open... and remove the unnecessary trim (if the sessionmap is newer, the
session is already closed, and thus we have no request info).

16 years agoosd: clean up info setattrs, such that write_info is called once per map update
Sage Weil [Sat, 22 Nov 2008 05:49:11 +0000 (21:49 -0800)]
osd: clean up info setattrs, such that write_info is called once per map update

Use pg dirty flags, and do the final write_info and/or write_log right
before we apply the advance/activate_map transaction.

PG.cc still calls the methods directly in methods that are not in the
advance/activate paths.

16 years agofilestore: clean up debug output
Sage Weil [Sat, 22 Nov 2008 04:46:10 +0000 (20:46 -0800)]
filestore: clean up debug output

10 ops
15 commits
20 waits

16 years agoosd: minor append_log cleanup
Sage Weil [Sat, 22 Nov 2008 04:36:38 +0000 (20:36 -0800)]
osd: minor append_log cleanup

16 years agofilestore: only commit if changes are pending
Sage Weil [Sat, 22 Nov 2008 04:31:53 +0000 (20:31 -0800)]
filestore: only commit if changes are pending

Avoid needlessly scribbling all over the superblocks.

16 years agofilestore: fix truncate argument, subsequent Transaction fuggering
Sage Weil [Sat, 22 Nov 2008 04:30:00 +0000 (20:30 -0800)]
filestore: fix truncate argument, subsequent Transaction fuggering

16 years agodstart: modprobe btrfs
Sage Weil [Sat, 22 Nov 2008 00:35:16 +0000 (16:35 -0800)]
dstart: modprobe btrfs

16 years agomds: assert ref count is 0 on inode deletion
Sage Weil [Fri, 21 Nov 2008 23:01:22 +0000 (15:01 -0800)]
mds: assert ref count is 0 on inode deletion

16 years agomds: mark dirty inode clean before dropping
Sage Weil [Fri, 21 Nov 2008 23:00:55 +0000 (15:00 -0800)]
mds: mark dirty inode clean before dropping

This has the effect of removing the inode from the dirty xlist.

16 years agomds: exclude BADREMOTEINO dentriess in readdir
Sage Weil [Fri, 21 Nov 2008 22:58:14 +0000 (14:58 -0800)]
mds: exclude BADREMOTEINO dentriess in readdir

16 years agomds: make open_remote_ino terminate if the anchortrace refers to a non-existent ino
Sage Weil [Fri, 21 Nov 2008 22:58:00 +0000 (14:58 -0800)]
mds: make open_remote_ino terminate if the anchortrace refers to a non-existent ino

Since anchor lookup is racy, we may have to do multiple lookups
(in the case of a concurrent anchor table update).  If we don't
find the ino in the anchor, remember the anchor version when we
try again.  If we fail again at the same point, and the anchor
has not changed, fail.

Create an open_remote_dentry helper that does this.  If we fail,
set the CDentry::STATE_BADREMOTEINO state bit.

16 years agomds: fix cdentry states
Sage Weil [Fri, 21 Nov 2008 22:55:59 +0000 (14:55 -0800)]
mds: fix cdentry states

Also add BADREMOTEINO, which we'll use shortly.

16 years agomds: add version to anchor; avoid looping in open_remote_ino
Sage Weil [Fri, 21 Nov 2008 22:34:40 +0000 (14:34 -0800)]
mds: add version to anchor; avoid looping in open_remote_ino

If we do not find the item referenced by teh anchor trace, we
try again, but keep track of the anchor version we ended on.  If
we hit a dead end at the same point next time and the anchor
hasn't changed, we give up.

16 years agomon: send osdmap to _original_ PGStats source
Sage Weil [Fri, 21 Nov 2008 21:30:27 +0000 (13:30 -0800)]
mon: send osdmap to _original_ PGStats source

..not the most recent sender, which may be another monitor.

16 years agomsgr: fix replace_connection bucket list manipulation
Sage Weil [Fri, 21 Nov 2008 21:24:40 +0000 (13:24 -0800)]
msgr: fix replace_connection bucket list manipulation

16 years agokclient: increase max data and front sizes to 16MB
Yehuda Sadeh [Fri, 21 Nov 2008 21:15:35 +0000 (13:15 -0800)]
kclient: increase max data and front sizes to 16MB

16 years agokclient: preserve peer_name across connection replacement
Sage Weil [Fri, 21 Nov 2008 21:11:46 +0000 (13:11 -0800)]
kclient: preserve peer_name across connection replacement

16 years agoosd: factor out add_log_entry into a helper that adjusts pg info accordingly
Sage Weil [Fri, 21 Nov 2008 20:35:20 +0000 (12:35 -0800)]
osd: factor out add_log_entry into a helper that adjusts pg info accordingly

We want to keep log.top, info.last_update etc. in sync.

16 years agoosd: use old acting set when noting past intervals
Sage Weil [Fri, 21 Nov 2008 20:02:56 +0000 (12:02 -0800)]
osd: use old acting set when noting past intervals

16 years agodstart: only clean up old output on mkfs
Sage Weil [Fri, 21 Nov 2008 19:56:59 +0000 (11:56 -0800)]
dstart: only clean up old output on mkfs

16 years agoclient: ignore dropped messages
Sage Weil [Fri, 21 Nov 2008 19:39:37 +0000 (11:39 -0800)]
client: ignore dropped messages

16 years agopg: fix build_prior debug output
Sage Weil [Fri, 21 Nov 2008 19:21:59 +0000 (11:21 -0800)]
pg: fix build_prior debug output

16 years agoclient: fix ms_handle_failure
Sage Weil [Fri, 21 Nov 2008 19:21:46 +0000 (11:21 -0800)]
client: fix ms_handle_failure

16 years agoosd: only trim past intervals that _fully_ preceed last epoch started
Sage Weil [Fri, 21 Nov 2008 19:08:23 +0000 (11:08 -0800)]
osd: only trim past intervals that _fully_ preceed last epoch started

16 years agoosd: ignore pushes if stray
Sage Weil [Fri, 21 Nov 2008 18:52:26 +0000 (10:52 -0800)]
osd: ignore pushes if stray

We need to verify a push is coming from the current primary if we are
any non-primary, including a stray.  Fixes bad assertion

osd/ReplicatedPG.cc:2322: FAILED assert in 'void ReplicatedPG::sub_op_push(MOSDSubOp*)': log.complete_to != log.log.end()

16 years agomon: send osdmap updates if pg_stats indicates an old map for a long time
Sage Weil [Fri, 21 Nov 2008 18:41:50 +0000 (10:41 -0800)]
mon: send osdmap updates if pg_stats indicates an old map for a long time

16 years agoosd: fix read_superblock
Sage Weil [Fri, 21 Nov 2008 18:24:58 +0000 (10:24 -0800)]
osd: fix read_superblock

16 years agomds: mark dn clean before removing
Sage Weil [Fri, 21 Nov 2008 20:12:58 +0000 (12:12 -0800)]
mds: mark dn clean before removing

This ensures we clean up the xlist_dirty xlist::item.

16 years agokclient: debug info for connections and crc errors
Yehuda Sadeh [Fri, 21 Nov 2008 19:20:19 +0000 (11:20 -0800)]
kclient: debug info for connections and crc errors

16 years agoosd: fix read_superblock
Sage Weil [Fri, 21 Nov 2008 18:23:37 +0000 (10:23 -0800)]
osd: fix read_superblock

16 years agodstart: fix crush map typo
Sage Weil [Fri, 21 Nov 2008 17:59:27 +0000 (09:59 -0800)]
dstart: fix crush map typo

16 years agoosd: fix default crush map rules
Sage Weil [Fri, 21 Nov 2008 18:21:07 +0000 (10:21 -0800)]
osd: fix default crush map rules

16 years agocrush: introduce crush magic
Sage Weil [Fri, 21 Nov 2008 17:49:18 +0000 (09:49 -0800)]
crush: introduce crush magic

This is a disk format change.

16 years agocrush: make recurse_to_leaf slightly less fragile
Sage Weil [Fri, 21 Nov 2008 17:43:31 +0000 (09:43 -0800)]
crush: make recurse_to_leaf slightly less fragile

16 years agoosd: don't fail assertion on out empty ops list (i.e. no-op)
Sage Weil [Fri, 21 Nov 2008 01:01:18 +0000 (17:01 -0800)]
osd: don't fail assertion on out empty ops list (i.e. no-op)

16 years agoosd: include magic in osd volumes
Sage Weil [Fri, 21 Nov 2008 01:00:36 +0000 (17:00 -0800)]
osd: include magic in osd volumes

16 years agoosd: get rid of snaptrimming/snaptrimqueue pg states
Sage Weil [Fri, 21 Nov 2008 01:00:25 +0000 (17:00 -0800)]
osd: get rid of snaptrimming/snaptrimqueue pg states

These aren't helpful, since we only report pg states from the
primary osd, and snaptrimming occurs on replicas as well.

16 years agomon: include magic in mondata
Sage Weil [Fri, 21 Nov 2008 00:59:08 +0000 (16:59 -0800)]
mon: include magic in mondata

16 years agoosdmap: use chooseleaf in default crush map
Sage Weil [Fri, 21 Nov 2008 00:25:43 +0000 (16:25 -0800)]
osdmap: use chooseleaf in default crush map

16 years agoosd: fix transaction argument order
Sage Weil [Thu, 20 Nov 2008 23:08:24 +0000 (15:08 -0800)]
osd: fix transaction argument order

The order of evaluation is ambiguous, apparently!

16 years agoOSD: pg ref count debugging
Sage Weil [Thu, 20 Nov 2008 22:47:30 +0000 (14:47 -0800)]
OSD: pg ref count debugging

16 years agoosd: use map_lock to avoid osdmap update race in _finish_recovery
Sage Weil [Thu, 20 Nov 2008 22:46:15 +0000 (14:46 -0800)]
osd: use map_lock to avoid osdmap update race in _finish_recovery

16 years agoosd: convert recovery to a work queue
Sage Weil [Thu, 20 Nov 2008 21:56:36 +0000 (13:56 -0800)]
osd: convert recovery to a work queue

16 years agowq: use a single lock
Sage Weil [Thu, 20 Nov 2008 21:56:25 +0000 (13:56 -0800)]
wq: use a single lock

16 years agoosd: clean up mkfs vs peek_super
Sage Weil [Thu, 20 Nov 2008 22:40:08 +0000 (14:40 -0800)]
osd: clean up mkfs vs peek_super

16 years agoosd: fix peek_whoami
Sage Weil [Thu, 20 Nov 2008 22:00:33 +0000 (14:00 -0800)]
osd: fix peek_whoami

16 years agoosd: make peek_whoami verify fsid
Sage Weil [Thu, 20 Nov 2008 21:54:44 +0000 (13:54 -0800)]
osd: make peek_whoami verify fsid

16 years agoosd: decode superblock properly
Sage Weil [Thu, 20 Nov 2008 21:54:34 +0000 (13:54 -0800)]
osd: decode superblock properly

16 years agovstart: 3 osds
Sage Weil [Thu, 20 Nov 2008 21:42:06 +0000 (13:42 -0800)]
vstart: 3 osds

16 years agocstring: pre-terminate even if content unspecified
Sage Weil [Thu, 20 Nov 2008 21:31:13 +0000 (13:31 -0800)]
cstring: pre-terminate even if content unspecified

16 years agoosd: feed type name into workqueue
Sage Weil [Wed, 19 Nov 2008 19:19:28 +0000 (11:19 -0800)]
osd: feed type name into workqueue

16 years agoosd: convert snap trimming to snap_trim_wq
Sage Weil [Wed, 19 Nov 2008 19:14:53 +0000 (11:14 -0800)]
osd: convert snap trimming to snap_trim_wq

16 years agoosd: basic scrub works
Sage Weil [Wed, 19 Nov 2008 19:04:37 +0000 (11:04 -0800)]
osd: basic scrub works

16 years agoosd: update stats when op is applied
Sage Weil [Wed, 19 Nov 2008 19:04:23 +0000 (11:04 -0800)]
osd: update stats when op is applied

16 years agoosd: add scrub wq
Sage Weil [Wed, 19 Nov 2008 00:46:52 +0000 (16:46 -0800)]
osd: add scrub wq

16 years agoos: use nstring instead of string for attrsets
Sage Weil [Thu, 20 Nov 2008 21:22:37 +0000 (13:22 -0800)]
os: use nstring instead of string for attrsets

16 years agoebofs: fix occasional bdev shutdown hang
Sage Weil [Thu, 20 Nov 2008 20:44:37 +0000 (12:44 -0800)]
ebofs: fix occasional bdev shutdown hang

16 years agoosd: adjust merge_log
Sage Weil [Thu, 20 Nov 2008 19:54:09 +0000 (11:54 -0800)]
osd: adjust merge_log

Object should only be marked missing if new entry is newer.  If
they are the same, it may or may not be missing (depending on
whether it was before merge_log).

16 years agomsgr: ref count message while they are owned by the messenger
Sage Weil [Thu, 20 Nov 2008 19:31:12 +0000 (11:31 -0800)]
msgr: ref count message while they are owned by the messenger

Users still assume they hold the only reference, at least until
they call send_message.

One caveat is that ms_handle_failure is passed a message with an
unknown number of refs.  The method should not try to free or
re-use the message.

16 years agomsgr: reference count messenger
Sage Weil [Thu, 20 Nov 2008 18:36:19 +0000 (10:36 -0800)]
msgr: reference count messenger

We want an explicit destroy() method, because the SimpleMessenger
needs to join the dispatch thread, and that can't happen just on
the last reference drop because that may happen in the dispatch
thread itself.

16 years agovstart: launch valgrind with --valgrind
Sage Weil [Thu, 20 Nov 2008 18:32:42 +0000 (10:32 -0800)]
vstart: launch valgrind with --valgrind

just for cosd atm

16 years agomsgr todo
Sage Weil [Thu, 20 Nov 2008 17:52:33 +0000 (09:52 -0800)]
msgr todo

16 years agoMakefile: adjust link order (libcommon first _and_ last)
Sage Weil [Thu, 20 Nov 2008 17:52:22 +0000 (09:52 -0800)]
Makefile: adjust link order (libcommon first _and_ last)

16 years agoebofs: add new objects to main collection
Sage Weil [Wed, 19 Nov 2008 19:00:25 +0000 (11:00 -0800)]
ebofs: add new objects to main collection

16 years agomds: remove session from xlist before deleting
Sage Weil [Thu, 20 Nov 2008 16:15:13 +0000 (08:15 -0800)]
mds: remove session from xlist before deleting

16 years agoclient: remove xlist items before deleting
Sage Weil [Thu, 20 Nov 2008 16:12:45 +0000 (08:12 -0800)]
client: remove xlist items before deleting

16 years agomds: remove Capability from session list before deleting
Sage Weil [Thu, 20 Nov 2008 16:09:27 +0000 (08:09 -0800)]
mds: remove Capability from session list before deleting

16 years agomds: pull scatterlock of xlist in destructor
Sage Weil [Thu, 20 Nov 2008 06:01:14 +0000 (22:01 -0800)]
mds: pull scatterlock of xlist in destructor

Really, this should happen sooner, but for now this is equivalent to the
old xlist::item destructor.

16 years agoos: clean up ObjectStore::Transaction interface
Sage Weil [Thu, 20 Nov 2008 05:50:39 +0000 (21:50 -0800)]
os: clean up ObjectStore::Transaction interface

Also, fix attrset * thing.. that doesn't look safe!

16 years agomsg: initialize footer
Sage Weil [Thu, 20 Nov 2008 05:48:13 +0000 (21:48 -0800)]
msg: initialize footer

16 years agolockdep: turn lockdep off during shutdown
Sage Weil [Thu, 20 Nov 2008 05:41:26 +0000 (21:41 -0800)]
lockdep: turn lockdep off during shutdown

We can't function after the static items in lockdep.cc destruct.  Disable
lockdep before that happens.

16 years agoosd: fix osd_reqid_t hash
Sage Weil [Thu, 20 Nov 2008 05:34:00 +0000 (21:34 -0800)]
osd: fix osd_reqid_t hash

blobhash is only safe on packed types.

16 years agofilestore: pad with zeroed buffer
Sage Weil [Thu, 20 Nov 2008 05:33:32 +0000 (21:33 -0800)]
filestore: pad with zeroed buffer

Shut up valgrind

16 years agoxlist: enforce removal from xlist
Sage Weil [Thu, 20 Nov 2008 03:43:58 +0000 (19:43 -0800)]
xlist: enforce removal from xlist

We want to ensure that removal takes place in the correct locking context,
not whatever context the ::item is destroyed in.

16 years agomds: fix uninit value
Sage Weil [Thu, 20 Nov 2008 00:47:12 +0000 (16:47 -0800)]
mds: fix uninit value

16 years agoosd: adjust missing in merge_old_entry
Sage Weil [Thu, 20 Nov 2008 00:38:27 +0000 (16:38 -0800)]
osd: adjust missing in merge_old_entry

Our "old" entry may have been newer, and missing.. remove from missing, and re-add
"new" entry to ensure missing reflects the correct object version.

16 years agoosd: type cleanup
Sage Weil [Wed, 19 Nov 2008 18:30:10 +0000 (10:30 -0800)]
osd: type cleanup

16 years agoosd: more merge_log updates
Sage Weil [Thu, 20 Nov 2008 00:28:09 +0000 (16:28 -0800)]
osd: more merge_log updates

16 years agomonclient: dont free messenger until races there are fixed
Sage Weil [Thu, 20 Nov 2008 00:28:49 +0000 (16:28 -0800)]
monclient: dont free messenger until races there are fixed

16 years agocmonctl: fix busy loop
Sage Weil [Wed, 19 Nov 2008 21:12:57 +0000 (13:12 -0800)]
cmonctl: fix busy loop

16 years agomon: ignore 0-byte latest
Sage Weil [Wed, 19 Nov 2008 21:09:38 +0000 (13:09 -0800)]
mon: ignore 0-byte latest

16 years agoosd: merge_log fix when logs abut but do not overlap
Sage Weil [Wed, 19 Nov 2008 18:31:31 +0000 (10:31 -0800)]
osd: merge_log fix when logs abut but do not overlap