Sage Weil [Thu, 22 Dec 2011 23:25:00 +0000 (15:25 -0800)]
filestore: fix config observer
Actually, I don't think this was fully implemented to begin with, so it's
not a 'fix' per se. This will let you use injectargs to adjust the
filestore config options during runtime.
Samuel Just [Thu, 22 Dec 2011 17:44:33 +0000 (09:44 -0800)]
ReplicatedPG: init backfill infos to last_backfill
We can scan starting from last_backfill to avoid rescanning portions
of the collection recovered by normal recovery. collection_list_partial
now includes begin if present. next will be <= the next object in the
collection. This way we can scan starting at last_backfill without
skipping last_backfill.
Samuel Just [Sat, 17 Dec 2011 02:04:32 +0000 (18:04 -0800)]
calc_acting: Prefer up[0] as primary if possible
Previously, we could get into a state where although up[0] has been
fully backfilled, acting[0] could be selected as a primary if it is able
to pull another peer into the acting set. This also collects the logic
of choosing the best info into a helper function.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Mon, 19 Dec 2011 22:50:17 +0000 (14:50 -0800)]
MOSDRepScrub,ReplicatedPG: Add scrub_to to MOSDRepScrub
When scrub_from is set, also set scrub_to to the primary's
last_update_applied (which will also be the official last_update before
finalizing scrub began). The replica instead of waiting for
last_update_applied to catch up to last_update will wait for
last_update_applied to catch up to active_rep_scrub->scrub_to. This
avoids a race where the replica scrub is requeued before all of the
currently queued sub-ops have been processed.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Sage Weil [Wed, 30 Nov 2011 22:13:14 +0000 (14:13 -0800)]
filejournal: uuid for fsid
Decode old header struct, but encode new class using more normal encoding
style. Embed in a bufferlist for later extensibility. Use the first
64 bits of the uuid for the per-entry magic, as before.
Kyle Marsh [Sat, 17 Dec 2011 00:05:46 +0000 (16:05 -0800)]
obsync: add authurl to CLI
s3 connections require the hostname and swift connections require the
authurl. obsync treats these as equivalent internally, but breaks them
apart on the command line interface for clarity for the users.
ctx->at_version should match the head of the new log entries
during issue_repop. This could cause the scrub hang bug as
last_update would be less than last_update_applied.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just [Thu, 15 Dec 2011 19:16:26 +0000 (11:16 -0800)]
OSD: use disk_tp.pause() without osd_lock
Previously, we called disk_tp.pause_new(). This can cause a race
where snap_trimmer queues more transactions after we flush the
store. Calling disk_tp.pause() under the osd_lock causes a
deadlock with pg removal.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Holger Macht [Thu, 15 Dec 2011 16:51:04 +0000 (17:51 +0100)]
ceph.spec: Clean up and fix spec file and build for a couple of distributions
Clean up and fix the spec file. This includes cleaning up of build and
installed system dependencies, LSB compliance fixes, splitting up into
several sub-packages (lib*) and so on. It now builds fine for the
following distributions in the Open Build Service and should be
considered as a starting point for further fixes:
- CentOS 6
- Fedora 15
- RedHat Enterprise Linux 6
- openSUSE 11.4
- openSUSE 12.1
- openSUSE Factory
- SUSE Linux Enterprise 11 (SP1 and SP2)
Sage Weil [Thu, 15 Dec 2011 02:54:45 +0000 (18:54 -0800)]
osd: wait for src_oid if it on other side of last_backfill from oid
If the target object is before last_backfill, then the backfill_target
will be asked to apply the operation. If one of the src objects is past
last_backfill, that will fail, so we need to wait for the src object to
be not degraded.
Sage Weil [Thu, 15 Dec 2011 02:41:10 +0000 (18:41 -0800)]
osd: preserve write order when waiting on src_oids
We need to preserve the order of write operations on each object. If we
have a write on X that needs to read from Y, and Y is degraded, then we
need to wait for Y to repair. Doing so blindly will allow other writes
to X to proceed while our clone op is still waiting, violating the
ordering.
Fix this by adding blocked_by and blocking vars to the ObjectContext. If
we wait on a src_oid, the oid is "blocked" by that object, and any
subsequent writes should also wait on the same queue.
Use a helper to do the cleanup when we complete recovery, or when the
pg resets.
Sage Weil [Wed, 14 Dec 2011 01:43:34 +0000 (17:43 -0800)]
osd: track backfill target pg stats
Maintain backfill target pg stats to be the summation over objects to
the left of last_backfill. Reflect this in the degraded stats we report
to the monitor.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Samuel Just [Tue, 13 Dec 2011 21:02:57 +0000 (13:02 -0800)]
ReplicatedPG: calc_*_subsets must consider last_backfill
Objects yet to be backfilled do not show up in the missing set. Thus,
we cannot use an object past last_backfill to clone into the object we
are pushing/pulling.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>