From: Sage Weil <sage@newdream.net>
Date: Wed, 15 Feb 2012 23:20:35 +0000 (-0800)
Subject: osd: fix do not always clear DEGRADED/set CLEAN on recovery finish
X-Git-Tag: v0.43~74
X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=82eceb9a3bd978bb96df5803b6dc7935b88a56ee;p=ceph.git

osd: fix do not always clear DEGRADED/set CLEAN on recovery finish

Clean means we have exactly the right number of replicas and recovery is
complete.  Degraded means we do not have enough replicas, either because
recovery is in progress, or because acting is too small.

A consequence is that if we have a PG with len(up) == 1 but a pg_temp
mapping so that len(acting) == 2, it will be active and not clean.

Fixes: #2060
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
---

diff --git a/src/osd/PG.cc b/src/osd/PG.cc
index 8a42e5e6590..49d7a4181de 100644
--- a/src/osd/PG.cc
+++ b/src/osd/PG.cc
@@ -1541,9 +1541,15 @@ struct C_PG_FinishRecovery : public Context {
 void PG::finish_recovery(ObjectStore::Transaction& t, list<Context*>& tfin)
 {
   dout(10) << "finish_recovery" << dendl;
-  state_clear(PG_STATE_DEGRADED);
   state_clear(PG_STATE_BACKFILL);
-  state_set(PG_STATE_CLEAN);
+
+  // only clear DEGRADED (or mark CLEAN) if we have enough (or the
+  // desired number of) replicas.
+  if (acting.size() >= get_osdmap()->get_pg_size(info.pgid))
+    state_clear(PG_STATE_DEGRADED);
+  if (acting.size() == get_osdmap()->get_pg_size(info.pgid))
+    state_set(PG_STATE_CLEAN);
+
   assert(info.last_complete == info.last_update);
 
   // NOTE: this is actually a bit premature: we haven't purged the