From 79a9f299253e24d20547131b3c9c9e0667e3b869 Mon Sep 17 00:00:00 2001 From: Kefu Chai Date: Wed, 31 Aug 2016 00:59:58 +0800 Subject: [PATCH] doc: add rados/operations/disaster-recovery.rst document the process to recover from leveldb corruption. Fixes: http://tracker.ceph.com/issues/17179 Signed-off-by: Kefu Chai --- .../troubleshooting/troubleshooting-mon.rst | 77 +++++++++++++++++++ src/tools/rebuild_mondb.cc | 5 +- 2 files changed, 81 insertions(+), 1 deletion(-) diff --git a/doc/rados/troubleshooting/troubleshooting-mon.rst b/doc/rados/troubleshooting/troubleshooting-mon.rst index 4dcd5a429b734..a9feb6d2a1d7f 100644 --- a/doc/rados/troubleshooting/troubleshooting-mon.rst +++ b/doc/rados/troubleshooting/troubleshooting-mon.rst @@ -383,6 +383,81 @@ example:: iptables -A INPUT -m multiport -p tcp -s {ip-address}/{netmask} --dports 6789,6800:7300 -j ACCEPT +Monitor Store Failures +====================== + +Symptoms of store corruption +---------------------------- + +Ceph monitor stores the `cluster map`_ in a key/value store such as LevelDB. If +a monitor fails due to the key/value store corruption, following error messages +might be found in the monitor log:: + + Corruption: error in middle of record + +or:: + + Corruption: 1 missing files; e.g.: /var/lib/ceph/mon/mon.0/store.db/1234567.ldb + +Recovery using healthy monitor(s) +--------------------------------- + +If there is any survivers, we can always `replace`_ the corrupted one with a +new one. And after booting up, the new joiner will sync up with a healthy +peer, and once it is fully sync'ed, it will be able to serve the clients. + +Recovery using OSDs +------------------- + +But what if all monitors fail at the same time? Since users are encouraged to +deploy at least three monitors in a Ceph cluster, the chance of simultaneous +failure is rare. But unplanned power-downs in a data center with improperly +configured disk/fs settings could fail the underlying filesystem, and hence +kill all the monitors. In this case, we can recover the monitor store with the +information stored in OSDs.:: + + ms=/tmp/mon-store + mkdir $ms + # collect the cluster map from OSDs + for host in $hosts; do + rsync -avz $ms user@host:$ms + rm -rf $ms + ssh user@host <erase(prefix, e); t->erase(prefix, ms.combine_strings("full", e)); ntrimmed++; -- 2.39.5