From: Min Chen <minchen@ubuntukylin.com>
Date: Wed, 4 Feb 2015 08:12:01 +0000 (+0800)
Subject: rbd-recover-tool: add usefull information of this tool
X-Git-Tag: v0.93~98^2~1
X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=a7a6fe4fa00c95e46003d35400ac773d341bad5e;p=ceph.git

rbd-recover-tool: add usefull information of this tool
include README, FAQ, TODO

Signed-off-by: Min Chen <minchen@ubuntukylin.com>
---

diff --git a/src/rbd_recover_tool/FAQ b/src/rbd_recover_tool/FAQ
new file mode 100644
index 00000000000..b94b37ea2d3
--- /dev/null
+++ b/src/rbd_recover_tool/FAQ
@@ -0,0 +1,16 @@
+# author: min chen(minchen@ubuntukylin.com) 2014 2015
+
+1. error "get_image_metadata_v2: no meta_header_seq input"
+cause: 
+  database is old, refresh database
+solution:
+  ./admin_job database
+
+2. Error initializing leveldb: IO error: lock /var/lib/ceph/osd/ceph-0/current/omap/LOCK: Resource temporarily unavailable
+   ERROR: error flushing journal /var/lib/ceph/osd/ceph-0/journal for object store /var/lib/ceph/osd/ceph-0: (1) Operation not permitted
+cause: 
+  when ./admin_job database is interrupted , but command has been sent to each osd node, and there is a process reading leveldb and it is LOCKED
+  if run ./admin_job database again, all command are sent to osd nodes again, while previous process is locking leveldb, so all new command
+  are failed.
+solution:
+  wait until all previous command finished.
diff --git a/src/rbd_recover_tool/README b/src/rbd_recover_tool/README
new file mode 100644
index 00000000000..2e45ad2bcdb
--- /dev/null
+++ b/src/rbd_recover_tool/README
@@ -0,0 +1,97 @@
+# author: min chen(minchen@ubuntukylin.com) 2014 2015
+
+------------- ceph rbd recovery tool -------------
+
+  ceph rbd recover tool is used for recovering ceph rbd image, when all ceph services are killed.
+it is based on ceph-0.80.x (Firefly and newer)
+  currently, ceph service(ceph-mon, ceph-osd) evently are not avaiable caused by bugs or sth else
+, especially on large scale ceph cluster, so that the ceph cluster can not supply service 
+and rbd images can not be accessed. In this case, a tool to recover rbd image is nessecary.
+  ceph rbd recover tool is just used for this, it can collect all objects of an image from distributed
+osd nodes with the latest pg epoch, and splice objects by offset to a complete image. To make sure
+object data is complete, this tool does flush osd journal on each osd node before recovering.
+  but, there are some limitions:
+-need ssh service and unobstructed network 
+-osd data must be accessed on local disk
+-clone image is not supported, while snapshot is supported
+-only support relicated pool
+
+before you run this tool, you should make sure that:
+1). all processes (ceph-osd, ceph-mon, ceph-mds) are shutdown
+2). ssh deamon is running & network is ok (ssh to each node without password)
+3). ceph-kvstore-tool is installed(for ubuntu: apt-get install ceph-test)
+4). osd disk is not crashed and data can be accessed on local filesystem
+
+-architecture:
+
+                      +---- osd.0
+                      |
+admin_node -----------+---- osd.1
+                      |
+                      +---- osd.2
+		      |
+                      ......
+
+-files:
+admin_node: {rbd-recover-tool  common_h  epoch_h  metadata_h  database_h}
+osd:        {osd_job           common_h  epoch_h  metadata_h} #/var/rbd_tool/osd_job
+in this architecture, admin_node acts as client, osds act as server.
+so, they run diffrent files: 
+on admin_node run:  rbd-recover-tool <action> [<parameters>]
+on osd node run:    ./osd_job <funtion> <parameters>
+admin_node will copy files: osd_job, common_h, epoch_h, metadata_h to remote osd node
+
+
+-config file
+before you run this tool, make sure write config files first
+osd_host_path: osd hostnames and osd data path #user input
+  osdhost0	/var/lib/ceph/osd/ceph-0
+  osdhost1	/var/lib/ceph/osd/ceph-1
+  ......
+mon_host: all mon node hostname #user input
+  monhost0
+  monhost1
+  ......
+mds_host: all mds node hostname #user input
+  mdshost0
+  mdshost1
+  ......
+then, init_env_admin function will create file: osd_host
+osd_host: all osd node hostname #generated by admin_job, user ignore it
+  osdhost0
+  osdhost1
+  ......
+
+
+-usage:
+rbd-recovert-tool <operation>
+<operation> :
+database		#generating offline database: hobject path, node hostname, pg_epoch and image metadata
+list			#list all images from offline database
+lookup <pool_id>/<image_name>[@[<snap_name>]]	#lookup image metadata in offline database
+recover <pool_id><image_name>[@[<snap_name>]] [/path/to/store/image]	#recover image data according to image metadata
+
+-steps:
+1. stop all ceph services: ceph-mon, ceph-osd, ceph-mds
+2. setup config files: osd_host_path, mon_host, mds_host
+3. rbd-recover-tool database 	# wait a long time 
+4. rbd-recover-tool list
+4. rbd-recover-tool recover <pool_id>/<image_name>[@[<image_name>]] [/path/to/store/image]
+
+
+-debug & error check
+if admin_node operation is failed, you can check it on osd node
+cd /var/rbd_tool/osd_job
+./osd_job <operation>
+<opeartion> :
+do_image_id <image_id_hobject>		#get image id of image format v2 
+do_image_id <image_header_hobject>	#get image id of image format v1
+do_image_metadata_v1 <image_header_hobject>  	#get image metadata of image format v1, maybe pg epoch is not latest
+do_image_metadata_v2 <image_header_hobject>  	#get image metadata of image format v2, maybe pg epoch is not latest
+do_image_list 				#get all images on this osd(image head hobject)
+do_pg_epoch				#get all pg epoch and store it in /var/rbd_tool/single_node/node_pg_epoch
+do_omap_list    			#list all omap headers and omap entries on this osd
+
+
+-FAQ
+file FAQ lists some common confusing cases while testing
diff --git a/src/rbd_recover_tool/TODO b/src/rbd_recover_tool/TODO
new file mode 100644
index 00000000000..c36d4c94737
--- /dev/null
+++ b/src/rbd_recover_tool/TODO
@@ -0,0 +1,2 @@
+
+1.support clone imag