From: Noah Watkins Date: Sun, 24 Mar 2013 18:22:11 +0000 (-0700) Subject: doc: update Hadoop docs with plugin download X-Git-Tag: v0.67-rc1~81^2 X-Git-Url: http://git.apps.os.sepia.ceph.com/?a=commitdiff_plain;h=6ef48c7ec2a54cf68023851e573fd1e7c19b4d8e;p=ceph.git doc: update Hadoop docs with plugin download Signed-off-by: Noah Watkins --- diff --git a/doc/cephfs/hadoop.rst b/doc/cephfs/hadoop.rst index 8c29376269ec8..fc3f3000fb0d7 100644 --- a/doc/cephfs/hadoop.rst +++ b/doc/cephfs/hadoop.rst @@ -2,18 +2,17 @@ Using Hadoop with CephFS ======================== -The Ceph file system can be used in place of HDFS in a Hadoop installation -by using the Ceph file system client Java package, and requires no changes to -the Hadoop code base. - -The Apache Hadoop project is a framework for building data-intensive -applications. Applications built for the Hadoop framework include MapReduce, -HBase, Hive, Mahout, and many others. Data management in Hadoop is handled by -a distributed file system, and the default file system supported by Hadoop is -the Hadoop Distributed File System (HDFS). However, Hadoop is not restricted -to using HDFS, and any alternative file system can be used with Hadoop by -plugging in a different implementation of the Hadoop virtual file system -layer. +The Ceph file system can be used as a drop-in replacement for the Hadoop File +System (HDFS). This page describes the installation and configuration process +of using Ceph with Hadoop. + +Dependencies +============ + +* CephFS Java Interface +* Hadoop CephFS Plugin + +.. important:: Currently requires Hadoop 1.1.X stable series Installation ============ @@ -23,8 +22,6 @@ Ceph installation is required. The details of setting up a Ceph cluster and the file system are beyond the scope of this document. Please refer to the Ceph documentation for installing Ceph. -.. important:: The master branch is currently required for compatibility. - The remaining two requirements are a Hadoop installation, and the Ceph file system Java packages, including the Java CephFS Hadoop plugin. The high-level steps are two add the dependencies to the Hadoop installation ``CLASSPATH``, @@ -33,8 +30,7 @@ and configure Hadoop to use the Ceph file system. CephFS Java Packages -------------------- -* CephFS Java package is located -* CephFS Hadoop plugin is located +* CephFS Hadoop plugin (`hadoop-cephfs.jar `_) Adding these dependencies to a Hadoop installation will depend on your particular deployment. In general the dependencies must be present on each @@ -57,7 +53,7 @@ These options are intended to be set in the Hadoop configuration file |Property |Value |Notes | | | | | +=====================+==========================+============================+ -|fs.default.name |Ceph URI |ceph://[monaddr:port][/root]| +|fs.default.name |Ceph URI |ceph://[monaddr:port]/ | | | | | | | | | +---------------------+--------------------------+----------------------------+