From ae472dbf8726b2e1f8da6c755f73915809b66168 Mon Sep 17 00:00:00 2001 From: sageweil Date: Fri, 9 Feb 2007 19:13:52 +0000 Subject: [PATCH] some web page updates git-svn-id: https://ceph.svn.sf.net/svnroot/ceph@1092 29311d96-e01e-0410-9327-a35deaab8ce9 --- trunk/web/index.body | 30 ++++++++++++++++++------------ trunk/web/overview.body | 12 ++++++------ 2 files changed, 24 insertions(+), 18 deletions(-) diff --git a/trunk/web/index.body b/trunk/web/index.body index ce727d6bde7d5..9e2c79f8f4f19 100644 --- a/trunk/web/index.body +++ b/trunk/web/index.body @@ -5,27 +5,24 @@ Ceph is a distributed network file system designed to provide excellent performance, reliability, and scalability. Ceph fills two significant gaps in the array of currently available file systems:
    -
  1. Petabyte-scale storage -- Ceph is built from the ground up to seamlessly and gracefully scale from gigabytes to petabytes and beyond. Scalability is considered in terms of workload as well as total storage. Ceph gracefully handles workloads in which tens thousands of clients or more simultaneously access the same file, or write to the same directory--usage scenarios that bring existing enterprise storage systems to their knees. -
  2. Robust, open-source distributed storage -- Ceph is released under the terms of the LGPL, which means it is free software (as in speech). Ceph will provide a variety of key features that are sorely lacking from existing open-source file systems, including snapshots, seamless scalability (the ability to simply add disks to expand volumes), and intelligent load balancing. +
  3. Petabyte-scale storage -- Ceph is built from the ground up to seamlessly and gracefully scale from gigabytes to petabytes and beyond. Scalability is considered in terms of workload as well as total storage. Ceph is designed to gracefully handle workloads in which tens thousands of clients or more simultaneously access the same file, or write to the same directory--usage scenarios that bring typical enterprise storage systems to their knees. +
  4. Robust, open-source distributed storage -- Ceph is released under the terms of the LGPL, which means it is free software (as in speech). Ceph will provide a variety of key features that are sorely lacking from existing open-source file systems, seamless scalability (the ability to simply add disks to expand volumes), intelligent load balancing, and snapshot functionality.
Here are some of the key features that make Ceph different from existing file systems that you may have worked with:
    -
  1. Seamless scaling -- A Ceph filesystem can be seamlessly expanded by simply adding storage nodes (OSDs). However, unlike most existing file systems, Ceph proactively migrates data onto new devices in order to maintain a balanced distribution of data that effectively utilizes all available resources (disk bandwidth and spindles) and avoids data hot spots (e.g., active data residing primarly on old disks while newer disks sit empty and idle). +
  2. Seamless scaling -- A Ceph filesystem can be seamlessly expanded by simply adding storage nodes (OSDs). However, unlike most existing file systems, Ceph proactively migrates data onto new devices in order to maintain a balanced distribution of data. This effectively utilizes all available resources (disk bandwidth and spindles) and avoids data hot spots (e.g., active data residing primarly on old disks while newer disks sit empty and idle).
  3. Strong reliability and fast recovery -- All data in Ceph is replicated across multiple OSDs. If any OSD fails, data is automatically re-replicated to other devices. However, unlike typical RAID systems, the replicas for data on each disk are spread out among a large number of other disks, and when a disk fails, the replacement replicas are also distributed across many disks. This allows recovery to proceed in parallel (with dozens of disks copying to dozens of other disks), removing the need for explicit "spare" disks (which are effectively wasted until they are needed) and preventing a single disk from becoming a "RAID rebuild" bottleneck.
  4. Adaptive MDS -- The Ceph metadata server (MDS) is designed to dynamically adapt its behavior to the current workload. If thousands of clients suddenly access a single file or directory, that metadata is dynamically replicated across multiple servers to distribute the workload. Similarly, as the size and popularity of the file system hierarchy changes over time, that hierarchy is dynamically redistributed among available metadata servers in order to balance load and most effectively use server resources. (In contrast, current file systems force system administrators to carve their data set into static "volumes" and assign volumes to servers. Volume sizes and workloads inevitably shift over time, forcing administrators to constantly shuffle data between servers or manually allocate new resources where they are currently needed.)
For more information about the underlying architecture of Ceph, please see the Overview. This project is based on a substantial body of research conducted by the Storage Systems Research Center at the University of California, Santa Cruz over the past few years that has resulted in a number of publications. + - - -

Ceph is currently in the prototype stage, and is under very active development. The file system is mountable and more or less usable, but a variety of subsystems are not yet fully functional (most notably including MDS failure recovery, reliable failure monitoring, and flexible snapshots). +

Current Status

+
+ Ceph is currently in the prototype stage, and is under very active development. The file system is mountable and more or less usable, but a variety of subsystems are not yet fully functional (most notably MDS failure recovery, reliable failure monitoring). Other key features are planned but not yet implemented, including snapshots.

The Ceph project is actively seeking participants. If you are interested in using Ceph, or contributing to its development, please join our mailing list and drop us a line.

@@ -33,14 +30,23 @@

News

-

Upcoming Publications (10/25/2006)

+ +

LSW and FAST

+
+ I am very excited to be attending the Linux Storage and Filesystem Workshop at FAST '07. It's a pretty small workshop that will have a lot of key people working with file systems and storage in the Linux kernel. I'll also be at FAST in San Jose for the rest of the week (Feb 13-16), and hope to generate some more interest in the project. +

-- Sage Weil (2/9/2007) +

+ +

Upcoming Publications

A paper describing the Ceph filesystem will be presented at OSDI '06 (7th USENIX Symposium on Operating Systems Design and Implementation) in Seattle on November 8th. The following week a paper describing CRUSH (the special-purpose mapping function used to distribute data) will be presented at SC'06, the International Conference for High Performance Computing in Tampa on November 16th. We hope to see you there! +

-- Sage Weil (10/25/2006)

-

Moved to SourceForge (10/2/2006)

+

Moved to SourceForge

After a few too many months of summer distractions, I've finally moved the Ceph CVS code base over from UCSC to Subversion on SourceForge, and created a Ceph home page. This is largely in preparation for upcoming paper publications which will hopefully increase Ceph's exposure and attract some interest to the project. Yay! +

-- Sage Weil (10/2/2006)

diff --git a/trunk/web/overview.body b/trunk/web/overview.body index 2d032585b7a98..9bc31fa616bbc 100644 --- a/trunk/web/overview.body +++ b/trunk/web/overview.body @@ -2,19 +2,19 @@

Ceph Overview -- What is it?

- Ceph is a scalable distributed network file system that provides both excellent performance and reliability. Like other network file systems like NFS and CIFS, clients require only a network connection to mount and use the file system. Unlike NFS and CIFS, however, Ceph clients can communicate directly with storage nodes (which we call OSDs) instead of a single "server" (something that limits the scalability of NFS and CIFS installations). In that sense, Ceph resembles "cluster" file systems based on SANs (storage area networks) and FC (fibre-channel) or iSCSI. The main difference is that FC and iSCSI is a block-level protocols to communicate with dumb, passive disks; Ceph OSDs are intelligent storage nodes, and unlike FC, all communication is over TCP. + Ceph is a scalable distributed network file system that provides both excellent performance and reliability. Like network file protocols such as NFS and CIFS, clients require only a network connection to mount and use the file system. Unlike NFS and CIFS, however, Ceph clients can communicate directly with storage nodes (which we call OSDs) instead of a single "server" (something that limits the scalability of installations using NFS and CIFS). In that sense, Ceph resembles "cluster" file systems based on SANs (storage area networks) and FC (fibre-channel) or iSCSI. The main difference is that FC and iSCSI are block-level protocols that communicate with dumb, passive disks; Ceph OSDs are intelligent storage nodes, all communication is over TCP and commodity IP networks.

- Ceph's intelligent storage nodes (basically, storage servers running software to serve "objects" instead of files) facilitate improved scalability and parallelism. NFS servers (i.e. NAS devices) and cluster file systems funnel all I/O through a single (or limited set of) servers, limiting scalability. Ceph clients interact with one of a limited set of (perhaps dozens or hundreds of) metadata servers (MDSs) for high-level operations like open(), but communicate directly with storage (termed OSDs) for I/O, of which there may be thousands. + Ceph's intelligent storage nodes (basically, storage servers running software to serve "objects" instead of files) facilitate improved scalability and parallelism. NFS servers (i.e. NAS devices) and cluster file systems funnel all I/O through a single (or limited set of) servers, limiting scalability. Ceph clients interact with a set of (perhaps dozens or hundreds of) metadata servers (MDSs) for high-level operations like open() and rename(), but communicate directly with storage nodes (OSDs) for I/O, of which there may be thousands.

- There are a handful of new file systems and enterprise storage products adopting a similar object- or brick-based architecture, including Lustre (also open-source, but with restricted access to development) and the Panasas file system (a commercial storage product). Ceph is different: + There are a handful of new file systems and enterprise storage products adopting a similar object- or brick-based architecture, including Lustre (also open-source, but with restricted access to source code) and the Panasas file system (a commercial storage product). Ceph is different:

  • Open source, open development. We're hosted on SourceForge, and are actively looking for interested users and developers. -
  • Scalability. Ceph sheds legacy file system design principles like explicit allocation tables that are still found in almost all other file systems (including Lustre and the Panasas file system) that ultimately limit scalability. +
  • Scalability. Ceph sheds legacy file system design principles like explicit allocation tables that are still found in almost all other file systems (including Lustre and the Panasas file system) and ultimately limit scalability.
  • Commodity hardware. Ceph is designed to run on commodity hardware running Linux (or any other POSIX-ish Unix variant). (Lustre relies on a SAN or other shared storage failover to make storage nodes reliable, while Panasas is based on custom hardware using integrated UPSs.)
- In additional to promising greater scalability than existing solutions, Ceph also promises to fill the huge gap between open-source filesystems and commercial enterprise systems. If you want network-attached storage without shelling out the big bucks, your are usually stuck with NFS and a direct-attached RAID. Technologies like ATA-over-ethernet and iSCSI help scale raw volume sizes, but the lack of "cluster-aware" open-source file systems still limit one to a single NFS "server" that limits scalability. + In additional to promising greater scalability than existing solutions, Ceph also promises to fill the huge gap between open-source filesystems and commercial enterprise systems. If you want network-attached storage without shelling out the big bucks, your are usually stuck with NFS and a direct-attached RAID. Technologies like ATA-over-ethernet and iSCSI help scale raw volume sizes, but the relative lack of "cluster-aware" open-source file systems (particularly those with snapshot-like functionality) still limits one to a single NFS "server" that limits scalability.

-Ceph fills this gap by providing a scalable, reliable file system that can seamlessly grom from gigabytes to petabytes. Moreover, Ceph will provide efficient snapshots, which almost no freely available file system (besides ZFS on Solaris) provides, despite snapshots having become almost ubiquitous in enterprise systems. +Ceph fills this gap by providing a scalable, reliable file system that can seamlessly grow from gigabytes to petabytes. Moreover, Ceph will eventually provide efficient snapshots, which almost no freely available file system (besides ZFS on Solaris) provides, despite snapshots having become almost ubiquitous in enterprise systems.

Ceph Architecture

-- 2.39.5