"Filesystem" is not a word (although fairly common in use).
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
.. _Ceph Storage Cluster APIs: ../rados/api/
-Ceph Filesystem APIs
-====================
+Ceph File System APIs
+=====================
See `libcephfs (javadoc)`_.
The Ceph Storage Cluster receives data from :term:`Ceph Clients`--whether it
comes through a :term:`Ceph Block Device`, :term:`Ceph Object Storage`, the
-:term:`Ceph Filesystem` or a custom implementation you create using
+:term:`Ceph File System` or a custom implementation you create using
``librados``--and it stores the data as objects. Each object corresponds to a
file in a filesystem, which is stored on an :term:`Object Storage Device`. Ceph
OSD Daemons handle the read/write operations on the storage disks.
volume'. Ceph's striping offers the throughput of RAID 0 striping, the
reliability of n-way RAID mirroring and faster recovery.
-Ceph provides three types of clients: Ceph Block Device, Ceph Filesystem, and
+Ceph provides three types of clients: Ceph Block Device, Ceph File System, and
Ceph Object Storage. A Ceph Client converts its data from the representation
format it provides to its users (a block device image, RESTful objects, CephFS
filesystem directories) into objects for storage in the Ceph Storage Cluster.
.. tip:: The objects Ceph stores in the Ceph Storage Cluster are not striped.
- Ceph Object Storage, Ceph Block Device, and the Ceph Filesystem stripe their
+ Ceph Object Storage, Ceph Block Device, and the Ceph File System stripe their
data over multiple Ceph Storage Cluster objects. Ceph Clients that write
directly to the Ceph Storage Cluster via ``librados`` must perform the
striping (and parallel I/O) for themselves to obtain these benefits.
provides RESTful APIs with interfaces that are compatible with Amazon S3
and OpenStack Swift.
-- **Filesystem**: The :term:`Ceph Filesystem` (CephFS) service provides
+- **Filesystem**: The :term:`Ceph File System` (CephFS) service provides
a POSIX compliant filesystem usable with ``mount`` or as
a filesystem in user space (FUSE).
Device kernel object(s). This is done with the command-line tool ``rbd``.
-.. index:: CephFS; Ceph Filesystem; libcephfs; MDS; metadata server; ceph-mds
+.. index:: CephFS; Ceph File System; libcephfs; MDS; metadata server; ceph-mds
.. _arch-cephfs:
-Ceph Filesystem
----------------
+Ceph File System
+----------------
-The Ceph Filesystem (CephFS) provides a POSIX-compliant filesystem as a
+The Ceph File System (CephFS) provides a POSIX-compliant filesystem as a
service that is layered on top of the object-based Ceph Storage Cluster.
CephFS files get mapped to objects that Ceph stores in the Ceph Storage
Cluster. Ceph Clients mount a CephFS filesystem as a kernel object or as
+---------------+ +---------------+ +---------------+
-The Ceph Filesystem service includes the Ceph Metadata Server (MDS) deployed
+The Ceph File System service includes the Ceph Metadata Server (MDS) deployed
with the Ceph Storage cluster. The purpose of the MDS is to store all the
filesystem metadata (directories, file ownership, access modes, etc) in
high-availability Ceph Metadata Servers where the metadata resides in memory.
The reason for the MDS (a daemon called ``ceph-mds``) is that simple filesystem
operations like listing a directory or changing a directory (``ls``, ``cd``)
would tax the Ceph OSD Daemons unnecessarily. So separating the metadata from
-the data means that the Ceph Filesystem can provide high performance services
+the data means that the Ceph File System can provide high performance services
without taxing the Ceph Storage Cluster.
CephFS separates the metadata from the data, storing the metadata in the MDS,
This subcommand is used to zap lvs, partitions or raw devices that have been used
by ceph OSDs so that they may be reused. If given a path to a logical
-volume it must be in the format of vg/lv. Any filesystems present
+volume it must be in the format of vg/lv. Any file systems present
on the given lv or partition will be removed and all data will be purged.
.. note:: The lv or partition will be kept intact.
systemctl enable ceph-volume@lvm-0-0A3E1ED2-DA8A-4F0E-AA95-61DEC71768D6
The enabled unit is a :term:`systemd oneshot` service, meant to start at boot
-after the local filesystem is ready to be used.
+after the local file system is ready to be used.
Failure and Retries
CephFS Administrative commands
==============================
-Filesystems
------------
+File Systems
+------------
-These commands operate on the CephFS filesystems in your Ceph cluster.
-Note that by default only one filesystem is permitted: to enable
-creation of multiple filesystems use ``ceph fs flag set enable_multiple true``.
+These commands operate on the CephFS file systems in your Ceph cluster.
+Note that by default only one file system is permitted: to enable
+creation of multiple file systems use ``ceph fs flag set enable_multiple true``.
::
- fs new <filesystem name> <metadata pool name> <data pool name>
+ fs new <file system name> <metadata pool name> <data pool name>
This command creates a new file system. The file system name and metadata pool
name are self-explanatory. The specified data pool is the default data pool and
::
- fs rm <filesystem name> [--yes-i-really-mean-it]
+ fs rm <file system name> [--yes-i-really-mean-it]
Destroy a CephFS file system. This wipes information about the state of the
file system from the FSMap. The metadata pool and data pools are untouched and
::
- fs get <filesystem name>
+ fs get <file system name>
Get information about the named file system, including settings and ranks. This
is a subset of the same information from the ``fs dump`` command.
::
- fs set <filesystem name> <var> <val>
+ fs set <file system name> <var> <val>
Change a setting on a file system. These settings are specific to the named
file system and do not affect other file systems.
::
- fs add_data_pool <filesystem name> <pool name/id>
+ fs add_data_pool <file system name> <pool name/id>
Add a data pool to the file system. This pool can be used for file layouts
as an alternate location to store file data.
::
- fs rm_data_pool <filesystem name> <pool name/id>
+ fs rm_data_pool <file system name> <pool name/id>
This command removes the specified pool from the list of data pools for the
file system. If any files have layouts for the removed data pool, the file
These commands are not required in normal operation, and exist
for use in exceptional circumstances. Incorrect use of these
commands may cause serious problems, such as an inaccessible
-filesystem.
+file system.
::
::
- fs reset <filesystem name>
+ fs reset <file system name>
This command resets the file system state to defaults, except for the name and
pools. Non-zero ranks are saved in the stopped set.
-Application best practices for distributed filesystems
-======================================================
+Application best practices for distributed file systems
+=======================================================
CephFS is POSIX compatible, and therefore should work with any existing
-applications that expect a POSIX filesystem. However, because it is a
-network filesystem (unlike e.g. XFS) and it is highly consistent (unlike
+applications that expect a POSIX file system. However, because it is a
+network file system (unlike e.g. XFS) and it is highly consistent (unlike
e.g. NFS), there are some consequences that application authors may
benefit from knowing about.
-The following sections describe some areas where distributed filesystems
+The following sections describe some areas where distributed file systems
may have noticeably different performance behaviours compared with
-local filesystems.
+local file systems.
ls -l
----------
Hard links have an intrinsic cost in terms of the internal housekeeping
-that a filesystem has to do to keep two references to the same data. In
+that a file system has to do to keep two references to the same data. In
CephFS there is a particular performance cost, because with normal files
the inode is embedded in the directory (i.e. there is no extra fetch of
the inode after looking up the path).
number of files and then expect equivalent performance when you move
to a much larger number of files.
-Do you need a filesystem?
--------------------------
+Do you need a file system?
+--------------------------
Remember that Ceph also includes an object storage interface. If your
application needs to store huge flat collections of files where you just
read and write whole files at once, then you might well be better off
using the :ref:`Object Gateway <object-gateway>`
-
-
-
Some features in CephFS are still experimental. See
:doc:`/cephfs/experimental-features` for guidance on these.
-For the best chance of a happy healthy filesystem, use a **single active MDS**
+For the best chance of a happy healthy file system, use a **single active MDS**
and **do not use snapshots**. Both of these are the default.
Note that creating multiple MDS daemons is fine, as these will simply be
When a client wants to operate on an inode, it will query the MDS in various
ways, which will then grant the client a set of **capabilities**. These
grant the client permissions to operate on the inode in various ways. One
-of the major differences from other network filesystems (e.g NFS or SMB) is
+of the major differences from other network file systems (e.g NFS or SMB) is
that the capabilities granted are quite granular, and it's possible that
multiple clients can hold different capabilities on the same inodes.
-------
If a CephFS journal has become damaged, expert intervention may be required
-to restore the filesystem to a working state.
+to restore the file system to a working state.
The ``cephfs-journal-tool`` utility provides functionality to aid experts in
examining, modifying, and extracting data from journals.
.. warning::
This tool is **dangerous** because it directly modifies internal
- data structures of the filesystem. Make backups, be careful, and
+ data structures of the file system. Make backups, be careful, and
seek expert advice. If you are unsure, do not run this tool.
Syntax
* ``get`` read the events from the log
* ``splice`` erase events or regions in the journal
-* ``apply`` extract filesystem metadata from events and attempt to apply it to the metadata store.
+* ``apply`` extract file system metadata from events and attempt to apply it to the metadata store.
Filtering:
CephFS Shell
=============
-The File System (FS) shell includes various shell-like commands that directly interact with the :term:`Ceph Filesystem`.
+The File System (FS) shell includes various shell-like commands that directly interact with the :term:`Ceph File System`.
Usage :
put
---
-Copy a file/directory to Ceph Filesystem from Local Filesystem.
+Copy a file/directory to Ceph File System from Local File System.
Usage :
get
---
-Copy a file from Ceph Filesystem to Local Filesystem.
+Copy a file from Ceph File System to Local File System.
Usage :
get [options] <source_path> [target_path]
-* source_path - remote file/directory path which is to be copied to local filesystem.
+* source_path - remote file/directory path which is to be copied to local file system.
* if `.` copies all the file/directories in the remote working directory.
* target_path - local directory path where the files/directories are to be copied to.
locate
------
-Find an item in Filesystem
+Find an item in File System
Usage:
CephFS Client Capabilities
================================
-Use Ceph authentication capabilities to restrict your filesystem clients
+Use Ceph authentication capabilities to restrict your file system clients
to the lowest possible level of authority needed.
.. note::
To grant rw access to the specified directory only, we mention the specified
directory while creating key for a client using the following syntax. ::
- ceph fs authorize *filesystem_name* client.*client_name* /*specified_directory* rw
+ ceph fs authorize *file_system_name* client.*client_name* /*specified_directory* rw
-For example, to restrict client ``foo`` to writing only in the ``bar`` directory of filesystem ``cephfs``, use ::
+For example, to restrict client ``foo`` to writing only in the ``bar`` directory of file system ``cephfs``, use ::
ceph fs authorize cephfs client.foo / r /bar rw
ceph fs authorize cephfs client.foo /bar rw
Note that if a client's read access is restricted to a path, they will only
-be able to mount the filesystem when specifying a readable path in the
+be able to mount the file system when specifying a readable path in the
mount command (see below).
-Supplying ``all`` or ``*`` as the filesystem name will grant access to every
+Supplying ``all`` or ``*`` as the file system name will grant access to every
file system. Note that it is usually necessary to quote ``*`` to protect it from
the shell.
will be calculated from the quota on that sub-directory, rather than reporting
the overall amount of space used on the cluster.
-If you would like the client to report the overall usage of the filesystem,
+If you would like the client to report the overall usage of the file system,
and not just the quota usage on the sub-directory mounted, then set the
following config option on the client:
client quota df = false
If quotas are not enabled, or no quota is set on the sub-directory mounted,
-then the overall usage of the filesystem will be reported irrespective of
+then the overall usage of the file system will be reported irrespective of
the value of this setting.
Layout and Quota restriction (the 'p' flag)
these fields (such as openc operations with layouts).
For example, in the following snippet client.0 can modify layouts and quotas
-on the filesystem cephfs_a, but client.1 cannot.
+on the file system cephfs_a, but client.1 cannot.
::
appear after it (all flags except 'rw' must be specified in alphabetical order).
For example, in the following snippet client.0 can create or delete snapshots
-in the ``bar`` directory of filesystem ``cephfs_a``.
+in the ``bar`` directory of file system ``cephfs_a``.
::
-========================
-Create a Ceph filesystem
-========================
+=========================
+Create a Ceph file system
+=========================
Creating pools
==============
-A Ceph filesystem requires at least two RADOS pools, one for data and one for metadata.
+A Ceph file system requires at least two RADOS pools, one for data and one for metadata.
When configuring these pools, you might consider:
- Using a higher replication level for the metadata pool, as any data loss in
- this pool can render the whole filesystem inaccessible.
+ this pool can render the whole file system inaccessible.
- Using lower-latency storage such as SSDs for the metadata pool, as this will
- directly affect the observed latency of filesystem operations on clients.
+ directly affect the observed latency of file system operations on clients.
- The data pool used to create the file system is the "default" data pool and
the location for storing all inode backtrace information, used for hard link
management and disaster recovery. For this reason, all inodes created in
hierarchy of directories and files (see also :ref:`file-layouts`).
Refer to :doc:`/rados/operations/pools` to learn more about managing pools. For
-example, to create two pools with default settings for use with a filesystem, you
+example, to create two pools with default settings for use with a file system, you
might run the following commands:
.. code:: bash
used in practice for large clusters.
-Creating a filesystem
-=====================
+Creating a file system
+======================
-Once the pools are created, you may enable the filesystem using the ``fs new`` command:
+Once the pools are created, you may enable the file system using the ``fs new`` command:
.. code:: bash
$ ceph fs ls
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
-Once a filesystem has been created, your MDS(s) will be able to enter
+Once a file system has been created, your MDS(s) will be able to enter
an *active* state. For example, in a single MDS system:
.. code:: bash
$ ceph mds stat
cephfs-1/1/1 up {0=a=up:active}
-Once the filesystem is created and the MDS is active, you are ready to mount
-the filesystem. If you have created more than one filesystem, you will
+Once the file system is created and the MDS is active, you are ready to mount
+the file system. If you have created more than one file system, you will
choose which to use when mounting.
- `Mount CephFS`_
.. _Mount CephFS: ../../cephfs/kernel
.. _Mount CephFS as FUSE: ../../cephfs/fuse
-If you have created more than one filesystem, and a client does not
-specify a filesystem when mounting, you can control which filesystem
+If you have created more than one file system, and a client does not
+specify a file system when mounting, you can control which file system
they will see by using the `ceph fs set-default` command.
Using Erasure Coded pools with CephFS
The tools mentioned here can easily cause damage as well as fixing it.
It is essential to understand exactly what has gone wrong with your
- filesystem before attempting to repair it.
+ file system before attempting to repair it.
If you do not have access to professional support for your cluster,
consult the ceph-users mailing list or the #ceph IRC channel.
MDS map reset
-------------
-Once the in-RADOS state of the filesystem (i.e. contents of the metadata pool)
+Once the in-RADOS state of the file system (i.e. contents of the metadata pool)
is somewhat recovered, it may be necessary to update the MDS map to reflect
the contents of the metadata pool. Use the following command to reset the MDS
map to a single MDS:
There has not been extensive testing of this procedure. It should be
undertaken with great care.
-If an existing filesystem is damaged and inoperative, it is possible to create
-a fresh metadata pool and attempt to reconstruct the filesystem metadata
+If an existing file system is damaged and inoperative, it is possible to create
+a fresh metadata pool and attempt to reconstruct the file system metadata
into this new pool, leaving the old metadata in place. This could be used to
make a safer attempt at recovery since the existing metadata pool would not be
overwritten.
::
- cephfs-data-scan scan_extents --alternate-pool recovery --filesystem <original filesystem name> <original data pool name>
- cephfs-data-scan scan_inodes --alternate-pool recovery --filesystem <original filesystem name> --force-corrupt --force-init <original data pool name>
+ cephfs-data-scan scan_extents --alternate-pool recovery --filesystem <original file system name> <original data pool name>
+ cephfs-data-scan scan_inodes --alternate-pool recovery --filesystem <original file system name> --force-corrupt --force-init <original data pool name>
cephfs-data-scan scan_links --filesystem recovery-fs
-If the damaged filesystem contains dirty journal data, it may be recovered next
+If the damaged file system contains dirty journal data, it may be recovered next
with:
::
Metadata damage and repair
--------------------------
-If a filesystem has inconsistent or missing metadata, it is considered
+If a file system has inconsistent or missing metadata, it is considered
*damaged*. You may find out about damage from a health message, or in some
unfortunate cases from an assertion in a running MDS daemon.
layer (e.g. multiple disk failures that lose all copies of a PG), or from
software bugs.
-CephFS includes some tools that may be able to recover a damaged filesystem,
+CephFS includes some tools that may be able to recover a damaged file system,
but to use them safely requires a solid understanding of CephFS internals.
The documentation for these potentially dangerous operations is on a
separate page: :ref:`disaster-recovery-experts`.
Data pool damage (files affected by lost data PGs)
--------------------------------------------------
-If a PG is lost in a *data* pool, then the filesystem will continue
+If a PG is lost in a *data* pool, then the file system will continue
to operate normally, but some parts of some files will simply
be missing (reads will return zeros).
per line.
Note that this command acts as a normal CephFS client to find all the
-files in the filesystem and read their layouts, so the MDS must be
+files in the file system and read their layouts, so the MDS must be
up and running.
-===============================
-Ceph filesystem client eviction
-===============================
+================================
+Ceph file system client eviction
+================================
-When a filesystem client is unresponsive or otherwise misbehaving, it
-may be necessary to forcibly terminate its access to the filesystem. This
+When a file system client is unresponsive or otherwise misbehaving, it
+may be necessary to forcibly terminate its access to the file system. This
process is called *eviction*.
Evicting a CephFS client prevents it from communicating further with MDS
-daemons and OSD daemons. If a client was doing buffered IO to the filesystem,
+daemons and OSD daemons. If a client was doing buffered IO to the file system,
any un-flushed data will be lost.
Clients may either be evicted automatically (if they fail to communicate
every expansion of testing has generally revealed new issues. If you do enable
snapshots and experience failure, manual intervention will be needed.
-Snapshots are known not to work properly with multiple filesystems (below) in
+Snapshots are known not to work properly with multiple file systems (below) in
some cases. Specifically, if you share a pool for multiple FSes and delete
a snapshot in one FS, expect to lose snapshotted file data in any other FS using
snapshots. See the :doc:`/dev/cephfs-snapshots` page for more information.
Snapshotting was blocked off with the ``allow_new_snaps`` flag prior to Mimic.
-Multiple filesystems within a Ceph cluster
-------------------------------------------
+Multiple file systems within a Ceph cluster
+-------------------------------------------
Code was merged prior to the Jewel release which enables administrators
-to create multiple independent CephFS filesystems within a single Ceph cluster.
-These independent filesystems have their own set of active MDSes, cluster maps,
+to create multiple independent CephFS file systems within a single Ceph cluster.
+These independent file systems have their own set of active MDSes, cluster maps,
and data. But the feature required extensive changes to data structures which
are not yet fully qualified, and has security implications which are not all
apparent nor resolved.
There are no known bugs, but any failures which do result from having multiple
-active filesystems in your cluster will require manual intervention and, so far,
+active file systems in your cluster will require manual intervention and, so far,
will not have been experienced by anybody else -- knowledgeable help will be
extremely limited. You also probably do not have the security or isolation
guarantees you want or think you have upon doing so.
-Note that snapshots and multiple filesystems are *not* tested in combination
+Note that snapshots and multiple file systems are *not* tested in combination
and may not work together; see above.
-Multiple filesystems were available starting in the Jewel release candidates
+Multiple file systems were available starting in the Jewel release candidates
but must be turned on via the ``enable_multiple`` flag until declared stable.
LazyIO
-----------------------
Directory fragmentation was considered experimental prior to the *Luminous*
-(12.2.x). It is now enabled by default on new filesystems. To enable directory
-fragmentation on filesystems created with older versions of Ceph, set
-the ``allow_dirfrags`` flag on the filesystem:
+(12.2.x). It is now enabled by default on new file systems. To enable directory
+fragmentation on file systems created with older versions of Ceph, set
+the ``allow_dirfrags`` flag on the file system:
::
- ceph fs set <filesystem name> allow_dirfrags 1
+ ceph fs set <file system name> allow_dirfrags 1
Multiple active metadata servers
--------------------------------
Prior to the *Luminous* (12.2.x) release, running multiple active metadata
-servers within a single filesystem was considered experimental. Creating
+servers within a single file system was considered experimental. Creating
multiple active metadata servers is now permitted by default on new
-filesystems.
+file systems.
-Filesystems created with older versions of Ceph still require explicitly
+File Systems created with older versions of Ceph still require explicitly
enabling multiple active metadata servers as follows:
::
- ceph fs set <filesystem name> allow_multimds 1
+ ceph fs set <file system name> allow_multimds 1
Note that the default size of the active mds cluster (``max_mds``) is
still set to 1 initially.
To mount CephFS in your file systems table as a kernel driver, add the
following to ``/etc/fstab``::
- {ipaddress}:{port}:/ {mount}/{mountpoint} {filesystem-name} [name=username,secret=secretkey|secretfile=/path/to/secretfile],[{mount.options}]
+ {ipaddress}:{port}:/ {mount}/{mountpoint} {file-system-name} [name=username,secret=secretkey|secretfile=/path/to/secretfile],[{mount.options}]
For example::
FUSE
====
-To mount CephFS in your file systems table as a filesystem in user space, add the
+To mount CephFS in your file systems table as a file system in user space, add the
following to ``/etc/fstab``::
#DEVICE PATH TYPE OPTIONS
-Handling a full Ceph filesystem
-===============================
+Handling a full Ceph file system
+================================
When a RADOS cluster reaches its ``mon_osd_full_ratio`` (default
95%) capacity, it is marked with the OSD full flag. This flag causes
most normal RADOS clients to pause all operations until it is resolved
(for example by adding more capacity to the cluster).
-The filesystem has some special handling of the full flag, explained below.
+The file system has some special handling of the full flag, explained below.
Hammer and later
----------------
-Since the hammer release, a full filesystem will lead to ENOSPC
+Since the hammer release, a full file system will lead to ENOSPC
results from:
* Data writes on the client
may be discarded after an ``fclose`` if no space is available to persist it.
.. warning::
- If an application appears to be misbehaving on a full filesystem,
+ If an application appears to be misbehaving on a full file system,
check that it is performing ``fsync()`` calls as necessary to ensure
data is on disk before proceeding.
* If a client had pending writes to a file, then it was not possible
for the client to release the file to the MDS for deletion: this could
- lead to difficulty clearing space on a full filesystem
+ lead to difficulty clearing space on a full file system
* If clients continued to create a large number of empty files, the
resulting metadata writes from the MDS could lead to total exhaustion
of space on the OSDs such that no further deletions could be performed.
sudo mkdir /home/username/cephfs
sudo ceph-fuse -m 192.168.0.1:6789 /home/username/cephfs
-If you have more than one filesystem, specify which one to mount using
+If you have more than one file system, specify which one to mount using
the ``--client_mds_namespace`` command line argument, or add a
``client_mds_namespace`` setting to your ``ceph.conf``.
=====================
The Ceph monitor daemons will generate health messages in response
-to certain states of the filesystem map structure (and the enclosed MDS maps).
+to certain states of the file system map structure (and the enclosed MDS maps).
Message: mds rank(s) *ranks* have failed
Description: One or more MDS ranks are not currently assigned to
-.. _ceph-filesystem:
+.. _ceph-file-system:
=================
- Ceph Filesystem
+ Ceph File System
=================
-The Ceph Filesystem (CephFS) is a POSIX-compliant filesystem that uses
-a Ceph Storage Cluster to store its data. The Ceph filesystem uses the same Ceph
+The Ceph File System (CephFS) is a POSIX-compliant file system that uses
+a Ceph Storage Cluster to store its data. The Ceph file system uses the same Ceph
Storage Cluster system as Ceph Block Devices, Ceph Object Storage with its S3
and Swift APIs, or native bindings (librados).
Using CephFS
============
-Using the Ceph Filesystem requires at least one :term:`Ceph Metadata Server` in
+Using the Ceph File System requires at least one :term:`Ceph Metadata Server` in
your Ceph Storage Cluster.
<style type="text/css">div.body h3{margin:5px 0px 0px 0px;}</style>
<table cellpadding="10"><colgroup><col width="33%"><col width="33%"><col width="33%"></colgroup><tbody valign="top"><tr><td><h3>Step 1: Metadata Server</h3>
-To run the Ceph Filesystem, you must have a running Ceph Storage Cluster with at
+To run the Ceph File System, you must have a running Ceph Storage Cluster with at
least one :term:`Ceph Metadata Server` running.
</td><td><h3>Step 2: Mount CephFS</h3>
Once you have a healthy Ceph Storage Cluster with at least
-one Ceph Metadata Server, you may create and mount your Ceph Filesystem.
+one Ceph Metadata Server, you may create and mount your Ceph File System.
Ensure that your client has network connectivity and the proper
authentication keyring.
cephfs-journal-tool <cephfs-journal-tool>
File layouts <file-layouts>
Client eviction <eviction>
- Handling full filesystems <full>
+ Handling full file systems <full>
Health messages <health-messages>
Troubleshooting <troubleshooting>
Disaster recovery <disaster-recovery>
Client authentication <client-auth>
- Upgrading old filesystems <upgrading>
+ Upgrading old file systems <upgrading>
Configuring directory fragmentation <dirfrags>
Configuring multiple active MDS daemons <multimds>
Export over NFS <nfs>
See `Quotas`_ for more information.
-Multiple filesystems within a Ceph cluster
-------------------------------------------
+Multiple file systems within a Ceph cluster
+-------------------------------------------
The feature was introduced by the Jewel release. Linux kernel clients >= 4.7
can support it.
sudo mount -t ceph 192.168.0.1:6789:/ /mnt/mycephfs -o name=admin,secretfile=/etc/ceph/admin.secret
-If you have more than one filesystem, specify which one to mount using
+If you have more than one file system, specify which one to mount using
the ``mds_namespace`` option, e.g. ``-o mds_namespace=myfs``.
See `User Management`_ for details on cephx.
*Also known as: multi-mds, active-active MDS*
-Each CephFS filesystem is configured for a single active MDS daemon
+Each CephFS file system is configured for a single active MDS daemon
by default. To scale metadata performance for large scale systems, you
may enable multiple active MDS daemons, which will share the metadata
workload with one another.
Increasing the MDS active cluster size
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Each CephFS filesystem has a *max_mds* setting, which controls how many ranks
-will be created. The actual number of ranks in the filesystem will only be
+Each CephFS file system has a *max_mds* setting, which controls how many ranks
+will be created. The actual number of ranks in the file system will only be
increased if a spare daemon is available to take on the new rank. For example,
if there is only one MDS daemon running, and max_mds is set to two, no second
rank will be created. (Note that such a configuration is not Highly Available
Requirements
============
-- Ceph filesystem (preferably latest stable luminous or higher versions)
+- Ceph file system (preferably latest stable luminous or higher versions)
- In the NFS server host machine, 'libcephfs2' (preferably latest stable
luminous or higher), 'nfs-ganesha' and 'nfs-ganesha-ceph' packages (latest
ganesha v2.5 stable or higher versions)
Current limitations
===================
-- Per running ganesha daemon, FSAL_CEPH can only export one Ceph filesystem
- although multiple directories in a Ceph filesystem may be exported.
+- Per running ganesha daemon, FSAL_CEPH can only export one Ceph file system
+ although multiple directories in a Ceph file system may be exported.
POSIX is somewhat vague about the state of an inode after fsync reports
an error. In general, CephFS uses the standard error-reporting
mechanisms in the client's kernel, and therefore follows the same
-conventions as other filesystems.
+conventions as other file systems.
In modern Linux kernels (v4.17 or later), writeback errors are reported
once to every file description that is open at the time of the error. In
.. _mds-scrub:
-=====================
-Ceph Filesystem Scrub
-=====================
+======================
+Ceph File System Scrub
+======================
-CephFS provides the cluster admin (operator) to check consistency of a filesystem
+CephFS provides the cluster admin (operator) to check consistency of a file system
via a set of scrub commands. Scrub can be classified into two parts:
-#. Forward Scrub: In which the scrub operation starts at the root of the filesystem
+#. Forward Scrub: In which the scrub operation starts at the root of the file system
(or a sub directory) and looks at everything that can be touched in the hierarchy
to ensure consistency.
#. Backward Scrub: In which the scrub operation looks at every RADOS object in the
- filesystem pools and maps it back to the filesystem hierarchy.
+ file system pools and maps it back to the file system hierarchy.
This document details commands to initiate and control forward scrub (referred as
scrub thereafter).
-Initiate Filesystem Scrub
-=========================
+Initiate File System Scrub
+==========================
To start a scrub operation for a directory tree use the following command
further in this document).
Custom tag can also be specified when initiating the scrub operation. Custom tags get
-persisted in the metadata object for every inode in the filesystem tree that is being
+persisted in the metadata object for every inode in the file system tree that is being
scrubbed.
::
}
-Monitor (ongoing) Filesystem Scrubs
-===================================
+Monitor (ongoing) File System Scrubs
+====================================
Status of ongoing scrubs can be monitored using in `scrub status` command. This commands
lists out ongoing scrubs (identified by the tag) along with the path and options used to
[...]
-Control (ongoing) Filesystem Scrubs
-===================================
+Control (ongoing) File System Scrubs
+====================================
- Pause: Pausing ongoing scrub operations results in no new or pending inodes being
scrubbed after in-flight RADOS ops (for the inodes that are currently being scrubbed)
Terminology
-----------
-A Ceph cluster may have zero or more CephFS *filesystems*. CephFS
-filesystems have a human readable name (set in ``fs new``)
-and an integer ID. The ID is called the filesystem cluster ID,
+A Ceph cluster may have zero or more CephFS *file systems*. CephFS
+file systems have a human readable name (set in ``fs new``)
+and an integer ID. The ID is called the file system cluster ID,
or *FSCID*.
-Each CephFS filesystem has a number of *ranks*, one by default,
+Each CephFS file system has a number of *ranks*, one by default,
which start at zero. A rank may be thought of as a metadata shard.
-Controlling the number of ranks in a filesystem is described
+Controlling the number of ranks in a file system is described
in :doc:`/cephfs/multimds`
Each CephFS ceph-mds process (a *daemon*) initially starts up
or a name.
Where a rank is used, this may optionally be qualified with
-a leading filesystem name or ID. If a daemon is a standby (i.e.
+a leading file system name or ID. If a daemon is a standby (i.e.
it is not currently assigned a rank), then it may only be
referred to by GID or name.
For example, if we had an MDS daemon which was called 'myhost',
-had GID 5446, and was assigned rank 0 in the filesystem 'myfs'
+had GID 5446, and was assigned rank 0 in the file system 'myfs'
which had FSCID 3, then any of the following would be suitable
forms of the 'fail' command:
ceph mds fail myhost # Daemon name
ceph mds fail 0 # Unqualified rank
ceph mds fail 3:0 # FSCID and rank
- ceph mds fail myfs:0 # Filesystem name and rank
+ ceph mds fail myfs:0 # File System name and rank
Managing failover
-----------------
disconnected from the system. At this point, the kernel client is in
a bind: it cannot safely write back dirty data, and many applications
do not handle IO errors correctly on close().
-At the moment, the kernel client will remount the FS, but outstanding filesystem
+At the moment, the kernel client will remount the FS, but outstanding file system
IO may or may not be satisfied. In these cases, you may need to reboot your
client system.
ceph fs set <fs_name> max_mds <old_max_mds>
-Upgrading pre-Firefly filesystems past Jewel
-============================================
+Upgrading pre-Firefly file systems past Jewel
+=============================================
.. tip::
- This advice only applies to users with filesystems
+ This advice only applies to users with file systems
created using versions of Ceph older than *Firefly* (0.80).
- Users creating new filesystems may disregard this advice.
+ Users creating new file systems may disregard this advice.
Pre-firefly versions of Ceph used a now-deprecated format
for storing CephFS directory objects, called TMAPs. Support
This only needs to be run once, and it is not necessary to
stop any other services while it runs. The command may take some
time to execute, as it iterates overall objects in your metadata
-pool. It is safe to continue using your filesystem as normal while
+pool. It is safe to continue using your file system as normal while
it executes. If the command aborts for any reason, it is safe
to simply run it again.
-If you are upgrading a pre-Firefly CephFS filesystem to a newer Ceph version
+If you are upgrading a pre-Firefly CephFS file system to a newer Ceph version
than Jewel, you must first upgrade to Jewel and run the ``tmap_upgrade``
command before completing your upgrade to the latest version.
ceph_set_uuid(cmount, nodeid);
/*
- * Now mount up the filesystem and do normal open/lock operations to
+ * Now mount up the file system and do normal open/lock operations to
* satisfy reclaim requests.
*/
ceph_mount(cmount, rootpath);
-----------
Generally, snapshots do what they sound like: they create an immutable view
-of the filesystem at the point in time they're taken. There are some headline
+of the file system at the point in time they're taken. There are some headline
features that make CephFS snapshots different from what you might expect:
* Arbitrary subtrees. Snapshots are created within any directory you choose,
- and cover all data in the filesystem under that directory.
+ and cover all data in the file system under that directory.
* Asynchronous. If you create a snapshot, buffered data is flushed out lazily,
including from other clients. As a result, "creating" the snapshot is
very fast.
directory and contains sequence counters, timestamps, the list of associated
snapshot IDs, and `past_parent_snaps`.
* SnapServer: SnapServer manages snapshot ID allocation, snapshot deletion and
- tracks list of effective snapshots in the filesystem. A filesystem only has
+ tracks list of effective snapshots in the file system. A file system only has
one instance of snapserver.
* SnapClient: SnapClient is used to communicate with snapserver, each MDS rank
has its own snapclient instance. SnapClient also caches effective snapshots
Creating a snapshot
-------------------
-CephFS snapshot feature is enabled by default on new filesystem. To enable it
-on existing filesystems, use command below.
+CephFS snapshot feature is enabled by default on new file system. To enable it
+on existing file systems, use command below.
.. code::
Hard links
----------
Inode with multiple hard links is moved to a dummy global SnapRealm. The
-dummy SnapRealm covers all snapshots in the filesystem. The inode's data
+dummy SnapRealm covers all snapshots in the file system. The inode's data
will be preserved for any new snapshot. These preserved data will cover
snapshots on any linkage of the inode.
Multi-FS
---------
-Snapshots and multiple filesystems don't interact well. Specifically, each
-MDS cluster allocates `snapids` independently; if you have multiple filesystems
+Snapshots and multiple file systems don't interact well. Specifically, each
+MDS cluster allocates `snapids` independently; if you have multiple file systems
sharing a single pool (via namespaces), their snapshots *will* collide and
deleting one will result in missing file data for others. (This may even be
invisible, not throwing errors to the user.) If each FS gets its own
uncompleted work in the filestore by delaying threads calling
queue_transactions more and more based on how many ops and bytes are
currently queued. The throttle is taken in queue_transactions and
-released when the op is applied to the filesystem. This period
+released when the op is applied to the file system. This period
includes time spent in the journal queue, time spent writing to the
journal, time spent in the actual op queue, time spent waiting for the
wbthrottle to open up (thus, the wbthrottle can push back indirectly
on the queue_transactions caller), and time spent actually applying
-the op to the filesystem. A BackoffThrottle is used to gradually
+the op to the file system. A BackoffThrottle is used to gradually
delay the queueing thread after each throttle becomes more than
filestore_queue_low_threshhold full (a ratio of
filestore_queue_max_(bytes|ops)). The throttles will block once the
SnapMapper
----------
*SnapMapper* is implemented on top of map_cacher<string, bufferlist>,
-which provides an interface over a backing store such as the filesystem
+which provides an interface over a backing store such as the file system
with async transactions. While transactions are incomplete, the map_cacher
instance buffers unstable keys allowing consistent access without having
to flush the filestore. *SnapMapper* provides two mappings:
conjunction with ``librbd``, a hypervisor such as QEMU or Xen, and a
hypervisor abstraction layer such as ``libvirt``.
- Ceph Filesystem
+ Ceph File System
CephFS
Ceph FS
The POSIX filesystem components of Ceph. Refer
- :ref:`CephFS Architecture <arch-cephfs>` and :ref:`ceph-filesystem` for
+ :ref:`CephFS Architecture <arch-cephfs>` and :ref:`ceph-file-system` for
more details.
Cloud Platforms
Ceph Client
The collection of Ceph components which can access a Ceph Storage
Cluster. These include the Ceph Object Gateway, the Ceph Block Device,
- the Ceph Filesystem, and their corresponding libraries, kernel modules,
+ the Ceph File System, and their corresponding libraries, kernel modules,
and FUSEs.
Ceph Kernel Modules
.. raw:: html
- </td><td><h3>Ceph Filesystem</h3>
+ </td><td><h3>Ceph File System</h3>
- POSIX-compliant semantics
- Separates metadata from data
</td><td>
-See `Ceph Filesystem`_ for additional details.
+See `Ceph File System`_ for additional details.
.. raw:: html
.. _Ceph Object Store: radosgw
.. _Ceph Block Device: rbd
-.. _Ceph Filesystem: cephfs
+.. _Ceph File System: cephfs
.. _Getting Started: start
.. _Architecture: architecture
- **Unique Identifier:** The ``fsid`` is a unique identifier for the cluster,
and stands for File System ID from the days when the Ceph Storage Cluster was
- principally for the Ceph Filesystem. Ceph now supports native interfaces,
+ principally for the Ceph File System. Ceph now supports native interfaces,
block devices, and object storage gateway interfaces too, so ``fsid`` is a
bit of a misnomer.
Then make sure you do not have a keyring set in ceph.conf in the global section; move it to the client section; or add a keyring setting specific to this mds daemon. And verify that you see the same key in the mds data directory and ``ceph auth get mds.{id}`` output.
-#. Now you are ready to `create a Ceph filesystem`_.
+#. Now you are ready to `create a Ceph file system`_.
Summary
.. _Add/Remove OSDs: ../../rados/operations/add-or-rm-osds
.. _Network Configuration Reference: ../../rados/configuration/network-config-ref
.. _Monitor Config Reference - Data: ../../rados/configuration/mon-config-ref#data
-.. _create a Ceph filesystem: ../../cephfs/createfs
+.. _create a Ceph file system: ../../cephfs/createfs
* Some cache and log (ZIL) can be attached.
Please note that this is different from the Ceph journals. Cache and log are
- totally transparent for Ceph, and help the filesystem to keep the system
+ totally transparent for Ceph, and help the file system to keep the system
consistent and help performance.
Assuming that ada2 is an SSD::
- **Unique Identifier:** The ``fsid`` is a unique identifier for the cluster,
and stands for File System ID from the days when the Ceph Storage Cluster was
- principally for the Ceph Filesystem. Ceph now supports native interfaces,
+ principally for the Ceph File System. Ceph now supports native interfaces,
block devices, and object storage gateway interfaces too, so ``fsid`` is a
bit of a misnomer.
Then make sure you do not have a keyring set in ceph.conf in the global section; move it to the client section; or add a keyring setting specific to this mds daemon. And verify that you see the same key in the mds data directory and ``ceph auth get mds.{id}`` output.
-#. Now you are ready to `create a Ceph filesystem`_.
+#. Now you are ready to `create a Ceph file system`_.
Summary
.. _Add/Remove OSDs: ../../rados/operations/add-or-rm-osds
.. _Network Configuration Reference: ../../rados/configuration/network-config-ref
.. _Monitor Config Reference - Data: ../../rados/configuration/mon-config-ref#data
-.. _create a Ceph filesystem: ../../cephfs/createfs
+.. _create a Ceph file system: ../../cephfs/createfs
ceph-authtool -C -n client.foo --gen-key keyring --mode 0644
To associate some capabilities with the key (namely, the ability to
-mount a Ceph filesystem)::
+mount a Ceph file system)::
ceph-authtool -n client.foo --cap mds 'allow' --cap osd 'allow rw pool=data' --cap mon 'allow r' keyring
**zap**
Zaps the given logical volume or partition. If given a path to a logical
-volume it must be in the format of vg/lv. Any filesystems present
+volume it must be in the format of vg/lv. Any file systems present
on the given lv or partition will be removed and all data will be purged.
However, the lv or partition will be kept intact.
fs
--
-Manage cephfs filesystems. It uses some additional subcommands.
+Manage cephfs file systems. It uses some additional subcommands.
-Subcommand ``ls`` to list filesystems
+Subcommand ``ls`` to list file systems
Usage::
ceph fs ls
-Subcommand ``new`` to make a new filesystem using named pools <metadata> and <data>
+Subcommand ``new`` to make a new file system using named pools <metadata> and <data>
Usage::
ceph fs reset <fs_name> {--yes-i-really-mean-it}
-Subcommand ``rm`` to disable the named filesystem
+Subcommand ``rm`` to disable the named file system
Usage::
**rbd-fuse** is a FUSE (File system in USErspace) client for RADOS
block device (rbd) images. Given a pool containing rbd images,
-it will mount a userspace filesystem allowing access to those images
+it will mount a userspace file system allowing access to those images
as regular files at **mountpoint**.
The file system can be unmounted with::
string. This is useful for situations where an image must
be open from more than one client at once, like during
live migration of a virtual machine, or for use underneath
- a clustered filesystem.
+ a clustered file system.
.. option:: --format format
rbd map foopool/bar2 --id admin --keyring /etc/ceph/ceph.client.admin.keyring
rbd map foopool/bar2 --id admin --keyring /etc/ceph/ceph.client.admin.keyring --options lock_on_read,queue_depth=1024
-If the images had XFS filesystems on them, the corresponding ``/etc/fstab``
+If the images had XFS file systems on them, the corresponding ``/etc/fstab``
entries might look like this::
/dev/rbd/foopool/bar1 /mnt/bar1 xfs noauto 0 0
* **RBD mirroring**: Enable and configure RBD mirroring to a remote Ceph server.
Lists all active sync daemons and their status, pools and RBD images including
their synchronization state.
-* **CephFS**: List all active filesystem clients and associated pools,
+* **CephFS**: List all active file system clients and associated pools,
including their usage statistics.
* **Object Gateway**: List all active object gateways and their performance
counters. Display and manage (add/edit/delete) object gateway users and their
details (e.g. quotas) as well as the users' buckets and their details (e.g.
owner, quotas). See :ref:`dashboard-enabling-object-gateway` for configuration
instructions.
-* **NFS**: Manage NFS exports of CephFS filesystems and RGW S3 buckets via NFS
+* **NFS**: Manage NFS exports of CephFS file systems and RGW S3 buckets via NFS
Ganesha. See :ref:`dashboard-nfs-ganesha-management` for details on how to
enable this functionality.
* **Ceph Manager Modules**: Enable and disable all Ceph Manager modules, change
Always use this specially constructed librados instance instead of
constructing one by hand.
-Similarly, if you are using libcephfs to access the filesystem, then
+Similarly, if you are using libcephfs to access the file system, then
use the libcephfs ``create_with_rados`` to construct it from the
``MgrModule.rados`` librados instance, and thereby inherit the correct context.
Query the status of a particular service instance (mon, osd, mds, rgw). For OSDs
-the id is the numeric OSD ID, for MDS services it is the filesystem name::
+the id is the numeric OSD ID, for MDS services it is the file system name::
ceph orchestrator service-instance status <type> <instance-name>Â [--refresh]
The ``name`` parameter is an identifier of the group of instances:
-* a CephFS filesystem for a group of MDS daemons,
+* a CephFS file system for a group of MDS daemons,
* a zone name for a group of RGWs
Sizing: the ``size`` parameter gives the number of daemons in the cluster
-(e.g. the number of MDS daemons for a particular CephFS filesystem).
+(e.g. the number of MDS daemons for a particular CephFS file system).
Creating/growing/shrinking/removing services::
bluestore_block_wal_size = 0
Otherwise, the current implementation will setup symbol file to kernel
-filesystem location and uses kernel driver to issue DB/WAL IO.
+file system location and uses kernel driver to issue DB/WAL IO.
- :term:`Ceph Manager` (``ceph-mgr``)
- :term:`Ceph OSD Daemon` (``ceph-osd``)
-Ceph Storage Clusters that support the :term:`Ceph Filesystem` run at
+Ceph Storage Clusters that support the :term:`Ceph File System` run at
least one :term:`Ceph Metadata Server` (``ceph-mds``). Clusters that
support :term:`Ceph Object Storage` run Ceph Gateway daemons
(``radosgw``).
``client``
:Description: Settings under ``client`` affect all Ceph Clients
- (e.g., mounted Ceph Filesystems, mounted Ceph Block Devices,
+ (e.g., mounted Ceph File Systems, mounted Ceph Block Devices,
etc.) as well as Rados Gateway (RGW) daemons.
:Example: ``objecter_inflight_ops = 512``
Extended Attributes (XATTRs) are an important aspect in your configuration.
Some file systems have limits on the number of bytes stored in XATTRS.
-Additionally, in some cases, the filesystem may not be as fast as an alternative
+Additionally, in some cases, the file system may not be as fast as an alternative
method of storing XATTRs. The following settings may help improve performance
-by using a method of storing XATTRs that is extrinsic to the underlying filesystem.
+by using a method of storing XATTRs that is extrinsic to the underlying file system.
Ceph XATTRs are stored as ``inline xattr``, using the XATTRs provided
by the underlying file system, if it does not impose a size limit. If
``filestore max inline xattr size``
-:Description: The maximum size of an XATTR stored in the filesystem (i.e., XFS,
+:Description: The maximum size of an XATTR stored in the file system (i.e., XFS,
btrfs, ext4, etc.) per object. Should not be larger than the
- filesystem can handle. Default value of 0 means to use the value
- specific to the underlying filesystem.
+ file system can handle. Default value of 0 means to use the value
+ specific to the underlying file system.
:Type: Unsigned 32-bit Integer
:Required: No
:Default: ``0``
``filestore max inline xattr size xfs``
-:Description: The maximum size of an XATTR stored in the XFS filesystem.
+:Description: The maximum size of an XATTR stored in the XFS file system.
Only used if ``filestore max inline xattr size`` == 0.
:Type: Unsigned 32-bit Integer
:Required: No
``filestore max inline xattr size btrfs``
-:Description: The maximum size of an XATTR stored in the btrfs filesystem.
+:Description: The maximum size of an XATTR stored in the btrfs file system.
Only used if ``filestore max inline xattr size`` == 0.
:Type: Unsigned 32-bit Integer
:Required: No
``filestore max inline xattr size other``
-:Description: The maximum size of an XATTR stored in other filesystems.
+:Description: The maximum size of an XATTR stored in other file systems.
Only used if ``filestore max inline xattr size`` == 0.
:Type: Unsigned 32-bit Integer
:Required: No
``filestore max inline xattrs``
-:Description: The maximum number of XATTRs stored in the filesystem per object.
+:Description: The maximum number of XATTRs stored in the file system per object.
Default value of 0 means to use the value specific to the
- underlying filesystem.
+ underlying file system.
:Type: 32-bit Integer
:Required: No
:Default: ``0``
``filestore max inline xattrs xfs``
-:Description: The maximum number of XATTRs stored in the XFS filesystem per object.
+:Description: The maximum number of XATTRs stored in the XFS file system per object.
Only used if ``filestore max inline xattrs`` == 0.
:Type: 32-bit Integer
:Required: No
``filestore max inline xattrs btrfs``
-:Description: The maximum number of XATTRs stored in the btrfs filesystem per object.
+:Description: The maximum number of XATTRs stored in the btrfs file system per object.
Only used if ``filestore max inline xattrs`` == 0.
:Type: 32-bit Integer
:Required: No
``filestore max inline xattrs other``
-:Description: The maximum number of XATTRs stored in other filesystems per object.
+:Description: The maximum number of XATTRs stored in other file systems per object.
Only used if ``filestore max inline xattrs`` == 0.
:Type: 32-bit Integer
:Required: No
=========================
Periodically, the filestore needs to quiesce writes and synchronize the
-filesystem, which creates a consistent commit point. It can then free journal
+file system, which creates a consistent commit point. It can then free journal
entries up to the commit point. Synchronizing more frequently tends to reduce
the time required to perform synchronization, and reduces the amount of data
that needs to remain in the journal. Less frequent synchronization allows the
-backing filesystem to coalesce small writes and metadata updates more
+backing file system to coalesce small writes and metadata updates more
optimally--potentially resulting in more efficient synchronization.
``filestore fsync flushes journal data``
-:Description: Flush journal data during filesystem synchronization.
+:Description: Flush journal data during file system synchronization.
:Type: Boolean
:Required: No
:Default: ``false``
``filestore op threads``
-:Description: The number of filesystem operation threads that execute in parallel.
+:Description: The number of file system operation threads that execute in parallel.
:Type: Integer
:Required: No
:Default: ``2``
``filestore op thread timeout``
-:Description: The timeout for a filesystem operation thread (in seconds).
+:Description: The timeout for a file system operation thread (in seconds).
:Type: Integer
:Required: No
:Default: ``60``
``fsid``
-:Description: The filesystem ID. One per cluster.
+:Description: The file system ID. One per cluster.
:Type: UUID
:Required: No.
:Default: N/A. Usually generated by deployment tools.
- **Speed:** The journal enables the Ceph OSD Daemon to commit small writes
quickly. Ceph writes small, random i/o to the journal sequentially, which
- tends to speed up bursty workloads by allowing the backing filesystem more
+ tends to speed up bursty workloads by allowing the backing file system more
time to coalesce writes. The Ceph OSD Daemon's journal, however, can lead
to spiky performance with short spurts of high-speed writes followed by
- periods without any write progress as the filesystem catches up to the
+ periods without any write progress as the file system catches up to the
journal.
-- **Consistency:** Ceph OSD Daemons require a filesystem interface that
+- **Consistency:** Ceph OSD Daemons require a file system interface that
guarantees atomic compound operations. Ceph OSD Daemons write a description
- of the operation to the journal and apply the operation to the filesystem.
+ of the operation to the journal and apply the operation to the file system.
This enables atomic updates to an object (for example, placement group
metadata). Every few seconds--between ``filestore max sync interval`` and
``filestore min sync interval``--the Ceph OSD Daemon stops writes and
- synchronizes the journal with the filesystem, allowing Ceph OSD Daemons to
+ synchronizes the journal with the file system, allowing Ceph OSD Daemons to
trim operations from the journal and reuse the space. On failure, Ceph
OSD Daemons replay the journal starting after the last synchronization
operation.
The Ceph configuration file consists of at least:
-- Its own filesystem ID (``fsid``)
+- Its own file system ID (``fsid``)
- The initial monitor(s) hostname(s), and
- The initial monitor(s) and IP address(es).
thousands of storage nodes. A minimal system will have at least one
Ceph Monitor and two Ceph OSD Daemons for data replication.
-The Ceph Filesystem, Ceph Object Storage and Ceph Block Devices read data from
+The Ceph File System, Ceph Object Storage and Ceph Block Devices read data from
and write data to the Ceph Storage Cluster.
.. raw:: html
</td><td><h3>APIs</h3>
Most Ceph deployments use `Ceph Block Devices`_, `Ceph Object Storage`_ and/or the
-`Ceph Filesystem`_. You may also develop applications that talk directly to
+`Ceph File System`_. You may also develop applications that talk directly to
the Ceph Storage Cluster.
.. toctree::
</td></tr></tbody></table>
.. _Ceph Block Devices: ../rbd/
-.. _Ceph Filesystem: ../cephfs/
+.. _Ceph File System: ../cephfs/
.. _Ceph Object Storage: ../radosgw/
.. _Deployment: ../rados/deployment/
wherever possible.
.. note:: A Ceph Storage Cluster user is not the same as a Ceph Object Storage
- user or a Ceph Filesystem user. The Ceph Object Gateway uses a Ceph Storage
+ user or a Ceph File System user. The Ceph Object Gateway uses a Ceph Storage
Cluster user to communicate between the gateway daemon and the storage
cluster, but the gateway has its own user management functionality for end
- users. The Ceph Filesystem uses POSIX semantics. The user space associated
- with the Ceph Filesystem is not the same as a Ceph Storage Cluster user.
+ users. The Ceph File System uses POSIX semantics. The user space associated
+ with the Ceph File System is not the same as a Ceph Storage Cluster user.
But what if all monitors fail at the same time? Since users are encouraged to
deploy at least three (and preferably five) monitors in a Ceph cluster, the chance of simultaneous
failure is rare. But unplanned power-downs in a data center with improperly
-configured disk/fs settings could fail the underlying filesystem, and hence
+configured disk/fs settings could fail the underlying file system, and hence
kill all the monitors. In this case, we can recover the monitor store with the
information stored in OSDs.::
Display Freespace
-----------------
-Filesystem issues may arise. To display your filesystem's free space, execute
+Filesystem issues may arise. To display your file system's free space, execute
``df``. ::
df -h
Ceph acknowledges writes *after* journaling, so fast SSDs are an
attractive option to accelerate the response time--particularly when
-using the ``XFS`` or ``ext4`` filesystems. By contrast, the ``btrfs``
-filesystem can write and journal simultaneously. (Note, however, that
+using the ``XFS`` or ``ext4`` file systems. By contrast, the ``btrfs``
+file system can write and journal simultaneously. (Note, however, that
we recommend against using ``btrfs`` for production deployments.)
.. note:: Partitioning a drive does not change its total throughput or
Currently, we recommend deploying clusters with XFS.
-We recommend against using btrfs or ext4. The btrfs filesystem has
-many attractive features, but bugs in the filesystem may lead to
+We recommend against using btrfs or ext4. The btrfs file system has
+many attractive features, but bugs in the file system may lead to
performance issues and spurious ENOSPC errors. We do not recommend
ext4 because xattr size limitations break our support for long object
names (needed for RGW).
an HTTP server for interacting with a Ceph Storage Cluster. Since it
provides interfaces compatible with OpenStack Swift and Amazon S3, the Ceph
Object Gateway has its own user management. Ceph Object Gateway can store data
-in the same Ceph Storage Cluster used to store data from Ceph Filesystem clients
+in the same Ceph Storage Cluster used to store data from Ceph File System clients
or Ceph Block Device clients. The S3 and Swift APIs share a common namespace, so
you may write data with one API and retrieve it with the other.
cloud-based computing systems like `OpenStack`_ and `CloudStack`_ that rely on
libvirt and QEMU to integrate with Ceph block devices. You can use the same cluster
to operate the :ref:`Ceph RADOS Gateway <object-gateway>`, the
-:ref:`CephFS filesystem <ceph-filesystem>`, and Ceph block devices simultaneously.
+:ref:`Ceph File System <ceph-file-system>`, and Ceph block devices simultaneously.
.. important:: To use Ceph Block Devices, you must have access to a running
Ceph cluster.
.. important:: If you set rbd_cache=true, you must set cache=writeback
or risk data loss. Without cache=writeback, QEMU will not send
flush requests to librbd. If QEMU exits uncleanly in this
- configuration, filesystems on top of rbd can be corrupted.
+ configuration, file systems on top of rbd can be corrupted.
.. _RBD caching: ../rbd-config-ref/#rbd-cache-config-settings
You may use Ceph Block Device images with Kubernetes v1.13 and later through
`ceph-csi`_, which dynamically provisions RBD images to back Kubernetes
`volumes`_ and maps these RBD images as block devices (optionally mounting
-a filesystem contained within the image) on worker nodes running
+a file system contained within the image) on worker nodes running
`pods`_ that reference an RBD-backed volume. Ceph stripes block device images as
objects across the cluster, which means that large Ceph Block Device images have
better performance than a standalone server!
A `PersistentVolumeClaim` is a request for abstract storage resources by a user.
The `PersistentVolumeClaim` would then be associated to a `Pod` resource to
provision a `PersistentVolume`, which would be backed by a Ceph block image.
-An optional `volumeMode` can be included to select between a mounted filesystem
+An optional `volumeMode` can be included to select between a mounted file system
(default) or raw block device-based volume.
Using `ceph-csi`, specifying `Filesystem` for `volumeMode` can support both
EOF
$ kubectl apply -f raw-block-pod.yaml
-To create a filesystem-based `PersistentVolumeClaim` that utilizes the
+To create a file-system-based `PersistentVolumeClaim` that utilizes the
`ceph-csi`-based `StorageClass` created above, the following YAML can be used to
-request a mounted filesystem (backed by an RBD image) from the `csi-rbd-sc`
+request a mounted file system (backed by an RBD image) from the `csi-rbd-sc`
`StorageClass`::
$ cat <<EOF > pvc.yaml
$ kubectl apply -f pvc.yaml
The following demonstrates and example of binding the above
-`PersistentVolumeClaim` to a `Pod` resource as a mounted filesystem::
+`PersistentVolumeClaim` to a `Pod` resource as a mounted file system::
$ cat <<EOF > pod.yaml
---
Cinder services.
- **Guest Disks**: Guest disks are guest operating system disks. By default,
- when you boot a virtual machine, its disk appears as a file on the filesystem
+ when you boot a virtual machine, its disk appears as a file on the file system
of the hypervisor (usually under ``/var/lib/nova/instances/<uuid>/``). Prior
to OpenStack Havana, the only way to boot a VM in Ceph was to use the
boot-from-volume functionality of Cinder. However, now it is possible to boot
.. important:: To use use RBD snapshots, you must have a running Ceph cluster.
-.. note:: Because RBD does not know about the filesystem, snapshots are
+.. note:: Because RBD does not know about the file system, snapshots are
`crash-consistent` if they are not coordinated with the mounting
computer. So, we recommend you stop `I/O` before taking a snapshot of
- an image. If the image contains a filesystem, the filesystem must be
+ an image. If the image contains a file system, the file system must be
in a consistent state before taking a snapshot or you may have to run
`fsck`. To stop `I/O` you can use `fsfreeze` command. See
`fsfreeze(8)` man page for more details.
For virtual machines, `qemu-guest-agent` can be used to automatically
- freeze filesystems when creating a snapshot.
+ freeze file systems when creating a snapshot.
.. ditaa:: +------------+ +-------------+
| {s} | | {s} c999 |
- **Ceph Object Storage:** The Ceph Object Storage documentation resides under
the ``doc/radosgw`` directory.
-- **Ceph Filesystem:** The Ceph Filesystem documentation resides under the
+- **Ceph File System:** The Ceph File System documentation resides under the
``doc/cephfs`` directory.
- **Installation (Quick):** Quick start documentation resides under the
this path to an SSD or to an SSD partition so that it is not merely a file on
the same disk as the object data.
-One way Ceph accelerates CephFS filesystem performance is to segregate the
+One way Ceph accelerates CephFS file system performance is to segregate the
storage of CephFS metadata from the storage of the CephFS file contents. Ceph
provides a default ``metadata`` pool for CephFS metadata. You will never have to
create a pool for CephFS metadata, but you can create a CRUSH map hierarchy for
</td><td><h3>Step 3: Ceph Client(s)</h3>
Most Ceph users don't store objects directly in the Ceph Storage Cluster. They typically use at least one of
-Ceph Block Devices, the Ceph Filesystem, and Ceph Object Storage.
+Ceph Block Devices, the Ceph File System, and Ceph Object Storage.
.. toctree::
Whether you want to provide :term:`Ceph Object Storage` and/or
:term:`Ceph Block Device` services to :term:`Cloud Platforms`, deploy
-a :term:`Ceph Filesystem` or use Ceph for another purpose, all
+a :term:`Ceph File System` or use Ceph for another purpose, all
:term:`Ceph Storage Cluster` deployments begin with setting up each
:term:`Ceph Node`, your network, and the Ceph Storage Cluster. A Ceph
Storage Cluster requires at least one Ceph Monitor, Ceph Manager, and
Ceph OSD (Object Storage Daemon). The Ceph Metadata Server is also
-required when running Ceph Filesystem clients.
+required when running Ceph File System clients.
.. ditaa:: +---------------+ +------------+ +------------+ +---------------+
| OSDs | | Monitors | | Managers | | MDSs |
and high availability.
- **MDSs**: A :term:`Ceph Metadata Server` (MDS, ``ceph-mds``) stores
- metadata on behalf of the :term:`Ceph Filesystem` (i.e., Ceph Block
+ metadata on behalf of the :term:`Ceph File System` (i.e., Ceph Block
Devices and Ceph Object Storage do not use MDS). Ceph Metadata
Servers allow POSIX file system users to execute basic commands (like
``ls``, ``find``, etc.) without placing an enormous burden on the
A StorageClass named ``ceph-rbd`` of type ``ceph.com/rbd`` will be created with ``ceph-rbd-provisioner`` Pods. These
will allow a RBD to be automatically provisioned upon creation of a PVC. RBDs will also be formatted when mapped for the first
-time. All RBDs will use the ext4 filesystem. ``ceph.com/rbd`` does not support the ``fsType`` option.
+time. All RBDs will use the ext4 file system. ``ceph.com/rbd`` does not support the ``fsType`` option.
By default, RBDs will use image format 2 and layering. You can overwrite the following storageclass' defaults in your values file::
storageclass:
===================
You have already created an MDS (`Storage Cluster Quick Start`_) but it will not
-become active until you create some pools and a filesystem. See :doc:`/cephfs/createfs`.
+become active until you create some pools and a file system. See :doc:`/cephfs/createfs`.
::
cat ceph.client.admin.keyring
-#. Copy the key of the user who will be using the mounted CephFS filesystem.
+#. Copy the key of the user who will be using the mounted CephFS file system.
It should look something like this::
[client.admin]
sudo mount -t ceph 192.168.0.1:6789:/ /mnt/mycephfs -o name=admin,secretfile=admin.secret
-.. note:: Mount the CephFS filesystem on the admin node,
+.. note:: Mount the CephFS file system on the admin node,
not the server node. See `FAQ`_ for details.