- reorganized cephadm into a top-level item with a series of sub-items.
- condensed the 'install' page so that it doesn't create a zillion items
in the toctree on the left
- started updating the cephadm/install sequence (incomplete)
Signed-off-by: Sage Weil <sage@redhat.com>
+++ /dev/null
-.. _cephadm-administration:
-
-======================
-cephadm Administration
-======================
-
-
-SSH Configuration
-=================
-
-Cephadm uses SSH to connect to remote hosts. SSH uses a key to authenticate
-with those hosts in a secure way.
-
-
-Default behavior
-----------------
-
-Cephadm normally stores an SSH key in the monitor that is used to
-connect to remote hosts. When the cluster is bootstrapped, this SSH
-key is generated automatically. Normally, no additional configuration
-is necessary.
-
-A *new* SSH key can be generated with::
-
- ceph cephadm generate-key
-
-The public portion of the SSH key can be retrieved with::
-
- ceph cephadm get-pub-key
-
-The currently stored SSH key can be deleted with::
-
- ceph cephadm clear-key
-
-You can make use of an existing key by directly importing it with::
-
- ceph config-key set mgr/cephadm/ssh_identity_key -i <key>
- ceph config-key set mgr/cephadm/ssh_identity_pub -i <pub>
-
-You will then need to restart the mgr daemon to reload the configuration with::
-
- ceph mgr fail
-
-
-Customizing the SSH configuration
----------------------------------
-
-Normally cephadm generates an appropriate ``ssh_config`` file that is
-used for connecting to remote hosts. This configuration looks
-something like this::
-
- Host *
- User root
- StrictHostKeyChecking no
- UserKnownHostsFile /dev/null
-
-There are two ways to customize this configuration for your environment:
-
-#. You can import a customized configuration file that will be stored
- by the monitor with::
-
- ceph cephadm set-ssh-config -i <ssh_config_file>
-
- To remove a customized ssh config and revert back to the default behavior::
-
- ceph cephadm clear-ssh-config
-
-#. You can configure a file location for the ssh configuration file with::
-
- ceph config set mgr mgr/cephadm/ssh_config_file <path>
-
- This approach is *not recommended*, however, as the path name must be
- visible to *any* mgr daemon, and cephadm runs all daemons as
- containers. That means that the file either need to be placed
- inside a customized container image for your deployment, or
- manually distributed to the mgr data directory
- (``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
- ``/var/lib/ceph/mgr/ceph-<id>`` from inside the container).
-
-
-Data location
-=============
-
-Cephadm daemon data and logs in slightly different locations than older
-versions of ceph:
-
-* ``/var/log/ceph/<cluster-fsid>`` contains all cluster logs. Note
- that by default cephadm logs via stderr and the container runtime,
- so these logs are normally not present.
-* ``/var/lib/ceph/<cluster-fsid>`` contains all cluster daemon data
- (besides logs).
-* ``/var/lib/ceph/<cluster-fsid>/<daemon-name>`` contains all data for
- an individual daemon.
-* ``/var/lib/ceph/<cluster-fsid>/crash`` contains crash reports for
- the cluster.
-* ``/var/lib/ceph/<cluster-fsid>/removed`` contains old daemon
- data directories for stateful daemons (e.g., monitor, prometheus)
- that have been removed by cephadm.
-
-
-Health checks
-=============
-
-CEPHADM_PAUSED
---------------
-
-Cephadm background work has been paused with ``ceph orch pause``. Cephadm
-will continue to perform passive monitoring activities (like checking
-host and daemon status), but it will not make any changes (like deploying
-or removing daemons).
-
-You can resume cephadm work with::
-
- ceph orch resume
-
-CEPHADM_STRAY_HOST
-------------------
-
-One or more hosts have running Ceph daemons but are not registered as
-hosts managed by *cephadm*. This means that those services cannot
-currently be managed by cephadm (e.g., restarted, upgraded, included
-in `ceph orch ps`).
-
-You can manage the host(s) with::
-
- ceph orch host add *<hostname>*
-
-Note that you may need to configure SSH access to the remote host
-before this will work.
-
-Alternatively, you can manually connect to the host and ensure that
-services on that host are removed and/or migrated to a host that is
-managed by *cephadm*.
-
-You can also disable this warning entirely with::
-
- ceph config set mgr mgr/cephadm/warn_on_stray_hosts false
-
-CEPHADM_STRAY_DAEMON
---------------------
-
-One or more Ceph daemons are running but not are not managed by
-*cephadm*, perhaps because they were deploy using a different tool, or
-were started manually. This means that those services cannot
-currently be managed by cephadm (e.g., restarted, upgraded, included
-in `ceph orch ps`).
-
-**FIXME:** We need to implement and document an adopt procedure here.
-
-You can also disable this warning entirely with::
-
- ceph config set mgr mgr/cephadm/warn_on_stray_daemons false
-
-CEPHADM_HOST_CHECK_FAILED
--------------------------
-
-One or more hosts have failed the basic cephadm host check, which verifies
-that (1) the host is reachable and cephadm can be executed there, and (2)
-that the host satisfies basic prerequisites, like a working container
-runtime (podman or docker) and working time synchronization.
-If this test fails, cephadm will no be able to manage services on that host.
-
-You can manually run this check with::
-
- ceph cephadm check-host *<hostname>*
-
-You can remove a broken host from management with::
-
- ceph orch host rm *<hostname>*
-
-You can disable this health warning with::
-
- ceph config set mgr mgr/cephadm/warn_on_failed_host_check false
-
-
-Converting an existing cluster to cephadm
-=========================================
-
-Cephadm allows you to (pretty) easily convert an existing Ceph cluster that
-has been deployed with ceph-deploy, ceph-ansible, DeepSea, or similar tools.
-
-Limitations
------------
-
-* Cephadm only works with BlueStore OSDs. If there are FileStore OSDs
- in your cluster you cannot manage them.
-
-Adoption Process
-----------------
-
-#. Get the ``cephadm`` command line too on each host. You can do this with curl or by installing the package. The simplest approach is::
-
- [each host] # curl --silent --remote-name --location https://github.com/ceph/ceph/raw/master/src/cephadm/cephadm
- [each host] # chmod +x cephadm
-
-#. Prepare each host for use by ``cephadm``::
-
- [each host] # ./cephadm prepare-host
-
-#. List all Ceph daemons on the current host::
-
- # ./cephadm ls
-
- You should see that all existing daemons have a type of ``legacy``
- in the resulting output.
-
-#. Determine which Ceph version you will use. You can use any Octopus
- release or later. For example, ``docker.io/ceph/ceph:v15.2.0``. The default
- will be the latest stable release, but if you are upgrading from an earlier
- release at the same time be sure to refer to the upgrade notes for any
- special steps to take while upgrading.
-
- The image is passed to cephadm with::
-
- # ./cephadm --image $IMAGE <rest of command goes here>
-
-#. Adopt each monitor::
-
- # ./cephadm adopt --style legacy --name mon.<hostname>
-
-#. Adopt each manager::
-
- # ./cephadm adopt --style legacy --name mgr.<hostname>
-
-#. Enable cephadm::
-
- # ceph mgr module enable cephadm
- # ceph orch set backend cephadm
-
-#. Generate an SSH key::
-
- # ceph cephadm generate-key
- # ceph cephadm get-pub-key
-
-#. Install the SSH key on each host to be managed::
-
- # echo <ssh key here> | sudo tee /root/.ssh/authorized_keys
-
- Note that ``/root/.ssh/authorized_keys`` should have mode ``0600`` and
- ``/root/.ssh`` should have mode ``0700``.
-
-#. Tell cephadm which hosts to manage::
-
- # ceph orch host add <hostname> [ip-address]
-
- This will perform a ``cephadm check-host`` on each host before
- adding it to ensure it is working. The IP address argument is only
- required if DNS doesn't allow you to connect to each host by it's
- short name.
-
-#. Verify that the monitor and manager daemons are visible::
-
- # ceph orch ps
-
-#. Adopt all remainingg daemons::
-
- # ./cephadm adopt --style legacy --name <osd.0>
- # ./cephadm adopt --style legacy --name <osd.1>
- # ./cephadm adopt --style legacy --name <mds.foo>
-
- Repeat for each host and daemon.
-
-#. Check the ``ceph health detail`` output for cephadm warnings about
- stray cluster daemons or hosts that are not yet managed.
-
-Troubleshooting
-===============
-
-Sometimes there is a need to investigate why a cephadm command failed or why
-a specific service no longer runs properly.
-
-As cephadm deploys daemons as containers, troubleshooting daemons is slightly
-different. Here are a few tools and commands to help investigating issues.
-
-Gathering log files
--------------------
-
-Use journalctl to gather the log files of all daemons:
-
-.. note:: By default cephadm now stores logs in journald. This means
- that you will no longer find daemon logs in ``/var/log/ceph/``.
-
-To read the log file of one specific daemon, run::
-
- cephadm logs --name <name-of-daemon>
-
-Note: this only works when run on the same host where the daemon is running. To
-get logs of a daemon running on a different host, give the ``--fsid`` option::
-
- cephadm logs --fsid <fsid> --name <name-of-daemon>
-
-Where the ``<fsid>`` corresponds to the cluster id printed by ``ceph status``.
-
-To fetch all log files of all daemons on a given host, run::
-
- for name in $(cephadm ls | jq -r '.[].name') ; do
- cephadm logs --fsid <fsid> --name "$name" > $name;
- done
-
-Collecting systemd status
--------------------------
-
-To print the state of a systemd unit, run::
-
- systemctl status "ceph-$(cephadm shell ceph fsid)@<service name>.service";
-
-
-To fetch all state of all daemons of a given host, run::
-
- fsid="$(cephadm shell ceph fsid)"
- for name in $(cephadm ls | jq -r '.[].name') ; do
- systemctl status "ceph-$fsid@$name.service" > $name;
- done
-
-
-List all downloaded container images
-------------------------------------
-
-To list all container images that are downloaded on a host:
-
-.. note:: ``Image`` might also be called `ImageID`
-
-::
-
- podman ps -a --format json | jq '.[].Image'
- "docker.io/library/centos:8"
- "registry.opensuse.org/opensuse/leap:15.2"
-
-
-Manually running containers
----------------------------
-
-cephadm writes small wrappers that run a containers. Refer to
-``/var/lib/ceph/<cluster-fsid>/<service-name>/unit.run`` for the container execution command.
-to execute a container.
--- /dev/null
+Converting an existing cluster to cephadm
+=========================================
+
+Cephadm allows you to (pretty) easily convert an existing Ceph cluster that
+has been deployed with ceph-deploy, ceph-ansible, DeepSea, or similar tools.
+
+Limitations
+-----------
+
+* Cephadm only works with BlueStore OSDs. If there are FileStore OSDs
+ in your cluster you cannot manage them.
+
+Adoption Process
+----------------
+
+#. Get the ``cephadm`` command line too on each host. You can do this with curl or by installing the package. The simplest approach is::
+
+ [each host] # curl --silent --remote-name --location https://github.com/ceph/ceph/raw/master/src/cephadm/cephadm
+ [each host] # chmod +x cephadm
+
+#. Prepare each host for use by ``cephadm``::
+
+ [each host] # ./cephadm prepare-host
+
+#. List all Ceph daemons on the current host::
+
+ # ./cephadm ls
+
+ You should see that all existing daemons have a type of ``legacy``
+ in the resulting output.
+
+#. Determine which Ceph version you will use. You can use any Octopus
+ release or later. For example, ``docker.io/ceph/ceph:v15.2.0``. The default
+ will be the latest stable release, but if you are upgrading from an earlier
+ release at the same time be sure to refer to the upgrade notes for any
+ special steps to take while upgrading.
+
+ The image is passed to cephadm with::
+
+ # ./cephadm --image $IMAGE <rest of command goes here>
+
+#. Adopt each monitor::
+
+ # ./cephadm adopt --style legacy --name mon.<hostname>
+
+#. Adopt each manager::
+
+ # ./cephadm adopt --style legacy --name mgr.<hostname>
+
+#. Enable cephadm::
+
+ # ceph mgr module enable cephadm
+ # ceph orch set backend cephadm
+
+#. Generate an SSH key::
+
+ # ceph cephadm generate-key
+ # ceph cephadm get-pub-key
+
+#. Install the SSH key on each host to be managed::
+
+ # echo <ssh key here> | sudo tee /root/.ssh/authorized_keys
+
+ Note that ``/root/.ssh/authorized_keys`` should have mode ``0600`` and
+ ``/root/.ssh`` should have mode ``0700``.
+
+#. Tell cephadm which hosts to manage::
+
+ # ceph orch host add <hostname> [ip-address]
+
+ This will perform a ``cephadm check-host`` on each host before
+ adding it to ensure it is working. The IP address argument is only
+ required if DNS doesn't allow you to connect to each host by it's
+ short name.
+
+#. Verify that the monitor and manager daemons are visible::
+
+ # ceph orch ps
+
+#. Adopt all remainingg daemons::
+
+ # ./cephadm adopt --style legacy --name <osd.0>
+ # ./cephadm adopt --style legacy --name <osd.1>
+ # ./cephadm adopt --style legacy --name <mds.foo>
+
+ Repeat for each host and daemon.
+
+#. Check the ``ceph health detail`` output for cephadm warnings about
+ stray cluster daemons or hosts that are not yet managed.
-.. _cephadm-bootstrap:
+.. _cephadm:
-========================
- Installation (cephadm)
-========================
+=======
+Cephadm
+=======
-.. note:: The *cephadm* bootstrap feature is first introduced in Octopus, and is not yet recommended for production deployments.
+Cephadm deploys and manages a Ceph cluster by connection to hosts from the
+manager daemon via SSH to add, remove, or update Ceph daemon containers. It
+does not rely on external configuration or orchestration tools like Ansible,
+Rook, or Salt.
-cephadm manages nodes in a cluster by establishing an SSH connection
-and issues explicit management commands. It does not rely on
-separate systems such as Rook or Ansible.
+Cephadm starts by bootstrapping a tiny Ceph cluster on a single node
+(one monitor and one manager) and then using the orchestration
+interface (so-called "day 2" commands) to expand the cluster to include
+all hosts and to provision all Ceph daemons and services, either via the Ceph
+command-line interface (CLI) or dashboard (GUI).
-A new Ceph cluster is deployed by bootstrapping a cluster on a single
-node, and then adding additional nodes and daemons via the CLI or GUI
-dashboard.
-
-The following example installs a basic three-node cluster. Each
-node will be identified by its prompt. For example, "[monitor 1]"
-identifies the first monitor, "[monitor 2]" identifies the second
-monitor, and "[monitor 3]" identifies the third monitor. This
-information is provided in order to make clear which commands
-should be issued on which systems.
-
-"[any node]" identifies any Ceph node, and in the context
-of this installation guide means that the associated command
-can be run on any node.
-
-Requirements
-============
-
-- Podman or Docker
-- LVM2
-
-.. highlight:: console
-
-Get cephadm
-===========
-
-The ``cephadm`` utility is used to bootstrap a new Ceph Cluster.
-
-Use curl to fetch the standalone script::
-
- [monitor 1] # curl --silent --remote-name --location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm
- [monitor 1] # chmod +x cephadm
-
-You can also get the utility by installing a package provided by
-your Linux distribution::
-
- [monitor 1] # apt install -y cephadm # or
- [monitor 1] # dnf install -y cephadm # or
- [monitor 1] # yum install -y cephadm # or
- [monitor 1] # zypper install -y cephadm
-
-
-Bootstrap a new cluster
-=======================
-
-To create a new cluster, you need to know which *IP address* to use
-for the cluster's first monitor. This is normally just the IP for the
-first cluster node. If there are multiple networks and interfaces, be
-sure to choose one that will be accessible by any hosts accessing the
-Ceph cluster.
-
-To bootstrap the cluster run the following command::
-
- [node 1] $ sudo ./cephadm bootstrap --mon-ip *<mon-ip>*
-
-This command does a few things:
-
-* Creates a monitor and manager daemon for the new cluster on the
- local host. A minimal configuration file needed to communicate with
- the new cluster is written to ``ceph.conf`` in the local directory.
-* A copy of the ``client.admin`` administrative (privileged!) secret
- key is written to ``ceph.client.admin.keyring`` in the local directory.
-* Generates a new SSH key, and adds the public key to the local root user's
- ``/root/.ssh/authorized_keys`` file. A copy of the public key is written
- to ``ceph.pub`` in the current directory.
-
-Interacting with the cluster
-============================
-
-To interact with your cluster, start up a container that has all of
-the Ceph packages installed::
-
- [any node] $ sudo ./cephadm shell --config ceph.conf --keyring ceph.client.admin.keyring
-
-The ``--config`` and ``--keyring`` arguments will bind those local
-files to the default locations in ``/etc/ceph`` inside the container
-to allow the ``ceph`` CLI utility to work without additional
-arguments. Inside the container, you can check the cluster status with::
-
- [ceph: root@monitor_1_hostname /]# ceph status
-
-In order to interact with the Ceph cluster outside of a container
-(that is, from the command line), install the Ceph
-client packages and install the configuration and privileged
-administrator key in a global location::
-
- [any node] $ sudo apt install -y ceph-common # or,
- [any node] $ sudo dnf install -y ceph-common # or,
- [any node] $ sudo yum install -y ceph-common
-
- [any node] $ sudo install -m 0644 ceph.conf /etc/ceph/ceph.conf
- [any node] $ sudo install -m 0600 ceph.keyring /etc/ceph/ceph.keyring
-
-Watching cephadm log messages
-=============================
-
-Cephadm logs to the ``cephadm`` cluster log channel, which means you can monitor progress in realtime with::
-
- # ceph -W cephadm
-
-By default it will show info-level events and above. To see
-debug-level messages too::
-
- # ceph config set mgr mgr/cephadm/log_to_cluster_level debug
- # ceph -W cephadm --watch-debug
-
-Be careful: the debug messages are very verbose!
-
-You can see recent events with::
-
- # ceph log last cephadm
-
-These events are also logged to the ``ceph.cephadm.log`` file on
-monitor hosts and/or to the monitor-daemon stderr.
-
-Adding hosts to the cluster
-===========================
-
-For each new host you'd like to add to the cluster, you need to do two things:
-
-#. Install the cluster's public SSH key in the new host's root user's
- ``authorized_keys`` file. This is easy with the ``ssh-copy-id`` script::
-
- [monitor 1] # ssh-copy-id -f -i ceph.pub root@*newhost*
-
-#. Tell Ceph that the new node is part of the cluster::
-
- # ceph orch host add *newhost*
-
-Deploying additional monitors
-=============================
-
-Normally a Ceph cluster has three or five monitor daemons spread
-across different hosts. As a rule of thumb, you should deploy five
-monitors if there are five or more nodes in your cluster.
-
-.. _CIDR: https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation
-
-If all of your monitors will exist on the same IP subnet, cephadm can
-automatically scale the number of monitors. This subnet should be
-specified in `CIDR`_ format (e.g., ``10.1.2.0/24``). (If you do not
-specify a subnet, you will need to manually specify an IP or subnet
-when creating each monitor.)::
-
- # ceph config set mon public_network *<mon-cidr-network>*
-
-For example::
-
- # ceph config set mon public_network 10.1.2.0/24
-
-There are several ways to add additional monitors:
-
-* You can simply tell cephadm how many monitors you want, and it will pick the
- hosts (randomly)::
-
- # ceph orch apply mon *<number-of-monitors>*
-
- For example, if you have 5 or more hosts added to the cluster,::
-
- # ceph orch apply mon 5
-
-* You can explicitly specify which hosts to deploy on. Be sure to include
- the first monitor host in this list.::
-
- # ceph orch apply mon *<host1,host2,host3,...>*
-
- For example,::
-
- # ceph orch apply mon host1,host2,host3
-
-* You can control which hosts the monitors run on by adding the ``mon`` label
- to the appropriate hosts::
-
- # ceph orch host label add *<hostname>* mon
-
- To view the current hosts and labels,::
-
- # ceph orch host ls
-
- For example::
-
- # ceph orch host label add host1 mon
- # ceph orch host label add host2 mon
- # ceph orch host label add host3 mon
- # ceph orch host ls
- HOST ADDR LABELS STATUS
- host1 mon
- host2 mon
- host3 mon
- host4
- host5
-
- Then tell cephadm to deploy monitors based on the label::
-
- # ceph orch apply mon label:mon
-
-* You can explicitly specify the IP address or CIDR for each monitor
- and control where it is placed. This is the only supported method
- if you did not specify the CIDR monitor network above.
-
- To deploy additional monitors,::
-
- # ceph orch daemon add mon *<host1:ip-or-network1> [<host1:ip-or-network-2>...]*
-
- For example, to deploy a second monitor on ``newhost1`` using an IP
- address ``10.1.2.123`` and a third monitor on ``newhost2`` in
- network ``10.1.2.0/24``,::
-
- # ceph orch daemon add mon newhost1:10.1.2.123
- # ceph orch daemon add mon newhost2:10.1.2.0/24
-
-Deploying OSDs
-==============
-
-To add OSDs to the cluster, you have two options:
-
-#. You need to know the device name for the block device (hard disk or
-SSD) that will be used. Then,::
-
- # ceph orch osd create *<host>*:*<path-to-device>*
-
- For example, to deploy an OSD on host *newhost*'s SSD,::
-
- # ceph orch osd create newhost:/dev/disk/by-id/ata-WDC_WDS200T2B0A-00SM50_182294800028
-
-
-#. You need to describe your disk setup by it's properties (Drive Groups)
-
- Link to DriveGroup docs.::
-
- # ceph orch osd create -i my_drivegroups.yml
-
-
-.. _drivegroups: drivegroups::
-
-Deploying manager daemons
-=========================
-
-It is a good idea to have at least one backup manager daemon. To
-deploy one or more new manager daemons,::
-
- # ceph orch apply mgr *<new-num-mgrs>* [*<host1>* ...]
-
-Deploying MDSs
-==============
-
-One or more MDS daemons is required to use the CephFS file system.
-These are created automatically if the newer ``ceph fs volume``
-interface is used to create a new file system. For more information,
-see :ref:`fs-volumes-and-subvolumes`.
-
-To deploy metadata servers,::
-
- # ceph orch apply mds *<fs-name>* *<num-daemons>* [*<host1>* ...]
-
-Deploying RGWs
-==============
-
-Cephadm deploys radosgw as a collection of daemons that manage a
-particular *realm* and *zone*. (For more information about realms and
-zones, see :ref:`multisite`.) To deploy a set of radosgw daemons for
-a particular realm and zone,::
-
- # ceph orch apply rgw *<realm-name>* *<zone-name>* *<num-daemons>* [*<host1>* ...]
-
-Note that with cephadm, radosgw daemons are configured via the monitor
-configuration database instead of via a `ceph.conf` or the command line. If
-that confiruation isn't already in place (usually in the
-``client.rgw.<realmname>.<zonename>`` section), then the radosgw
-daemons will start up with default settings (e.g., binding to port
-80).
-
-
-Further Reading
-===============
+Cephadm is new in the Octopus v15.2.0 release and does not support older
+versions of Ceph.
.. toctree::
:maxdepth: 2
- Cephadm administration <administration>
+ install
+ adoption
+ Cephadm operations <operations>
Cephadm monitoring <monitoring>
Cephadm CLI <../mgr/orchestrator>
DriveGroups <drivegroups>
+ troubleshooting
OS recommendations <../start/os-recommendations>
-
--- /dev/null
+============================
+Deploying a new Ceph cluster
+============================
+
+Cephadm can create a new Ceph cluster by "bootstrapping" on a single
+host, expanding the cluster to encompass any additional
+hosts, and deploying the needed services.
+
+The following instructions install a basic multi-node cluster. Commands
+may be prefixed by the host that they need to be run on. For example,
+``host1`` identifies the first host, ``host2`` identifies the second
+host, and so on. This information is provided in order to make clear
+which commands should be issued on which systems. If there is no
+explicit prefix, then the command be run anywhere the ``ceph``
+command is available.
+
+.. highlight:: console
+
+
+Requirements
+============
+
+- Systemd
+- Podman or Docker for running containers.
+- Time synchronization (such as chrony or NTP)
+- LVM2 for provisioning storage devices
+
+Any modern Linux distribution should be sufficient. Dependencies
+are installed automatically by the bootstrap process below.
+
+
+Get cephadm
+===========
+
+The ``cephadm`` command is used (1) to bootstrap a new cluster, (2) to
+access a containerized shell with a working Ceph CLI, and (3) to work
+with containerized Ceph daemons when debugging issues.
+
+You can use ``curl`` to fetch the most recent version of the standalone script::
+
+ host1$ curl --silent --remote-name --location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm
+ host1$ chmod +x cephadm
+
+You may also be able to get cephadm by installing a package
+provided by your Linux distribution::
+
+ host1$ sudo apt install -y cephadm # or
+ host1$ sudo dnf install -y cephadm # or
+ host1$ sudo yum install -y cephadm # or
+ host1$ sudo zypper install -y cephadm
+
+
+
+Bootstrap a new cluster
+=======================
+
+You need to know which *IP address* to use for the cluster's first
+monitor. This is normally just the IP for the first cluster node. If
+there are multiple networks and interfaces, be sure to choose one that
+will be accessible by any hosts accessing the Ceph cluster.
+
+To bootstrap the cluster run the following commands::
+
+ host1$ sudo ./cephadm bootstrap --mon-ip *<mon-ip>*
+
+This command does a few things:
+
+* A monitor and manager daemon for the new cluster are created on the
+ local host. A minimal configuration file needed to communicate with
+ the new cluster is written to ``ceph.conf`` in the current directory.
+* A copy of the ``client.admin`` administrative (privileged!) secret
+ key is written to ``ceph.client.admin.keyring`` in the current directory.
+* A new SSH key is generated for the Ceph cluster and is added to the
+ root user's ``/root/.ssh/authorized_keys`` file. A copy of the
+ public key is written to ``ceph.pub`` in the current directory.
+
+.. tip::
+
+ If you run the bootstrap command from ``/etc/ceph``, the cluster's new
+ keys are written to a standard location. For example,::
+
+ host1$ sudo mkdir -p /etc/ceph
+ host1$ cd /etc/ceph
+ host1$ sudo /path/to/cephadm bootstrap --mon-ip *<mon-ip>*
+
+
+Interacting with the cluster
+============================
+
+To interact with your cluster via the command-line interface, start up
+a container that has all of the Ceph packages (including the ``ceph``
+command) installed::
+
+ host1$ sudo ./cephadm shell --config ceph.conf --keyring ceph.client.admin.keyring
+
+Inside the container, you can check the cluster status with::
+
+ [ceph: root@host1 /]# ceph status
+
+In order to interact with the Ceph cluster outside of a container
+(that is, from the host's command line), install the Ceph
+client packages and install the configuration and privileged
+administrator key in a global location::
+
+ host1$ sudo apt install -y ceph-common # or,
+ host1$ sudo dnf install -y ceph-common # or,
+ host1$ sudo yum install -y ceph-common
+
+ host1$ sudo install -m 0644 ceph.conf /etc/ceph/ceph.conf
+ host1$ sudo install -m 0600 ceph.keyring /etc/ceph/ceph.keyring
+
+
+Adding hosts to the cluster
+===========================
+
+For each new host you'd like to add to the cluster, you need to do two things:
+
+#. Install the cluster's public SSH key in the new host's root user's
+ ``authorized_keys`` file::
+
+ host1$ sudo ssh-copy-id -f -i ceph.pu root@*<new-host>*
+
+ For example::
+
+ host1$ sudo ssh-copy-id -f -i ceph.pu root@host2
+ host1$ sudo ssh-copy-id -f -i ceph.pu root@host3
+
+#. Tell Ceph that the new node is part of the cluster::
+
+ # ceph orch host add *newhost*
+
+ For example::
+
+ # ceph orch host add host2
+ # ceph orch host add host3
+
+Deploying additional monitors
+=============================
+
+Normally a Ceph cluster has three or five monitor daemons spread
+across different hosts. As a rule of thumb, you should deploy five
+monitors if there are five or more nodes in your cluster.
+
+.. _CIDR: https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation
+
+If all of your monitors will exist on the same IP subnet, cephadm can
+automatically scale the number of monitors. This subnet should be
+specified in `CIDR`_ format (e.g., ``10.1.2.0/24``). (If you do not
+specify a subnet, you will need to manually specify an IP or subnet
+when creating each monitor.)::
+
+ # ceph config set mon public_network *<mon-cidr-network>*
+
+For example::
+
+ # ceph config set mon public_network 10.1.2.0/24
+
+There are several ways to add additional monitors:
+
+* You can simply tell cephadm how many monitors you want, and it will pick the
+ hosts (randomly)::
+
+ # ceph orch apply mon *<number-of-monitors>*
+
+ For example, if you have 5 or more hosts added to the cluster,::
+
+ # ceph orch apply mon 5
+
+* You can explicitly specify which hosts to deploy on. Be sure to include
+ the first monitor host in this list.::
+
+ # ceph orch apply mon *<host1,host2,host3,...>*
+
+ For example,::
+
+ # ceph orch apply mon host1,host2,host3
+
+* You can control which hosts the monitors run on by adding the ``mon`` label
+ to the appropriate hosts::
+
+ # ceph orch host label add *<hostname>* mon
+
+ To view the current hosts and labels,::
+
+ # ceph orch host ls
+
+ For example::
+
+ # ceph orch host label add host1 mon
+ # ceph orch host label add host2 mon
+ # ceph orch host label add host3 mon
+ # ceph orch host ls
+ HOST ADDR LABELS STATUS
+ host1 mon
+ host2 mon
+ host3 mon
+ host4
+ host5
+
+ Then tell cephadm to deploy monitors based on the label::
+
+ # ceph orch apply mon label:mon
+
+* You can explicitly specify the IP address or CIDR for each monitor
+ and control where it is placed. This is the only supported method
+ if you did not specify the CIDR monitor network above.
+
+ To deploy additional monitors,::
+
+ # ceph orch daemon add mon *<host1:ip-or-network1> [<host1:ip-or-network-2>...]*
+
+ For example, to deploy a second monitor on ``newhost1`` using an IP
+ address ``10.1.2.123`` and a third monitor on ``newhost2`` in
+ network ``10.1.2.0/24``,::
+
+ # ceph orch daemon add mon newhost1:10.1.2.123
+ # ceph orch daemon add mon newhost2:10.1.2.0/24
+
+Deploying OSDs
+==============
+
+To add OSDs to the cluster, you have two options:
+
+#. You need to know the device name for the block device (hard disk or
+SSD) that will be used. Then,::
+
+ # ceph orch osd create *<host>*:*<path-to-device>*
+
+ For example, to deploy an OSD on host *newhost*'s SSD,::
+
+ # ceph orch osd create newhost:/dev/disk/by-id/ata-WDC_WDS200T2B0A-00SM50_182294800028
+
+
+#. You need to describe your disk setup by it's properties (Drive Groups)
+
+ Link to DriveGroup docs.::
+
+ # ceph orch osd create -i my_drivegroups.yml
+
+
+.. _drivegroups: drivegroups::
+
+Deploying manager daemons
+=========================
+
+It is a good idea to have at least one backup manager daemon. To
+deploy one or more new manager daemons,::
+
+ # ceph orch apply mgr *<new-num-mgrs>* [*<host1>* ...]
+
+Deploying MDSs
+==============
+
+One or more MDS daemons is required to use the CephFS file system.
+These are created automatically if the newer ``ceph fs volume``
+interface is used to create a new file system. For more information,
+see :ref:`fs-volumes-and-subvolumes`.
+
+To deploy metadata servers,::
+
+ # ceph orch apply mds *<fs-name>* *<num-daemons>* [*<host1>* ...]
+
+Deploying RGWs
+==============
+
+Cephadm deploys radosgw as a collection of daemons that manage a
+particular *realm* and *zone*. (For more information about realms and
+zones, see :ref:`multisite`.) To deploy a set of radosgw daemons for
+a particular realm and zone,::
+
+ # ceph orch apply rgw *<realm-name>* *<zone-name>* *<num-daemons>* [*<host1>* ...]
+
+Note that with cephadm, radosgw daemons are configured via the monitor
+configuration database instead of via a `ceph.conf` or the command line. If
+that confiruation isn't already in place (usually in the
+``client.rgw.<realmname>.<zonename>`` section), then the radosgw
+daemons will start up with default settings (e.g., binding to port
+80).
--- /dev/null
+==================
+Cephadm Operations
+==================
+
+Watching cephadm log messages
+=============================
+
+Cephadm logs to the ``cephadm`` cluster log channel, which means you can monitor progress in realtime with::
+
+ # ceph -W cephadm
+
+By default it will show info-level events and above. To see
+debug-level messages too::
+
+ # ceph config set mgr mgr/cephadm/log_to_cluster_level debug
+ # ceph -W cephadm --watch-debug
+
+Be careful: the debug messages are very verbose!
+
+You can see recent events with::
+
+ # ceph log last cephadm
+
+These events are also logged to the ``ceph.cephadm.log`` file on
+monitor hosts and/or to the monitor-daemon stderr.
+
+
+Ceph daemon logs
+================
+
+Logging to stdout
+-----------------
+
+Traditionally, Ceph daemons have logged to ``/var/log/ceph``. With
+cephadm, by default, daemons instead log to stderr and the logs are
+captured by the container runtime environment. For most systems, by
+default, these logs are sent to journald and accessible via
+``journalctl``.
+
+For example, to view the logs for the daemon ``mon.foo`` for a cluster
+with id ``5c5a50ae-272a-455d-99e9-32c6a013e694``, the command would be
+something like::
+
+ journalctl -u ceph-5c5a50ae-272a-455d-99e9-32c6a013e694@mon.foo
+
+This works well for normal operations when logging levels are low.
+
+To disable logging to stderr::
+
+ ceph config set global log_to_stderr false
+ ceph config set global mon_cluster_log_to_stderr false
+
+Logging to files
+----------------
+
+You can also configure Ceph daemons to log to files instead of stderr,
+just like they have in the past. When logging to files, Ceph logs appear
+in ``/var/log/ceph/<cluster-fsid>``.
+
+To enable logging to files::
+
+ ceph config set global log_to_file true
+ ceph config set global mon_cluster_log_to_file true
+
+You probably want to disable logging to stderr (see above) or else everything
+will be logged twice!::
+
+ ceph config set global log_to_stderr false
+ ceph config set global mon_cluster_log_to_stderr false
+
+By default, cephadm sets up log rotation on each host to rotate these
+files. You can configure the logging retention schedule by modifying
+``/etc/logrotate.d/ceph.<cluster-fsid>``.
+
+
+Data location
+=============
+
+Cephadm daemon data and logs in slightly different locations than older
+versions of ceph:
+
+* ``/var/log/ceph/<cluster-fsid>`` contains all cluster logs. Note
+ that by default cephadm logs via stderr and the container runtime,
+ so these logs are normally not present.
+* ``/var/lib/ceph/<cluster-fsid>`` contains all cluster daemon data
+ (besides logs).
+* ``/var/lib/ceph/<cluster-fsid>/<daemon-name>`` contains all data for
+ an individual daemon.
+* ``/var/lib/ceph/<cluster-fsid>/crash`` contains crash reports for
+ the cluster.
+* ``/var/lib/ceph/<cluster-fsid>/removed`` contains old daemon
+ data directories for stateful daemons (e.g., monitor, prometheus)
+ that have been removed by cephadm.
+
+Disk usage
+----------
+
+Because a few Ceph daemons may store a significant amount of data in
+``/var/lib/ceph`` (notably, the monitors and prometheus), you may want
+to move this directory to its own disk, partition, or logical volume so
+that you do not fill up the root file system.
+
+
+
+SSH Configuration
+=================
+
+Cephadm uses SSH to connect to remote hosts. SSH uses a key to authenticate
+with those hosts in a secure way.
+
+
+Default behavior
+----------------
+
+Cephadm normally stores an SSH key in the monitor that is used to
+connect to remote hosts. When the cluster is bootstrapped, this SSH
+key is generated automatically. Normally, no additional configuration
+is necessary.
+
+A *new* SSH key can be generated with::
+
+ ceph cephadm generate-key
+
+The public portion of the SSH key can be retrieved with::
+
+ ceph cephadm get-pub-key
+
+The currently stored SSH key can be deleted with::
+
+ ceph cephadm clear-key
+
+You can make use of an existing key by directly importing it with::
+
+ ceph config-key set mgr/cephadm/ssh_identity_key -i <key>
+ ceph config-key set mgr/cephadm/ssh_identity_pub -i <pub>
+
+You will then need to restart the mgr daemon to reload the configuration with::
+
+ ceph mgr fail
+
+
+Customizing the SSH configuration
+---------------------------------
+
+Normally cephadm generates an appropriate ``ssh_config`` file that is
+used for connecting to remote hosts. This configuration looks
+something like this::
+
+ Host *
+ User root
+ StrictHostKeyChecking no
+ UserKnownHostsFile /dev/null
+
+There are two ways to customize this configuration for your environment:
+
+#. You can import a customized configuration file that will be stored
+ by the monitor with::
+
+ ceph cephadm set-ssh-config -i <ssh_config_file>
+
+ To remove a customized ssh config and revert back to the default behavior::
+
+ ceph cephadm clear-ssh-config
+
+#. You can configure a file location for the ssh configuration file with::
+
+ ceph config set mgr mgr/cephadm/ssh_config_file <path>
+
+ This approach is *not recommended*, however, as the path name must be
+ visible to *any* mgr daemon, and cephadm runs all daemons as
+ containers. That means that the file either need to be placed
+ inside a customized container image for your deployment, or
+ manually distributed to the mgr data directory
+ (``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
+ ``/var/lib/ceph/mgr/ceph-<id>`` from inside the container).
+
+
+Health checks
+=============
+
+CEPHADM_PAUSED
+--------------
+
+Cephadm background work has been paused with ``ceph orch pause``. Cephadm
+will continue to perform passive monitoring activities (like checking
+host and daemon status), but it will not make any changes (like deploying
+or removing daemons).
+
+You can resume cephadm work with::
+
+ ceph orch resume
+
+CEPHADM_STRAY_HOST
+------------------
+
+One or more hosts have running Ceph daemons but are not registered as
+hosts managed by *cephadm*. This means that those services cannot
+currently be managed by cephadm (e.g., restarted, upgraded, included
+in `ceph orch ps`).
+
+You can manage the host(s) with::
+
+ ceph orch host add *<hostname>*
+
+Note that you may need to configure SSH access to the remote host
+before this will work.
+
+Alternatively, you can manually connect to the host and ensure that
+services on that host are removed and/or migrated to a host that is
+managed by *cephadm*.
+
+You can also disable this warning entirely with::
+
+ ceph config set mgr mgr/cephadm/warn_on_stray_hosts false
+
+CEPHADM_STRAY_DAEMON
+--------------------
+
+One or more Ceph daemons are running but not are not managed by
+*cephadm*, perhaps because they were deploy using a different tool, or
+were started manually. This means that those services cannot
+currently be managed by cephadm (e.g., restarted, upgraded, included
+in `ceph orch ps`).
+
+**FIXME:** We need to implement and document an adopt procedure here.
+
+You can also disable this warning entirely with::
+
+ ceph config set mgr mgr/cephadm/warn_on_stray_daemons false
+
+CEPHADM_HOST_CHECK_FAILED
+-------------------------
+
+One or more hosts have failed the basic cephadm host check, which verifies
+that (1) the host is reachable and cephadm can be executed there, and (2)
+that the host satisfies basic prerequisites, like a working container
+runtime (podman or docker) and working time synchronization.
+If this test fails, cephadm will no be able to manage services on that host.
+
+You can manually run this check with::
+
+ ceph cephadm check-host *<hostname>*
+
+You can remove a broken host from management with::
+
+ ceph orch host rm *<hostname>*
+
+You can disable this health warning with::
+
+ ceph config set mgr mgr/cephadm/warn_on_failed_host_check false
--- /dev/null
+
+Troubleshooting
+===============
+
+Sometimes there is a need to investigate why a cephadm command failed or why
+a specific service no longer runs properly.
+
+As cephadm deploys daemons as containers, troubleshooting daemons is slightly
+different. Here are a few tools and commands to help investigating issues.
+
+Gathering log files
+-------------------
+
+Use journalctl to gather the log files of all daemons:
+
+.. note:: By default cephadm now stores logs in journald. This means
+ that you will no longer find daemon logs in ``/var/log/ceph/``.
+
+To read the log file of one specific daemon, run::
+
+ cephadm logs --name <name-of-daemon>
+
+Note: this only works when run on the same host where the daemon is running. To
+get logs of a daemon running on a different host, give the ``--fsid`` option::
+
+ cephadm logs --fsid <fsid> --name <name-of-daemon>
+
+Where the ``<fsid>`` corresponds to the cluster id printed by ``ceph status``.
+
+To fetch all log files of all daemons on a given host, run::
+
+ for name in $(cephadm ls | jq -r '.[].name') ; do
+ cephadm logs --fsid <fsid> --name "$name" > $name;
+ done
+
+Collecting systemd status
+-------------------------
+
+To print the state of a systemd unit, run::
+
+ systemctl status "ceph-$(cephadm shell ceph fsid)@<service name>.service";
+
+
+To fetch all state of all daemons of a given host, run::
+
+ fsid="$(cephadm shell ceph fsid)"
+ for name in $(cephadm ls | jq -r '.[].name') ; do
+ systemctl status "ceph-$fsid@$name.service" > $name;
+ done
+
+
+List all downloaded container images
+------------------------------------
+
+To list all container images that are downloaded on a host:
+
+.. note:: ``Image`` might also be called `ImageID`
+
+::
+
+ podman ps -a --format json | jq '.[].Image'
+ "docker.io/library/centos:8"
+ "registry.opensuse.org/opensuse/leap:15.2"
+
+
+Manually running containers
+---------------------------
+
+cephadm writes small wrappers that run a containers. Refer to
+``/var/lib/ceph/<cluster-fsid>/<service-name>/unit.run`` for the container execution command.
+to execute a container.
:hidden:
start/intro
- cephadm/index
install/index
+ cephadm/index
rados/index
cephfs/index
rbd/index
Installing Ceph
===============
-There are various options for installing Ceph. Review the documention for each method before choosing the one that best serves your needs.
+There are several different ways to install Ceph. Please choose the
+method that best suites your needs.
-We recommend the following installation methods:
+Recommended methods
+~~~~~~~~~~~~~~~~~~~
- * cephadm
- * Rook
+:ref:`Cephadm <cephadm>` installs and manages a Ceph cluster using containers and
+systemd, with tight integration with the CLI and dashboard GUI.
+* cephadm supports only Octopus and newer releases.
+* cephadm is fully integrated with the new orchestration API and
+ fully supports the new CLI and dashboard features to manage
+ cluster deployment.
+* cephadm requires container support (podman or docker) and
+ Python 3.
-We offer these other methods of installation in addition to the ones we recommend:
+`Rook <https://rook.io/>`_ deploys and manages Ceph clusters running
+in Kubernetes, while also enabling management of storage resources and
+provisioning via Kubernetes APIs. Rook is the recommended way to run Ceph in
+Kubernetes or to connect an existing Ceph storage cluster to Kubernetes.
- * ceph-ansible
- * ceph-deploy (no longer actively maintained)
- * Deepsea (Salt)
- * Juju
- * Manual installation (using packages)
- * Puppet
+* Rook supports only Nautilus and newer releases of Ceph.
+* Rook is the preferred method for running Ceph on Kubernetes, or for
+ connecting a Kubernetes cluster to an existing (external) Ceph
+ cluster.
+* Rook fully suports the new orchestrator API. New management features
+ in the CLI and dashboard are fully supported.
+Other methods
+~~~~~~~~~~~~~
-Recommended Methods of Ceph Installation
-========================================
+`ceph-ansible <https://docs.ceph.com/ceph-ansible/>`_ deploys and manages
+Ceph clusters using Ansible.
-cephadm
--------
+* ceph-ansible is widely deployed.
+* ceph-ansible is not integrated with the new orchestrator APIs,
+ introduced in Nautlius and Octopus, which means that newer
+ management features and dashboard integration are not available.
-Installs Ceph using containers and systemd.
-* :ref:`cephadm-bootstrap`
-
- * cephadm is supported only on Octopus and newer releases.
- * cephadm is fully integrated with the new orcehstration API and fully supports the new CLI and dashboard features to manage cluster deployment.
- * cephadm requires container support (podman or docker) and Python 3.
-
-Rook
-----
-
-Installs Ceph in Kubernetes.
-
-* `rook.io <https://rook.io/>`_
-
- * Rook supports only Nautilus and newer releases of Ceph.
- * Rook is the preferred deployment method for Ceph with Kubernetes.
- * Rook fully suports the new orchestrator API. New management features in the CLI and dashboard are fully supported.
-
-Other Methods of Ceph Installation
-==================================
-
-ceph-ansible
-------------
-
-Installs Ceph using Ansible.
-
-* `docs.ceph.com/ceph-ansible <https://docs.ceph.com/ceph-ansible/>`_
-
-ceph-deploy
------------
-
-Install ceph using ceph-deploy
-
-* :ref:`ceph-deploy-index`
+:ref:`ceph-deploy <ceph-deploy-index>` is a tool for quickly deploying simple clusters.
.. IMPORTANT::
-
ceph-deploy is no longer actively maintained. It is not tested on versions of Ceph newer than Nautilus. It does not support RHEL8, CentOS 8, or newer operating systems.
-.. toctree::
- :hidden:
-
- ceph-deploy/index
-
-
-DeepSea
--------
-
-Install Ceph using Salt
-
-* `github.com/SUSE/DeepSea <https://github.com/SUSE/DeepSea>`_
-
-Juju
-----
+`DeepSea <https://github.com/SUSE/DeepSea>`_ installs Ceph using Salt.
-Installs Ceph using Juju.
+`jaas.ai/ceph-mon <https://jaas.ai/ceph-mon>`_ installs Ceph using Juju.
-* `jaas.ai/ceph-mon <https://jaas.ai/ceph-mon>`_
+`github.com/openstack/puppet-ceph <https://github.com/openstack/puppet-ceph>`_ installs Ceph via Puppet.
+Ceph can also be :ref:`installed manually <install-manual>`.
-Manual
-------
-
-Manually install Ceph using packages.
-
-* :ref:`install-manual`
.. toctree::
:hidden:
-
- index_manual
-Puppet
-------
-
-Installs Ceph using Puppet
+ index_manual
+ ceph-deploy/index
-* `github.com/openstack/puppet-ceph <https://github.com/openstack/puppet-ceph>`_
* A new deployment tool called **cephadm** has been introduced that
integrates Ceph daemon deployment and management via containers
into the orchestration layer. For more information see
- :ref:`cephadm-bootstrap`.
+ :ref:`cephadm`.
* Health alerts can now be muted, either temporarily or permanently.
* A simple 'alerts' capability has been introduced to send email
health alerts for clusters deployed without the benefit of an