doc/security: document global_id reclaim CVE

author Sage Weil <sage@newdream.net>

Mon, 29 Mar 2021 17:39:13 +0000 (12:39 -0500)

committer Ilya Dryomov <idryomov@gmail.com>

Thu, 15 Apr 2021 12:03:30 +0000 (14:03 +0200)
author Sage Weil <sage@newdream.net>
Mon, 29 Mar 2021 17:39:13 +0000 (12:39 -0500)
committer Ilya Dryomov <idryomov@gmail.com>
Thu, 15 Apr 2021 12:03:30 +0000 (14:03 +0200)
diff --git a/doc/security/CVE-2021-20288.rst b/doc/security/CVE-2021-20288.rst

new file mode 100644 (file)

index 0000000..fa3b073
--- /dev/null
+++ b/doc/security/CVE-2021-20288.rst
@@ -0,0 +1,183 @@
+.. _CVE-2021-20288:
+
+CVE-2021-20288: Unauthorized global_id reuse in cephx
+=====================================================
+
+* `NIST information page <https://nvd.nist.gov/vuln/detail/CVE-2021-20288>`_
+
+Summary
+-------
+
+Ceph was not ensuring that reconnecting/renewing clients were
+presenting an existing ticket when reclaiming their global_id value.
+An attacker that was able to authenticate could claim a global_id in
+use by a different client and potentially disrupt
+other cluster services.
+
+Background
+----------
+
+Each authenticated client or daemon in Ceph is assigned a numeric
+global_id identifier. That value is assumed to be unique across the
+cluster.  When clients reconnect to the monitor (e.g., due to a
+network disconnection) or renew their ticket, they are supposed to
+present their old ticket to prove prior possession of their global_id
+so that it can be reclaimed and thus remain constant over the lifetime
+of that client instance.
+
+Ceph was not correctly checking that the old ticket was valid, allowing
+an arbitrary global_id to be reclaimed, even if it was in use by another
+active client in the system.
+
+Attacker Requirements
+---------------------
+
+Any potential attacker must:
+
+* have a valid authentication key for the cluster
+* know or guess the global_id of another client
+* run a modified version of the Ceph client code to reclaim another client's global_id
+* construct appropriate client messages or requests to disrupt service or exploit
+  Ceph daemon assumptions about global_id uniqueness
+
+Impact
+------
+
+Confidentiality Impact
+______________________
+
+None
+
+Integrity Impact
+________________
+
+Partial.  An attacker could potentially exploit assumptions around
+global_id uniqueness to disrupt other clients' access or disrupt
+Ceph daemons.
+
+Availability Impact
+___________________
+
+High.  An attacker could potentially exploit assumptions around
+global_id uniqueness to disrupt other clients' access or disrupt
+Ceph daemons.
+
+Access Complexity
+_________________
+
+High.  The client must make use of modified client code in order to
+exploit specific assumptions in the behavior of other Ceph daemons.
+
+Authentication
+______________
+
+Yes.  The attacker must also be authenticated and have access to the
+same services as a client it is wishing to impersonate or disrupt.
+
+Gained Access
+_____________
+
+Partial.  An attacker can partially impersonate another client.
+
+Affected versions
+-----------------
+
+All prior versions of Ceph monitors fail to ensure that global_id reclaim
+attempts are authentic.
+
+In addition, all user-space daemons and clients starting from Luminous v12.2.0
+were failing to securely reclaim their global_id following commit a2eb6ae3fb57
+("mon/monclient: hunt for multiple monitor in parallel").
+
+All versions of the Linux kernel client properly authenticate.
+
+Fixed versions
+--------------
+
+* Pacific v16.2.1 (and later)
+* Octopus v15.2.11 (and later)
+* Nautilus v14.2.20 (and later)
+
+
+Fix details
+-----------
+
+#. Patched monitors now properly require that clients securely reclaim
+   their global_id when the ``auth_allow_insecure_global_id_reclaim``
+   is ``false``.  Initially, by default, this option is set to
+   ``true`` so that existing clients can continue to function without
+   disruption until all clients have been upgraded.  When this option
+   is set to false, then an unpatched client will not be able to reconnect
+   to the cluster after an intermittent network disruption breaking
+   its connect to a monitor, or be able to renew its authentication
+   ticket when it times out (by default, after 72 hours).
+
+   Patched monitors raise the ``AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED``
+   health alert if ``auth_allow_insecure_global_id_reclaim`` is enabled.
+   This health alert can be muted with::
+
+     ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1w
+
+   Although it is not recommended, the alert can also be disabled with::
+
+     ceph config set mon mon_warn_on_insecure_global_id_reclaim_allowed false
+
+#. Patched monitors can disconnect new clients right after they have
+   authenticated (forcing them to reconnect and reclaim) in order to
+   determine whether they securely reclaim global_ids.  This allows
+   the cluster and users to discover quickly whether clients would be
+   affected by requiring secure global_id reclaim: most clients will
+   report an authentication error immediately.  This behavior can be
+   disabled by setting ``auth_expose_insecure_global_id_reclaim`` to
+   ``false``::
+
+     ceph config set mon auth_expose_insecure_global_id_reclaim false
+
+#. Patched monitors will raise the ``AUTH_INSECURE_GLOBAL_ID_RECLAIM`` health
+   alert for any clients or daemons that are not securely reclaiming their
+   global_id.  These clients should be upgraded before disabling the
+   ``auth_allow_insecure_global_id_reclaim`` option to avoid disrupting
+   client access.
+
+   By default (if ``auth_expose_insecure_global_id_reclaim`` has not
+   been disabled), clients' failure to securely reclaim global_id will
+   immediately be exposed and raise this health alert.
+   However, if ``auth_expose_insecure_global_id_reclaim`` has been
+   disabled, this alert will not be triggered for a client until it is
+   forced to reconnect to a monitor (e.g., due to a network disruption)
+   or the client renews its authentication ticket (by default, after
+   72 hours).
+
+#. The default time-to-live (TTL) for authentication tickets has been increased
+   from 12 hours to 72 hours.  Because we previously were not ensuring that
+   a client's prior ticket was valid when reclaiming their global_id, a client
+   could tolerate a network outage that lasted longer than the ticket TTL and still
+   reclaim its global_id.  Once the cluster starts requiring secure global_id reclaim,
+   a client that is disconnected for longer than the TTL may fail to reclaim its global_id,
+   fail to reauthenticate, and be unable to continue communicating with the cluster
+   until it is restarted.  The default TTL was increased to minimize the impact of this
+   change on users.
+
+
+Recommendations
+---------------
+
+#. Users should upgrade to a patched version of Ceph at their earliest
+   convenience.
+
+#. Users should upgrade any unpatched clients at their earliest
+   convenience.  By default, these clients can be easily identified by
+   checking the ``ceph health detail`` output for the
+   ``AUTH_INSECURE_GLOBAL_ID_RECLAIM`` alert.
+
+#. If all clients cannot be upgraded immediately, the health alerts can be
+   temporarily muted with::
+
+     ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM 1w  # 1 week
+     ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1w  # 1 week
+
+#. After all clients have been updated and the ``AUTH_INSECURE_GLOBAL_ID_RECLAIM``
+   alert is no longer present, the cluster should be set to prevent insecure
+   global_id reclaim with::
+
+     ceph config set mon auth_allow_insecure_global_id_reclaim false
diff --git a/doc/security/cves.rst b/doc/security/cves.rst

index bc4a05cf78160f2909fc798998fd7032029b08b6..3c4e864b817ac7010ca2d28fa210cc2feaf8dcf9 100644 (file)
--- a/doc/security/cves.rst
+++ b/doc/security/cves.rst
@@ -5,6 +5,8 @@ Past vulnerabilities
  +------------+-------------------+-------------+--------------------------------------------+
  | Published  | CVE               | Severity    | Summary                                    |
  +------------+-------------------+-------------+--------------------------------------------+
+| 2021-04-14 | `CVE-2021-20288`_ | High        | Unauthorized global_id reuse in cephx      |
++------------+-------------------+-------------+--------------------------------------------+
  | 2020-12-18 | `CVE-2020-27781`_ | 7.1 High    | CephFS creds read/modified by Manila users |
  +------------+-------------------+-------------+--------------------------------------------+
  | 2021-01-08 | `CVE-2020-25678`_ | 4.9 Medium  | mgr module passwords in clear text         |
@@ -60,7 +62,13 @@ Past vulnerabilities
  | 2016-12-03 | `CVE-2015-5245`_  |             | RGW header injection                       |
  +------------+-------------------+-------------+--------------------------------------------+
  
+.. toctree::
+   :hidden:
+   :maxdepth: 0
+
+    CVE-2021-20288 <CVE-2021-20288.rst>
  
+.. _CVE-2021-20288: ../CVE-2021-20288
  .. _CVE-2020-27781: https://nvd.nist.gov/vuln/detail/CVE-2020-27781
  .. _CVE-2020-25678: https://nvd.nist.gov/vuln/detail/CVE-2020-25678
  .. _CVE-2020-25677: https://nvd.nist.gov/vuln/detail/CVE-2020-25677
author	Sage Weil <sage@newdream.net>
	Mon, 29 Mar 2021 17:39:13 +0000 (12:39 -0500)
committer	Ilya Dryomov <idryomov@gmail.com>
	Thu, 15 Apr 2021 12:03:30 +0000 (14:03 +0200)
doc/security/CVE-2021-20288.rst	[new file with mode: 0644]	patch \| blob
doc/security/cves.rst		patch \| blob \| history