From: David Zafman Date: Fri, 7 Feb 2020 22:27:18 +0000 (-0800) Subject: Merge branch 'nautilus' into wip-42120-nautilus X-Git-Tag: v14.2.8~85^2 X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=refs%2Fpull%2F30689%2Fhead;p=ceph.git Merge branch 'nautilus' into wip-42120-nautilus --- fe2035f3fed60f82349fbb2e436a7223f00980e6 diff --cc PendingReleaseNotes index e32a134efca0,f70af61496b2..527b3444b2f6 --- a/PendingReleaseNotes +++ b/PendingReleaseNotes @@@ -1,139 -1,6 +1,6 @@@ -14.2.7 +14.2.8 ------ - * Ceph will now issue a health warning if a RADOS pool as a ``pg_num`` - value that is not a power of two. This can be fixed by adjusting - the pool to a nearby power of two:: - - ceph osd pool set pg_num - - Alternatively, the warning can be silenced with:: - - ceph config set global mon_warn_on_pool_pg_num_not_power_of_two false - - 14.2.4 - ------ - - * In the Zabbix Mgr Module there was a typo in the key being send - to Zabbix for PGs in backfill_wait state. The key that was sent - was 'wait_backfill' and the correct name is 'backfill_wait'. - Update your Zabbix template accordingly so that it accepts the - new key being send to Zabbix. - - 14.2.3 - ------ - - * Nautilus-based librbd clients can now open images on Jewel clusters. - - * The RGW "num_rados_handles" has been removed. - If you were using a value of "num_rados_handles" greater than 1 - multiply your current "objecter_inflight_ops" and - "objecter_inflight_op_bytes" paramaeters by the old - "num_rados_handles" to get the same throttle behavior. - - * The ``bluestore_no_per_pool_stats_tolerance`` config option has been - replaced with ``bluestore_fsck_error_on_no_per_pool_stats`` - (default: false). The overall default behavior has not changed: - fsck will warn but not fail on legacy stores, and repair will - convert to per-pool stats. - - - 14.2.2 - ------ - - * The no{up,down,in,out} related commands has been revamped. - There are now 2 ways to set the no{up,down,in,out} flags: - the old 'ceph osd [un]set ' command, which sets cluster-wide flags; - and the new 'ceph osd [un]set-group ' command, - which sets flags in batch at the granularity of any crush node, - or device class. - - * RGW: radosgw-admin introduces two subcommands that allow the - managing of expire-stale objects that might be left behind after a - bucket reshard in earlier versions of RGW. One subcommand lists such - objects and the other deletes them. Read the troubleshooting section - of the dynamic resharding docs for details. - - 14.2.5 - ------ - - * The telemetry module now has a 'device' channel, enabled by default, that - will report anonymized hard disk and SSD health metrics to telemetry.ceph.com - in order to build and improve device failure prediction algorithms. Because - the content of telemetry reports has changed, you will need to either re-opt-in - with:: - - ceph telemetry on - - You can view exactly what information will be reported first with:: - - ceph telemetry show - ceph telemetry show device # specifically show the device channel - - If you are not comfortable sharing device metrics, you can disable that - channel first before re-opting-in: - - ceph config set mgr mgr/telemetry/channel_crash false - ceph telemetry on - - * The telemetry module now reports more information about CephFS file systems, - including: - - - how many MDS daemons (in total and per file system) - - which features are (or have been) enabled - - how many data pools - - approximate file system age (year + month of creation) - - how many files, bytes, and snapshots - - how much metadata is being cached - - We have also added: - - - which Ceph release the monitors are running - - whether msgr v1 or v2 addresses are used for the monitors - - whether IPv4 or IPv6 addresses are used for the monitors - - whether RADOS cache tiering is enabled (and which mode) - - whether pools are replicated or erasure coded, and - which erasure code profile plugin and parameters are in use - - how many hosts are in the cluster, and how many hosts have each type of daemon - - whether a separate OSD cluster network is being used - - how many RBD pools and images are in the cluster, and how many pools have RBD mirroring enabled - - how many RGW daemons, zones, and zonegroups are present; which RGW frontends are in use - - aggregate stats about the CRUSH map, like which algorithms are used, how big buckets are, how many rules are defined, and what tunables are in use - - If you had telemetry enabled, you will need to re-opt-in with:: - - ceph telemetry on - - You can view exactly what information will be reported first with:: - - ceph telemetry show # see everything - ceph telemetry show basic # basic cluster info (including all of the new info) - - * A health warning is now generated if the average osd heartbeat ping - time exceeds a configurable threshold for any of the intervals - computed. The OSD computes 1 minute, 5 minute and 15 minute - intervals with average, minimum and maximum values. New configuration - option ``mon_warn_on_slow_ping_ratio`` specifies a percentage of - ``osd_heartbeat_grace`` to determine the threshold. A value of zero - disables the warning. New configuration option - ``mon_warn_on_slow_ping_time`` specified in milliseconds over-rides the - computed value, causes a warning - when OSD heartbeat pings take longer than the specified amount. - New admin command ``ceph daemon mgr.# dump_osd_network [threshold]`` command will - list all connections with a ping time longer than the specified threshold or - value determined by the config options, for the average for any of the 3 intervals. - New admin command ``ceph daemon osd.# dump_osd_network [threshold]`` will - do the same but only including heartbeats initiated by the specified OSD. - - * New OSD daemon command dump_recovery_reservations which reveals the - recovery locks held (in_progress) and waiting in priority queues. - - * New OSD daemon command dump_scrub_reservations which reveals the - scrub reservations that are held for local (primary) and remote (replica) PGs. - - 14.2.6 - ------ - * The following OSD memory config options related to bluestore cache autotuning can now be configured during runtime: @@@ -162,3 -26,10 +26,20 @@@ would not allow a pool to ever have completely balanced PGs. For example, if crush requires 1 replica on each of 3 racks, but there are fewer OSDs in 1 of the racks. In those cases, the configuration value can be increased. + + * RGW: a mismatch between the bucket notification documentation and the actual + message format was fixed. This means that any endpoints receiving bucket + notification, will now receive the same notifications inside an JSON array + named 'Records'. Note that this does not affect pulling bucket notification + from a subscription in a 'pubsub' zone, as these are already wrapped inside + that array. ++ ++* Ceph will now issue a health warning if a RADOS pool as a ``pg_num`` ++ value that is not a power of two. This can be fixed by adjusting ++ the pool to a nearby power of two:: ++ ++ ceph osd pool set pg_num ++ ++ Alternatively, the warning can be silenced with:: ++ ++ ceph config set global mon_warn_on_pool_pg_num_not_power_of_two false