git-server-git.apps.pok.os.sepia.ceph.com Git

mds: take xlock in the order requests start locking

this avoid assertion in MutaionImpl::finish_locking()

Fix: https://tracker.ceph.com/issues/45261
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
(cherry picked from commit 2e11a35d5b06312e0b2d0aecd83e8eb882ddf719)

Merge pull request #34712 from ceph/wip-yuriw-clients-upgrades-luminous

qa/tests: removed 2-workload/devstack-tempest-gate.yaml tests

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>

qa/tests: removed 2-workload/devstack-tempest-gate.yaml tests

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>

Merge pull request #34459 from badone/wip-44984-luminous

luminous: selinux: Allow ceph-mgr access to httpd dir

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

selinux: Allow ceph-mgr access to httpd dir

ceph-mgr loads modules which require read access and this causes a
denial on el7.

Fixes: https://tracker.ceph.com/issues/44216
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
(cherry picked from commit 35a7fc8249337c3c59f0c561632abf578f5d20fc)

Merge pull request #34159 from ceph/wip-yuriw-clients-upgrades-luminous-octopus

qa/tests: client-upgrade-luminous-octopus tests

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

qa/tests: use py3 version of rbd scripts

client.1 is upgraded to octopus, so grab the same version of the rbd
workunit and test tree that will run py3 there.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>

qa/tests: skip python-ceph during upgrade

Octopus is python3-only so there are no python 2 packages to install.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>

qa/tests: client-upgrade-luminous-octopus tests

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>

Merge pull request #34149 from yuriw/wip-yuriw-clients-upgrades-luminous-octopus

qa/tests: initial check in for client-upgrade-luminous-octopus

qa/tests: initial check in for client-upgrade-luminous-octopus

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>

Merge pull request #33019 from shyukri/wip-40315-luminous

luminous: tests: pybind/test_volume_client: print python version correctly

Reviewed-by: Nathan Cutler <ncutler@suse.com>

Merge pull request #33195 from tchaikov/wip-luminous-17730

luminous: tool: introduce repair command to ceph-kvstore-tool

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #33619 from jan--f/wip-44333-luminous

luminous: ceph-volume: strip _dmcrypt suffix in simple scan json output

Merge pull request #33307 from smithfarm/wip-43481-luminous

luminous: rgw: change the "rgw admin status" 'num_shards' output to signed int

Reviewed-by: J. Eric Ivancich <ivancich@redhat.com>

Merge pull request #32718 from trociny/wip-43626-luminous

luminous: rbd-mirror: fix 'rbd mirror status' asok command output

Reviewed-by: Jason Dillaman <dillaman@redhat.com>

Merge pull request #32955 from smithfarm/wip-43831-luminous

luminous: librbd: don't call refresh from mirror::GetInfoRequest state machine

Reviewed-by: Jason Dillaman <dillaman@redhat.com>

ceph-volume: strip _dmcrypt suffix in simple scan json output

LUKS encrypted OSDs name their block* files with a _dmcrypt suffix.
activate fails on json files like this. Stripping this suffix in scan
fixes this.

Fixes: https://tracker.ceph.com/issues/43966
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit 2ddf76d118d77659c590ea076d34ce9a8e351a86)

Merge pull request #33376 from badone/wip-luminous-upgrade-ceph-ansible-and-move-to-lvm

luminous: qa/ceph-ansible: Upgrade to stable-3.2.30 branch

luminous: qa/ceph-ansible: Upgrade to stable-3.2.30 branch

The move to LVM will allow this test to run on smithis once the
teuthology ceph_ansible task supports that.

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>

rgw: change the 'rgw admin status" nun_shards output to signed int

Fixes: http://tracker.ceph.com/issues/37645
Signed-off-by: Mark Kogan <mkogan@redhat.com>
(cherry picked from commit 9bdc324cb6667244bd32ee09760f91819383b30d)

ceph-kvstore-tool: rename repair -> destructive-repair

This is shown to corrupt otherwise healthy rocksdb databases. Rename to
make it clear that it is generally not safe to run and shoud only be used
as a last resort.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 8cc636951132c2ee93e32bfc999777e3da023dd7)

Conflicts:
PendingReleaseNotes: drop this change as "repair" command did
not exist in luminous before this change.
qa/workunits/cephtool/test_kvstore_tool.sh: drop this change,
as this test was not added before this change.
src/tools/ceph_kvstore_tool.cc: trivial resolution.

doc: introduce repair subcommon of ceph-kvstore-tool

Signed-off-by: liuchang0812 <liuchang0812@gmail.com>
(cherry picked from commit 51b5ba1aa242772093174cc87a9861c9405c3b67)

tools/ceph_kvstore_tool: do not open rocksdb when repairing it

before this change, the `need_open_db` parameter is passed to the
constructor of BlueStore as `min_alloc_size`. and rocksdb will fail to
repair because Repairer::Run() also tries to acquire the db lock, and it
will fail to do so if the lock file is already acquired by
BlueStore::_mount().

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 85c65a58cc454d9181ed64a4e5e4af0fea3812c6)

common, tool: update kvstore-tool to repair our key/value database

Fixes: http://tracker.ceph.com/issues/17730
Signed-off-by: liuchang0812 <liuchang0812@gmail.com>
(cherry picked from commit 4849ce3cc96eac9fee305927198a6c1b90892687)

Conflicts:
src/kv/LevelDBStore.cc
src/kv/RocksDBStore.cc
src/kv/RocksDBStore.h
src/os/bluestore/BlueStore.cc
src/tools/ceph_kvstore_tool.cc: resolve conflicts.

qa/tasks/cephfs/test_volume_client: print py2 or py3 which the test case runs

Fixes: http://tracker.ceph.com/issues/40184
Signed-off-by: Lianne <liyan.wang@xtaotech.com>
(cherry picked from commit 7c7c7870d38902a0df83a0fdecaa56baad556d82)

12.2.13

Merge pull request #32950 from neha-ojha/wip-pcycle-luminous

luminous: qa: install build dependencies for cfuse_workunit_kernel_untar_build.yaml

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

librbd: don't call refresh from mirror::GetInfoRequest state machine

Fixes: https://tracker.ceph.com/issues/43589
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit da46798ab3f56a639cc7a0b885778e8f75505b53)

Conflicts:
src/librbd/api/Mirror.cc
- C_ImageGetInfo ctor takes only two arguments in nautilus
- nautilus does not have LambdaContext as a class; use FunctionContext
instead

(cherry picked from commit a1e0d623d5026baec9d1e6ed83201c3fb326fc10)

qa: install build dependencies for cfuse_workunit_kernel_untar_build.yaml

Fixes: https://tracker.ceph.com/issues/36076
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 38ef3da8d27e24576193cbf3f9238f2c5b586c09)

Merge pull request #32796 from jan--f/wip-43759-luminous

luminous: ceph-volume: assume msgrV1 for all branches containing mimic

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

ceph-volume: assume msgrV1 for all branches containing mimic

With nautilus and newer OSDs listen on v1 ports and v2 ports. Assume
that if mimic (or luminous) occur in the branch name, the OSDs are
running msgrv1 only.

Fixes: https://tracker.ceph.com/issues/42791
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
(cherry picked from commit b8754919df61b118200e210e0bfc8d6df0261dfd)

rbd-mirror: fix 'rbd mirror status' asok command output

This was broken by def50d04796, and implicitly fixed during
refactoring in the master (octopus) by adf1486e46c, hence it is a
direct commit to nautilus branch.

Fixes: https://tracker.ceph.com/issues/43429
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 0a0fcc7da4573e8b3c82440226747a2cc377496b)

Conflicts:
src/tools/rbd_mirror/Mirror.cc (image_deleter section removed after luminous)

Merge pull request #32666 from dzafman/wip-41016-luminous

luminous: osd: Diagnostic logging for upmap cleaning

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #32523 from iliul/luminous

luminous: os/bluestore: fix assertion in StupidAllocator::get_fragmentation

Reviewed-by: Igor Fedotov <ifedotov@suse.com>

osd: Diagnostic logging for upmap cleaning

Fixes: https://tracker.ceph.com/issues/41016
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit b8746e5e851f0f0d6415d0261fa401ffac51a902)

Merge pull request #32599 from trociny/wip-43499-luminous

luminous: rbd-mirror: make logrotate work

Reviewed-by: Jason Dillaman <dillaman@redhat.com>

Merge pull request #32586 from dzafman/wip-bal4-luminous

luminous: Change default upmap_max_deviation to 5

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>

logrotate: also sighup rbd-mirror

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 86424fc3c895995d1d45f067c7852e6dce993027)

Conflicts:
src/cephadm/cephadm (does not exist)
src/logrotate.conf (no "pkill" fallback)

rbd-mirror: reopen all contexts logs on SIGHUP

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 9ddf111506611a596c713ffe861a41aeda05e7a5)

Conflicts:
src/tools/rbd_mirror/Mirror.cc (std::lock_guard vs Mutex::Locker, ceph_abort_msgf does not exist)
src/tools/rbd_mirror/PoolReplayer.cc (std::lock_guard vs Mutex::Locker, PoolReplayer is not a template)

rbd-mirror: delay local/remote rados initialization until context created

We rely on that if a rados ref is initialized it contains the
valid context.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit f3b49191771d2a3a20a7f55a14e0a7482ee96172)

Conflicts:
src/tools/rbd_mirror/PoolReplayer.cc (trivial)

mgr: Change default upmap_max_deviation to 5

Fixes: https://tracker.ceph.com/issues/43312
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit b0a1b758d012dfea40db3feca1a841c96f79defe)

Conflicts:
src/pybind/mgr/balancer/module.py (default isn't in COMMANDS section)
qa/standalone/mgr/balancer.sh (setting upmap_max_deviations to 1 differ)
src/test/cli/osdmaptool/missing-argument.t (usage included here)

osdmaptool: Add --upmap-active to simulate active upmap balancing

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 621acf8ce7f48253e9d2189a9a2ee432fa1d3ba1)

Conflicts:
src/test/cli/osdmaptool/help.t (some options not present)
src/tools/osdmaptool.cc (ceph_assert is assert here)
src/test/cli/osdmaptool/missing-argument.t (usage included here)

doc: Add upmap options to osdmaptool man page and give example

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 184e9d1ae3b5bcc332d5fe3330d46a5cb8fcacd6)

tools: osdmaptool document non-upmap options that were missing

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit e42a6ccb1819be4988e3ed7bd78fcf513f8d1589)

Conflicts:
doc/man/8/osdmaptool.rst (missing other options not part of this)

os/bluestore: fix assertion in StupidAllocator::get_fragmentation

One might face an assertion (assert(intervals <= max_intervals))
in StupidAllocator::get_fragmentation method for clusters created
by early Luminous releases and before. The root cause is that block
volume size wasn't aligned with min_alloc_size and hence we missed
that last fraction interval during max_interval calculation.

Fixes: https://tracker.ceph.com/issues/43297
Note: This was a clean cherry-pick from master, but p2roundup was
introduced since mimic release, use P2ROUNDUP instead

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
(cherry picked from commit a60b2316ce0bed28c468043cff4cab5e61b1a694)
Signed-off-by: Lei Liu <liulei3@360.cn>

Merge pull request #32349 from smithfarm/wip-39474-luminous

luminous: common/util: handle long lines in /proc/cpuinfo

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #31855 from smithfarm/wip-41730-luminous

luminous: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_missing.count(fromshard))

Reviewed-by: David Zafman <dzafman@redhat.com>

Merge pull request #32194 from linuxbox2/luminous-lc-early

luminous: rgw: lc: continue past get_obj_state() failure

Reviewed-by: Daniel Gryniewicz <dang@redhat.com>

common/util: handle long lines in /proc/cpuinfo

Fixes: http://tracker.ceph.com/issues/38296
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit b02e81935c877eff4929c8aad714b0015db45201)

Merge pull request #32267 from ideepika/wip-43325-luminous

luminous: doc: wrong datatype describing crush_rule

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #32227 from alimaredia/wip-s3-tests-branch-name-refactor-luminous

luminous: update s3-test download code for s3-test tasks

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

Merge pull request #31860 from smithfarm/wip-43013-luminous

luminous: rgw: crypt: permit RGW-AUTO/default with SSE-S3 headers

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #32034 from cbodley/wip-qa-rgw-swift-luminous

luminous: qa/rgw: add missing force-branch: ceph-luminous for swift tasks

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>

Merge pull request #32215 from smithfarm/wip-43234-luminous

luminous: tests: radosgw-admin: remove dependency on bunch package

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

luminous: update s3-test download code for s3-test tasks

- Ensure the download code for all tasks running
s3-tests is consistent.
- Simplify download code to only use the config
variable 'force-branch' for the branch being
cloned.
- Make ceph-luminous the force-branch for all
suites using s3-tests.
- Add force-branch to suites running s3readwrite
& s3roundtrip tasks

Signed-off-by: Ali Maredia <amaredia@redhat.com>

osd/MissingLoc.cc: do not rely on missing_loc_sources only

In 624ade487ea4aeaf988cc1767e0b293f76addd5b, we relied on missing_loc_sources
to check for strays and remove an OSD from missing_loc. However, it is
possible that missing_loc_sources is empty while there are still OSDs
present in missing_loc. Since the aim is to just remove a stray OSD from
missing_loc, we do not need to rely on missing_loc_sources. We still
clean missing_loc_sources if any stray is present in it.

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 5906a57320f04f57a38eef9588bd16ac3fd4e55d)

Conflicts:
src/osd/MissingLoc.cc
- file does not exist in luminous; made changes manually in src/osd/PG.cc
- adjust ldout for luminous

Merge pull request #32135 from jecluis/wip-telemetry-luminous

luminous: telemetry module for mgr

Reviewed-by: Nathan Cutler <ncutler@suse.com>

doc/rados/operations: crush_rule is a name

like
```
ceph osd pool set <pool-name> crush_rule <rule-name>
```
where `<rule-name>` is a string instead of a number.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 3ed3de6c964ba998d5b18ceb997d1a6dffe355db)

osd/PG: fix _finish_recovery vs repair race

On detecting a corrupted object, primary may automatically
repair that object by leveraging the existing recovery procedure,
which turned out to be racy with a previous unfinished _finish_recovery
callback - the problem would then be that _finish_recovery might
continue to purge some strays that we still want to pull data from.

Fix by re-checking if there are any newly added missing objects when
executing _finish_recovery.

Note that before https://github.com/ceph/ceph/pull/29756 we might
instead have to call needs_recovery to catch the race condition
since we did not evict pg from clean state when triggering an auto-repair..

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(manual backport of d96e53285b4e748eacda314bf0958b87cfa42130)

Conflicts:
src/osd/PG.cc
- adjusted if conditional for luminous
- did not add the comment nor state_clear(PG_STATE_REPAIR);. Those lines were
moved but don't exist in luminous.

osd/MissingLoc, PeeringState: remove osd from missing loc in purge_strays()

We should always try to keep osds in missing_loc consistent with peer_missing
and peer_info. When we remove an osd from peer_missing and peer_info, we
should also remove it from missing_loc during purging strays.

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 624ade487ea4aeaf988cc1767e0b293f76addd5b)

Conflicts:
src/osd/MissingLoc.cc
src/osd/MissingLoc.h
src/osd/PeeringState.cc
- these files do not exist in luminous; made the changes manually to
src/osd/PG.cc and src/osd/PG.h
- ldout(cct, ...) -> ldout(pg->cct, ...)

PendingReleaseNotes: add telemetry mgr module

Signed-off-by: Joao Eduardo Luis <joao@suse.com>

mgr/telemetry: bump revision

We should have done this while cherry-picking from master, but we
didn't. And here we are now. It's simpler to apply this one-off patch
than going back to the cherry-picking maze to adjust this one thing.

Signed-off-by: Joao Eduardo Luis <joao@suse.com>

mgr/telemetry: add stats about crush map

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 713dbc9722888d3bf60d772dbca23e13b0cafc38)

Conflicts:
src/pybind/mgr/telemetry/module.py
          Missing context due to missing patches.
        PendingReleaseNotes
          Dropped to prevent conflicts in the future

mgr/telemetry: add rgw metadata

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit f62c6e8cba2e894f84ddabdea6db4ce56e02ea63)

Conflicts:
PendingReleaseNotes
Dropped to prevent conflicts in the future
src/pybind/mgr/telemetry/module.py
Context issues due to missing patches

mgr/telemetry: mds cache stats

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit f4c736699478f608bba77770a85f96a7bf8d24e5)

Conflicts:
src/pybind/mgr/telemetry/module.py
            Due to missing context resulting from missing patches.
        PendingReleaseNotes
            Dropped to prevent conflicts in the future

mgr/telemetry: add more pool metadata

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 87670fdc3c227387068d527b4659b50bc3bb64a3)

Conflicts:
src/pybind/mgr/telemetry/module.py
          Context issues
        PendingReleaseNotes:
          Dropped to prevent conflicts in the future

mgr/telemetry: remove crush rule name

This is a user-specified string and could contain identifying info.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 26b427356d920682b61cdf98fc2745e324c28baa)

Conflicts:
src/pybind/mgr/telemetry/module.py
Context issues

mgr/telemetry: include min_mon_release and msgr v1 vs v2 addr count

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 3453930d438dc3ba9ba5addca59aec6786293bd4)

Note:
    This commit was heavily modified. We wanted to provide the number of
    ipv4 and ipv6 monitors in the report, so we rewrote that part so we
    can report on it; but we had to drop everything else (msgr1 and
    msgr2), as well as 'min_mon_release'. Those do not exist in
    luminous. In the end, the commit message itself is misleading, but
    we are somehow (*shrug*) opting for leaving the commit as the original.
    Additionally, we removed PendingReleaseNotes changes to prevent
    conflicts in the future.

mgr/telemetry: add CephFS metadata

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 7f6aad677b76847514f6f9b893827412dfb35a6b)

Conflicts:
PendingReleaseNotes
Dropped due to conflicts down the road
src/pybind/mgr/telemetry/module.py
Context issues

Merge pull request #31857 from smithfarm/wip-40947-luminous

luminous: osd: add hdd, ssd and hybrid variants for osd_snap_trim_sleep

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

Merge pull request #31858 from smithfarm/wip-38205-luminous

luminous: osd: refuse to start if we're > N+2 from recorded require_osd_release

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

Merge pull request #31992 from dzafman/wip-balancer3-luminous

luminous: mgr: Release GIL and Balancer fixes

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

qa: radosgw-admin: remove dependency on bunch package

Fixes: https://tracker.ceph.com/issues/43184
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit 1bf21326aa7f8eaafd7049b44eb73aeb36bcc5d7)

rgw: lc: continue past get_obj_state() failure

The get_obj_state() failure in particular could indicate a race with
an object being deleted, so likely is non-fatal. By returning, lifecycle
processing for the current bi-shard would not resume until re-scheduled,
likely in 24 hours.

Fixes: https://tracker.ceph.com/issues/43269
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>

Merge pull request #31846 from smithfarm/wip-42988-luminous

luminous: tests: kernel.sh: update for read-only changes

Reviewed-by: Ilya Dryomov <idryomov@redhat.com>

release note: Add pending release notes for already merged code

Follow on to https://github.com/ceph/ceph/pull/31774

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 65d03bae8b4f50cc3cbaa50640eaeab4cabd711f)

common/options.cc, doc: osd_snap_trim_sleep overrides other variants

A value > 0 for osd_snap_trim_sleep, will override the backend specific
variants of osd_snap_trim_sleep.

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 733df09fe5111e7beca75f8be0afb8669ef9a625)

doc/rados/configuration/osd-config-ref.rst: document snap trim sleep

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit accf95e9dea257c3aaa64b7a36d077468d7c86ec)

osd: add hdd, ssd and hybrid variants for osd_snap_trim_sleep

This is better than the earlier default, which was set to 0.

Fixes: https://tracker.ceph.com/issues/40528
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 560fca12e695a817e1b7e46d365838ed871b64bd)

Conflicts:
src/osd/OSD.cc
src/osd/OSD.h
src/osd/PrimaryLogPG.h
- no OSD::get_osd_delete_sleep() in luminous, no OSD::get_recovery_max_active()
in luminous
- use cct->_conf->get_val instead of cct->_conf.get_val

osd: refuse to start if release > recorded min_osd_release + 2

If we try to start up the objectstore, we may make writeable changes to
(say) rocksdb that are not backwards compatible. This happens, for
example, if you start a mimic osd. Even if the compatset checks fail,
rocksdb may have written something that is not backwards compatible.

Fixes: http://tracker.ceph.com/issues/38076
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 9f7713a905d67441b28371e4494e9447319d2129)

Conflicts:
src/ceph_osd.cc
- include common/version.h for ceph_release()
- use exit instead of forker.exit

osd: record require_osd_release in objectstore meta

Record the require_osd_release value from the OSDMap in the 'meta' portion
of the osd's metadata that can be accessed without actually mounting the
OSD. This will be useful as a safety gate to prevent you from mounting
an osd thet is too new that may make incompatible changes to the store.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 482cdca55351ca85290f1b2adb3c0cdf78af411d)

Conflicts:
src/osd/OSD.cc
src/osd/OSD.h
- ignore differences in surrounding context, as they do not seem relevant to
the fix

mgr/telemetry: clear the event after being awaken by it

otherwise telemetry will have a busy-loop once it's signaled.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 0dddb20685007d990dc30daddca6212d36e5e308)

mgr/telemetry: force re-opt-in if the report contents change

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 143e1f046909cb68d769ebbbaa80cb7106879997)

Conflicts:
doc/rados/operations/health-checks.rst
We don't have the crash module, hence neither its docs.
src/pybind/mgr/telemetry/module.py
Issues due to context

mgr/telemetry: less noise in the log

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 7f1897b238f5c55859229748b200bac715941cc3)

Conflicts:
src/pybind/mgr/telemetry/module.py
Due to context resulting from previous cherry-pick amends

mgr/telemetry: track telemetry report revisions

Assign revisions to track changes to the content of the telemetry
reports.

Track the revision when the user opts in or out.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit b93e4050261b962449c9721cea713616ce5e03bd)

Conflicts:
pybind/mgr/telemetry/modules.py
Due to set_module_option() -> set_config()

mgr/telemetry: specify license when opting in

Choosing not to include this in the docs so that the user is more likely
to see this interactively.  (That is...probably good?)

Choose sharing-1-0.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit f1eac8ba4becfb0ad6a7e7a303fc45a9d6ab59ee)

Conflicts:
src/pybind/mgr/telemetry/module.py
          Slight conflicts due to past cherry-picks (or lack thereof)
          Using set_config() instead of set_module_option()

mgr/telemetry: move contact info to an 'ident' channel

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 35c273c9c3ebd00fbac8e1a9fd281333a24c5fe7)

Conflicts:
src/pybind/mgr/telemetry/module.py
Due to lack of 'crash' and 'devicehealth' modules, and a bit
on how we keep options (self.config[] vs class attributes)

mgr/telemetry: accept channel list to 'telemetry show'

Also include a 'channels_available' item so that a user can tell which
channels are available.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit e9fdb219c1b946dc2b346b11c5e5d4b04786345d)

Conflicts:
src/pybind/mgr/telemetry/module.py
Due to lack of 'crash' and 'devicehealth' modules

mgr/telemetry: always generate new report for 'telemetry show'

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit f0762ce4470972ce8f555d8aab87362331a53afd)

mgr/telemetry: add separate channels

'basic' is the basic cluster stats (version, size, etc)
'crash' is the crash dumps.

By default these are both on, but they can be selectively enabled or
disabled.

New channels will follow.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit f3a3ccb52fa5a7b24f8d11138b5b51345865547e)

Conflicts:
src/pybind/mgr/telemetry/module.py
          Don't backport code related to the 'crash' module, and adjust
          how we read option variables (luminous goes through a config
          map, instead of master's that goes through class attributes)

mgr/telemetry: use cluster-provided timestamp unmolested

The cluster stamp is now ISO 8601; just use that.

(The isoformat() puts a : in +hh:mm the timezone offset, which is slightly
different than what Ceph does; just pass Ceph's value through for
consistency.)

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 383006a5cc2da3e0b643a9dc600e75e1ce088bd6)

Conflicts:
src/pybind/mgr/telemetry/module.py
Due to missing scaffolding that exists in master but was not
backported to luminous.

mgr/telemetry: default to reports every 24h; lower minimum

Allow more frequent telemetry reports.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 712987d53337e39ad871ee7abb38c2e2755fc75d)

Conflicts:
src/pybind/mgr/telemetry/module.py
          Past commit in master had introduced field types and a
          'minimum' value for the interval. We concluded that the field
          types commit does not affect the telemetry module in a
          significant way to force us to backport it, and the minimum
          value commit is introduced for the benefit of the dashboard
          (which, in luminous, does not have control over telemetry)

mgr/telemetry: add report_timestamp to sent reports

Received time may differ from report time, and correlating
to local cluster state events might be useful.

Signed-off-by: Dan Mick <dan.mick@redhat.com>
(cherry picked from commit a42c8e327c9f7d53b8c13cf51837c294bc4c643d)

mgr/telemetry: fix 'telemetry {on,off}'

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 497e00c4dadd9e29a792413a425671c061fa44c6)

mgr/telemetry: make 'telemetry show' readable by a human

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit ec78dcf516273b18ce6e92708da89466c9c0409d)

mgr/telemetry: check for errors when sending report

There was no error checking, and the server has been failing for
some time, but no one noticed.  Oops.

Signed-off-by: Dan Mick <dan.mick@redhat.com>
(cherry picked from commit de71f38a2c0b37a96970d6b2fd62ea19b20bdf46)

Conflicts:
src/pybind/mgr/telemetry/module.py
          mostly due to store_get/store_set not existing in luminous,
          and we relying instead on config_get/config_set.

mgr/telemetry: add 'telemetry on' and 'telemetry off' commands

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 6ab90c9cb09627649ca31f27c35e1c1efd6a6f12)

Conflicts:
src/pybind/mgr/telemetry/module.py
          master no longer has 'telemetry selftest' due to some other
          major changes that we did not backport, as they would require
          too many changes that were not, in an obvious manner, relevant
          for us.

mgr/telemetry: off by default

This way a user can enable the module and look at the output before
deciding to send it to anyone.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 9320bdb8ac06668719a9f910e12a9e9f4ed56405)

Conflicts:
src/pybind/mgr/telemetry/module.py
          We don't have some other scaffolding that exists on master,
          and we are not cherry-picking it because it changes
          significantly the module's code in a way that is not a clear
          advantage for the telemetry module (in 'luminous' context)

mgr/telemetry: fix total_objects

This field was removed from df output a while back in
342f309645df886fb96eb401634e38376553e6d9

Fixes: http://tracker.ceph.com/issues/37976
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit c849d7dfcc7306b8945d7a697fb76558ed50f983)