IvanGuan [Wed, 4 Mar 2020 13:04:03 +0000 (21:04 +0800)]
log: fix timestap precision of log can't set to millisecond.
The option log_coarse_timestamps can be set to Log::clock
successfully,but the Log::clock has no effect on time accuracy
because the dout_impl really use is Entry::clock.So we should
set Entry::clock by log_coarse_timestamps option instead of
Log:clock.In addition, i think the Log::clock can be removed
because i didn't see what it was for.
Sage Weil [Tue, 3 Mar 2020 21:59:33 +0000 (15:59 -0600)]
Merge PR #33667 into master
* refs/pull/33667/head:
mgr/orch: show placement in 'orch ls'
mgr/orch: fix SPEC alignment in 'orch ls'
mgr/orch: include spec ref in ServiceDescription
this reverts 9639acfefe09c87adc821bb5c5cc41974685331d, as the test does
make sense. what fails this test is the machinary to marshal/unmarshal
exception fails to handle un-picklable exceptions. the previous commit
is supposed to use a fallback to handle them.
Kefu Chai [Tue, 3 Mar 2020 16:18:35 +0000 (00:18 +0800)]
mgr/orch: try harder when pickle fails to marshal an exception
pickle cannot marshal instances of class not defined in the top level of
a module. for instance, `DriveGroupValidationError` is defined in a
submodule of `ceph` python module, that's why we cannot capture it.
this prevent the `ceph` command line from getting a proper error when
the command fails if the command is implemented using cross python
module call(s).
Sage Weil [Tue, 3 Mar 2020 16:09:06 +0000 (10:09 -0600)]
common/ceph_time: tolerate mono time going backwards
Some kernels (and possibly some hardware?) can trigger a monotonic clock
that goes back in time. That, in turn, can lead to a negative monotonic
time span. This would trigger an assert.
This this problem seems to be widespread, tolerate the case and interpret
it as a 0-length interval (vs something negative).
Fixes: https://tracker.ceph.com/issues/44078 Fixes: https://tracker.ceph.com/issues/43365 Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Mon, 2 Mar 2020 20:21:20 +0000 (14:21 -0600)]
mgr/cephadm: refresh configs when dependencies change
If a daemon config (e.g., prometheus) depends on other daemons'
existence, refresh the config if that list changes (e.g., new
node-exporter, mgr removed, etc.).
```
Traceback (most recent call last):
File "/home/user/src/ceph/src/pybind/mgr/cephadm/module.py", line 391, in do_work
res = self._on_complete_(*args, **kwargs)
File "/home/user/src/ceph/src/pybind/mgr/cephadm/module.py", line 455, in call_self
return f(self, *inner_args)
File "/home/user/src/ceph/src/pybind/mgr/cephadm/module.py", line 2576, in _create_alertmanager
custom_config=self._generate_alertmanager_config)
File "/home/user/src/ceph/src/pybind/mgr/cephadm/module.py", line 2051, in _create_daemon
stdin=json.dumps(cephadm_config))
File "/home/user/src/ceph/src/pybind/mgr/cephadm/module.py", line 1459, in _run_cephadm
code, '\n'.join(err)))
RuntimeError: cephadm exited with an error code: 1, stderr:ERROR: alertmanager not implemented yet
```
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
Patrick Donnelly [Tue, 17 Dec 2019 23:10:34 +0000 (15:10 -0800)]
qa: add upgrade test for volume upgrade from legacy
This tests that volumes created using the ceph_volume_client.py library
continue to be accessible/function via the Nautilus/Octopus ceph-mgr
volumes plugin.
Fixes: https://tracker.ceph.com/issues/42723 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Kefu Chai [Tue, 3 Mar 2020 03:10:00 +0000 (11:10 +0800)]
crimson: do not capture unused variable
this silences the warning of:
```
../src/crimson/osd/osdmap_gate.cc:48:38: warning: lambda capture 'this' is not used [-Wunused-lambda-capture]
std::for_each(first, last, [epoch, this](auto& blocked_requests) {
~~^~~~
```
bk203 [Mon, 2 Mar 2020 13:19:56 +0000 (14:19 +0100)]
doc: update Zabbix template reference
The old link references a 2017 version of the template, I experienced problems using this version of the template with the latest version of Ceph. Ceph would report "Failed to send data to Zabbix", by importing the newer 2019 version of the template within Zabbix Ceph could again send data (due to changed Zabbix Trapper item keys). Propose to replace the link for a link referencing the master branch of the template so the newest version is always referenced in the docs.
Jason Dillaman [Mon, 2 Mar 2020 20:34:22 +0000 (15:34 -0500)]
rbd-mirror: move resetting of snapshot replayer rescan variable
The `m_image_updated` boolean should be reset at the start of the
state checking loop now that we scan the local image meta and check
for forced-promotion of the local image.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Sage Weil [Thu, 27 Feb 2020 15:30:27 +0000 (09:30 -0600)]
compressor/lz4: rebuild if buffer is not contiguous
In older versions of lz4 (specifically < 1.8.2) bit errors
can be introduced when compressing from fragmented memory. The lz4
bug was fixed by this lz4 commit:
The error can be reproduced using following command :
./frametest -v -i100000000 -s1659 -t31096808
It's actually a bug in the stream LZ4 API,
when starting a new stream
and providing a first chunk to complete with size < MINMATCH.
In which case, the chunk becomes a dictionary.
No hash was generated and stored,
but the chunk is accessible as default position 0 points to dictStart,
and position 0 is still within MAX_DISTANCE.
Then, next attempt to read 32-bits from position 0 fails.
The issue would have been mitigated by starting from index 64 KB,
effectively eliminating position 0 as too far away.
The proper fix is to eliminate such "dictionary" as too small.
Which is what this patch does.
This is a workaround to rebuild our input buffer into a continguos buffer
if it is not already contiguous.
Fixes: https://tracker.ceph.com/issues/39525 Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Mon, 2 Mar 2020 16:30:03 +0000 (10:30 -0600)]
Merge PR #33523 into master
* refs/pull/33523/head:
mgr/orch: ServiceSpec: drop 'count'
mgr/rook: use spec.placement.count (instead of spec.count)
mgr/cephadm: make HostAssignment make sense
mgr/orch: PlacementSpec: do not combine all_hosts with anything else
mgr/orch: use PlacementSpec.from_strings() for all CLI commands
Jason Dillaman [Thu, 27 Feb 2020 19:50:59 +0000 (14:50 -0500)]
librbd: acquire exclusive lock from peer when removing
This solves an issue with snapshot-based mirroring when the
rbd-mirror daemon is the exclusive lock owner. For other cases,
it still checks for watchers before proceeding.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
We should return -EREMOTEIO if we don't have any primary images to sync to
since we want display a warning. Additionally, don't attempt to sync a
remote snapshot against a primary (demoted) local snapshot since it would
have an invalid primary snapshot id.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>