amitkuma [Tue, 8 Aug 2017 18:41:13 +0000 (00:11 +0530)]
messages: Initialization of member variables
Fixes the coverity issues:
** 717271 Uninitialized scalar field
2. uninit_member: Non-static class member from_mds is not initialized
in this constructor nor in any functions that it calls.
4. uninit_member: Non-static class member dir_rep is not initialized
in this constructor nor in any functions that it calls.
CID 717271 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
6. uninit_member: Non-static class member discover is not initialized
in this constructor nor in any functions that it calls.
** 717272 Uninitialized scalar field
2. uninit_member: Non-static class member want_base_dir is not initialized
in this constructor nor in any functions that it calls.
CID 717272 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
4. uninit_member: Non-static class member want_xlocked is not initialized
in this constructor nor in any functions that it calls.
** 717274 Uninitialized scalar field
2. uninit_member: Non-static class member wanted_base_dir is not initialized
in this constructor nor in any functions that it calls.
4. uninit_member: Non-static class member wanted_xlocked is not initialized
in this constructor nor in any functions that it calls.
6. uninit_member: Non-static class member flag_error_dn is not initialized
in this constructor nor in any functions that it calls.
8. uninit_member: Non-static class member flag_error_dir is not initialized
in this constructor nor in any functions that it calls.
10. uninit_member: Non-static class member unsolicited is not initialized
in this constructor nor in any functions that it calls.
12. uninit_member: Non-static class member dir_auth_hint is not initialized
in this constructor nor in any functions that it calls.
CID 717274 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
14. uninit_member: Non-static class member starts_with is not initialized
in this constructor nor in any functions that it calls.
** 717275 Uninitialized scalar field
CID 717275 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR)
2. uninit_member: Non-static class member from is not initialized in this
constructor nor in any functions that it calls.
kungf [Thu, 10 Aug 2017 12:05:00 +0000 (20:05 +0800)]
mon: return directly after health_events_cleanup
when mon_health_to_clog was set false, all health events was cleanup,
no need to judge the change of mon_health_to_clog_interval and
mon_health_to_clog_tick_interval.
Alex Mikheev [Mon, 12 Jun 2017 08:32:38 +0000 (08:32 +0000)]
msg/async/rdma: fixes crash in fio
fio creates multiple CephContext in a single process.
Crash(es) happen because rdma stack has a global resources that
are still used from one ceph context while have already been destroyed
by another context.
The commit removes global instances of RDMA dispatcher and infiniband
and makes them context (rdma stack) specific.
Signed-off-by: Adir Lev <adirl@mellanox.com> Signed-off-by: Alex Mikheev <alexm@mellanox.com>
Consider the following user case:
(1) randomly choose some OSDs(e.g., from different hosts) and try to make them for private use only,
say, by grouping them into 'pool1'
(2) ceph osd crush set-device-class pool1 'OSDs from (1)'
(3) ceph osd crush rule create-replicated rule_for_pool1 default host pool1
(4) ceph osd pool rename pool1 pool2
(5) ceph osd crush class rename pool1 pool2
From the above user case, we need to safely change a pool name without worrying
any risk of data migration. That is why the 'osd crush class rename' command
is still needed here.
David Zafman [Wed, 9 Aug 2017 15:43:57 +0000 (08:43 -0700)]
qa: Fix races with waiting for scrubs
The trigger_scrub sets the last_scrub_stamp backwards to
force a scheduled scrub. In a small window this stamp could get propagated
to the mgr. A test failure occurred because wait_for_scrub() was confused
by seeing a backward moving date.
The most critical change is having wait_for_scrub() make sure that the
date advances past the previous in value.
A test failed because the random backoff kept delayed triggered scrub, so
set osd_scrub_backoff throughout.
Greg Farnum [Wed, 9 Aug 2017 21:34:44 +0000 (14:34 -0700)]
mdsmon: treat the osdmon correctly when doing plugged updates
Make sure it's writeable before invoking changes, and propose_pending()
on it when we're done.
Make the PaxosService::C_RetryMessage public so we can do this from FSCommands.
David Zafman [Tue, 1 Aug 2017 22:19:01 +0000 (15:19 -0700)]
qa: ceph-helpers.sh fixes
Add missing teardown to cleanup test directory
Fix pgid due to elimination of initial default pool
Testing could never fail because run_tests return ignored
Sage Weil [Wed, 9 Aug 2017 20:40:43 +0000 (16:40 -0400)]
qa/suites/upgrade/jewel-x/parallel: thrash layout
We can't kill and restart osds because that will interfere with
the upgrade process. We can, however, thrash the layout by
tweaking osd weights and so on. This will exercise osd recovery
paths during the upgrade that aren't normally exercised (outside
of stress-split..which doesn't upgrade individual osds while they
are non-clean).
Sage Weil [Wed, 9 Aug 2017 16:50:57 +0000 (12:50 -0400)]
osd/PG: force rebuild of missing set on jewel upgrade
Previously we were detecting the need to rebuild missing based on
whether the "divergent_priors" omap key was present. Unfortunately,
jewel does not always set this, so it is not a reliable indicator.
(It only gets set if you actually have a divergent prior at some
point in the PG's life time on that OSD.)
Fix by using the info_struct_v on the PG to detect whether we need
to do the conversion. We didn't bump the value when we adding
the missing persistence, but the fastinfo was also added during
the same period between jewel and kraken, so it will work just as
well.
Fixes: http://tracker.ceph.com/issues/20958 Signed-off-by: Sage Weil <sage@redhat.com>
Sage Weil [Tue, 8 Aug 2017 22:43:22 +0000 (18:43 -0400)]
mon/Elector: force election epoch bump on start
We are generally careful when bumping the epoch so that we can join
existing rounds. However, if we restart in the middle of an election,
and change versions, we need to be certain that our previous ACK (as
$version - 1) isn't accepted as truth for the restarted daemon (running
$version) keeping the same epoch.
The conservatism with bumping is to avoid spurious election cycles, but
mon restarts are more rare, and we need them here.
Fixes: http://tracker.ceph.com/issues/20949 Signed-off-by: Sage Weil <sage@redhat.com>