Boris Ranto [Fri, 15 Aug 2014 17:34:27 +0000 (19:34 +0200)]
Fix -Wno-format and -Werror=format-security options clash
This causes build failure in latest fedora builds, ceph_test_librbd_fsx adds -Wno-format cflag but the default AM_CFLAGS already contain -Werror=format-security, in previous releases, this was tolerated but in the latest fedora rawhide it no longer is, ceph_test_librbd_fsx builds fine without -Wno-format on x86_64 so there is likely no need for the flag anymore
Loic Dachary [Tue, 3 Feb 2015 15:14:23 +0000 (16:14 +0100)]
ceph.spec.in: junit always except for EPEL 6
The package was renamed a long time ago (around the Fedora 15
timeframe). The "junit4" name is only relevant for EPEL 6. For EPEL 7
and Fedora 20, the "junit" package has "Provides: junit4". And most
recently, in the junit package that ships in Fedora 21 and 22, the
package maintainer dropped the old Provides: line.
Sage Weil [Wed, 29 Apr 2015 19:34:25 +0000 (12:34 -0700)]
mon: prevent pool with snapshot state from being used as a tier
If we add a pool with snap state as a tier the snap state gets clobbered
by OSDMap::Incremental::propogate_snaps_to_tiers(), and may prevent OSDs
from starting. Disallow this.
Conflicts:
qa/workunits/cephtool/test.sh
properly co-exist with "# make sure we can't create an ec pool tier"
src/mon/OSDMonitor.cc
properly co-exist with preceding "if (tp->ec_pool())"
(The changes to both files would have applied cleanly if
https://github.com/ceph/ceph/pull/5389 had not been merged first.)
Conflicts:
src/test/librados/TestCase.cc
for it of type ObjectIterator:
- use it->first instead of it->get_oid()
- use it->second instead of it->get_locator()
Sage Weil [Fri, 14 Nov 2014 06:32:20 +0000 (22:32 -0800)]
osd/ReplicatedPG: allow whiteout deletion with IGNORE_CACHE flag
If the client specifies IGNORE_CACHE, allow a regular DELETE to zap a
whiteout. Expand test case to verify this works.
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 34e4d24)
Conflicts:
src/test/librados/tier.cc
replaced NObjectIterator -> ObjectIterator
replaced cache_ioctx.nobjects_begin -> cache_ioctx.objects_begin
replaced cache_ioctx.nobjects_end -> cache_ioctx.objects_end
replace it->get_oid() with it->first for it of type ObjectIterator
Greg Farnum [Tue, 16 Jun 2015 15:13:41 +0000 (08:13 -0700)]
qa: update to newer Linux tarball
This should make newer gcc releases happier in their default configuration.
kernel.org is now distributing tarballs as .xz files so we change to that
as well when decompressing (it is supported by Ubuntu Precise so we should
be all good).
Sage Weil [Wed, 3 Jun 2015 18:57:34 +0000 (14:57 -0400)]
upstart: limit respawn to 3 in 30 mins (instead of 5 in 30s)
It may take tens of seconds to restart each time, so 5 in 30s does not stop
the crash on startup respawn loop in many cases. In particular, we'd like
to catch the case where the internal heartbeats fail.
This should be enough for all but the most sluggish of OSDs and capture
many cases of failure shortly after startup.
Samuel Just [Thu, 27 Aug 2015 18:08:33 +0000 (11:08 -0700)]
PG::handle_advance_map: on_pool_change after handling the map change
Otherwise, the is_active() checks in the hitset code can erroneously
return true firing off repops stamped with the new epoch which then get
cleared in the map change code. The filestore callbacks then pass the
interval check and call into a destroyed repop structure.
mon: MonitorDBStore: make get_next_key() work properly
We introduced a significant bug with 2cc7aee, when we fixed issue #11786.
Although that patch would fix the problem described in #11786, we
managed to not increment the iterator upon returning the current key.
This would have the iterator iterating over the same key, forever and
ever.
Sage Weil [Sun, 9 Aug 2015 14:46:10 +0000 (10:46 -0400)]
osd/PGLog: dirty_to is inclusive
There are only two callers of mark_dirty_to who do not pass max,
and they are both in the merge_log extending tail path. In that
case, we want to include the last version specified in the log
writeout. Fix the tail extending code to always specify the
last entry added, inclusive.
Josh Durgin [Mon, 24 Aug 2015 22:40:39 +0000 (15:40 -0700)]
config: skip lockdep for intentionally recursive md_config_t lock
lockdep can't handle recursive locks, resulting in false positive
reports for certain set_val_or_die() calls, like via
md_config_t::parse_argv() passed "-m".
Loic Dachary [Thu, 13 Aug 2015 11:47:24 +0000 (13:47 +0200)]
osd: trigger the cache agent after a promotion
When a proxy read happens, the object promotion is done in parallel. The
agent_choose_mode function must be called to reconsider the situation
to protect against the following scenario:
* proxy read
* agent_choose_mode finds no object exists and the agent
goes idle
* object promotion happens
* the agent does not reconsider and eviction does not happen
although it should
Yehuda Sadeh [Wed, 26 Aug 2015 21:34:30 +0000 (14:34 -0700)]
rgw: init some manifest fields when handling explicit objs
Fixes: #11455
When dealing with old manifest that has explicit objs, we also
need to set the head size and head object correctly so that
code that relies on this info doesn't break.
Jason Dillaman [Fri, 21 Aug 2015 15:32:39 +0000 (11:32 -0400)]
Objecter: pg_interval_t::is_new_interval needs pgid from previous pool
When increasing the pg_num of a pool, an assert would fail since the
calculated pgid seed would be for the pool's new pg_num value instead
of the previous pg_num value.
Fixes: #10399
Backport: infernalis, hammer, firefly Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit f20f7a23e913d09cc7fc22fb3df07f9938ddc144)
Conflicts: (hobject_t sort order not backported, trivial resolution)
src/osdc/Objecter.cc
src/osdc/Objecter.h
Dan van der Ster [Tue, 18 Nov 2014 14:51:46 +0000 (15:51 +0100)]
ceph-disk: don't change the journal partition uuid
We observe that the new /dev/disk/by-partuuid/<journal_uuid>
symlink is not always created by udev when reusing a journal
partition. Fix by not changing the uuid of a journal partition
in this case -- instead we can reuse the existing uuid (and
journal_symlink) instead. We also now assert that the symlink
exists before further preparing the OSD.
Fixes: #10146 Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch> Tested-by: Dan van der Ster <daniel.vanderster@cern.ch>
(cherry picked from commit 29eb1350b4acaeabfe1d2b19efedbce22641d8cc)
Dan van der Ster [Mon, 29 Sep 2014 11:20:10 +0000 (13:20 +0200)]
ceph-disk: set guid if reusing a journal partition
When reusing a journal partition (e.g. /dev/sda2) we should set a
new partition guid and link it correctly with the OSD. This way
the journal is symlinked by its persistent name and ceph-disk list
works correctly.
Xinze Chi [Fri, 3 Jul 2015 10:27:13 +0000 (18:27 +0800)]
mon/PGMonitor: bug fix pg monitor get crush rule
when some rules have been deleted before, the index in array of crush->rules
is not always equals to crush_ruleset of pool.
Fixes: #12210 Reported-by: Ning Yao <zay11022@gmail.com> Signed-off-by: Xinze Chi <xmdxcxz@gmail.com>
(cherry picked from commit 498793393c81c0a8e37911237969fba495a3a183)
wuxingyi [Wed, 11 Mar 2015 09:34:40 +0000 (17:34 +0800)]
rgw/logrotate.conf: Rename service name
The service name for ceph rados gateway was changed to "ceph-radosgw",
the previous version of service name "radosgw" would cause a failed reload,
and finally make it impossible to write any log data to the log file.
Conflicts:
qa/workunits/cephtool/test.sh
no "# make sure we can't clobber snapshot state" tests in firefly
src/mon/OSDMonitor.cc
no tp->removed_snaps.empty() in firefly
mon: MonitorDBStore: get_next_key() only if prefix matches
get_next_key() had a bug in which we would always return the first key
from the iterator, regardless of whether its prefix had been specified
to the iterator.
mon: PaxosService: call post_refresh() instead of post_paxos_update()
Whenever the monitor finishes committing a proposal, we call
Monitor::refresh_from_paxos() to nudge the services to refresh. Once
all services have refreshed, we would then call each services
post_paxos_update().
However, due to an unfortunate, non-critical bug, some services (mainly
the LogMonitor) could have messages pending in their
'waiting_for_finished_proposal' callback queue [1], and we need to nudge
those callbacks.
This patch adds a new step during the refresh phase: instead of calling
directly the service's post_paxos_update(), we introduce a
PaxosService::post_refresh() which will call the services
post_paxos_update() function first and then nudge those callbacks when
appropriate.
[1] - Given the monitor will send MLog messages to itself, and given the
service is not readable before its initial state is proposed and
committed, some of the initial MLog's would be stuck waiting for the
proposal to finish. However, by design, we only nudge those message's
callbacks when an election finishes or, if the leader, when the proposal
finishes. On peons, however, we would only nudge those callbacks if an
election happened to be triggered, hence the need for an alternate path
to retry any message waiting for the initial proposal to finish.
Samuel Just [Fri, 24 Jul 2015 22:38:18 +0000 (15:38 -0700)]
Log::reopen_log_file: take m_flush_mutex
Otherwise, _flush() might continue to write to m_fd after it's closed.
This might cause log data to go to a data object if the filestore then
reuses the fd during that time.