Loic Dachary [Tue, 3 Feb 2015 15:14:23 +0000 (16:14 +0100)]
ceph.spec.in: junit always except for EPEL 6
The package was renamed a long time ago (around the Fedora 15
timeframe). The "junit4" name is only relevant for EPEL 6. For EPEL 7
and Fedora 20, the "junit" package has "Provides: junit4". And most
recently, in the junit package that ships in Fedora 21 and 22, the
package maintainer dropped the old Provides: line.
Greg Farnum [Tue, 16 Jun 2015 15:13:41 +0000 (08:13 -0700)]
qa: update to newer Linux tarball
This should make newer gcc releases happier in their default configuration.
kernel.org is now distributing tarballs as .xz files so we change to that
as well when decompressing (it is supported by Ubuntu Precise so we should
be all good).
Xinze Chi [Fri, 3 Jul 2015 10:27:13 +0000 (18:27 +0800)]
mon/PGMonitor: bug fix pg monitor get crush rule
when some rules have been deleted before, the index in array of crush->rules
is not always equals to crush_ruleset of pool.
Fixes: #12210 Reported-by: Ning Yao <zay11022@gmail.com> Signed-off-by: Xinze Chi <xmdxcxz@gmail.com>
(cherry picked from commit 498793393c81c0a8e37911237969fba495a3a183)
wuxingyi [Wed, 11 Mar 2015 09:34:40 +0000 (17:34 +0800)]
rgw/logrotate.conf: Rename service name
The service name for ceph rados gateway was changed to "ceph-radosgw",
the previous version of service name "radosgw" would cause a failed reload,
and finally make it impossible to write any log data to the log file.
Conflicts:
qa/workunits/cephtool/test.sh
no "# make sure we can't clobber snapshot state" tests in firefly
src/mon/OSDMonitor.cc
no tp->removed_snaps.empty() in firefly
mon: PaxosService: call post_refresh() instead of post_paxos_update()
Whenever the monitor finishes committing a proposal, we call
Monitor::refresh_from_paxos() to nudge the services to refresh. Once
all services have refreshed, we would then call each services
post_paxos_update().
However, due to an unfortunate, non-critical bug, some services (mainly
the LogMonitor) could have messages pending in their
'waiting_for_finished_proposal' callback queue [1], and we need to nudge
those callbacks.
This patch adds a new step during the refresh phase: instead of calling
directly the service's post_paxos_update(), we introduce a
PaxosService::post_refresh() which will call the services
post_paxos_update() function first and then nudge those callbacks when
appropriate.
[1] - Given the monitor will send MLog messages to itself, and given the
service is not readable before its initial state is proposed and
committed, some of the initial MLog's would be stuck waiting for the
proposal to finish. However, by design, we only nudge those message's
callbacks when an election finishes or, if the leader, when the proposal
finishes. On peons, however, we would only nudge those callbacks if an
election happened to be triggered, hence the need for an alternate path
to retry any message waiting for the initial proposal to finish.
Samuel Just [Fri, 24 Jul 2015 22:38:18 +0000 (15:38 -0700)]
Log::reopen_log_file: take m_flush_mutex
Otherwise, _flush() might continue to write to m_fd after it's closed.
This might cause log data to go to a data object if the filestore then
reuses the fd during that time.
Jason Dillaman [Wed, 8 Apr 2015 23:06:52 +0000 (19:06 -0400)]
librbd: avoid blocking AIO API methods
Enqueue all AIO API methods within the new librbd thread pool to
reduce the possibility of any blocking operations. To maintain
backwards compatibility with the legacy return codes of the API's
AIO methods, it's still possible to block attempting to acquire
the snap_lock.
Haomai Wang [Mon, 1 Dec 2014 15:54:16 +0000 (23:54 +0800)]
CephContext: Add AssociatedSingletonObject to allow CephContext's singleton
If some objects associated to CephContext want to create a singleton object,
it can inherit AssociatedSingletonObject and implement destruction to get notified.
Changed $CEPH_MON to 127.0.0.1 -- the CEPH_MON was introduced after
firefly to allow tests to run in parallel. Back in firefly all tests
use the same port because 127.0.0.1 was hardcoded. We can't
conveniently backport all that's necessary for tests to run in
parallel, therefore we keep the 127.0.0.1 hardcoded.
Conflicts:
src/test/mon/osd-pool-create.sh
TEST_no_pool_delete() follows a different test than in master
Conflicts:
src/librados/IoCtxImpl.cc
In firefly, return value of objecter->pg_read() is not assigned to c->tid.
src/osdc/Objecter.cc
src/osdc/Objecter.h
There is no _op_submit_with_budget() function in firefly.
There is no Objecter::_finish_op() function in firefly.
In firefly, _take_op_budget() is called take_op_budget().
This fixes a problem, wherein calamari does not provide
popup drill-downs for warnings or errors, should the summary
be missing.
Calamari gets health info from /api/v1/cluster/$FSID/health.
If the data here has a summary field, this summary is provided
in a popup window:
/api/v1/cluster/$FSID/health is populated (ultimately) with
status obtained via librados python bindings from the ceph
cluster. In the case where there's clock skew, the summary
field supplied by the ceph cluster is empty.
No summary field, no popup window with more health details.
Sage Weil [Wed, 18 Mar 2015 20:49:20 +0000 (13:49 -0700)]
os/chain_xattr: handle read on chnk-aligned xattr
If we wrote an xattr that was a multiple of a chunk, we will try to read
the next chunk and get ENODATA. If that happens bail out of the loop and
assume we've read the whole thing.
Yehuda Sadeh [Thu, 25 Jun 2015 21:31:03 +0000 (14:31 -0700)]
rgw: error out if frontend did not send all data
Fixes: #11851
The civetweb mg_write() doesn't return error when it can't flush all data
to the user, it just sends the total number of bytes written. Modified the
client io to return total number of bytes and return an error if didn't
send anything.
Thorsten Behrens [Wed, 10 Dec 2014 10:53:43 +0000 (11:53 +0100)]
Unconditionally chown rados log file.
This fixes bnc#905047 (in a somewhat ad-hoc way). Sadly the log
file gets created from several places, so its existence does not
mean init-radosgw had actually run.
Nathan Cutler [Thu, 25 Jun 2015 20:37:52 +0000 (22:37 +0200)]
ceph.spec.in: use _udevrulesdir to eliminate conditionals
The conditionals governing where 50-rbd.rules is installed were not doing the
right thing on SUSE distros.
Start using the %_udevrulesdir RPM macro, while taking care that it is defined
and set to the right value. Use it to eliminate some conditionals around other
udev rules files as well.
Nathan Cutler [Tue, 16 Jun 2015 16:27:20 +0000 (18:27 +0200)]
ceph.spec.in: python-argparse only in Python 2.6
argparse is a widely-used Python module for parsing command-line arguments.
Ceph makes heavy use of Python scripts, both in the build environment and on
cluster nodes and clients.
Until Python 2.6, argparse was distributed separately from Python proper.
As of 2.7 it is part of the Python standard library.
Although the python package in a given distro may or may not Provide:
python-argparse, this cannot be relied upon.
Therefore, this commit puts appropriate conditionals around Requires:
python-argparse and BuildRequires: python-argparse. It does so for Red
Hat/CentOS and SUSE only, because the last Fedora version with Python 2.6
was Fedora 13, which is EOL.
argparse is required by both the ceph and ceph-common packages, but since ceph
requires ceph-common, the argparse Requires and BuildRequires need only appear
once, under ceph-common.
Signed-off-by: Zhiqiang Wang <wonzhq@hotmail.com> Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 1417eded65f40bdb2a49c8252fcfffa383a7b965)
Sage Weil [Fri, 11 Jul 2014 18:31:22 +0000 (11:31 -0700)]
osd/osd_types: be pedantic about encoding last_force_op_resend without feature bit
The addition of the value is completely backward compatible, but if the
mon feature bits don't match it can cause monitor scrub noice (due to the
parallel OSDMap encoding). Avoid that by only adding the new field if the
feature (which was added 2 patches after the encoding, see 3152faf79f498a723ae0fe44301ccb21b15a96ab and 45e79a17a932192995f8328ae9f6e8a2a6348d10.
Kefu Chai [Fri, 15 May 2015 14:50:36 +0000 (22:50 +0800)]
mon: always reply mdsbeacon
the MDS (Beacon) is always expecting the reply for the mdsbeacon messages from
the lead mon, and it uses the delay as a metric for the laggy-ness of the
Beacon. when it comes to the MDSMonitor on a peon, it will remove the route
session at seeing a reply (route message) from leader, so a reply to
mdsbeacon will stop the peon from resending the mdsbeacon request to the
leader.
if the MDSMonitor re-forwards the unreplied requests after they are
outdated, there are chances that the requests reflecting old and even wrong
state of the MDSs mislead the lead monitor. for example, the MDSs which sent
the outdated messages could be dead.