Ken Dreyer [Tue, 14 Apr 2015 13:58:17 +0000 (07:58 -0600)]
debian: move ceph_argparse into ceph-common
Prior to this commit, if a user installed the "ceph-common" Debian
package without installing "ceph", then /usr/bin/ceph would crash
because it was missing the ceph_argparse library.
Ship the ceph_argparse library in "ceph-common" instead of "ceph". (This
was the intention of the original commit that moved argparse to "ceph", 2a23eac54957e596d99985bb9e187a668251a9ec)
http://tracker.ceph.com/issues/11388 Refs: #11388
Reported-by: Jens Rosenboom <j.rosenboom@x-ion.de> Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
(cherry picked from commit 110608e5bdd9e2f03020ad41f0c2d756684d4417)
Conflicts:
debian/ceph.install
There is no ceph_daemon.py in hammer
debian/control
Depends/Replaces/Breaks version adapted (from 9.0.0 to 0.94.1.6)
Samuel Just [Fri, 24 Jul 2015 22:38:18 +0000 (15:38 -0700)]
Log::reopen_log_file: take m_flush_mutex
Otherwise, _flush() might continue to write to m_fd after it's closed.
This might cause log data to go to a data object if the filestore then
reuses the fd during that time.
Kefu Chai [Mon, 25 May 2015 12:14:32 +0000 (20:14 +0800)]
mon: validate new crush for unknown names
* the "osd tree dump" command enumerates all buckets/osds found in either the
crush map or the osd map. but the newly set crushmap is not validated for
the dangling references, so we need to check to see if any item in new crush
map is referencing unknown type/name when a new crush map is sent to
monitor, reject it if any.
Kefu Chai [Tue, 26 May 2015 04:08:09 +0000 (12:08 +0800)]
crush/CrushTester: add check_name_maps() method
* check for dangling bucket name or type names referenced by the
buckets/items in the crush map.
* also check for the references from Item(0, 0, 0) which does not
necessarily exist in the crush map under testing. the rationale
behind this is: the "ceph osd tree" will also print stray OSDs
whose id is greater or equal to 0. so it would be useful to
check if the crush map offers the type name indexed by "0"
(the name of OSDs is always "OSD.{id}", so we don't need to
look up the name of an OSD item in the crushmap).
Yehuda Sadeh [Wed, 17 Jun 2015 18:35:18 +0000 (11:35 -0700)]
rgw: fix assignment of copy obj attributes
Fixes: #11563
Clarify the confusing usage of set_copy_attrs() by switching the source and
destinatiion params (attrs, src_attrs). Switch to use attrs instead of
src_attrs afterwards. In one of the cases we originally used the wrong
variable.
max_req_id was moved to RGWRados and changed to atomic64_t.
The same request id resulted in gc giving the same idtag to all objects
resulting in a leakage of rados objects. It only kept the last deleted object in
it's queue, the previous objects were never freed.
Jason Dillaman [Wed, 8 Apr 2015 23:06:52 +0000 (19:06 -0400)]
librbd: avoid blocking AIO API methods
Enqueue all AIO API methods within the new librbd thread pool to
reduce the possibility of any blocking operations. To maintain
backwards compatibility with the legacy return codes of the API's
AIO methods, it's still possible to block attempting to acquire
the snap_lock.
Ken Dreyer [Fri, 13 Mar 2015 22:08:35 +0000 (16:08 -0600)]
packaging: include ceph_perf_objectstore
The /usr/bin/ceph_perf_objectstore file is installed by default. Prior
to this commit it was missing from the packaging. This caused the RPM to
fail to build in mock.
Add ceph_perf_objectstore to the "ceph-test" RPM and Debian package.
If we end up developing further ceph_perf_* utilities, it would make
sense to glob them all with a wildcard, similar to what we are doing
with all the ceph_test_* utilities in ceph-test.
Yehuda Sadeh [Thu, 14 May 2015 00:05:22 +0000 (17:05 -0700)]
rgw: merge manifests correctly when there's prefix override
Fixes: #11622
Backport: hammer, firefly
Prefix override happens in a manifest when a rados object does not
conform to the generic prefix set on the manifest. When merging
manifests (specifically being used in multipart objects upload), we need
to check if the rule that we try to merge has a prefix that is the same
as the previous rule. Beforehand we checked if both had the same
override_prefix setting, but that might not apply as both manifests
might have different prefixes.
Jon Bernard [Fri, 8 May 2015 15:54:06 +0000 (11:54 -0400)]
common/admin_socket: close socket descriptor in destructor
Long-running processes that do not reuse a single client connection will
see accumulating file descriptors as a result of not closing the
listening socket. In this case, eventually the system will reach
file-max and subsequent connections will fail.
Samuel Just [Tue, 21 Apr 2015 06:45:57 +0000 (23:45 -0700)]
OSD: handle the case where we resurrected an old, deleted pg
Prior to giant, we would skip pgs in load_pgs which were not present in
the current osdmap. Those pgs would eventually refer to very old
osdmaps, which we no longer have causing the assertion failure in 11429
once the osd is finally upgraded to a version which does not skip the
pgs. Instead, if we do not have the map for the pg epoch, complain to
the osd log and skip the pg.
Objects that start with underscore need to have an object locator,
this is due to an old behavior that we need to retain. Some objects
might have been created without the locator. This tool creates a new
rados object with the appropriate locator.
Yehuda Sadeh [Fri, 27 Mar 2015 23:32:48 +0000 (16:32 -0700)]
rgw: generate new tag for object when setting object attrs
Fixes: #11256
Backport: firefly, hammer
Beforehand we were reusing the object's tag, which is problematic as
this tag is used for bucket index updates, and we might be clobbering a
racing update (like object removal).
Ken Dreyer [Wed, 22 Apr 2015 22:36:42 +0000 (16:36 -0600)]
init-radosgw: run RGW as root
The ceph-radosgw service fails to start if the httpd package is not
installed. This is because the init.d file attempts to start the RGW
process with the "apache" UID. If a user is running civetweb, there is
no reason for the httpd or apache2 package to be present on the system.
Switch the init scripts to use "root" as is done on Ubuntu.
http://tracker.ceph.com/issues/11453 Refs: #11453
Reported-by: Vickey Singh <vickey.singh22693@gmail.com> Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
(cherry picked from commit 47339c5ac352d305e68a58f3d744c3ce0fd3a2ac)
Boris Ranto [Mon, 13 Apr 2015 10:36:07 +0000 (12:36 +0200)]
Rework mds/Makefile.am to support a dencoder client build
The patch adds all the mds sources to DENCODER_SOURCES to allow a
dencoder client build. The patch also splits the Makefile.am file to
better accomodate the change.
Dencoder is built if ENABLE_CLIENT is set. However, the rgw/Makefile.am
populated DENCODER_SOURCES only if WITH_RADOSGW was set. The patch fixes
this and populates DENCODER_SOURES if ENABLE_CLIENT is set.
Sage Weil [Fri, 10 Apr 2015 15:43:45 +0000 (08:43 -0700)]
crush: fix has_v4_buckets()
alg, not type!
This bug made us incorrectly think we were using v4 features when user type
5 was being used. That's currently 'rack' with recent crush maps, but
was other types for clusters that were created with older versions. This
is clearly problematic as it will lock out non-hammer clients incorrectly,
breaking deployments on upgrade.
Guang Yang [Thu, 26 Feb 2015 08:13:12 +0000 (08:13 +0000)]
osd: fix negative degraded objects during backfilling
When there is deleting requests during backfilling, the reported number of degraded
objects could be negative, as the primary's num_objects is the latest (locally) but
the number for replicas might not reflect the deletings. A simple fix is to ignore
the negative subtracted value.
This can be done better in a separate script, which puts these in
CEPH_EXTRA_CONFIGURE_ARGS. In particular, this lets us enable
lttng for gitbuilder builds, but not release builds.
Jason Dillaman [Mon, 16 Mar 2015 22:40:49 +0000 (18:40 -0400)]
librbd: snap_remove should ignore -ENOENT errors
If the attempt to deregister the snapshot from the parent
image fails with -ENOENT, ignore the error as it is safe
to assume that the child is not associated with the parent.
Samuel Just [Thu, 26 Mar 2015 17:26:48 +0000 (10:26 -0700)]
ReplicatedPG::cancel_pull: requeue waiters as well
If we are in recovery_wait, we might not recover that object as part of
recover_primary for some time. Worse, if we are waiting on a backfill
which is blocked waiting on a copy_from on the missing object in
question, it can become a dead lock.
Fixes: 11244
Backport: firefly Signed-off-by: Samuel Just <sjust@redhat.com>
Sage Weil [Fri, 27 Mar 2015 22:35:21 +0000 (15:35 -0700)]
common: send cluster log messages to 'cluster' channel by default
The CLOG_CHANNEL_DEFAULT constant was being abused for two purposes:
- the default channel to log messages to
- the name of the config option key in the key/value pair string that is
used for the default option, e.g. "default=true foo=false bar=false"
Fix this by making the config option key CLOG_CONFIG_DEFAULT_KEY and
replacing throughout, and changing CLOG_CHANNEL_DEFAULT to "cluster" (as
it should be and has been historically).
Fixes: #11177 Signed-off-by: Sage Weil <sage@redhat.com>
Samuel Just [Tue, 24 Mar 2015 17:48:02 +0000 (10:48 -0700)]
PG: set/clear CREATING in Primary state entry/exit
Previously, we did not actually set it when we got a pg creation message from
the mon. It would actually get set on the first start_peering_interval after
that point. If we don't get that far, but do send a stat update to the mon, we
can end up with 11197. Instead, let's just set it and clear it upon entry into
and exit from the Primary state.
Fixes: 11197 Signed-off-by: Samuel Just <sjust@redhat.com>