Sage Weil [Fri, 6 May 2016 13:09:43 +0000 (09:09 -0400)]
osdc/Objecter: upper bound watch_check result
This way we always return a safe upper bound on the amount of time
since we did a check. Among other things, this prevents us from
returning a value of 0, which is confusing.
Sage Weil [Tue, 3 May 2016 03:28:18 +0000 (23:28 -0400)]
osd: handle boot racing with NOUP set
This is a follow-on to 7139a232d26beef441ffbc13bc087baab3505ea8,
which handled the NOUP set + clear case when the OSD found out
about the flag being cleared. However, it's possible that the
flag will get cleared but the OSD won't get a map update (because
it hasn't subscribed and is not doing any work).
This means that it is *more* likely than before that we will
restart the boot process even though the OSD did successfully
mark us up. However, as before, it is unavoidable because there
is no notification of whether our boot request succeeds or not.
And it is still mostly harmless (an extra mark down + up cycle).
xinxin shu [Thu, 2 Jun 2016 06:13:09 +0000 (14:13 +0800)]
remove invalid objectmap flag when objectmap is disabled Fixes: http://tracker.ceph.com/issues/16076 Signed-off-by: xinxin shu <shuxinxin@chinac.com>
(cherry picked from commit b2d475686ee7617bb2023d753941e3d6952f0878)
Samuel Just [Fri, 3 Jun 2016 00:13:09 +0000 (17:13 -0700)]
src/: remove all direct comparisons to get_max()
get_max() now returns a special singleton type from which hobject_t's
can be assigned and constructed, but which cannot be directly compared.
This patch also cleans up all such uses to use is_max() instead.
This should prevent some issues like 16113 by preventing us from
checking for max-ness by comparing against a sentinel value. The more
complete fix will be to make all fields of hobject_t private and enforce
a canonical max() representation that way. That patch will be hard to
backport, however, so we'll settle for this for now.
Samuel Just [Fri, 3 Jun 2016 00:36:21 +0000 (17:36 -0700)]
hobject: compensate for non-canonical hobject_t::get_max() encodings
This closes a loop-hole that could allow a non-canonical in memory
hobject_t::get_max() object which would return true for is_max(), but
false for *this == hobject_t::get_max().
Ramana Raja [Wed, 13 Apr 2016 08:33:51 +0000 (14:03 +0530)]
ceph_volume_client: evict client also based on mount path
Evict clients based on not just their auth ID, but also based on the
volume path mounted. This is needed for the Manila use-case, where
the clients using an auth ID are denied further access to a share.
John Spray [Wed, 11 May 2016 12:18:23 +0000 (13:18 +0100)]
client: report root's quota in statfs
When user is mounted a quota-restricted inode
as the root, report that inode's quota status
as the filesystem statistics in statfs.
This allows us to have a fairly convincing illusion
that someone has a filesystem to themselves, when
they're really mounting a restricted part of
the larger global filesystem.
Fixes: http://tracker.ceph.com/issues/15599 Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit b6d2b6d1a51969c210ae75fef93c71ac21f511a6)
Loic Dachary [Thu, 26 May 2016 10:55:51 +0000 (12:55 +0200)]
ceph-disk: workaround gperftool hang
Temporary workaround: if ceph-osd --mkfs does not
complete within 5 minutes, assume it is blocked
because of https://github.com/gperftools/gperftools/issues/786
Jason Dillaman [Wed, 25 May 2016 18:00:34 +0000 (14:00 -0400)]
rbd-mirror: stop stale replayers before starting new replayers
If the connection details are tweaked for a remote peer, stop
the existing replayer before potentially starting a new replayer
against the same remote.
Jason Dillaman [Tue, 24 May 2016 02:21:33 +0000 (22:21 -0400)]
journal: eliminate watch delay for object refetches
The randomized write sizes of the modified rbd-mirror stress
test results in a lot of journal object with few entries.
Immediately fetch objects when performing a refetch check prior
to closing an empty object.
Jason Dillaman [Mon, 23 May 2016 18:57:03 +0000 (14:57 -0400)]
journal: keep active tag to assist with pruning watched objects
It's possible that there might be additional entries to prune in
objects that haven't been prefetched yet. Keep the active tag
to allow these entries to be pruned after they have been loaded.
Jason Dillaman [Mon, 23 May 2016 15:01:05 +0000 (11:01 -0400)]
journal: cleanup watch refetch flag handling
Clear the refetch required flag while scheduling the watch
and remove the stale object after the watch completes if still
empty. Previously, it was possible for the flag to become
out-of-sync with whether or not it was actually refreshed
and pruned.
Fixes: http://tracker.ceph.com/issues/15993 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit ff2cc27ae592646b495bf1b614d35bd50c091a3d)
Ricardo Dias [Fri, 13 May 2016 15:44:53 +0000 (16:44 +0100)]
rbd-mirror: Unregister clients from non-primary images journal
A non-primary image may have registered clients on its journal
(for instance a primary image that was later demoted). We must
unregister the clients when disabling image mirroring with the
force option.
Mykola Golub [Wed, 25 May 2016 18:54:16 +0000 (21:54 +0300)]
test: workaround failure in journal.sh
With the changes to ensure that the commit position of a new
client is initialized to the minimum position of other clients,
the 'journal inspect/export' commands return zero records because
the master client has committed all of its entries.
Workaround this by restoring the initial commit position after
writing to the image.
Robin H. Johnson [Fri, 20 May 2016 23:00:33 +0000 (16:00 -0700)]
rgw: fix manager selection when APIs customized
When modifying rgw_enable_apis per RGW instance, such as for staticsites, you
can end up with RESTManager instance being null in some cases, which returns a
HTTP 405 MethodNotAllowed to all requests.
Example configuration to trigger the bug:
rgw_enable_apis = s3website
Backport: jewel
X-Note: Patch from Yehuda in private IRC discussion, 2016/05/20. Fixes: http://tracker.ceph.com/issues/15973 Fixes: http://tracker.ceph.com/issues/15974 Signed-off-by: Robin H. Johnson <robin.johnson@dreamhost.com>
(cherry picked from commit 7c7a465b55f7100eab0f140bf54f9420abd1c776)
Kefu Chai [Fri, 13 May 2016 03:26:31 +0000 (11:26 +0800)]
osd/OpRequest: reset connection upon unregister
this helps to free the resources referenced by the connection, among
other things, in the case of MOSDOp, the OSD::Session and OSDMap. this
helps to free the resource earlier and trim the osdmaps in time.
Kefu Chai [Thu, 12 May 2016 12:28:11 +0000 (20:28 +0800)]
osd: reset session->osdmap if session is not waiting for a map anymore
we should release the osdmap reference once we are done with it,
otherwise we might need to wait very long to update that reference with
a newer osdmap ref. this appears to be an OSDMap leak: it is held by an
quiet OSD::Session forever.
the osdmap is not reset in OSD::session_notify_pg_create(), because its
only caller is wake_pg_waiters(), which will call
dispatch_session_waiting() later. and dispatch_session_waiting() will
check the session->osdmap, and will also reset the osdmap if
session->waiting_for_pg.empty().
Ricardo Dias [Tue, 17 May 2016 17:04:28 +0000 (18:04 +0100)]
ceph.in: fix exception when pool name has non-ascii characters
When deleting a pool without the --i-really-really-mean-it option, if
the pool name has non-ascii characters, the format of the command
message raises a UnicodeEncodeError exception.
Jason Dillaman [Thu, 19 May 2016 00:53:26 +0000 (20:53 -0400)]
rbd-mirror: disable librbd caching for replicated images
Each image has its own cache and each cache uses its own thread. With
a large replicated cluster, this could result in thousands of extra
threads and gigabytes of extra memory.
Fixes: http://tracker.ceph.com/issues/15930 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit ea35f148257282fe3f3ae02fe7a26cf245cda952)
Boris Ranto [Wed, 4 May 2016 07:09:47 +0000 (09:09 +0200)]
rpm: Fix SELinux relabel on fedora
The SELinux userspace utilities stopped providing versions when they
switched to CIL language. We need to use a different technique to
relabel the files.