David Zafman [Mon, 30 Sep 2013 22:53:35 +0000 (15:53 -0700)]
common, os: Perform xattr handling based on detected fs type
In FileStore::_detect_fs() store discovered filesystem type in m_fs_type
Add per-filesystem filestore_max_inline_xattr_size_* variants
Add per-filesystem filestore_max_inline_xattrs_* variants
New function set_xattr_limits_via_conf()
Set m_filestore_max_inline_xattr_size based on override or fs type
Set m_filestore_max_inline_xattrs based on override or fs type
Handle conf change of any relevant value by calling set_xattr_limits_via_conf()
Change filestore_max_inline_xattr_size to override if non-zero
Change filestore_max_inline_xattrs to override if non-zero
Fixes: #6143 Signed-off-by: David Zafman <david.zafman@inktank.com>
Sage Weil [Fri, 4 Oct 2013 04:27:36 +0000 (21:27 -0700)]
osd/ReplicatedPG: fix null deref on rollback_to whiteout check
Bring this whole if/else chain up one level so that we can capture both
ENOENT and whiteout in the same case. (And don't dereference the
pointer when we know it is NULL.)
Fixes: #6474 Signed-off-by: Sage Weil <sage@inktank.com>
mon: Monitor: drop client msg if no session exists and msg is not MAuth
If we are not a monitor and we don't have a session yet, we must first
authenticate with the cluster. Therefore, the first message to the
monitor must be an MAuth. If not, we assume it's a stray message and
just drop it.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
mon: MonmapMonitor: make 'ceph mon add' idempotent
MonMap changes lead to bootstraps. Callbacks waiting for a proposal to
finish can have several fates, depending on what happens: finished, rerun
or aborted.
In the case of a bootstrap right after a monmap change, callbacks are
rerun. Considering we queued the message that lead to the monmap change
on this queue, if we instead of finishing it end up reruning it, we will
end up trying to perform the same modification twice -- the last one will
try to modify an already existing state and we will return just that:
whatever you're attempting to do has already been done.
This patch makes 'ceph mon add' completely idempotent. If one tries to
add an already existing monitor (i.e., same name, same ip:port), one
simply gets a 'monitor foo added', with return 0, no matter how many
times one runs the command.
Fixes: #5896 Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Sage Weil [Fri, 20 Sep 2013 00:57:14 +0000 (17:57 -0700)]
common/bloom_filter: unit tests
Fun facts:
- fpp = false positive probability
- fpp is a function of insert count only
- at .1% fpp, we pay about 2 bytes per insert
- at 1-2% fpp, we pay about 1 byte per insert
- at 15% fpp, we pay about .5 bytes per insert
David Zafman [Fri, 27 Sep 2013 00:42:13 +0000 (17:42 -0700)]
common, os, osd: Use common functions for safe file reading and writing
Add new safe_read_file() and safe_write_file() to update files atomically
Used instead of original OSD::read_meta(), OSD::write_meta() they are based on
Used by read_superblock() and write_superblock()
Used by write_version_stamp() and version_stamp_is_valid()
Fixes: #6422 Signed-off-by: David Zafman <david.zafman@inktank.com>
* Update to the current state of the ghobject implementaiton and the fact
that they encode the shard_t Although the pool also contains the shard
id, it is less relevant to understand the implementation.
* Update with the erasure code plugin infrastructure and the example
plugin now in master.
* Move jerasure to a separate page to be expanded and link it from the
toc
* Kill the partial read and writes notes as it will probably not be
implemented in the near future. Kill some of the notes because they
are no longer relevant.
Sage Weil [Tue, 1 Oct 2013 21:21:40 +0000 (14:21 -0700)]
osd: remove magical tmap -> omap conversion
This is incomplete and unfortunately unusable in its current state:
- it would only set USES_TMAP for old encoded object_info_t and tmapput,
but would NOT set it for tmapup
- a config option turned that off by default.
That means that the mds conversion from tmap -> omap won't be able to use
this because any existing cluster has tmap objects without the USES_TMAP
flag set. And we don't want to unconditionally try a tmap->omap conversion
on omap operations because there are lots of existing librados users out
there that will be negatively impacted by this.
Instead, the MDS will need to handle this conversion on the client side by
reading either tmap or omap objects and explicitly rewriting the content
with omap (while truncating the tmap data away).
Sage Weil [Wed, 2 Oct 2013 00:04:44 +0000 (17:04 -0700)]
osd: add ISDIRTY, UNDIRTY rados operations
ISDIRTY will query whether the dirty flag is set on an object. UNDIRTY
will explicitly clear it. Note that a user doing so will likely run amok
with the caching code.
Sage Weil [Tue, 1 Oct 2013 19:12:55 +0000 (12:12 -0700)]
osd/ReplicatedPG: update all find_object_context() users to handle whiteouts
In each case, we treat the whiteout as if we got an ENOENT.
We do not change the semantics of bool exists to avoid breaking lots of
potentially fragile code. We are only interested in changing the
user-visible behavior of the object, not the way it is internally stored
or managed.
This will likely be refined as we grow acutal users for whiteoutes in the
pool caching code.
Sage Weil [Tue, 1 Oct 2013 16:28:29 +0000 (09:28 -0700)]
osdc/ObjectCacher: limit writeback IOs generated while holding lock
While analyzing a log from Mike Dawson I saw a long stall while librbd's
objectcacher was starting lots (many hundreds) of IOs. Limit the amount of
time we spend doing this at a time to allow IO replies to be processed so
that the cache remains responsive.
I'm not sure this warrants a tunable (which we would need to add for both
libcephfs and librbd).
Yehuda Sadeh [Mon, 26 Aug 2013 18:16:08 +0000 (11:16 -0700)]
rgw: quiet down warning message
Fixes: #6123
We don't want to know about failing to read region map info
if it's not found, only if failed on some other error. In
any case it's just a warning.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Dan Mick [Fri, 27 Sep 2013 05:24:37 +0000 (22:24 -0700)]
ceph_argparse.py: clean up error reporting when required param missing
Treat "need 1, got 0" as a special case, and change the message to
"missing required parameter <x>". Also, when failing for that reason,
print the command concise description and its helptext.
Fixes: #6384 Signed-off-by: Dan Mick <dan.mick@inktank.com>