common: Enforces the methods lru_pin() and lru_unpin()
If lru_*pin() is called twice, the counter will be incr/decr
incorrectly since it will count more/less pinned objects than
there is and so corrupts the balancing (lru_adjust()).
common: Fixes issue with lru_clear() + add new test
The method lru_clear() must set attribute lru_num to zero
after lru_top, lru_bot and lru_mid are reseted. indeed, lru_num
is the total number of elements found in all of them.
Also the test insures the good behavior of the method
lru_adjust() - lru_touch() calls lru_adjust every time
to balance lru_top and lru_bot by the value of lru_midpoint.
Loic Dachary [Fri, 13 Jun 2014 12:41:39 +0000 (14:41 +0200)]
tests: prevent kill race condition
When trying to kill a daemon, keep its pid in a variable instead of
retrieving it from the pidfile multiple times. It prevents the following
race condition:
* try to kill ceph-mon
* ceph-mon is in the process of dying and removed its pidfile
* try to kill ceph-mon fails because the pidfile is not found
* another ceph-mon is spawned and fails to bind the port
because the previous ceph-mon is still holding it
Sage Weil [Fri, 6 Jun 2014 20:31:29 +0000 (13:31 -0700)]
osd/OSDMap: do not require ERASURE_CODE feature of clients
Just because an EC pool exists in the cluster does not mean tha tthe client
has to support the feature:
1) The way client IO is initiated is no different for EC pools than for
replicated pools.
2) People may add an EC pool to an existing cluster with old clients and
locking those old clients out is very rude when they are not using the
new pool.
3) The only direct client user of EC pools right now is rgw, and the new
versions already need to support various other features like CRUSH_V2
in order to work. These features are present in new kernels.
Fixes: #8556
Backport: firefly Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil [Thu, 12 Jun 2014 23:44:53 +0000 (16:44 -0700)]
osd/OSDMap: make get_features() take an entity type
Make the helper that returns what features are required of the OSDMap take
an entity type argument, as the required features may vary between
components in the cluster.
Backport: firefly Signed-off-by: Sage Weil <sage@inktank.com>
Yehuda Sadeh [Wed, 11 Jun 2014 23:50:41 +0000 (16:50 -0700)]
rgw: set a default data extra pool name
Fixes: #8585
Have a default name for the data extra pool, otherwise it would be empty
which means that it'd default to the data pool name (which is a problem
with ec backends).
Greg Farnum [Tue, 20 May 2014 18:07:45 +0000 (11:07 -0700)]
FileStore: remove user_only options from getattrs through the ObjectStore stack
This sort of awareness belongs at a higher level in the stack -- as
evidenced by nobody using the option at this level. Remove it from the
implementations and the interface
Greg Farnum [Tue, 20 May 2014 20:04:02 +0000 (13:04 -0700)]
FileStore: do not use user_only in collection_getattrs
There's no particular reason why any of the callers of collection_getattrs
want to avoid looking at Ceph's internal xattrs.
It looks like this flag (set in 1862ddd88548fd4609f4fa9715dbad42a84d3775) was
set this way by mistake.
And finally, we don't actually set xattrs on collections anymore, anyway.
Yehuda Sadeh [Wed, 11 Jun 2014 06:06:12 +0000 (23:06 -0700)]
rgw: chain to multiple cache entries in one call
This ensures that chained cache entries that depend on more than one raw
cache entry (bucket info cache depends on both the bucket entry point
and on the bucket info object), are chained and created atomically.
Somnath Roy [Wed, 11 Jun 2014 01:10:30 +0000 (18:10 -0700)]
PG: Added a const spg_t member to the PG class
The const spg_t member is been insantiated from constructor
and now get_pgid() can reference this to return a spg_t instance
without the need of pg_info (thus not requiring to acquire pg_lock).
Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
Somnath Roy [Tue, 10 Jun 2014 23:02:52 +0000 (16:02 -0700)]
ShardedTP: The config option changed
The config option for sharded threadpool is changed to
osd_op_num_threads_per_shard instead of osd_op_num_sharded_pool_threads.
Along with osd_op_num_shards this will be much more user friendly while
configuring the number of op threads for the osd.
Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
Steve Taylor [Tue, 10 Jun 2014 18:42:55 +0000 (12:42 -0600)]
Fix for bug #6700
When preparing OSD disks with colocated journals, the intialization process
fails when using dmcrypt. The kernel fails to re-read the partition table after
the storage partition is created because the journal partition is already in use
by dmcrypt. This fix unmaps the journal partition from dmcrypt and allows the
partition table to be read.
Signed-off-by: Stephen F Taylor <steveftaylor@gmail.com>
Haomai Wang [Sat, 7 Jun 2014 10:57:29 +0000 (18:57 +0800)]
Make KeyValueStore support set_alloc_hint op
Add a new config let KeyValueStore support configurable strip size.
set_alloc_hint op can affect the strip size of the specified object
and the expect write size will become the strip size of the object.
Sebastien Ponce [Thu, 6 Feb 2014 10:38:44 +0000 (11:38 +0100)]
Added unit test suite for the Rados striping API.
This includes tests for standard io and asynchronous io, similar to what is tested in the rados tests.
In addition, it includes in depth tests of the striping itself.
Sebastien Ponce [Thu, 5 Jun 2014 15:17:40 +0000 (17:17 +0200)]
Implementation of the radosstriper interface.
The user facing API is implemented in libradosstriper.cc and the backend in RadosStriperImpl.cc.
Details on how the code works are given in a comment at the top of RadosStriperImple.cc
Ilya Dryomov [Thu, 5 Jun 2014 06:08:42 +0000 (10:08 +0400)]
XfsFileStoreBackend: call ioctl(XFS_IOC_FSSETXATTR) less often
No need to call ioctl(XFS_IOC_FSSETXATTR) if extsize is already set to
the value we want or if any extents are allocated - XFS will refuse to
change extsize in that's the case.
Sage Weil [Thu, 5 Jun 2014 17:43:16 +0000 (10:43 -0700)]
include/atomic: make 32-bit atomic64_t unsigned
This fixes
In file included from test/perf_counters.cc:19:0:
./common/perf_counters.h: In member function ‘std::pair PerfCounters::perf_counter_data_any_d::read_avg() const’:
warning: ./common/perf_counters.h:156:36: comparison between signed and unsigned integer expressions [-Wsign-compare]
} while (avgcount2.read() != count);
^
Sage Weil [Thu, 5 Jun 2014 18:56:58 +0000 (11:56 -0700)]
ceph-objectstore-test: fix warning in collect_metadata test
In file included from test/objectstore/store_test.cc:33:0:
../src/gtest/include/gtest/gtest.h: In function ‘testing::AssertionResult testing::internal::CmpHelperNE(const char*, const char*, const T1&, const T2&) [with T1 = long unsigned int, T2 = int]’:
test/objectstore/store_test.cc:82:5: instantiated from here
warning: ../src/gtest/include/gtest/gtest.h:1379:1: comparison between signed and unsigned integer expressions [-Wsign-compare]
Sebastien Ponce [Tue, 4 Feb 2014 16:38:37 +0000 (17:38 +0100)]
Added a striper interface on top of rados called radosstriper.
This interface allows to manipulate striped objects stored in a rados cluster with a standard open/read/write/stat/close/remove API.
Asynchronous APIs are also provided for data transfers and both C and C++ APIs are present.