]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
10 years agodoc: add notes on using "ceph fs new" 2280/head
john [Mon, 18 Aug 2014 15:57:25 +0000 (16:57 +0100)]
doc: add notes on using "ceph fs new"

Signed-off-by: John Spray <john.spray@redhat.com>
10 years agoosd: fix theoretical use-after-free of OSDMap
Sage Weil [Sat, 16 Aug 2014 21:51:31 +0000 (14:51 -0700)]
osd: fix theoretical use-after-free of OSDMap

In practice, the map will remain pinned for a while, but this
will make coverity happy.

*** CID 1231685:  Use after free  (USE_AFTER_FREE)
/osd/OSD.cc: 6223 in OSD::handle_osd_map(MOSDMap *)()
6217
6218           if (o->test_flag(CEPH_OSDMAP_FULL))
6219            last_marked_full = e;
6220           pinned_maps.push_back(add_map(o));
6221
6222           bufferlist fbl;
>>>     CID 1231685:  Use after free  (USE_AFTER_FREE)
>>>     Calling "encode" dereferences freed pointer "o".
6223           o->encode(fbl);
6224
6225           hobject_t fulloid = get_osdmap_pobject_name(e);
6226           t.write(coll_t::META_COLL, fulloid, 0, fbl.length(), fbl);
6227           pin_map_bl(e, fbl);
6228           continue;

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2259 from ceph/wip-9039
Sage Weil [Sat, 16 Aug 2014 20:41:41 +0000 (13:41 -0700)]
Merge pull request #2259 from ceph/wip-9039

Wip 9039

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2217 from ceph/wip-problem-osds
Sage Weil [Sat, 16 Aug 2014 20:15:10 +0000 (13:15 -0700)]
Merge pull request #2217 from ceph/wip-problem-osds

mon: 'ceph osd blocked-by' for histogram of peers OSDs are waiting for

Reviewed-by: Samuel Just <sam.just@inktank.com>
10 years agoqa/workunits/rest/test.py: fix 'df' test to use total_used_bytes
Sage Weil [Sat, 16 Aug 2014 20:06:02 +0000 (13:06 -0700)]
qa/workunits/rest/test.py: fix 'df' test to use total_used_bytes

This changed back in ee2dbdb0f5e54fe6f9c5999c032063b084424c4c

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2271 from ceph/wip-9053
Sage Weil [Sat, 16 Aug 2014 16:18:19 +0000 (09:18 -0700)]
Merge pull request #2271 from ceph/wip-9053

paxos: fix problem with disjoint quorum members

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
10 years agoMerge pull request #2270 from ceph/wip-init-ceph
Alfredo Deza [Fri, 15 Aug 2014 23:42:59 +0000 (19:42 -0400)]
Merge pull request #2270 from ceph/wip-init-ceph

init-ceph: don't use bashism

Reviewed-by: Alfredo Deza <adeza@redhat.com>
10 years agoinit-ceph: don't use bashism 2270/head
Sage Weil [Fri, 15 Aug 2014 23:41:43 +0000 (16:41 -0700)]
init-ceph: don't use bashism

       -z STRING
              the length of STRING is zero

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2247 from ceph/wip-ceph-disk
Alfredo Deza [Fri, 15 Aug 2014 23:40:15 +0000 (19:40 -0400)]
Merge pull request #2247 from ceph/wip-ceph-disk

ceph-disk: fix various dmcrypt bugs

Reviewed-by: Alfredo Deza <adeza@redhat.com>
10 years agoMerge pull request #2269 from ceph/wip-osd-mon-feature
Loic Dachary [Fri, 15 Aug 2014 22:19:59 +0000 (00:19 +0200)]
Merge pull request #2269 from ceph/wip-osd-mon-feature

osd: fix mon feature requirement

Reviewed-by: Loic Dachary <loic@dachary.org>
10 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Fri, 15 Aug 2014 22:01:23 +0000 (15:01 -0700)]
Merge remote-tracking branch 'gh/next'

10 years agoFix -Wno-format and -Werror=format-security options clash
Boris Ranto [Fri, 15 Aug 2014 17:34:27 +0000 (19:34 +0200)]
Fix -Wno-format and -Werror=format-security options clash

This causes build failure in latest fedora builds, ceph_test_librbd_fsx adds -Wno-format cflag but the default AM_CFLAGS already contain -Werror=format-security, in previous releases, this was tolerated but in the latest fedora rawhide it no longer is, ceph_test_librbd_fsx builds fine without -Wno-format on x86_64 so there is likely no need for the flag anymore

Signed-off-by: Boris Ranto <branto@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoosd: fix feature requirement for mons 2269/head
Sage Weil [Fri, 15 Aug 2014 21:28:57 +0000 (14:28 -0700)]
osd: fix feature requirement for mons

These features should be set on the client_messenger, not
cluster_messenger.

Backport: firefly
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2268 from ceph/wip-9119
Sage Weil [Fri, 15 Aug 2014 21:11:10 +0000 (14:11 -0700)]
Merge pull request #2268 from ceph/wip-9119

Wip 9119

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoReplicatedPG::maybe_handle_cache: do not forward RWORDERED reads 2268/head
Samuel Just [Thu, 14 Aug 2014 18:13:31 +0000 (11:13 -0700)]
ReplicatedPG::maybe_handle_cache: do not forward RWORDERED reads

Even with READFORWARD, we can't forward RWORDERED reads.

Fixes: #9119
Backport: firefly
Signed-off-by: Samuel Just <sam.just@inktank.com>
10 years agoReplicatedPG::cancel_copy: clear cop->obc
Samuel Just [Tue, 12 Aug 2014 23:41:38 +0000 (16:41 -0700)]
ReplicatedPG::cancel_copy: clear cop->obc

Otherwise, an objecter callback might still be hanging
onto this reference until after the flush.

Fixes: #8894
Introduced: 589b639af7c8834a1e6293d58d77a9c440107bc3
Signed-off-by: Samuel Just <sam.just@inktank.com>
10 years agoMerge pull request #2264 from ceph/wip-crush-features
Sage Weil [Fri, 15 Aug 2014 20:55:36 +0000 (13:55 -0700)]
Merge pull request #2264 from ceph/wip-crush-features

do not require crush features for rules that aren't being used

Reviewed-by: Loic Dachary <loic@dachary.org>
10 years agounittest_osdmap: test EC rule and pool features 2264/head
Sage Weil [Fri, 15 Aug 2014 20:54:11 +0000 (13:54 -0700)]
unittest_osdmap: test EC rule and pool features

TODO: tiering feature bits.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2266 from kevincox/removewirehsark
Sage Weil [Fri, 15 Aug 2014 20:41:15 +0000 (13:41 -0700)]
Merge pull request #2266 from kevincox/removewirehsark

Remove Old Wireshark Dissectors

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2070 from somnathr/wip-sd-filestore-optimization
Samuel Just [Fri, 15 Aug 2014 20:37:54 +0000 (13:37 -0700)]
Merge pull request #2070 from somnathr/wip-sd-filestore-optimization

Wip sd filestore optimization

Reviewed-by: Samuel Just <sam.just@inktank.com>
10 years agoRemove Old Wireshark Dissectors 2266/head
Kevin Cox [Fri, 15 Aug 2014 19:27:13 +0000 (15:27 -0400)]
Remove Old Wireshark Dissectors

Remove the two old Wireshark plugins.  They do not build and are
superseded by the dissector which is inside Wireshark.

Signed-Off-By: Kevin Cox <kevincox@kevincox.ca>
10 years agoosd: only require crush features for rules that are actually used
Sage Weil [Fri, 15 Aug 2014 15:55:10 +0000 (08:55 -0700)]
osd: only require crush features for rules that are actually used

Often there will be a CRUSH rule present for erasure coding that uses the
new CRUSH steps or indep mode.  If these rules are not referenced by any
pool, we do not need clients to support the mapping behavior.  This is true
because the encoding has not changed; only the expected CRUSH output.

Fixes: #8963
Backport: firefly
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agocrush: add is_v[23]_rule(ruleid) methods
Sage Weil [Fri, 15 Aug 2014 15:52:37 +0000 (08:52 -0700)]
crush: add is_v[23]_rule(ruleid) methods

Add methods to check if a *specific* rule uses v2 or v3 features.  Refactor
the existing checks to use these.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2213 from dachary/wip-9025-chunk-remapping
Loic Dachary [Fri, 15 Aug 2014 10:43:03 +0000 (12:43 +0200)]
Merge pull request #2213 from dachary/wip-9025-chunk-remapping

erasure-code: chunk remapping

Reviewed-by: Samuel Just <sam.just@inktank.com>
10 years agomon/Paxos: share state and verify contiguity early in collect phase 2271/head
Sage Weil [Wed, 13 Aug 2014 23:17:02 +0000 (16:17 -0700)]
mon/Paxos: share state and verify contiguity early in collect phase

We verify peons are contiguous and share new paxos states to catch peons
up at the end of the round.  Do this each time we (potentially) get new
states via a collect message.  This will allow peons to be pulled forward
and remain contiguous when they otherwise would not have been able to.
For example, if

  mon.0 (leader)  20..30
  mon.1 (peon)    15..25
  mon.2 (peon)    28..40

If we got mon.1 first and then mon.2 second, we would store the new txns
and then boot mon.1 out at the end because 15..25 is not contiguous with
28..40.  However, with this change, we share 26..30 to mon.1 when we get
the collect, and then 31..40 when we get mon.2's collect, pulling them
both into the final quorum.

It also breaks the 'catch-up' work into smaller pieces, which ought to
smooth out latency a bit.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agomon/Paxos: verify all new peons are still contiguous at end of round
Sage Weil [Thu, 14 Aug 2014 23:55:58 +0000 (16:55 -0700)]
mon/Paxos: verify all new peons are still contiguous at end of round

During the collect phase we verify that each peon has overlapping or
contiguous versions as us (and can therefore be caught up with some
series of transactions).  However, we *also* assimilate any new states we
get from those peers, and that may move our own first_committed forward
in time.  This means that an early responder might have originally been
contiguous, but a later one moved us forward, and when the round finished
they were not contiguous any more.  This leads to a crash on the peon
when they get our first begin message.

For example:

 - we have 10..20
 - first peon has 5..15
   - ok!
 - second peon has 18..30
   - we apply this state
 - we are now 18..30
 - we finish the round
   - send commit to first peon (empty.. we aren't contiguous)
   - send no commit to second peon (we match)
 - we send a begin for state 31
   - first peon crashes (it's lc is still 15)

Prevent this by checking at the end of the round if we are still
contiguous.  If not, bootstrap.  This is similar to the check we do above,
but reverse to make sure *we* aren't too far ahead of *them*.

Fixes: #9053
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoerasure-code: remap chunks if not sequential 2213/head
Loic Dachary [Tue, 3 Jun 2014 17:27:26 +0000 (19:27 +0200)]
erasure-code: remap chunks if not sequential

If the remap vector is not empty, use it to figure out the sequence of
data chunks.

http://tracker.ceph.com/issues/9025 Fixes: #9025

Signed-off-by: Loic Dachary <loic@dachary.org>
10 years agoerasure-code: parse function for the mapping parameter
Loic Dachary [Tue, 3 Jun 2014 20:20:29 +0000 (22:20 +0200)]
erasure-code: parse function for the mapping parameter

Each D letter is a data chunk. For instance:

    _DDD_DDD

is going to parse into:

   [ 1, 2, 3, 5, 6, 7 ]

the 0 and 4 positions are not used by chunks and do not show in the
mapping. Implement ErasureCode::parse to support a reasonable default
for the mapping parameter.

Signed-off-by: Loic Dachary <loic@dachary.org>
10 years agoerasure-code: ErasureCodeInterface::get_chunk_mapping()
Loic Dachary [Tue, 3 Jun 2014 15:45:47 +0000 (17:45 +0200)]
erasure-code: ErasureCodeInterface::get_chunk_mapping()

Add support for erasure code plugins that do not sequentially map the
chunks encoded to the corresponding index. This is mostly transparent to
the caller, except when it comes to retrieving the data chunks when
reading. For this purpose there needs to be a remapping function so the
caller has a way to figure out which chunks actually contain the data
and reorder them.

Signed-off-by: Loic Dachary <loic@dachary.org>
10 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Thu, 14 Aug 2014 23:02:22 +0000 (16:02 -0700)]
Merge remote-tracking branch 'gh/next'

10 years agoFileStore: Introduced a RLock instead of WLock 2070/head
Somnath Roy [Thu, 31 Jul 2014 22:03:53 +0000 (15:03 -0700)]
FileStore: Introduced a RLock instead of WLock

While calling index->collection_version, there is no need to
hold WLock at the index level. RLock should be sufficient.

Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
10 years agoFileStore: No need to hold Index lock during omap calls
Somnath Roy [Thu, 31 Jul 2014 21:56:42 +0000 (14:56 -0700)]
FileStore: No need to hold Index lock during omap calls

The Index lock is held during all the omap calls which is
not necessary.

Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
10 years agoFileStore: FDCache lookup is rearranged
Somnath Roy [Mon, 30 Jun 2014 08:54:36 +0000 (01:54 -0700)]
FileStore: FDCache lookup is rearranged

In lfn_open() there is no point of building the Index if the
cache lookup is successful and caller is not asking for Index.

Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
10 years agoFileStore: Index caching is introduced for performance improvement
Somnath Roy [Mon, 30 Jun 2014 08:28:07 +0000 (01:28 -0700)]
FileStore: Index caching is introduced for performance improvement

IndexManager now has a Index caching. Index will only be created if not
found in the cache. Earlier, each op is creating an Index object and other
ops requesting the same index needed to wait till previous op is done.
Also, after finishing lookup, this Index object was destroyed.
Now, a Index cache is been implemented to persists these Indexes since
there is a major performance hit because each op is creating and destroying
these. A RWlock is been introduced in the CollectionIndex class and that is
responsible for sync between lookup and create.
Also, since these Index objects are persistent there is no need to use
smart pointers. So, Index is a wrapper class of CollecIndex* now.
It is the responsibility of the users of Index now to lock explicitely
before using them. Index object is sufficient now for locking and no need
to hold IndexPath for locking. The function interfaces of lfn_open,lfn_find
are changed accordingly.

Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
10 years agoshared_cache: pass key (K) by const ref in interface methods
Somnath Roy [Mon, 30 Jun 2014 07:24:39 +0000 (00:24 -0700)]
shared_cache: pass key (K) by const ref in interface methods

Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
10 years agoFileStore: remove the fdcache_lock
Greg Farnum [Thu, 30 Jan 2014 22:27:04 +0000 (14:27 -0800)]
FileStore: remove the fdcache_lock

With the changes to the shared_cache, we no longer need the fdcache_lock
to prevent us from inserting a second fd for the same hobject into the cache.

Signed-off-by: Greg Farnum <greg@inktank.com>
Merged conflict fixed.

Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
Conflicts:
src/os/FileStore.cc

10 years agoFDCache: implement a basic sharding of the FDCache
Greg Farnum [Mon, 3 Feb 2014 22:36:02 +0000 (14:36 -0800)]
FDCache: implement a basic sharding of the FDCache

This is just a basic sharding. A more sophisticated implementation would
rely on something other than luck for keeping the distribution equitable.
The minimum FDCache shard size is 1.

Signed-off-by: Greg Farnum <greg@inktank.com>
Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
10 years agoshared_cache: expose prior existence when inserting an element
Greg Farnum [Thu, 30 Jan 2014 22:21:52 +0000 (14:21 -0800)]
shared_cache: expose prior existence when inserting an element

The LRU now handles you attempting to insert multiple values for the
same key, by telling you that you've done so and returning the
existing value before it manages to muck up existing data.
The param 'existed' is not mandatory, default value is NULL.

Signed-off-by: Greg Farnum <greg@inktank.com>
Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
10 years agoMerge pull request #2235 from kevincox/wireshark
Sage Weil [Thu, 14 Aug 2014 20:50:04 +0000 (13:50 -0700)]
Merge pull request #2235 from kevincox/wireshark

doc: Add documentation about Wireshark dissector.

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agorgw_admin: add --min-rewrite-stripe-size for object rewrite 2259/head
Yehuda Sadeh [Wed, 13 Aug 2014 01:30:03 +0000 (18:30 -0700)]
rgw_admin: add --min-rewrite-stripe-size for object rewrite

A new param to check whether the object has requires restriping,
checking whether a specific object stripe is bigger than the specified
size. By default it is set to 0, and in that case it'll always be
restriped. Having it set to 4M + 1 will make sure that only the objects
that weren't striped before (using default settings) will be restriped.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agodoc: Add documentation about Wireshark dissector. 2235/head
Kevin Cox [Thu, 14 Aug 2014 20:42:56 +0000 (16:42 -0400)]
doc: Add documentation about Wireshark dissector.

Signed-Off-By: Kevin Cox <kevincox@kevincox.ca>
10 years agorgw: fix compilation
Yehuda Sadeh [Thu, 14 Aug 2014 20:35:12 +0000 (13:35 -0700)]
rgw: fix compilation

RGWRadosPutObj couldn't refer to the ceph context.

Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agoshared_cache: use a single lookup for lookup() too
Greg Farnum [Thu, 30 Jan 2014 21:47:22 +0000 (13:47 -0800)]
shared_cache: use a single lookup for lookup() too

We didn't convert this one to use iterators before.

Signed-off-by: Greg Farnum <greg@inktank.com>
10 years agoqa/workunits/cephtool: verify setmaxosd doesn't let you clobber osds
Sage Weil [Thu, 14 Aug 2014 20:18:07 +0000 (13:18 -0700)]
qa/workunits/cephtool: verify setmaxosd doesn't let you clobber osds

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoOSDMonitor: Do not allow OSD removal using setmaxosd
Anand Bhat [Thu, 14 Aug 2014 04:22:56 +0000 (09:52 +0530)]
OSDMonitor: Do not allow OSD removal using setmaxosd

Description: Currently setmaxosd command allows removal of OSDs by providing
a number less than current max OSD number. This causes abrupt removal of
OSDs causing data loss as well as kernel panic when kernel RBDs are involved.
Fix is to avoid removal of OSDs if any of the OSDs in the range between
current max OSD number and new max OSD number is part of the cluster.

Fixes: #8865
Signed-off-by: Anand Bhat <anand.bhat@sandisk.com>
10 years agorgw: pass set_mtime to copy_obj_data()
Yehuda Sadeh [Tue, 12 Aug 2014 23:33:57 +0000 (16:33 -0700)]
rgw: pass set_mtime to copy_obj_data()

Sometimes we need to set the mtime when copying object data (e.g., when
we rewrite the obj).

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agorgw: copy_obj_data() uses atomic processor
Yehuda Sadeh [Tue, 12 Aug 2014 21:23:46 +0000 (14:23 -0700)]
rgw: copy_obj_data() uses atomic processor

Fixes: #9089
copy_obj_data was not using the current object write infrastructure,
which means that the end objects weren't striped.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agoMerge pull request #2257 from ceph/wip-8784
Sage Weil [Thu, 14 Aug 2014 18:27:13 +0000 (11:27 -0700)]
Merge pull request #2257 from ceph/wip-8784

rgw: call throttle_data() even if renew_state() failed

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agorgw: copy object data if target bucket is in a different pool
Yehuda Sadeh [Tue, 12 Aug 2014 20:36:11 +0000 (13:36 -0700)]
rgw: copy object data if target bucket is in a different pool

Fixes: #9039
Backport: firefly

The new manifest does not provide a way to put the head and the tail in
separate pools. In any case, if an object is copied between buckets in
different pools, we may really just want the object to be copied, rather
than reference counted.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
10 years agoMerge pull request #2251 from ceph/wip-9102
Sage Weil [Thu, 14 Aug 2014 15:36:29 +0000 (08:36 -0700)]
Merge pull request #2251 from ceph/wip-9102

ceph-disk: linter cleanup

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2255 from ceph/wip-9062
Sage Weil [Thu, 14 Aug 2014 13:50:07 +0000 (06:50 -0700)]
Merge pull request #2255 from ceph/wip-9062

msg/PipeConnection: make methods behave on 'anon' connection

Reviewed-by: John Spray <john.spray@redhat.com>
10 years agoMerge remote-tracking branch 'origin/next'
John Spray [Thu, 14 Aug 2014 13:44:06 +0000 (14:44 +0100)]
Merge remote-tracking branch 'origin/next'

10 years agoMerge pull request #2254 from ceph/wip-8725
John Spray [Thu, 14 Aug 2014 13:29:40 +0000 (14:29 +0100)]
Merge pull request #2254 from ceph/wip-8725

mds: fix MDSMap encoding to be backward-compatible

Reviewed-by: Loic Dachary <loic@dachary.org>
Reviewed-by: John Spray <john.spray@redhat.com>
10 years agodoc: update kernel recommendations (avoid 3.15!)
Sage Weil [Thu, 14 Aug 2014 13:09:50 +0000 (06:09 -0700)]
doc: update kernel recommendations (avoid 3.15!)

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agomsg/PipeConnection: make methods behave on 'anon' connection 2255/head
Sage Weil [Thu, 14 Aug 2014 00:52:25 +0000 (17:52 -0700)]
msg/PipeConnection: make methods behave on 'anon' connection

The monitor does a create_anon_connection() to create a pseudo Connection
object for forwarded messages.  If we try to call mark_down or similar
on one of these we should silently ignore the operation, not crash.

If we try to send a message, still crash (explicitly assert); the caller
should probably know better.

Fixes: #9062
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agomon/Paxos: put source mon id in a temp variable
Sage Weil [Wed, 13 Aug 2014 23:01:01 +0000 (16:01 -0700)]
mon/Paxos: put source mon id in a temp variable

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agomds/MDSMap: fix incompat version for encoding 2254/head
Sage Weil [Wed, 13 Aug 2014 22:05:05 +0000 (15:05 -0700)]
mds/MDSMap: fix incompat version for encoding

Back in 8f7900a09c8e490c9cd3a6f92ed1f0eb1f47f2a9 we added the new fields
before the 'extended' section, which made the encoding incompatible.
Instead, add them at the end--old clients don't care whether the enabled
flag is set or what the 'fs name' is.

Fixes: #8725
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agomds/MDSMap: drop trailing else in favor of early return
Sage Weil [Wed, 13 Aug 2014 22:03:03 +0000 (15:03 -0700)]
mds/MDSMap: drop trailing else in favor of early return

This keeps the old-version special cases in one place and make it obvious
what the current/forward-looking path is.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge remote-tracking branch 'upstream/next'
Samuel Just [Wed, 13 Aug 2014 21:11:12 +0000 (14:11 -0700)]
Merge remote-tracking branch 'upstream/next'

10 years agoMerge pull request #2252 from ceph/wip-9087
Samuel Just [Wed, 13 Aug 2014 21:10:45 +0000 (14:10 -0700)]
Merge pull request #2252 from ceph/wip-9087

test/system/systest_runnable.cc: debugging on start and end

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge remote-tracking branch 'gh/next'
Sage Weil [Wed, 13 Aug 2014 21:10:31 +0000 (14:10 -0700)]
Merge remote-tracking branch 'gh/next'

10 years agotest/system/systest_runnable.cc: debugging on start and end 2252/head
Samuel Just [Wed, 13 Aug 2014 20:57:13 +0000 (13:57 -0700)]
test/system/systest_runnable.cc: debugging on start and end

Signed-off-by: Samuel Just <sam.just@inktank.com>
10 years agoceph-disk: linter cleanup 2251/head
Alfredo Deza [Wed, 13 Aug 2014 19:50:20 +0000 (15:50 -0400)]
ceph-disk: linter cleanup

Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
10 years agomon: fix divide by zero when pg_num adjusted and no osds
Sage Weil [Wed, 13 Aug 2014 20:31:10 +0000 (13:31 -0700)]
mon: fix divide by zero when pg_num adjusted and no osds

Fixes: #9052
Backport: firefly, dumpling
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agomon: fix potential divide by zero on can_mark_{down,out}
Sage Weil [Wed, 13 Aug 2014 20:15:04 +0000 (13:15 -0700)]
mon: fix potential divide by zero on can_mark_{down,out}

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agomon: fix divide by zero when pg_num adjusted and no osds
Sage Weil [Wed, 13 Aug 2014 20:15:36 +0000 (13:15 -0700)]
mon: fix divide by zero when pg_num adjusted and no osds

Fixes: #9101
Backport: firefly, dumpling
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agomon: fix potential divide by zero on can_mark_{down,out}
Sage Weil [Wed, 13 Aug 2014 20:15:04 +0000 (13:15 -0700)]
mon: fix potential divide by zero on can_mark_{down,out}

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2236 from ceph/wip-9055
Sage Weil [Wed, 13 Aug 2014 19:54:40 +0000 (12:54 -0700)]
Merge pull request #2236 from ceph/wip-9055

ceph_test_rados_api_tier: fix cache pool cleanup during test

Reviewed-by: Samuel Just <sam.just@inktank.com>
10 years agoMerge pull request #2222 from ceph/wip-9029
Sage Weil [Wed, 13 Aug 2014 19:40:58 +0000 (12:40 -0700)]
Merge pull request #2222 from ceph/wip-9029

mds: Make min/max UID configurable for who is allowed to create a snapsh...

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agomds: Revert from mds_mksnap_ setting to mds_snap_ settings 2222/head
Wido den Hollander [Wed, 13 Aug 2014 19:07:59 +0000 (21:07 +0200)]
mds: Revert from mds_mksnap_ setting to mds_snap_ settings

10 years agoceph-disk: warn about falling back to sgdisk (once) 2247/head
Sage Weil [Wed, 13 Aug 2014 19:00:50 +0000 (12:00 -0700)]
ceph-disk: warn about falling back to sgdisk (once)

This way the user knows something funny might be up if dmcrypt is in use.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoceph-disk: only fall back to sgdisk for 'list' if blkid seems old
Sage Weil [Wed, 13 Aug 2014 18:40:34 +0000 (11:40 -0700)]
ceph-disk: only fall back to sgdisk for 'list' if blkid seems old

If the blkid doesn't show us any ID_PART_ENTRY_* fields but we know it is
a GPT partition, *then* fallback.  Otherwise, don't bother.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoceph-disk: add get_partition_base() helper
Sage Weil [Wed, 13 Aug 2014 18:39:47 +0000 (11:39 -0700)]
ceph-disk: add get_partition_base() helper

Return the base devices/disk for a partition device.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoceph-disk: display information about dmcrypted data and journal volumes
Sage Weil [Wed, 13 Aug 2014 00:26:07 +0000 (17:26 -0700)]
ceph-disk: display information about dmcrypted data and journal volumes

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2249 from ceph/wip-9096
Samuel Just [Wed, 13 Aug 2014 17:48:32 +0000 (10:48 -0700)]
Merge pull request #2249 from ceph/wip-9096

osd: fix require_same_peer_instance from fast_dispatch

Reviewed-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Loic Dachary <loic@dachary.org>
10 years agoosd/ReplicatedPG: only do agent mode calculations for positive values
Sage Weil [Wed, 13 Aug 2014 17:34:53 +0000 (10:34 -0700)]
osd/ReplicatedPG: only do agent mode calculations for positive values

After a split we can get negative values here.  Only do the arithmetic if
we have a valid (positive) value that won't through the floating point
unit for a loop.

Fixes: #9082
Tested-by: Karan Singh <karan.singh@csc.fi>
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: fix some line wrapping 2249/head
Sage Weil [Wed, 13 Aug 2014 16:38:07 +0000 (09:38 -0700)]
osd: fix some line wrapping

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: fix require_same_peer_instance from fast_dispatch
Sage Weil [Wed, 13 Aug 2014 15:30:25 +0000 (08:30 -0700)]
osd: fix require_same_peer_instance from fast_dispatch

The mark-down of old peers needs to take the session_dispatch_lock in order
to safely clear the Session ref cycle.  However, for fast dispatch callers,
that lock is already held.  Pass a flag down from the callers indicating
whether we need to take the additional lock.

Fixes: #9096
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoosd: inline require_osd_up_peer
Sage Weil [Wed, 13 Aug 2014 15:20:42 +0000 (08:20 -0700)]
osd: inline require_osd_up_peer

There is only one caller.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2233 from majianpeng/fix1
Sage Weil [Wed, 13 Aug 2014 04:22:25 +0000 (21:22 -0700)]
Merge pull request #2233 from majianpeng/fix1

os/chain_xattr: Remove all old xattr entry when overwrite the xattr.

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2230 from ceph/wip-fsx-flatten
Sage Weil [Wed, 13 Aug 2014 04:17:12 +0000 (21:17 -0700)]
Merge pull request #2230 from ceph/wip-fsx-flatten

test_librbd_fsx: also flatten as part of randomize_parent_overlap

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2234 from kevincox/net-docs
Sage Weil [Wed, 13 Aug 2014 04:14:10 +0000 (21:14 -0700)]
Merge pull request #2234 from kevincox/net-docs

doc: Initial network docs.

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2237 from ceph/wip-8560
Sage Weil [Wed, 13 Aug 2014 04:06:16 +0000 (21:06 -0700)]
Merge pull request #2237 from ceph/wip-8560

mon: instrument paxos

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
10 years agomon/Paxos: add perfcounters for most paxos operations 2237/head
Sage Weil [Sun, 10 Aug 2014 21:41:19 +0000 (14:41 -0700)]
mon/Paxos: add perfcounters for most paxos operations

I'm focusing primarily on the ones that result in IO here.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2242 from majianpeng/fix4
Sage Weil [Wed, 13 Aug 2014 04:01:09 +0000 (21:01 -0700)]
Merge pull request #2242 from majianpeng/fix4

utime: Because class Clock didn't exist,so remove the declaration in class utime_t

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoceph-disk: move fs mount probe into a helper
Sage Weil [Wed, 13 Aug 2014 00:25:42 +0000 (17:25 -0700)]
ceph-disk: move fs mount probe into a helper

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoceph-disk: use partition type UUIDs, and blkid
Sage Weil [Wed, 13 Aug 2014 00:25:10 +0000 (17:25 -0700)]
ceph-disk: use partition type UUIDs, and blkid

Use blkid to give us the GPT partition type.  This lets us distinguish
between dmcrypt and non-dmcrypt partitions.  Fake it if blkid doesn't
give us what we want and try with sgdisk.  This isn't perfect (it can't
tell between dmcrypt and not dmcrypt), but such is life, and we are better
off than before.

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoRevert "ReplicatedPG: do not pass cop into C_Copyfrom"
Samuel Just [Tue, 12 Aug 2014 23:34:30 +0000 (16:34 -0700)]
Revert "ReplicatedPG: do not pass cop into C_Copyfrom"

The ref was introduced in 589b639af7c8834a1e6293d58d77a9c440107bc3
and is actually necessary to keep the buffers around.

This reverts commit 300b5e8ecbb7526b55e2cb5eeba81fd501a8b652.

10 years agoReplicatedPG: do not pass cop into C_Copyfrom
Samuel Just [Tue, 12 Aug 2014 19:20:28 +0000 (12:20 -0700)]
ReplicatedPG: do not pass cop into C_Copyfrom

We do not know when the objecter will finally let go of this Context.  Thus, we
cannot know whether it will happen before the flush, at which point the
object_context held by the cop must have been released.

Also, we simply don't need it, process_copy_chunk alrady works in terms of the
tid!

Fixes: #8894
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Samuel Just <sam.just@inktank.com>
10 years agoMerge pull request #2246 from ceph/wip-9064
Sage Weil [Tue, 12 Aug 2014 22:50:59 +0000 (15:50 -0700)]
Merge pull request #2246 from ceph/wip-9064

ReplicatedPG::maybe_handle_cache: do not skip promote for write_ordered

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoReplicatedPG::maybe_handle_cache: do not skip promote for write_ordered 2246/head
Samuel Just [Tue, 12 Aug 2014 22:24:26 +0000 (15:24 -0700)]
ReplicatedPG::maybe_handle_cache: do not skip promote for write_ordered

We cannot redirect a RW ordered read.

Fixes: #9064
Introduced: 0ed3adc1e0a74bf9548d1d956aece11f019afee0
Signed-off-by: Samuel Just <sam.just@inktank.com>
10 years agoMerge pull request #2245 from dachary/wip-9085-isa-link
Sage Weil [Tue, 12 Aug 2014 21:37:28 +0000 (14:37 -0700)]
Merge pull request #2245 from dachary/wip-9085-isa-link

erasure-code: isa plugin must link with ErasureCode.cc

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoerasure-code: isa plugin must link with ErasureCode.cc 2245/head
Loic Dachary [Tue, 12 Aug 2014 16:46:29 +0000 (18:46 +0200)]
erasure-code: isa plugin must link with ErasureCode.cc

Otherwise it will not get the methods it needs. A test is added to check
the plugin loads as expected, from the command line. The test is not run
if the isa plugin is not found, which happens on platforms that are not
supported.

Signed-off-by: Loic Dachary <loic@dachary.org>
10 years agoceph-disk: fix log syntax error
Sage Weil [Tue, 12 Aug 2014 20:53:16 +0000 (13:53 -0700)]
ceph-disk: fix log syntax error

  File "/usr/sbin/ceph-disk", line 303, in command_check_call
    LOG.info('Running command: %s' % ' '.join(arguments))
TypeError: sequence item 2: expected string, NoneType found

Backport: firefly
Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #2239 from ceph/wip-8912
Sage Weil [Tue, 12 Aug 2014 19:41:31 +0000 (12:41 -0700)]
Merge pull request #2239 from ceph/wip-8912

librbd: fix error path cleanup for opening an image

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agodoc/changelog: v0.67.10 notes
Sage Weil [Tue, 12 Aug 2014 19:36:47 +0000 (12:36 -0700)]
doc/changelog: v0.67.10 notes

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agoMerge branch 'wip-8860'
Sage Weil [Tue, 12 Aug 2014 19:22:31 +0000 (12:22 -0700)]
Merge branch 'wip-8860'

Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoinit-ceph: conditionally update after argparsing
Alfredo Deza [Fri, 8 Aug 2014 14:16:20 +0000 (10:16 -0400)]
init-ceph: conditionally update  after argparsing

Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
10 years agodoc/release-notes: v0.67.10
Sage Weil [Tue, 12 Aug 2014 18:30:48 +0000 (11:30 -0700)]
doc/release-notes: v0.67.10

Signed-off-by: Sage Weil <sage@redhat.com>
10 years agorgw: call throttle_data() even if renew_state() failed 2257/head
Yehuda Sadeh [Tue, 12 Aug 2014 18:17:47 +0000 (11:17 -0700)]
rgw: call throttle_data() even if renew_state() failed

Otherwise we're going to leak the aio callback handle.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>