]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
10 years agoObjecter: failed assert(tick_event==NULL) at osdc/Objecter.cc 4175/head
Zhiqiang Wang [Wed, 25 Mar 2015 08:32:44 +0000 (16:32 +0800)]
Objecter: failed assert(tick_event==NULL) at osdc/Objecter.cc

When the Objecter timer erases the tick_event from its events queue and
calls tick() to dispatch it, if the Objecter::rwlock is held by shutdown(),
it waits there to get the rwlock. However, inside the shutdown function,
it checks the tick_event and tries to cancel it. The cancel_event function
returns false since tick_event is already removed from the events queue. Thus
tick_event is not set to NULL in shutdown(). Later the tick function return
ealier and doesn't set tick_event to NULL as well. This leads to the assertion
failure.

This is a regression introduced by an incorrect conflict resolution when
d790833 was backported.

Fixes: #11183
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
10 years agoMerge pull request #4127 from dzafman/wip-11176-giant
Loic Dachary [Mon, 23 Mar 2015 19:39:26 +0000 (20:39 +0100)]
Merge pull request #4127 from dzafman/wip-11176-giant

ceph-objectstore-tool: Output only unsupported features when incomatible

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agoMerge pull request #4097 from dachary/wip-10497-giant
Loic Dachary [Sun, 22 Mar 2015 22:11:46 +0000 (23:11 +0100)]
Merge pull request #4097 from dachary/wip-10497-giant

librados: c api does not translate op flag

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4096 from dachary/wip-9617-giant
Loic Dachary [Sun, 22 Mar 2015 22:11:26 +0000 (23:11 +0100)]
Merge pull request #4096 from dachary/wip-9617-giant

objecter shutdown races with msg dispatch

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4095 from dachary/wip-9675-giant
Loic Dachary [Sun, 22 Mar 2015 22:11:03 +0000 (23:11 +0100)]
Merge pull request #4095 from dachary/wip-9675-giant

splitting a pool doesn't start when rule_id != ruleset_id

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4094 from dachary/wip-9891-giant
Loic Dachary [Sun, 22 Mar 2015 22:10:42 +0000 (23:10 +0100)]
Merge pull request #4094 from dachary/wip-9891-giant

Assertion: os/DBObjectMap.cc: 1214: FAILED assert(0)

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4093 from dachary/wip-9915-giant
Loic Dachary [Sun, 22 Mar 2015 22:10:25 +0000 (23:10 +0100)]
Merge pull request #4093 from dachary/wip-9915-giant

osd: eviction logic reversed

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4092 from dachary/wip-9985-giant
Loic Dachary [Sun, 22 Mar 2015 22:09:28 +0000 (23:09 +0100)]
Merge pull request #4092 from dachary/wip-9985-giant

osd: incorrect atime calculation

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4091 from dachary/wip-9986-giant
Loic Dachary [Sun, 22 Mar 2015 22:08:41 +0000 (23:08 +0100)]
Merge pull request #4091 from dachary/wip-9986-giant

objecter: map epoch skipping broken

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4090 from dachary/wip-10059-giant
Loic Dachary [Sun, 22 Mar 2015 22:08:16 +0000 (23:08 +0100)]
Merge pull request #4090 from dachary/wip-10059-giant

osd/ECBackend.cc: 876: FAILED assert(0)

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4089 from dachary/wip-10080-giant
Loic Dachary [Sun, 22 Mar 2015 22:07:52 +0000 (23:07 +0100)]
Merge pull request #4089 from dachary/wip-10080-giant

Pipe::connect() cause osd crash when osd reconnect to its peer

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4088 from dachary/wip-6003-giant
Loic Dachary [Sun, 22 Mar 2015 22:07:20 +0000 (23:07 +0100)]
Merge pull request #4088 from dachary/wip-6003-giant

journal Unable to read past sequence 406 ...

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4082 from dachary/wip-10106-giant
Loic Dachary [Sun, 22 Mar 2015 22:06:51 +0000 (23:06 +0100)]
Merge pull request #4082 from dachary/wip-10106-giant

rgw acl response should start with <?xml version=1.0 ?>

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4078 from dachary/wip-11157-giant
Loic Dachary [Sun, 22 Mar 2015 22:06:23 +0000 (23:06 +0100)]
Merge pull request #4078 from dachary/wip-11157-giant

doc,tests: force checkout of submodules

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4077 from dachary/wip-10150-giant
Loic Dachary [Sun, 22 Mar 2015 22:05:20 +0000 (23:05 +0100)]
Merge pull request #4077 from dachary/wip-10150-giant

osd/ReplicatedPG.cc: 10853: FAILED assert(r >= 0) (in _scan_range)

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4076 from dachary/wip-10153-giant
Loic Dachary [Sun, 22 Mar 2015 22:04:51 +0000 (23:04 +0100)]
Merge pull request #4076 from dachary/wip-10153-giant

Rados.shutdown() dies with Illegal instruction (core dumped)

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4074 from dachary/wip-10220-giant
Loic Dachary [Sun, 22 Mar 2015 22:04:25 +0000 (23:04 +0100)]
Merge pull request #4074 from dachary/wip-10220-giant

mon/Paxos.cc: 1033: FAILED assert(mon->is_leader())

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #3548 from ceph/wip-10643
Loic Dachary [Sun, 22 Mar 2015 22:03:35 +0000 (23:03 +0100)]
Merge pull request #3548 from ceph/wip-10643

mon: MDSMonitor: missing backports for giant

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agoMerge pull request #4053 from dachary/wip-8011-giant
Loic Dachary [Sun, 22 Mar 2015 21:12:58 +0000 (22:12 +0100)]
Merge pull request #4053 from dachary/wip-8011-giant

osd/ReplicatedPG.cc: 5244: FAILED assert(soid < scrubber.start || soid >= scrubber.end)

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4052 from dachary/wip-10844-giant
Loic Dachary [Sun, 22 Mar 2015 21:12:40 +0000 (22:12 +0100)]
Merge pull request #4052 from dachary/wip-10844-giant

mon: caps validation should rely on EntityName instead of entity_name_t

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4050 from dachary/wip-10817-giant
Loic Dachary [Sun, 22 Mar 2015 21:12:15 +0000 (22:12 +0100)]
Merge pull request #4050 from dachary/wip-10817-giant

WorkQueue: make timeout when calling WaitInterval configurable

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4049 from dachary/wip-10787-giant
Loic Dachary [Sun, 22 Mar 2015 21:11:43 +0000 (22:11 +0100)]
Merge pull request #4049 from dachary/wip-10787-giant

mon: OSDMonitor::map_cache is buggy, send_incremental is not conservative

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
Reviewed-by: Sage Weil <sage@redhat.com>
10 years agoMerge pull request #4048 from dachary/wip-10770-giant
Loic Dachary [Sun, 22 Mar 2015 21:09:28 +0000 (22:09 +0100)]
Merge pull request #4048 from dachary/wip-10770-giant

rgw: pending bucket index operations are not cancelled correctly

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4046 from dachary/wip-10723-giant
Loic Dachary [Sun, 22 Mar 2015 21:09:06 +0000 (22:09 +0100)]
Merge pull request #4046 from dachary/wip-10723-giant

rados python binding leaks Ioctx objects

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4044 from dachary/wip-10617-giant
Loic Dachary [Sun, 22 Mar 2015 21:08:45 +0000 (22:08 +0100)]
Merge pull request #4044 from dachary/wip-10617-giant

osd: pgs for deleted pools don't finish getting removed if osd restarts

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4034 from dachary/wip-10475-giant
Loic Dachary [Sun, 22 Mar 2015 21:08:20 +0000 (22:08 +0100)]
Merge pull request #4034 from dachary/wip-10475-giant

rgw: Swift API. Support for X-Remove-Container-Meta-{key} header.

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoMerge pull request #4033 from dachary/wip-10471-giant
Loic Dachary [Sun, 22 Mar 2015 21:07:53 +0000 (22:07 +0100)]
Merge pull request #4033 from dachary/wip-10471-giant

rgw: index swift keys appropriately

Reviewed-by: Abhishek Lekshmanan <abhishek.lekshmanan@ril.com>
10 years agoceph-objectstore-tool: Output only unsupported features when incomatible 4127/head
David Zafman [Fri, 20 Mar 2015 23:57:40 +0000 (16:57 -0700)]
ceph-objectstore-tool: Output only unsupported features when incomatible

Fixes: #11176
Backport: firefly, giant

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 5b23f5b5892b36fb7d06efc0d77e64a24ef6e8c9)

10 years agoMerge pull request #3971 from ceph/giant-11053
John Spray [Thu, 19 Mar 2015 22:07:58 +0000 (22:07 +0000)]
Merge pull request #3971 from ceph/giant-11053

mds: fix assertion caused by system clock backwards

Reviewed-by: John Spray <john.spray@redhat.com>
10 years agorgw: Swift API. Support for X-Remove-Container-Meta-{key} header. 4034/head
Dmytro Iurchenko [Tue, 3 Feb 2015 15:54:38 +0000 (17:54 +0200)]
rgw: Swift API. Support for X-Remove-Container-Meta-{key} header.

Fixes: #10475
Backport: hammer, firefly
Reported-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: Dmytro Iurchenko <diurchenko@mirantis.com>
(cherry picked from commit f67bfa24fd6f69c2fcc0987eba8b6b426dd78320)

Conflicts:
src/rgw/rgw_rest.h
        trivial merge: prototype of an unrelated function changed
        src/rgw/rgw_op.cc
        s/is_object_op/!(s->object == NULL)/

10 years agolibrados: Translate operation flags from C APIs 4097/head
Matt Richards [Thu, 8 Jan 2015 21:16:17 +0000 (13:16 -0800)]
librados: Translate operation flags from C APIs

The operation flags in the public C API are a distinct enum
and need to be translated to Ceph OSD flags, like as happens in
the C++ API. It seems like the C enum and the C++ enum consciously
use the same values, so I reused the C++ translation function.

Signed-off-by: Matthew Richards <mattjrichards@gmail.com>
(cherry picked from commit 49d114f1fff90e5c0f206725a5eb82c0ba329376)

10 years agoObjecter: check the 'initialized' atomic_t safely 4096/head
Josh Durgin [Tue, 30 Sep 2014 01:17:29 +0000 (18:17 -0700)]
Objecter: check the 'initialized' atomic_t safely

shutdown() resets initialized to 0, but we can still receive messages
after this point, so fix message handlers to skip messages in this
case instead of asserting.

Also read initialized while holding Objecter::rwlock to avoid races
where e.g. handle_osd_map() checks initialized -> 1, continues,
shutdown() is called, sets initialized to 0, then handle_osd_map()
goes about its business and calls op_submit(), which would fail the
assert(initialized.read()) check. Similar races existed in other
message handlers which change Objecter state.

The Objecter is not destroyed until after its Messenger in
the MDS, OSD, and librados, so this should be safe.

Fixes: #9617
Backport: giant
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
(cherry picked from commit e506f896a9217324ab7a7865989f4454562aed5f)

Conflicts:
src/osdc/Objecter.cc
        context changed: Objecter::tick() did not have
        assert(initialized.read())

10 years agoObjecter: init with a constant of the correct type
Josh Durgin [Tue, 30 Sep 2014 01:12:50 +0000 (18:12 -0700)]
Objecter: init with a constant of the correct type

Just a tiny cleanup.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
(cherry picked from commit 1feba200aae7d9a042cda705c3de8fba2fc82331)

10 years agoCrushWrapper: pick a ruleset same as rule_id 4095/head
Xiaoxi Chen [Wed, 20 Aug 2014 07:35:44 +0000 (15:35 +0800)]
CrushWrapper: pick a ruleset same as rule_id

Originally in the add_simple_ruleset funtion, the ruleset_id
is not reused but rule_id is reused. So after some add/remove
against rules, the newly created rule likely to have
ruleset!=rule_id.

We dont want this happen because we are trying to hold the constraint
that ruleset == rule_id.

Signed-off-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
(cherry picked from commit 78e84f34da83abf5a62ae97bb84ab70774b164a6)

10 years agoDBObjectMap: lock header_lock on sync() 4094/head
Samuel Just [Fri, 20 Feb 2015 21:43:46 +0000 (13:43 -0800)]
DBObjectMap: lock header_lock on sync()

Otherwise, we can race with another thread updating state.seq
resulting in the old, smaller value getting persisted.  If there
is a crash at that time, we will reuse a sequence number, resulting
in an inconsistent node tree and bug #9891.

Fixes: 9891
Backport: giant, firefly, dumpling
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 2b63dd25fc1c73fa42e52e9ea4ab5a45dd9422a0)

Conflicts:
src/os/DBObjectMap.cc
        because we have state.v = 1; instead of state.v = 2;

10 years agoosd: cache tiering: fix the atime logic of the eviction 4093/head
Zhiqiang Wang [Tue, 28 Oct 2014 01:37:11 +0000 (09:37 +0800)]
osd: cache tiering: fix the atime logic of the eviction

Reported-by: Xinze Chi <xmdxcxz@gmail.com>
Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
(cherry picked from commit 622c5ac41707069ef8db92cb67c9185acf125d40)

10 years agoosd/ReplicatedPG: fix compile error 4092/head
Sage Weil [Sat, 1 Nov 2014 02:33:59 +0000 (19:33 -0700)]
osd/ReplicatedPG: fix compile error

From 1fef4c3d541cba360738437420ebfa2447d5802e.

Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 4a9ad7dc2da6f4fa6a64235776a3f1d2799aef60)

10 years agoGet the currently atime of the object in cache pool for eviction
Xinze Chi [Wed, 29 Oct 2014 07:11:11 +0000 (07:11 +0000)]
Get the currently atime of the object in cache pool for eviction

Because if there are mutiple atime in agent_state for the same object, we should use the recently one.

Signed-off-by: Xinze Chi <xmdxcxz@gmail.com>
(cherry picked from commit 1fef4c3d541cba360738437420ebfa2447d5802e)

10 years agoosdc/Objecter: Fix a bug of dead looping in Objecter::handle_osd_map 4091/head
Ding Dinghua [Thu, 30 Oct 2014 06:58:42 +0000 (14:58 +0800)]
osdc/Objecter: Fix a bug of dead looping in Objecter::handle_osd_map

If current map epoch is less than oldest epoch, current map epoch
should step up to oldest epoch.

Fixes: #9986
Signed-off-by: Ding Dinghua <dingdinghua85@gmail.com>
(cherry picked from commit e0166a23c2cf655bfb4cf873be021a14d9b9be27)

10 years agoosdc/Objecter: e shouldn't be zero in Objecter::handle_osd_map
Ding Dinghua [Thu, 30 Oct 2014 06:58:05 +0000 (14:58 +0800)]
osdc/Objecter: e shouldn't be zero in Objecter::handle_osd_map

Signed-off-by: Ding Dinghua <dingdinghua85@gmail.com>
(cherry picked from commit 31c584c8ba022cd44fe2872d221f3026618cefab)

10 years agoPG: always clear_primary_state on new interval, but only clear pg temp if not primary 4090/head
Samuel Just [Wed, 19 Nov 2014 16:20:16 +0000 (08:20 -0800)]
PG: always clear_primary_state on new interval, but only clear pg temp if not primary

Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit f692bfe076b8ddb679c6d1a6ea78cc47f0876326)

10 years agoPG: always clear_primary_state when leaving Primary
Samuel Just [Fri, 14 Nov 2014 23:44:20 +0000 (15:44 -0800)]
PG: always clear_primary_state when leaving Primary

Otherwise, entries from the log collection process might leak into the next
epoch, where we might end up choosing a different authoritative log.  In this
case, it resulted in us not rolling back to log entries on one of the replicas
prior to trying to recover from an affected object due to the peer_missing not
being cleared.

Fixes: #10059
Backport: giant, firefly, dumpling
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit c87bde64dfccb5d6ee2877cc74c66fc064b1bcd7)

10 years agoSimpleMessenger: allow RESETSESSION whenever we forget an endpoint 4089/head
Greg Farnum [Tue, 2 Dec 2014 23:17:57 +0000 (15:17 -0800)]
SimpleMessenger: allow RESETSESSION whenever we forget an endpoint

In the past (e229f8451d37913225c49481b2ce2896ca6788a2) we decided to disable
reset of lossless Pipes, because lossless peers resetting caused trouble and
they can't forget about each other. But they actually can: if mark_down()
is called.

I can't figure out how else we could forget about a remote endpoint, so I think
it's okay if we tell them we reset in order to clean up state. That's desirable
so that we don't get into strange situations with out-of-whack counters.

Fixes: #10080
Backport: giant, firefly, dumpling

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
(cherry picked from commit 8cd1fdd7a778eb84cb4d7161f73bc621cc394261)

10 years agoFileJournal: fix journalq population in do_read_entry() 4088/head
Samuel Just [Fri, 6 Feb 2015 17:52:29 +0000 (09:52 -0800)]
FileJournal: fix journalq population in do_read_entry()

Fixes: 6003
Backport: dumpling, firefly, giant
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit bae1f3eaa09c4747b8bfc6fb5dc673aa6989b695)

Conflicts:
src/os/FileJournal.cc
        because reinterpret_cast was added near two hunks after firefly

10 years agorgw: flush xml header on get acl request 4082/head
Yehuda Sadeh [Sat, 31 Jan 2015 02:42:40 +0000 (18:42 -0800)]
rgw: flush xml header on get acl request

Fixes: #10106
Backport: firefly, giant

dump_start() updates the formatter with the appropriate prefix, however,
we never flushed the formatter.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit eb45f861343162e018968b8c56693a8c6f5b2cab)

10 years agodoc,tests: force checkout of submodules 4078/head
Loic Dachary [Wed, 18 Mar 2015 23:32:39 +0000 (00:32 +0100)]
doc,tests: force checkout of submodules

When updating submodules, always checkout even if the HEAD is the
desired commit hash (update --force) to avoid the following:

    * a directory gmock exists in hammer
    * a submodule gmock replaces the directory gmock in master
    * checkout master + submodule update : gmock/.git is created
    * checkout hammer : the gmock directory still contains the .git from
      master because it did not exist at the time and checkout won't
      remove untracked directories
    * checkout master + submodule update : git rev-parse HEAD is
      at the desired commit although the content of the gmock directory
      is from hammer

http://tracker.ceph.com/issues/11157 Fixes: #11157

Signed-off-by: Loic Dachary <ldachary@redhat.com>
10 years agoReplicatedPG::scan_range: an object can disappear between the list and the attr get 4077/head
Samuel Just [Thu, 11 Dec 2014 21:05:54 +0000 (13:05 -0800)]
ReplicatedPG::scan_range: an object can disappear between the list and the attr get

The first item in the range is often last_backfill, upon which writes
can be occuring.  It's trimmed off on the primary side anyway.

Fixes: 10150
Backport: dumpling, firefly, giant
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit dce6f288ad541fe7f0ef8374301cd712dd3bfa39)

10 years agocommon: do not unlock rwlock on destruction 4076/head
Federico Simoncelli [Sat, 15 Nov 2014 14:14:04 +0000 (14:14 +0000)]
common: do not unlock rwlock on destruction

According to pthread_rwlock_unlock(3p):

 Results are undefined if the read-write lock rwlock is not held
 by the calling thread.

and:

 https://sourceware.org/bugzilla/show_bug.cgi?id=17561

 Calling pthread_rwlock_unlock on an rwlock which is not locked
 is undefined.

calling pthread_rwlock_unlock on RWLock destruction could cause
an unknown behavior for two reasons:

- the lock is acquired by another thread (undefined)
- the lock is not acquired (undefined)

Moreover since glibc-2.20 calling pthread_rwlock_unlock on a
rwlock that is not locked results in a SIGILL that kills the
application.

This patch removes the pthread_rwlock_unlock call on destruction
and replaces it with an assertion to check that the RWLock is
not in use.

Any code that relied on the implicit release is now going to
break the assertion, e.g.:

 {
   RWLock l;
   l.get(for_write);
 } // implicit release, wrong.

Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
(cherry picked from commit cf2104d4d991361c53f6e2fea93b69de10cd654b)

10 years agomon: Paxos: reset accept timeout before submiting work to the store 4074/head
Joao Eduardo Luis [Wed, 10 Dec 2014 17:46:35 +0000 (17:46 +0000)]
mon: Paxos: reset accept timeout before submiting work to the store

Otherwise we may trigger the timeout while waiting for the work to be
committed to the store -- and it would only take the write to take a bit
longer than 10 seconds (default accept timeout).

We do wait for the work to be properly committed to the store before
extending the lease though.

Fixes: #10220
Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
(cherry picked from commit 18534615f184ba56b441fd1d4242eb06debdfe13)

10 years agomon: MonitorDBStore: allow randomly injecting random delays on writes
Joao Eduardo Luis [Tue, 9 Dec 2014 17:35:47 +0000 (17:35 +0000)]
mon: MonitorDBStore: allow randomly injecting random delays on writes

Adds two new config options:

mon_inject_transaction_delay_probability : DOUBLE (0.0-1.0, default: 0.0)
mon_inject_transaction_delay_max : DOUBLE (seconds, default: 10.0)

If probability is set to a value greater than 0, just before applying
the transaction, the store will decide whether to inject a delay,
randomly choosing a value between 0 and the max.

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
(cherry picked from commit beaa04e4119765d5775a6c48fd072dd95c984e3b)

10 years agoShardedThreadPool: make wait timeout on empty queue configurable 4050/head
Samuel Just [Tue, 10 Feb 2015 01:41:19 +0000 (17:41 -0800)]
ShardedThreadPool: make wait timeout on empty queue configurable

Fixes: 10818
Backport: giant
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 7002f934e6664daa995ca0629c0ea3bae1c6bddf)

10 years agoWorkQueue: make wait timeout on empty queue configurable
Samuel Just [Tue, 10 Feb 2015 01:11:38 +0000 (17:11 -0800)]
WorkQueue: make wait timeout on empty queue configurable

Fixes: 10817
Backport: giant, firefly, dumpling
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 5aa6f910843e98a05bfcabe6f29d612cf335edbf)

10 years agoPGLog: include rollback_info_trimmed_to in (read|write)_log
Samuel Just [Thu, 20 Nov 2014 23:15:08 +0000 (15:15 -0800)]
PGLog: include rollback_info_trimmed_to in (read|write)_log

Fixes: #10157
Backport: firefly, giant
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 1fe8b846641486cc294fe7e1d2450132c38d2dba)

10 years agomon: MonCap: take EntityName instead when expanding profiles 4052/head
Joao Eduardo Luis [Wed, 11 Feb 2015 23:36:01 +0000 (23:36 +0000)]
mon: MonCap: take EntityName instead when expanding profiles

entity_name_t is tightly coupled to the messenger, while EntityName is
tied to auth.  When expanding profiles we want to tie the profile
expansion to the entity that was authenticated.  Otherwise we may incur
in weird behavior such as having caps validation failing because a given
client messenger inst does not match the auth entity it used.

e.g., running

ceph --name osd.0 config-key exists foo daemon-private/osd.X/foo

has entity_name_t 'client.12345' and EntityName 'osd.0'.  Using
entity_name_t during profile expansion would not allow the client access
to daemon-private/osd.X/foo (client.12345 != osd.X).

Fixes: #10844
Backport: firefly,giant

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
(cherry picked from commit 87544f68b88fb3dd17c519de3119a9ad9ab21dfb)

10 years agomon: Monitor: stash auth entity name in session
Joao Eduardo Luis [Fri, 14 Nov 2014 21:03:54 +0000 (21:03 +0000)]
mon: Monitor: stash auth entity name in session

Backport: giant

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
(cherry picked from commit ca8e1efc0be9bffcfbdce5593526d257aa498062)

10 years agoReplicatedPG: fail a non-blocking flush if the object is being scrubbed 4053/head
Samuel Just [Thu, 20 Nov 2014 22:27:39 +0000 (14:27 -0800)]
ReplicatedPG: fail a non-blocking flush if the object is being scrubbed

Fixes: #8011
Backport: firefly, giant
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 9b26de3f3653d38dcdfc5b97874089f19d2a59d7)

10 years agoMerge pull request #4042 from dachary/wip-10546-giant
Sage Weil [Tue, 17 Mar 2015 17:52:01 +0000 (10:52 -0700)]
Merge pull request #4042 from dachary/wip-10546-giant

ceph time check start round bug in monitor.cc

10 years agoMerge pull request #4047 from dachary/wip-10762-giant
Sage Weil [Tue, 17 Mar 2015 17:50:26 +0000 (10:50 -0700)]
Merge pull request #4047 from dachary/wip-10762-giant

mon: osd gets marked down twice

10 years agoMerge pull request #4041 from dachary/wip-10512-giant
Sage Weil [Tue, 17 Mar 2015 17:49:53 +0000 (10:49 -0700)]
Merge pull request #4041 from dachary/wip-10512-giant

osd: cancel_flush requeues blocked events after blocking event

10 years agoMerge pull request #4031 from dachary/wip-10353-giant
Sage Weil [Tue, 17 Mar 2015 17:47:26 +0000 (10:47 -0700)]
Merge pull request #4031 from dachary/wip-10353-giant

crush: set_choose_tries = 100 for erasure code rulesets

10 years agoMerge pull request #4029 from dachary/wip-9910-giant
Sage Weil [Tue, 17 Mar 2015 17:47:08 +0000 (10:47 -0700)]
Merge pull request #4029 from dachary/wip-9910-giant

msg/Pipe: discard delay queue before incoming queue

Reviewed-by: Greg Farnum <gfarnum@redhat.com>
10 years agoMerge pull request #4030 from dachary/wip-10351-giant
Sage Weil [Tue, 17 Mar 2015 17:44:53 +0000 (10:44 -0700)]
Merge pull request #4030 from dachary/wip-10351-giant

mount.ceph: avoid spurious error message

Reviewed-by: John Spray <john.spray@redhat.com>
10 years agoMerge pull request #4028 from dachary/wip-10259-giant
Sage Weil [Tue, 17 Mar 2015 17:44:00 +0000 (10:44 -0700)]
Merge pull request #4028 from dachary/wip-10259-giant

osd_types: op_queue_age_hist and fs_perf_stat should be in osd_stat_t::o...

10 years agoMerge pull request #4027 from dachary/wip-10257-giant
Sage Weil [Tue, 17 Mar 2015 17:42:10 +0000 (10:42 -0700)]
Merge pull request #4027 from dachary/wip-10257-giant

  mon: PGMonitor: several stats output error fixes

10 years agoMerge pull request #3998 from dzafman/wip-10677
Sage Weil [Tue, 17 Mar 2015 17:41:56 +0000 (10:41 -0700)]
Merge pull request #3998 from dzafman/wip-10677

Fix ceph command manpage to match ceph -h (giant)

Reviewed-by: Xinxin Shu <xinxin.shu@intel.com>
10 years agoMerge pull request #3921 from sponce/wip-11078-giant
Sage Weil [Tue, 17 Mar 2015 17:40:02 +0000 (10:40 -0700)]
Merge pull request #3921 from sponce/wip-11078-giant

Fix libstriprados::stat, use strtoll insdead of strtol

10 years agoMerge pull request #3819 from tchaikov/giant-pg-leak-10421
Sage Weil [Tue, 17 Mar 2015 17:36:28 +0000 (10:36 -0700)]
Merge pull request #3819 from tchaikov/giant-pg-leak-10421

osd: fix PG leak in SnapTrimWQ._clear()

10 years agoMerge pull request #3771 from ceph/wip-10883-giant
Sage Weil [Tue, 17 Mar 2015 17:35:37 +0000 (10:35 -0700)]
Merge pull request #3771 from ceph/wip-10883-giant

osd: Fix FileJournal wrap to get header out first

10 years agoMerge pull request #3637 from sponce/wip-10758-giant
Sage Weil [Tue, 17 Mar 2015 17:35:29 +0000 (10:35 -0700)]
Merge pull request #3637 from sponce/wip-10758-giant

Backport of pull request 3633 to giant : Fixed write_full behavior in libradosstriper

10 years agomon/OSDMonitor: do not trust small values in osd epoch cache 4049/head
Sage Weil [Thu, 12 Feb 2015 21:49:50 +0000 (13:49 -0800)]
mon/OSDMonitor: do not trust small values in osd epoch cache

If the epoch cache says the osd has epoch 100 and the osd is asking for
epoch 200+, do not send it 100+.

Fixes: #10787
Backport: giant, firefly
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit a5759e9b97107488a8508f36adf9ca1aba3fae07)

10 years agorgw: send appropriate op to cancel bucket index pending operation 4048/head
Yehuda Sadeh [Thu, 5 Feb 2015 17:33:26 +0000 (09:33 -0800)]
rgw: send appropriate op to cancel bucket index pending operation

Fixes: #10770
Backport: firefly, giant

Reported-by: baijiaruo <baijiaruo@126.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit dfee96e3aebcaeef18c721ab73f0460eba69f1c7)

Conflicts:
src/rgw/rgw_rados.cc
        resolved by manual s/ADD/CANCEL/

10 years agomon: ignore osd failures from before up_from 4047/head
Sage Weil [Thu, 5 Feb 2015 11:07:50 +0000 (03:07 -0800)]
mon: ignore osd failures from before up_from

If the failure was generated for an instance of the OSD prior to when
it came up, ignore it.

This probably causes a fair bit of unnecessary flapping in the wild...

Backport: giant, firefly
Fixes: #10762
Reported-by: Dan van der Ster <dan@vanderster.com>
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 400ac237d35d0d1d53f240fea87e8483c0e2a7f5)

10 years agorados.py: keep reference to python callbacks 4046/head
Josh Durgin [Tue, 10 Feb 2015 04:50:23 +0000 (20:50 -0800)]
rados.py: keep reference to python callbacks

If we don't keep a reference to these, the librados aio calls will
segfault since the python-level callbacks will have been garbage
collected. Passing them to aio_create_completion() does not take a
reference to them. Keep a reference in the python Completion object
associated with the request, since they need the same lifetime.

This fixes a regression from 60b019f69aa0e39d276c669698c92fc890599f50.

Fixes: #10775
Backport: dumpling, firefly, giant
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
(cherry picked from commit 36d37aadbbbece28d70e827511f1a473d851463d)
(cherry picked from commit 5f1245e131e33a98572408c8223deed2c7cf7b75)

10 years agoFix memory leak in python rados bindings
Billy Olsen [Mon, 2 Feb 2015 23:24:59 +0000 (16:24 -0700)]
Fix memory leak in python rados bindings

A circular reference was inadvertently created when using the
CFUNCTYPE binding for callbacks for the asynchronous i/o callbacks.
This commit refactors the usage of the callbacks such that the
Ioctx object does not have a class reference to the callbacks.

Fixes: #10723
Backport: giant, firefly, dumpling
Signed-off-by: Billy Olsen <billy.olsen@gmail.com>
Reviewed-by: Dan Mick <dmick@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
(cherry picked from commit 60b019f69aa0e39d276c669698c92fc890599f50)

10 years agoosd: do not ignore deleted pgs on startup 4044/head
Sage Weil [Fri, 23 Jan 2015 18:47:44 +0000 (10:47 -0800)]
osd: do not ignore deleted pgs on startup

These need to get instantiated so that we can complete the removal process.

Fixes: #10617
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 879fd0c192f5d3c6afd36c2df359806ea95827b8)

10 years agomon: Monitor: fix timecheck rounds period 4042/head
Joao Eduardo Luis [Fri, 30 Jan 2015 11:37:28 +0000 (11:37 +0000)]
mon: Monitor: fix timecheck rounds period

Fixes: #10546
Backports: dumpling?,firefly,giant

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
(cherry picked from commit 2e749599ac6e1060cf553b521761a93fafbf65bb)

10 years agoosd: requeue blocked op before flush it was blocked on 4041/head
Sage Weil [Mon, 12 Jan 2015 01:28:04 +0000 (17:28 -0800)]
osd: requeue blocked op before flush it was blocked on

If we have request A (say, cache-flush) that blocks things, and then
request B that gets blocked on it, and we have an interval change, then we
need to requeue B first, then A, so that the resulting queue will keep
A before B and preserve the order.

This was observed on this firefly run:

  ubuntu@teuthology:/a/sage-2015-01-09_21:43:43-rados-firefly-distro-basic-multi/694675

Backport: giant, firefly
Fixes: #10512
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 11bdfb4131ecac16d4a364d651c6cf5d1d28c702)

10 years agorgw: index swift keys appropriately 4033/head
Yehuda Sadeh [Wed, 7 Jan 2015 21:56:14 +0000 (13:56 -0800)]
rgw: index swift keys appropriately

Fixes: #10471
Backport: firefly, giant

We need to index the swift keys by the full uid:subuser when decoding
the json representation, to keep it in line with how we store it when
creating it through other mechanism.

Reported-by: hemant burman <hemant.burman@gmail.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
(cherry picked from commit 478629bd2f3f32afbe6e93eaebb8a8fa01af356f)

10 years agocrush: set_choose_tries = 100 for erasure code rulesets 4031/head
Loic Dachary [Wed, 17 Dec 2014 15:06:55 +0000 (16:06 +0100)]
crush: set_choose_tries = 100 for erasure code rulesets

It is common for people to try to map 9 OSDs out of a 9 OSDs total ceph
cluster. The default tries (50) will frequently lead to bad mappings for
this use case. Changing it to 100 makes no significant CPU performance
difference, as tested manually by running crushtool on one million
mappings.

http://tracker.ceph.com/issues/10353 Fixes: #10353

Signed-off-by: Loic Dachary <ldachary@redhat.com>
(cherry picked from commit 2f87ac807f3cc7ac55d9677d2051645bf5396a62)

10 years agomount.ceph: avoid spurious error message 4030/head
Yan, Zheng [Sat, 3 Jan 2015 07:29:29 +0000 (15:29 +0800)]
mount.ceph: avoid spurious error message

/etc/mtab in most modern distributions is a symbol link to
/proc/self/mounts.

Fixes: #10351
Signed-off-by: Yan, Zheng <zyan@redhat.com>
(cherry picked from commit bdd0e3c4bda97fe18487a58dd173a7dff752e1a2)

10 years agomsg/Pipe: discard delay queue before incoming queue 4029/head
Sage Weil [Wed, 29 Oct 2014 21:45:11 +0000 (14:45 -0700)]
msg/Pipe: discard delay queue before incoming queue

Shutdown the delayed delivery before the incoming queue in case the
DelayedDelivery thread is busy queuing messages.

Fixes: #9910
Signed-off-by: Sage Weil <sage@redhat.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit f7431cc3c25878057482007beb874c9d4473883e)

10 years agoosd_types: op_queue_age_hist and fs_perf_stat should be in osd_stat_t::operator== 4028/head
Samuel Just [Fri, 5 Dec 2014 23:29:52 +0000 (15:29 -0800)]
osd_types: op_queue_age_hist and fs_perf_stat should be in osd_stat_t::operator==

Fixes: 10259
Backport: giant, firefly, dumpling
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit 1ac17c0a662e6079c2c57edde2b4dc947f547f57)

10 years agomon: PGMonitor: skip zeroed osd stats on get_rule_avail() 4027/head
Joao Eduardo Luis [Mon, 19 Jan 2015 18:49:15 +0000 (18:49 +0000)]
mon: PGMonitor: skip zeroed osd stats on get_rule_avail()

Fixes: #10257
Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
(cherry picked from commit b311e7c36273efae39aa2602c1f8bd90d39e5975)

10 years agomon: PGMonitor: available size 0 if no osds on pool's ruleset
Joao Eduardo Luis [Fri, 16 Jan 2015 18:13:05 +0000 (18:13 +0000)]
mon: PGMonitor: available size 0 if no osds on pool's ruleset

get_rule_avail() may return < 0, which we were using blindly assuming it
would always return an unsigned value.  We would end up with weird
values if the ruleset had no osds.

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
(cherry picked from commit 8be6a6ab2aa5a000a39c73a98b11a0ab32fffa1c)

10 years agomon: PGMonitor: fix division by zero on stats dump
Joao Eduardo Luis [Fri, 16 Jan 2015 18:12:42 +0000 (18:12 +0000)]
mon: PGMonitor: fix division by zero on stats dump

Signed-off-by: Joao Eduardo Luis <joao@redhat.com>
(cherry picked from commit 50547dc3c00b7556e26b9a44ec68640c5c3a2384)

10 years agodoc: Fix ceph command manpage to match ceph -h (giant) 3998/head
David Zafman [Sat, 14 Mar 2015 02:16:47 +0000 (19:16 -0700)]
doc: Fix ceph command manpage to match ceph -h (giant)

Fixes: #10677
Signed-off-by: David Zafman <dzafman@redhat.com>
10 years agodoc: Minor fixes to ceph command manpage
David Zafman [Fri, 13 Mar 2015 23:50:13 +0000 (16:50 -0700)]
doc: Minor fixes to ceph command manpage

Fixes: #10676
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 7e85722fd4c89715fc2ed79697c82d65d7ebf287)

10 years agodoc: Fix ceph command manpage to match ceph -h (firefly)
David Zafman [Thu, 12 Mar 2015 18:39:52 +0000 (11:39 -0700)]
doc: Fix ceph command manpage to match ceph -h (firefly)

Improve synopsis section
Fixes: #10676
Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 9ac488c1eb0e30511079ba05aaf11c79615b3940)

Conflicts:

man/ceph.8 (took incoming version)

10 years agodoc: Changes format style in ceph to improve readability as html.
Nilamdyuti Goswami [Thu, 18 Dec 2014 11:41:22 +0000 (17:11 +0530)]
doc: Changes format style in ceph to improve readability as html.

Signed-off-by: Nilamdyuti Goswami <ngoswami@redhat.com>
(cherry picked from commit 8b796173063ac9af8c21364521fc5ee23d901196)

10 years agomds: fix assertion caused by system clock backwards 3971/head
Yan, Zheng [Tue, 10 Mar 2015 11:55:57 +0000 (19:55 +0800)]
mds: fix assertion caused by system clock backwards

Fixes: #11053
Signed-off-by: Yan, Zheng <zyan@redhat.com>
10 years agodoc: Adds man page for ceph.
Nilamdyuti Goswami [Fri, 12 Dec 2014 20:54:41 +0000 (02:24 +0530)]
doc: Adds man page for ceph.

Signed-off-by: Nilamdyuti Goswami <ngoswami@redhat.com>
(cherry picked from commit 76da87a64ca6b3cc0ceeaf63e19a9f440d6f4161)

10 years agoFix libstriprados::stat, use strtoll insdead of strtol 3921/head
Dongmao Zhang [Fri, 14 Nov 2014 10:48:58 +0000 (18:48 +0800)]
Fix libstriprados::stat, use strtoll insdead of strtol

The return value(long int) of strict_strtol is too small for unstriped
object.

Signed-off-by: Dongmao Zhang <deanraccoon@gmail.com>
(cherry picked from commit fe6679dca479fc24806d7e57ab0108a516cd6d55)

10 years agoFix libstriprados::remove, use strtoll insdead of strtol
Dongmao Zhang [Wed, 10 Dec 2014 10:55:28 +0000 (18:55 +0800)]
Fix libstriprados::remove, use strtoll insdead of strtol

Signed-off-by: Dongmao Zhang <deanraccoon@gmail.com>
(cherry picked from commit 78a15ee4c61fdadccb1921e861748400cc651862)

10 years agoObjecter::_op_submit_with_budget: add timeout before call
Samuel Just [Mon, 2 Feb 2015 21:57:00 +0000 (13:57 -0800)]
Objecter::_op_submit_with_budget: add timeout before call

Objecter::_send_op depends on the ontimeout field being filled in
to avoid 10340 and 9582.

Fixes: 10340
Signed-off-by: Samuel Just <sjust@redhat.com>
(cherry picked from commit cfcfafcb0f33994dbda1efe478ef3ab822ff50d4)

10 years agoosd: fix PG leak in SnapTrimWQ._clear() 3819/head
Kefu Chai [Tue, 10 Feb 2015 08:29:45 +0000 (16:29 +0800)]
osd: fix PG leak in SnapTrimWQ._clear()

Fixes: #10421
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 01e154d592d6cdbf3f859cf1b4357e803536a6b4)

10 years ago0.87.1 v0.87.1
Jenkins [Mon, 23 Feb 2015 20:02:04 +0000 (12:02 -0800)]
0.87.1

10 years agoosd: Fix FileJournal wrap to get header out first 3771/head
David Zafman [Thu, 19 Feb 2015 00:21:12 +0000 (16:21 -0800)]
osd: Fix FileJournal wrap to get header out first

Correct and restore assert that was removed

Cause by f46b1b473fce0322a672b16c7739e569a45054b6
Fixes: #10883
Backport: dumpling, firefly, giant

Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 970bb4901f93575709421b5b25c3eff213de61b8)

10 years agoMerge pull request #3731 from liewegas/wip-10834-giant
Loic Dachary [Tue, 17 Feb 2015 00:09:54 +0000 (01:09 +0100)]
Merge pull request #3731 from liewegas/wip-10834-giant

osd: tolerate sessionless con in fast dispatch path

Reviewed-by: Loic Dachary <ldachary@redhat.com>
10 years agoosd: tolerate sessionless con in fast dispatch path 3731/head
Sage Weil [Tue, 2 Dec 2014 02:15:59 +0000 (18:15 -0800)]
osd: tolerate sessionless con in fast dispatch path

We can now get a session cleared from a Connection at any time.  Change
the assert to an if in ms_fast_dispatch to cope.  It's pretty rare, but it
can happen, especially with delay injection.  In particular, a racing
thread can call mark_down() on us.

Fixes: #10209
Backport: giant
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 01df2227125abf94571b4b0c7bccca57098ed2dc)

10 years agoqa: use correct binary path on rpm-based systems
Josh Durgin [Mon, 2 Feb 2015 15:43:35 +0000 (16:43 +0100)]
qa: use correct binary path on rpm-based systems

Fixes: #10715
Signed-off-by: Josh Durgin <jdurgin@redhat.com>
(cherry picked from commit 05ce2aa1bf030ea225300b48e7914577a412b38c)