]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
9 years agomon: change mon_osd_min_down_reporters from 1 -> 2
Sage Weil [Sat, 14 Nov 2015 03:34:12 +0000 (22:34 -0500)]
mon: change mon_osd_min_down_reporters from 1 -> 2

This makes more sense to me.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/OSDMonitor: simplify failure reporters vs reports logic
Sage Weil [Sat, 14 Nov 2015 03:27:14 +0000 (22:27 -0500)]
mon/OSDMonitor: simplify failure reporters vs reports logic

Since each OSD only sends a failure report for a given peer once,
we don't need to count reports vs reporters separately.  (This was
probably a bad idea anyway.)  Remove this logic and the associated
config option.

Reported-by: Greg Farnum <gfarnum@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: simplify pg creation
Sage Weil [Sat, 14 Nov 2015 03:11:17 +0000 (22:11 -0500)]
osd: simplify pg creation

We used to have a complicated pg creation process in which we
would query any previous mappings for the pg before we created the
new 'empty' pg locally.  The tracking of the prior mappings was
very simple (and broken), but it didn't really matter because the
mon would resend pg create messages periodically.  Now it doesn't,
so that broke.

However, none of this is necessary: the PG peering process does
all of the same things.  Namely, it

- enumerates past intervals
- determines which ones may have been rw
- queries OSDs from each one to gather any potential changes

This is a more robust version of what the creation code was (or
should have been doing).  So, let's rip it all out and let
peering handle it.  As long as the newly instantiated PG sets
last_epoch_started and _clean to the created epoch we will probe
and consider all of these prior mappings and find any previous
instance of the PG (if one existed).

Yay for removing unnecessary code!

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/MonClient: make _sub_got behave if we "got" old stuff
Sage Weil [Fri, 13 Nov 2015 18:03:16 +0000 (13:03 -0500)]
mon/MonClient: make _sub_got behave if we "got" old stuff

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/OSDMonitor: fix oldest_map in send_incremental
Sage Weil [Wed, 11 Nov 2015 03:19:48 +0000 (22:19 -0500)]
mon/OSDMonitor: fix oldest_map in send_incremental

This should be the oldest map on the sender (like every other
place that generates an MOSDMap message).

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/PGMonitor: avoid useless pg gets when pool is deleted
Sage Weil [Mon, 12 Oct 2015 02:06:33 +0000 (22:06 -0400)]
mon/PGMonitor: avoid useless pg gets when pool is deleted

If the .0 pg no longer exists, we know the entire pool was
deleted, and can avoid querying every other pg.  (This is a good
thing because leveldb and rocksdb can be very slow to query
missing keys.)

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/PGMonitor: revamp how pg creates are tracked
Sage Weil [Thu, 8 Oct 2015 16:13:40 +0000 (12:13 -0400)]
mon/PGMonitor: revamp how pg creates are tracked

Previously we were calculating and managing in-core state that
wasn't committed as part of the pg_map, leading to all sorts of
ugliness that didn't really work.  Instead,

 * set mapping in all creating pgs in the committed pg_map
 * make all pg create message sending be based on committed state
 * update mappings for creating pgs every time we consume a new
   osdmap, so that we have a reliable/stable epoch to attach to
   it.

In particular, having that stable epoch means we have a reference
we can put in the pg create message that will also be used for
the subscription version.  That way OSDs get consistent creates
from any mon.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/PGMonitor: only send pg create messages to up osds
Sage Weil [Thu, 8 Oct 2015 16:12:34 +0000 (12:12 -0400)]
mon/PGMonitor: only send pg create messages to up osds

If the OSD is down it will ignore the message.  If it gets marked up, we
will eventually consume that map and call check_subs().

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/PGMonitor: only churn mapping_epoch if the primary changes
Sage Weil [Wed, 7 Oct 2015 05:07:34 +0000 (01:07 -0400)]
mon/PGMonitor: only churn mapping_epoch if the primary changes

This results is fewer resent pg create messages.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/PGMonitor: a bunch of cosmetic cleanup
Sage Weil [Fri, 9 Oct 2015 21:25:00 +0000 (17:25 -0400)]
mon/PGMonitor: a bunch of cosmetic cleanup

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/PGMonitor: drop old creating_pgs_by_osd
Sage Weil [Wed, 7 Oct 2015 04:39:41 +0000 (00:39 -0400)]
mon/PGMonitor: drop old creating_pgs_by_osd

Obsoleted by creating_pgs_by_osd_epoch.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: reduce mon_subscribe messages
Sage Weil [Sat, 14 Nov 2015 17:57:05 +0000 (12:57 -0500)]
osd: reduce mon_subscribe messages

1. MonClient remembers our subscriptions; only indicate we want
osd_pg_creates once, in init.

2. We don't need to re-request the latest osdmap each time we
reconnect.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/MonClient: only send new subscriptions
Sage Weil [Wed, 7 Oct 2015 04:09:18 +0000 (00:09 -0400)]
mon/MonClient: only send new subscriptions

Instead of resending all subscriptions, only send the new ones.  This
avoids races like

 - ask for 4+
 - mon sends maps 4-50
 - ask for 4+ and something else
 - mon has to resend same maps and the other thing

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/PGMonitor: send pg creates via persistent subscriptions, not spam
Sage Weil [Wed, 7 Oct 2015 01:39:33 +0000 (21:39 -0400)]
mon/PGMonitor: send pg creates via persistent subscriptions, not spam

Generate and send pg create messages only for those OSDs who have
subscribed on this monitor.  This is N time more efficient (where there
are N monitors) than the previous method.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/PGMonitor: only map and send pg creates post paxos update
Sage Weil [Wed, 7 Oct 2015 03:57:50 +0000 (23:57 -0400)]
mon/PGMonitor: only map and send pg creates post paxos update

These other call sites are no longer needed.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/PGMonitor: remove map_pg_creates, send_pg_creates commands
Sage Weil [Fri, 9 Oct 2015 21:22:01 +0000 (17:22 -0400)]
mon/PGMonitor: remove map_pg_creates, send_pg_creates commands

These shouldn't be triggered manually.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomessages/MOSDPGCreate: make it more readable
Sage Weil [Wed, 7 Oct 2015 03:58:28 +0000 (23:58 -0400)]
messages/MOSDPGCreate: make it more readable

1- include the epoch
2- drop the 'pg'
3- hide the timestamp

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: subscribe to all pg creates, not just once on start
Sage Weil [Wed, 7 Oct 2015 00:48:38 +0000 (20:48 -0400)]
osd: subscribe to all pg creates, not just once on start

We want to know about all future pg creations, not just those pending
when we start.  (This only helps once the mon knows how to do this...)

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/PGMonitor: track creating_pgs_by_osd_epoch
Sage Weil [Wed, 7 Oct 2015 00:37:06 +0000 (20:37 -0400)]
mon/PGMonitor: track creating_pgs_by_osd_epoch

Track pg creations, grouped by the first epoch they mapped to a particular
OSD.  This will be necessary to send messages only for new creations.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/PGMap: assert our pg counts don't go negative
Sage Weil [Thu, 8 Oct 2015 16:15:01 +0000 (12:15 -0400)]
mon/PGMap: assert our pg counts don't go negative

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/OSDMonitor: do not prime pg_temp for creating pgs
Sage Weil [Thu, 8 Oct 2015 16:14:49 +0000 (12:14 -0400)]
mon/OSDMonitor: do not prime pg_temp for creating pgs

It will be less work for the old primary to ignore the create message
and the new one to query it and find nothing that for the slightly more
complicated peering and removal process to happen.  Also, this reduces
bloat in the OSDMap a bit.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/PGMonitor: note mapping_epoch for creating pgs
Sage Weil [Tue, 6 Oct 2015 22:52:22 +0000 (18:52 -0400)]
mon/PGMonitor: note mapping_epoch for creating pgs

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon: let peon mons send the osdmap replies
Sage Weil [Thu, 17 Sep 2015 01:44:04 +0000 (21:44 -0400)]
mon: let peon mons send the osdmap replies

Currently the leader mon often replies to OSDs by sending a set of
incremental OSDmaps (e.g., in response to an osd boot or failure).

Instead, send a small message to the proxying peon mon (if any)
with the epoch to start from and let *them* generate a suitable
reply.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomsg/simple/Pipe: show keepalives at level 2
Sage Weil [Tue, 6 Oct 2015 19:37:31 +0000 (15:37 -0400)]
msg/simple/Pipe: show keepalives at level 2

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon: set mon_subscribe_interval to a day
Sage Weil [Tue, 6 Oct 2015 19:35:58 +0000 (15:35 -0400)]
mon: set mon_subscribe_interval to a day

This is only needed for legacy clients to avoid confusing them--
we don't actually need the renewals at all.  Make them infrequent
to reduce mon load.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon: only ack subscriptions (and renew) if client or mon is old
Sage Weil [Tue, 6 Oct 2015 19:25:02 +0000 (15:25 -0400)]
mon: only ack subscriptions (and renew) if client or mon is old

Old client expect an ack so they can schedule renewal; send it for
them only.

Old mons expect renewals.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon: remove old subscribe renewal-based timeouts
Sage Weil [Tue, 6 Oct 2015 19:19:33 +0000 (15:19 -0400)]
mon: remove old subscribe renewal-based timeouts

This is no longer needed/used.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon: small cleanup in _ms_dispatch
Sage Weil [Tue, 6 Oct 2015 19:18:21 +0000 (15:18 -0400)]
mon: small cleanup in _ms_dispatch

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon: new session_timeout mechanism that is not subscribe-based
Sage Weil [Tue, 6 Oct 2015 19:11:03 +0000 (15:11 -0400)]
mon: new session_timeout mechanism that is not subscribe-based

Simplify the session liveness detection:

 - renew on any message
 - renew on keepalive[2] messages (lightweight ping in msgr)

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomsg: make last_keepalive[_ack] lock safe
Sage Weil [Tue, 6 Oct 2015 19:10:02 +0000 (15:10 -0400)]
msg: make last_keepalive[_ack] lock safe

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomsg: track stamp of last keepalive[2] received
Sage Weil [Tue, 6 Oct 2015 19:08:57 +0000 (15:08 -0400)]
msg: track stamp of last keepalive[2] received

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agocommon: mirror leveldb default tuning w/ rocksdb
Sage Weil [Tue, 6 Oct 2015 18:38:47 +0000 (14:38 -0400)]
common: mirror leveldb default tuning w/ rocksdb

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/MonClient: don't send log if we're reconnecting
Sage Weil [Tue, 6 Oct 2015 18:38:30 +0000 (14:38 -0400)]
mon/MonClient: don't send log if we're reconnecting

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon: disabled rocksdb compression when used as the backend
Sage Weil [Fri, 2 Oct 2015 13:15:33 +0000 (09:15 -0400)]
mon: disabled rocksdb compression when used as the backend

This significantly reduced CPU utilization on the bigbang scale
testing cluster at CERN.  Note that it is already disabled for
leveldb by default (in ceph_mon.cc).

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: cap adjusted max mon report interval at 2/3 of timeout
Sage Weil [Fri, 2 Oct 2015 13:06:29 +0000 (09:06 -0400)]
osd: cap adjusted max mon report interval at 2/3 of timeout

This ensures that we don't throttle back mon reports so much that
the mon times out out due to no pg stat reports.  Since there is
little value is having a lower max anyway, just set this at an
upper bound (relative to the mon's timeout value).

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: protect mon reporting with mon_report_lock
Sage Weil [Wed, 30 Sep 2015 01:03:53 +0000 (21:03 -0400)]
osd: protect mon reporting with mon_report_lock

We need an exclusive lock over paths that update state related to
mon reports, lest they step on fields like up_thru_*, *stats_ack*,
last_mon_report, and so on.  Everybody still needs a read lock
on map_lock too to get a stable OSDMap epoch.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: fix reconnect behavior from booting state
Sage Weil [Mon, 23 Nov 2015 13:38:44 +0000 (08:38 -0500)]
osd: fix reconnect behavior from booting state

We don't need to restart the boot process unless we are in preboot;
if we are in booting state we just need to resend the boot
message.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: move the monitor report to OSD::tick_without_osd_lock
Guang Yang [Tue, 22 Sep 2015 20:59:28 +0000 (20:59 +0000)]
osd: move the monitor report to OSD::tick_without_osd_lock

Fixes: #12722
Reviewed-by: Guang Yang <yguang@yahoo-inc.com>
9 years agoosd: _got_mon_epochs - refactor the lock scope to avoid a race (which fail make check)
Guang Yang [Tue, 29 Sep 2015 22:26:14 +0000 (22:26 +0000)]
osd: _got_mon_epochs - refactor the lock scope to avoid a race (which fail make check)

Reviewed-by: Guang Yang <yguang@yahoo-inc.com>
9 years agoosd: don't send dup subscribes so much
Sage Weil [Mon, 28 Sep 2015 21:22:01 +0000 (17:22 -0400)]
osd: don't send dup subscribes so much

The subscribe MonClient service is stateful--we don't need to
force a new subscribe send unless sub_want() says we need to.

Keep forcing it for instances where we request an *old* map.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: introduce explicit preboot stage
Sage Weil [Wed, 23 Sep 2015 21:58:15 +0000 (17:58 -0400)]
osd: introduce explicit preboot stage

We want to separate the stage where we do a bunch of work
prior to booting (but intend to eventually boot), like when we
get maps and wait to be healthy, from the point after we've sent
the boot message while we are just waiting for a response (so that
we can avoid resending that boot message needlessly).

- start at PREBOOT in start_boot()
- transition to BOOTING in _send_boot()
- only call _preboot() while in PREBOOT state

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: skip osdmap version query if we can
Sage Weil [Wed, 23 Sep 2015 21:31:50 +0000 (17:31 -0400)]
osd: skip osdmap version query if we can

If we get OSDmaps from the mon we *also* learn the oldest/newest
map epochs; no need to query again.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: make [_]maybe_boot lockless variant
Sage Weil [Wed, 23 Sep 2015 21:33:28 +0000 (17:33 -0400)]
osd: make [_]maybe_boot lockless variant

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: only send boot if booting on getversion completion
Sage Weil [Tue, 22 Sep 2015 15:16:15 +0000 (11:16 -0400)]
osd: only send boot if booting on getversion completion

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: do not resend pg_temp requests
Sage Weil [Fri, 18 Sep 2015 17:27:49 +0000 (13:27 -0400)]
osd: do not resend pg_temp requests

Send each pg_temp request once (per mon session); no need to
resend everything that is pending every time.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: do not send dup failure reports
Sage Weil [Fri, 18 Sep 2015 01:45:16 +0000 (21:45 -0400)]
osd: do not send dup failure reports

If a failure report is already pending, we do not need to resend
it.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: resend pending failure reports with a new mon session
Sage Weil [Fri, 18 Sep 2015 01:48:30 +0000 (21:48 -0400)]
osd: resend pending failure reports with a new mon session

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: fix send_failures() locking
Sage Weil [Fri, 18 Sep 2015 01:42:53 +0000 (21:42 -0400)]
osd: fix send_failures() locking

It is unsafe to check failure_queue.empty() without the lock.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: backoff the max reporting interval, too
Sage Weil [Thu, 17 Sep 2015 21:47:54 +0000 (17:47 -0400)]
osd: backoff the max reporting interval, too

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: no need for regular send_pg_temps
Sage Weil [Thu, 17 Sep 2015 21:47:43 +0000 (17:47 -0400)]
osd: no need for regular send_pg_temps

This is done by process_peering_events.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: just send alive when it is queue
Sage Weil [Fri, 18 Sep 2015 18:24:27 +0000 (14:24 -0400)]
osd: just send alive when it is queue

No need to futz with last_mon_report or resend it again later.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: fix pg stat reporting
Sage Weil [Wed, 16 Sep 2015 15:00:57 +0000 (11:00 -0400)]
osd: fix pg stat reporting

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: inline do_mon_report
Sage Weil [Fri, 18 Sep 2015 18:24:05 +0000 (14:24 -0400)]
osd: inline do_mon_report

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: limit nubmer of pg stat updates in flight
Sage Weil [Tue, 15 Sep 2015 20:41:03 +0000 (16:41 -0400)]
osd: limit nubmer of pg stat updates in flight

There is no reason to heavily pipeline this.  If the mon is slow
committing them we should go slow too.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: fix pg_stats_queue lock protection
Sage Weil [Tue, 15 Sep 2015 20:34:34 +0000 (16:34 -0400)]
osd: fix pg_stats_queue lock protection

We are indirectly relying on osd_lock, but that may no longer
work for us in the future.  Use the stats lock instead.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: scale mon report interval with timeout backoff
Sage Weil [Tue, 15 Sep 2015 20:26:54 +0000 (16:26 -0400)]
osd: scale mon report interval with timeout backoff

If we have had to scale the backoff by 3x because the mon is
loaded, scale the min report interval accordingly.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: keep count of outstanding pg stat updates to mon
Sage Weil [Tue, 15 Sep 2015 20:26:06 +0000 (16:26 -0400)]
osd: keep count of outstanding pg stat updates to mon

Count how many stat updates we have in flight.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: no stats outstanding when we reset the session
Sage Weil [Tue, 15 Sep 2015 20:16:22 +0000 (16:16 -0400)]
osd: no stats outstanding when we reset the session

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: remove old stats backoff mechanism
Sage Weil [Tue, 15 Sep 2015 20:15:29 +0000 (16:15 -0400)]
osd: remove old stats backoff mechanism

This would only backoff 2x the configured rate, and is less
robust than the new backoff + decay approach.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd: exponential backoff on pg stats ack timeout
Sage Weil [Tue, 15 Sep 2015 20:08:02 +0000 (16:08 -0400)]
osd: exponential backoff on pg stats ack timeout

If we don't get a timely response to our pg stats update we fail
the mon connection and reconnect to a new mon.  If the mons aren't
responding because they are overloaded (for example, because they
are overwhelmed with stats updates) this just makes the problem
worse.

Mitigate the situation by doing an exponential backoff on the
timeout.  When we do successfully send an update, slowly decay the
timeout back to the initial value.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomessage/MLog: include seq in print
Sage Weil [Tue, 15 Sep 2015 19:38:52 +0000 (15:38 -0400)]
message/MLog: include seq in print

...so we can disambiguate which log message(s) we have.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosd/OSDMap: cache values for in, up osds
Sage Weil [Mon, 14 Sep 2015 21:11:31 +0000 (17:11 -0400)]
osd/OSDMap: cache values for in, up osds

We already do this for num_osd; do the same for the up and in
counts.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agomon/PGMonitor: avoid iterating over all pgs to find stale
Sage Weil [Mon, 14 Sep 2015 21:04:23 +0000 (17:04 -0400)]
mon/PGMonitor: avoid iterating over all pgs to find stale

Instead of iterating over all pgs when an osd goes down, make a
set of all osds that might have gone down, and only check pgs that
it manages.  This is more efficient, especially for large clusters
with large numbers of OSDs.

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #6574 from yuyuyu101/fix-broken-kinects
Sage Weil [Sat, 14 Nov 2015 02:14:56 +0000 (21:14 -0500)]
Merge pull request #6574 from yuyuyu101/fix-broken-kinects

kv/KineticStore: fix broken split_key

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #6571 from dachary/wip-test-run-cli
Sage Weil [Sat, 14 Nov 2015 00:27:45 +0000 (19:27 -0500)]
Merge pull request #6571 from dachary/wip-test-run-cli

tests: restore run-cli-tests

9 years agoMerge pull request #6578 from dachary/wip-13785-debian-rbd-replay
Loic Dachary [Fri, 13 Nov 2015 21:04:56 +0000 (22:04 +0100)]
Merge pull request #6578 from dachary/wip-13785-debian-rbd-replay

build/ops: rbd-replay moved from ceph-test-dbg to ceph-common-dbg

Reviewed-by: Ken Dreyer <kdreyer@redhat.com>
9 years agoMerge pull request #4737 from kylinstorage/wip-temp-based-object-eviction
Sage Weil [Fri, 13 Nov 2015 20:39:54 +0000 (15:39 -0500)]
Merge pull request #4737 from kylinstorage/wip-temp-based-object-eviction

osd: improve temperature calculation for cache tier agent

Reviewed-by: Sage Weil
9 years agoMerge pull request #6422 from xiexingguo/xxg-wip13639
Sage Weil [Fri, 13 Nov 2015 20:22:08 +0000 (15:22 -0500)]
Merge pull request #6422 from xiexingguo/xxg-wip13639

librados: fix potential null pointer access when do pool_snap_list

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
9 years agoMerge pull request #6486 from XinzeChi/wip-multiple-finisher
Sage Weil [Fri, 13 Nov 2015 20:20:48 +0000 (15:20 -0500)]
Merge pull request #6486 from XinzeChi/wip-multiple-finisher

osd: FileStore: support multiple ondisk finish and apply finishers

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Samuel Just <sjust@redhat.com>
9 years agoMerge pull request #6518 from kylinstorage/wip-trivial-optimization
Sage Weil [Fri, 13 Nov 2015 20:20:13 +0000 (15:20 -0500)]
Merge pull request #6518 from kylinstorage/wip-trivial-optimization

osd: optimize scrub subset_last_update calculation

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agobuild/ops: rbd-replay moved from ceph-test-dbg to ceph-common-dbg 6578/head
Loic Dachary [Fri, 13 Nov 2015 18:10:28 +0000 (19:10 +0100)]
build/ops: rbd-replay moved from ceph-test-dbg to ceph-common-dbg

http://tracker.ceph.com/issues/13785 Fixes: #13785

Signed-off-by: Loic Dachary <loic@dachary.org>
9 years agotests: avoid bashism 6571/head
Loic Dachary [Fri, 13 Nov 2015 17:47:31 +0000 (18:47 +0100)]
tests: avoid bashism

The shell used by the cli tests is not always bash. Not using the
here-word is also more readable in this specific case.

Signed-off-by: Loic Dachary <loic@dachary.org>
9 years agorbd: hardcode application name into help
Jason Dillaman [Fri, 13 Nov 2015 15:20:19 +0000 (10:20 -0500)]
rbd: hardcode application name into help

Avoid dynamically detecting the application name and instead hardcode
the rbd CLI name into the help output.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
9 years agokv/KineticStore: Fix broken split_key 6574/head
Haomai Wang [Fri, 13 Nov 2015 17:04:11 +0000 (01:04 +0800)]
kv/KineticStore: Fix broken split_key

Introduced by PR #6312

Signed-off-by: Haomai Wang <haomai@xsky.com>
9 years agotests: restore run-cli-tests
Loic Dachary [Fri, 13 Nov 2015 15:23:27 +0000 (16:23 +0100)]
tests: restore run-cli-tests

e4ca468 moved src/test/run-cli-tests from check-local to check_SCRIPTS
but did not add it to the TESTS variable.

Signed-off-by: Loic Dachary <loic@dachary.org>
9 years agoradosgw-admin: fix cli tests 6569/head
Sage Weil [Fri, 13 Nov 2015 15:06:18 +0000 (10:06 -0500)]
radosgw-admin: fix cli tests

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoosdmaptool: fix cli tests
Sage Weil [Fri, 13 Nov 2015 15:05:53 +0000 (10:05 -0500)]
osdmaptool: fix cli tests

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agocrushtool: fix cli tests
Sage Weil [Fri, 13 Nov 2015 14:53:37 +0000 (09:53 -0500)]
crushtool: fix cli tests

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agocrushtool: fix cli test help
Sage Weil [Fri, 13 Nov 2015 14:27:28 +0000 (09:27 -0500)]
crushtool: fix cli test help

Signed-off-by: Sage Weil <sage@redhat.com>
9 years agoMerge pull request #6532 from dachary/wip-mailmap
Loic Dachary [Fri, 13 Nov 2015 14:14:11 +0000 (15:14 +0100)]
Merge pull request #6532 from dachary/wip-mailmap

mailmap: Ubuntu Kylin name changed to Kylin Cloud

Reviewed-by: Li Wang <li.wang@kylin-cloud.com>
9 years agoMerge pull request #5848 from storage-zuiwanyuan/wip-nonblock-connect
Sage Weil [Fri, 13 Nov 2015 14:04:47 +0000 (09:04 -0500)]
Merge pull request #5848 from storage-zuiwanyuan/wip-nonblock-connect

msg/async: support of non-block connect in async messenger

Reviewed-by: Haomai Wang <haomai@xsky.com>
9 years agoMerge pull request #6478 from yuyuyu101/wip-13666
Sage Weil [Fri, 13 Nov 2015 14:03:35 +0000 (09:03 -0500)]
Merge pull request #6478 from yuyuyu101/wip-13666

msg/async: let receiver ack message ASAP

Reviewed-by: Sage Weil <sage@redhat.com>
9 years agomsg/async: support of non-block connect in async messenger 5848/head
Jianhui Yuan [Fri, 13 Nov 2015 07:36:36 +0000 (15:36 +0800)]
msg/async: support of non-block connect in async messenger

Fixes: #12802
Signed-off-by: Jianhui Yuan <zuiwanyuan@gmail.com>
9 years agoMerge pull request #6534 from kylinstorage/wip-trivial-scrub-cleanup
Kefu Chai [Fri, 13 Nov 2015 07:28:39 +0000 (15:28 +0800)]
Merge pull request #6534 from kylinstorage/wip-trivial-scrub-cleanup

osd: clarify the scrub result report

Reviewed-by: Kefu Chai <kchai@redhat.com>
9 years agoscrub: clarify the result report 6534/head
Li Wang [Fri, 13 Nov 2015 07:00:09 +0000 (15:00 +0800)]
scrub: clarify the result report

It may happen that the authoritative object
such that auth.size != be_get_ondisk_size(auth_oi.size),
in that case, clarify the error report.

Signed-off-by: Li Wang <li.wang@kylin-cloud.com>
9 years agoMerge branch 'wip-py3'
Josh Durgin [Fri, 13 Nov 2015 04:15:48 +0000 (20:15 -0800)]
Merge branch 'wip-py3'

pybind: a few more python 3 fixes for rbd and rados

Reviewed-by: David Coles <dcoles@gaikai.com>
9 years agopybind/rados: return pool_reverse_lookup() result as a string
Josh Durgin [Thu, 12 Nov 2015 08:57:36 +0000 (00:57 -0800)]
pybind/rados: return pool_reverse_lookup() result as a string

This makes it symmetric with create_pool() in python 3.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
9 years agopybind/test_rbd: convert a few more str to bytes for py3
Josh Durgin [Thu, 12 Nov 2015 07:59:21 +0000 (23:59 -0800)]
pybind/test_rbd: convert a few more str to bytes for py3

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
9 years agopybind/rbd: encode snap_rename args for py3
Josh Durgin [Thu, 12 Nov 2015 07:57:16 +0000 (23:57 -0800)]
pybind/rbd: encode snap_rename args for py3

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
9 years agopybind/rbd: decode stat() and list_children() results for py3
Josh Durgin [Thu, 12 Nov 2015 07:56:14 +0000 (23:56 -0800)]
pybind/rbd: decode stat() and list_children() results for py3

For stat(), only block_name_prefix is filled in - parent and
parent_pool are always blank.

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
9 years agopybind/rbd: decode parent_info() to str types for py3
Josh Durgin [Thu, 12 Nov 2015 03:05:59 +0000 (19:05 -0800)]
pybind/rbd: decode parent_info() to str types for py3

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
9 years agopybind/test_rbd: fix map() usage for py3 compat
Josh Durgin [Thu, 12 Nov 2015 03:02:12 +0000 (19:02 -0800)]
pybind/test_rbd: fix map() usage for py3 compat

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
9 years agopybind/test_rbd: use // for division for py3
Josh Durgin [Thu, 12 Nov 2015 08:06:14 +0000 (00:06 -0800)]
pybind/test_rbd: use // for division for py3

Signed-off-by: Josh Durgin <jdurgin@redhat.com>
9 years agoMerge branch 'pybind3' of https://github.com/dcoles/ceph into wip-pybind3
Josh Durgin [Fri, 13 Nov 2015 03:32:42 +0000 (19:32 -0800)]
Merge branch 'pybind3' of https://github.com/dcoles/ceph into wip-pybind3

pybind: Add Python 3 support for rados and rbd modules

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Conflicts:
src/pybind/rbd.py (new create args, minor fix to work with py3)

9 years agoceph: Make stdout/stderr always output Unicode (UTF-8) 6315/head
David Coles [Wed, 11 Nov 2015 22:06:45 +0000 (14:06 -0800)]
ceph: Make stdout/stderr always output Unicode (UTF-8)

If a stream is not interactive, then under Python 2, then the encoding for
stdout/stderr may be None. This means that it's not possible to print Unicode
characters since the encoding will fall back to ASCII.

This explicitly makes sys.stdout/sys.stderr always use UTF-8 encoding for
strings, regardless of the system's local or if the console is interactive or
not.
This matches the existing tests that assume that output of non-ASCII pool names
will be UTF-8 encoded.

When outputting raw binary data (such as the CRUSH-map), we must bypass the
codec and write directly to raw streams (since the new stream will only accept
ASCII byte-strings or Unicode strings).

Signed-off-by: David Coles <dcoles@gaikai.com>
9 years agopybind: Add decode_cstr helper function
David Coles [Tue, 27 Oct 2015 20:32:44 +0000 (13:32 -0700)]
pybind: Add decode_cstr helper function

This function attempts to decode a C-style string into a Python Unicode string.
It accepts an optional "size" parameter for the string length, otherwise it is
assumed that the string is NUL-terminated.

If the pointer is NULL, then this function returns None.

Signed-off-by: David Coles <dcoles@gaikai.com>
9 years agopybind: Add test for creating pool by raw UTF-8
David Coles [Tue, 20 Oct 2015 17:57:46 +0000 (10:57 -0700)]
pybind: Add test for creating pool by raw UTF-8

Some clients try providing non-ASCII pool names by sending raw encoded bytes.
This check ensures that we still support this behaviour for Python 2.

In Python 3, bytestrings will fail since strings are Unicode strings and thus
clients should use Unicode escapes instead.

Signed-off-by: David Coles <dcoles@gaikai.com>
9 years agopybind: Import cstr from the rados module
David Coles [Tue, 20 Oct 2015 17:55:44 +0000 (10:55 -0700)]
pybind: Import cstr from the rados module

Since rados is required for rbd, we can avoid duplication of code across these
two modules.

Signed-off-by: David Coles <dcoles@gaikai.com>
9 years agopybind: Don't encode str on Python 2
David Coles [Tue, 20 Oct 2015 02:42:18 +0000 (19:42 -0700)]
pybind: Don't encode str on Python 2

If you attempt to call encode on a non-ASCII string, then a UnicodeDecodeError
will be raised.

Since str on Python 2 is an 8-bit string, it's possible that it's already UTF-8
encoded. As such we should just pass it through to the C API unmodified.

On Python 3 or if the user explicitly uses unicode, then we'll encode it to
UTF-8 for them.

Signed-off-by: David Coles <dcoles@gaikai.com>
9 years agoMerge branch 'wip-13504' of https://github.com/trociny/ceph
Josh Durgin [Thu, 12 Nov 2015 22:08:31 +0000 (14:08 -0800)]
Merge branch 'wip-13504' of https://github.com/trociny/ceph

rbd: API: options on image create

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Conflicts:
src/test/librbd/test_librbd.cc (trivial, two tests added at end of file)