]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
7 years agoosd: some debug output in identify_split_children
Sage Weil [Fri, 9 Mar 2018 04:05:14 +0000 (22:05 -0600)]
osd: some debug output in identify_split_children

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: do final pg delete transaction on pg sequencer
Sage Weil [Thu, 8 Mar 2018 17:50:44 +0000 (11:50 -0600)]
osd/PG: do final pg delete transaction on pg sequencer

Simpler, cleaner.  Also, this way we flush before returning.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: better debug output in identify_splits
Sage Weil [Wed, 7 Mar 2018 21:09:30 +0000 (15:09 -0600)]
osd: better debug output in identify_splits

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: handle NOUP flag vs boot race
Sage Weil [Wed, 7 Mar 2018 21:09:16 +0000 (15:09 -0600)]
osd: handle NOUP flag vs boot race

If we digest maps that show a NOUP flag change *and* we also go active,
there is no need to restart the boot process--we can just go/stay active.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoqa/suites/rados/singleton/all/recovery_preemption: make test more reliable
Sage Weil [Tue, 27 Feb 2018 22:25:21 +0000 (16:25 -0600)]
qa/suites/rados/singleton/all/recovery_preemption: make test more reliable

A 30 second run did only 7000 ops, which means ~50 log entires per pg...
not enough to trigger backfill.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoqa/suites/rados/singleton/all/mon-seesaw: whitelist PG_AVAILABILITY
Sage Weil [Wed, 28 Feb 2018 16:17:09 +0000 (10:17 -0600)]
qa/suites/rados/singleton/all/mon-seesaw: whitelist PG_AVAILABILITY

The seesaw might delay pg creation by more than 60s.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: ensure an actual transaction gets queued for recovery finish
Sage Weil [Mon, 26 Feb 2018 19:45:28 +0000 (13:45 -0600)]
osd/PG: ensure an actual transaction gets queued for recovery finish

Otherwise, this context gets leaked and lost.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: close split vs query race in consume_map
Sage Weil [Fri, 23 Feb 2018 19:39:40 +0000 (13:39 -0600)]
osd: close split vs query race in consume_map

Consider the race:

- shard 0 consumes epoch E
- shard 1 consumes epoch E
  - shard 1 pg P will split to C
- shard 0 processes query on C, returns DNE
- shard 0 primes slot C

Close race by priming split children before consuming map into each
OSDShard.  That way the query will either (1) arrive before E and before
slot C is primed and wait for E, or find the slot present with
waiting_for_split true.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: improve documentation for event queue ordering and requeueing rules
Sage Weil [Fri, 23 Feb 2018 19:18:53 +0000 (13:18 -0600)]
osd: improve documentation for event queue ordering and requeueing rules

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: flush sequencer/collection on shutdown
Sage Weil [Fri, 23 Feb 2018 15:19:13 +0000 (09:19 -0600)]
osd/PG: flush sequencer/collection on shutdown

This should catch any in-flight work we have.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: move shutdown into PG
Sage Weil [Fri, 23 Feb 2018 15:17:12 +0000 (09:17 -0600)]
osd/PG: move shutdown into PG

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/osd_types: fix pg_t::pool() return type (uint64_t -> int64_t)
Sage Weil [Fri, 23 Feb 2018 14:58:32 +0000 (08:58 -0600)]
osd/osd_types: fix pg_t::pool() return type (uint64_t -> int64_t)

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agomon/OSDMonitor: disallow pg_num changes until after pool is created
Sage Weil [Fri, 23 Feb 2018 14:52:42 +0000 (08:52 -0600)]
mon/OSDMonitor: disallow pg_num changes until after pool is created

The pg create handling OSD code does not handle races between a mon create
message and a split message.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: set send_notify on child
Sage Weil [Thu, 22 Feb 2018 15:18:28 +0000 (09:18 -0600)]
osd/PG: set send_notify on child

If we are a non-primary, we need to ensure the split children send
notifies.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: kill broken _process optimization; simplify null pg flow
Sage Weil [Thu, 22 Feb 2018 03:10:54 +0000 (21:10 -0600)]
osd: kill broken _process optimization; simplify null pg flow

- drop fast quuee to waiting list optimization: it breaks ordering and is
a useless optimization
- restructure so that we don't drop the lock and revalidate the world if
pg == nullptr

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fix fast pg create vs limits
Sage Weil [Wed, 21 Feb 2018 03:23:25 +0000 (21:23 -0600)]
osd: fix fast pg create vs limits

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: (pre)publish map before distributing to shards (and pgs)
Sage Weil [Wed, 21 Feb 2018 03:14:27 +0000 (21:14 -0600)]
osd: (pre)publish map before distributing to shards (and pgs)

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: update numpg_* counters when removing a pg
Sage Weil [Tue, 20 Feb 2018 21:49:46 +0000 (15:49 -0600)]
osd: update numpg_* counters when removing a pg

Usually on a pg create we see an OSDMap update; on PG removal completion
we may not.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: decrement deleting pg count in _delete_some
Sage Weil [Tue, 20 Feb 2018 21:49:19 +0000 (15:49 -0600)]
osd: decrement deleting pg count in _delete_some

The exit() method for ToDelete state doesn't run on PG destruction.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: clear shard osdmaps during shutdown
Sage Weil [Tue, 20 Feb 2018 21:33:09 +0000 (15:33 -0600)]
osd: clear shard osdmaps during shutdown

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: make save osdmap accessor for OSDShard
Sage Weil [Tue, 20 Feb 2018 21:20:39 +0000 (15:20 -0600)]
osd: make save osdmap accessor for OSDShard

The advance_pg needs to get the shard osdmap without racing against
consume_map().

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: clean up mutex naming for OSDShard
Sage Weil [Tue, 20 Feb 2018 21:20:00 +0000 (15:20 -0600)]
osd: clean up mutex naming for OSDShard

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agocommon/tracked_int_ptr: fix operator= return value
Sage Weil [Sun, 18 Feb 2018 20:36:28 +0000 (14:36 -0600)]
common/tracked_int_ptr: fix operator= return value

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fix pg removal vs _process race
Sage Weil [Sun, 18 Feb 2018 02:27:30 +0000 (20:27 -0600)]
osd: fix pg removal vs _process race

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: lookup_*pg must return PGRef
Sage Weil [Fri, 16 Feb 2018 21:53:43 +0000 (15:53 -0600)]
osd: lookup_*pg must return PGRef

Otherwise it is fundamentally unsafe, as the PG might get destroyed out
from under us without a reference.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: kill pass-through _open_pg
Sage Weil [Fri, 9 Feb 2018 22:15:00 +0000 (16:15 -0600)]
osd: kill pass-through _open_pg

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove old min pg epoch tracking
Sage Weil [Fri, 9 Feb 2018 22:14:42 +0000 (16:14 -0600)]
osd: remove old min pg epoch tracking

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: remove RecoveryCtx on_applied and on_commit
Sage Weil [Fri, 9 Feb 2018 22:05:46 +0000 (16:05 -0600)]
osd/PG: remove RecoveryCtx on_applied and on_commit

These were awkward and unnecessary.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: register delete completion directly on Transaction
Sage Weil [Fri, 9 Feb 2018 22:07:44 +0000 (16:07 -0600)]
osd/PG: register delete completion directly on Transaction

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: register split completion directly on Transaction
Sage Weil [Fri, 9 Feb 2018 22:04:30 +0000 (16:04 -0600)]
osd: register split completion directly on Transaction

No need to use wonky RecoveryCtx C_Contexts

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: drop unused context list accessors for RecoveryCtx
Sage Weil [Fri, 9 Feb 2018 22:00:28 +0000 (16:00 -0600)]
osd/PG: drop unused context list accessors for RecoveryCtx

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: register recovery finish context directly on Transaction
Sage Weil [Fri, 9 Feb 2018 22:02:08 +0000 (16:02 -0600)]
osd/PG: register recovery finish context directly on Transaction

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: drop unused activate() context list arg
Sage Weil [Fri, 9 Feb 2018 22:00:10 +0000 (16:00 -0600)]
osd/PG: drop unused activate() context list arg

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: register flush completions directly on the Transaction
Sage Weil [Fri, 9 Feb 2018 21:58:15 +0000 (15:58 -0600)]
osd/PG: register flush completions directly on the Transaction

No need to awkward list passed as an arg; all of these callbacks end up
on the Transaction anyway.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: wait for pg epochs based on shard tracking
Sage Weil [Fri, 9 Feb 2018 21:37:59 +0000 (15:37 -0600)]
osd: wait for pg epochs based on shard tracking

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: index pg (slots) by map epoch within each shard
Sage Weil [Fri, 9 Feb 2018 21:31:14 +0000 (15:31 -0600)]
osd: index pg (slots) by map epoch within each shard

This will replace the epoch tracking in OSDService shortly.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: link back to pg slot
Sage Weil [Fri, 9 Feb 2018 21:04:32 +0000 (15:04 -0600)]
osd/PG: link back to pg slot

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: OSDShard::pg_slot -> OSDShardPGSlot
Sage Weil [Fri, 9 Feb 2018 20:42:25 +0000 (14:42 -0600)]
osd: OSDShard::pg_slot -> OSDShardPGSlot

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: change pg_slots unordered_map to use unique_ptr<>
Sage Weil [Fri, 9 Feb 2018 20:39:34 +0000 (14:39 -0600)]
osd: change pg_slots unordered_map to use unique_ptr<>

This avoids moving slots around in memory in the unordered_map... they can
be big!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove some unused methods
Sage Weil [Fri, 9 Feb 2018 20:22:23 +0000 (14:22 -0600)]
osd: remove some unused methods

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove created_pgs tracking in RecoveryCtx
Sage Weil [Fri, 9 Feb 2018 19:16:19 +0000 (13:16 -0600)]
osd: remove created_pgs tracking in RecoveryCtx

Not needed or used!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fix PG::ch init
Sage Weil [Fri, 9 Feb 2018 19:13:07 +0000 (13:13 -0600)]
osd: fix PG::ch init

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: use _attach_pg and _detach_pg helpers; keep PG::osd_shard ptr
Sage Weil [Thu, 8 Feb 2018 22:54:36 +0000 (16:54 -0600)]
osd: use _attach_pg and _detach_pg helpers; keep PG::osd_shard ptr

Consolidate num_pgs updates (and fix a counting bug along the way).

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove old split tracking machinery
Sage Weil [Thu, 8 Feb 2018 22:27:21 +0000 (16:27 -0600)]
osd: remove old split tracking machinery

This infrastructure is no longer used; simpler split tracking now lives in
the shards pg_slots directly.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: restructure consume_map in terms of shards
Sage Weil [Thu, 8 Feb 2018 22:23:04 +0000 (16:23 -0600)]
osd: restructure consume_map in terms of shards

- new split primming machinery
- new primed split cleanup on pg removal
- cover the pg creation path

The old split tracking is now totally unused; will be removed in the next
patch.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: pass sdata into dequeue_peering_evt (and dequeue_delete)
Sage Weil [Thu, 8 Feb 2018 19:45:37 +0000 (13:45 -0600)]
osd: pass sdata into dequeue_peering_evt (and dequeue_delete)

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: pass data into OpQueueItem::run()
Sage Weil [Thu, 8 Feb 2018 19:36:41 +0000 (13:36 -0600)]
osd: pass data into OpQueueItem::run()

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: kill pg_map
Sage Weil [Tue, 6 Feb 2018 20:35:17 +0000 (14:35 -0600)]
osd: kill pg_map

Split doesn't work quite right; num_pgs count is probably off.  But, things
mostly work.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: rename OSDShard waiting_for_pg_osdmap -> osdmap
Sage Weil [Tue, 6 Feb 2018 17:27:36 +0000 (11:27 -0600)]
osd: rename OSDShard waiting_for_pg_osdmap -> osdmap

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: use _get_pgs() where possible; avoid touching pg_map directly
Sage Weil [Tue, 6 Feb 2018 16:51:42 +0000 (10:51 -0600)]
osd: use _get_pgs() where possible; avoid touching pg_map directly

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: get _get_pgs() and _get_pgids()
Sage Weil [Tue, 6 Feb 2018 16:50:31 +0000 (10:50 -0600)]
osd: get _get_pgs() and _get_pgids()

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove get_mapped_pools command
Sage Weil [Tue, 6 Feb 2018 14:48:24 +0000 (08:48 -0600)]
osd: remove get_mapped_pools command

No in-tree users.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: move ShardedOpWQ::ShardData -> OSDShard
Sage Weil [Tue, 6 Feb 2018 00:22:40 +0000 (18:22 -0600)]
osd: move ShardedOpWQ::ShardData -> OSDShard

Soon we will destroy pg_map!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: kill _open_lock_pg
Sage Weil [Mon, 5 Feb 2018 21:54:19 +0000 (15:54 -0600)]
osd: kill _open_lock_pg

Move lock call to caller.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: kill _create_lock_pg
Sage Weil [Mon, 5 Feb 2018 21:52:56 +0000 (15:52 -0600)]
osd: kill _create_lock_pg

Unused.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: do not release recovery_ops_reserved on requeue
Sage Weil [Fri, 2 Feb 2018 21:50:25 +0000 (15:50 -0600)]
osd: do not release recovery_ops_reserved on requeue

This doesn't make sense.. although it's the same behavior as
luminous.

The point of the releases here is that if we drop something that is in
the queue we drop the recovery_ops_reserved counter by that much.  However,
if something is in the queue and waiting, and we wake it back up, there
is no net change to _reserved... which is only decremented when we
actually dequeue something.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: debug recovery_ops_reserved
Sage Weil [Fri, 2 Feb 2018 21:26:52 +0000 (15:26 -0600)]
osd: debug recovery_ops_reserved

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: move PG peering waiters into op wq
Sage Weil [Fri, 2 Feb 2018 16:11:49 +0000 (10:11 -0600)]
osd: move PG peering waiters into op wq

This resolves problems with a peering event being delivered triggering
advance_pg which triggers a requeue of waiting events that are requeued
*behind* the event we are processing.  It also reduces the number of
wait lists by one, yay!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: store ec profile with final pool
Sage Weil [Thu, 1 Feb 2018 19:58:15 +0000 (13:58 -0600)]
osd: store ec profile with final pool

We need this to reinstantiate semi-deleted ec backends.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: ignore RecoveryDone in ReplicaActive too
Sage Weil [Thu, 1 Feb 2018 18:59:29 +0000 (12:59 -0600)]
osd/PG: ignore RecoveryDone in ReplicaActive too

This can be missed on a RepRecovering -> RepNotRecovering ->
RepWaitBackfillReserved transition.  Catch any straggler events in
ReplicaActive.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/osd_types: include epoch_sent in pg_query_t operator<<
Sage Weil [Tue, 30 Jan 2018 14:56:46 +0000 (08:56 -0600)]
osd/osd_types: include epoch_sent in pg_query_t operator<<

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: restructure pg waiting more
Sage Weil [Fri, 2 Feb 2018 16:04:44 +0000 (10:04 -0600)]
osd: restructure pg waiting more

Wait by epoch.  This is less kludgey than before!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: restructure pg waiting
Sage Weil [Fri, 19 Jan 2018 19:23:01 +0000 (13:23 -0600)]
osd: restructure pg waiting

Rethink the way we wait for PGs.  We need to order peering events relative to
each other; keep them in a separate queue in the pg_slot.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: normal command uses slow dispatch (it can send messages)
Sage Weil [Fri, 19 Jan 2018 14:51:07 +0000 (08:51 -0600)]
osd: normal command uses slow dispatch (it can send messages)

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/OSD,PG: get_osdmap()->get_epoch() -> get_osdmap_epoch()
Sage Weil [Thu, 18 Jan 2018 22:31:46 +0000 (16:31 -0600)]
osd/OSD,PG: get_osdmap()->get_epoch() -> get_osdmap_epoch()

Avoid wrangling shared_ptr!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: misc fixes
Sage Weil [Wed, 17 Jan 2018 16:23:15 +0000 (10:23 -0600)]
osd: misc fixes

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: kill disk_tp, recovery_gen_wq
Sage Weil [Tue, 16 Jan 2018 22:48:37 +0000 (16:48 -0600)]
osd: kill disk_tp, recovery_gen_wq

Progress!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: move recovery contexts to normal wq
Sage Weil [Tue, 16 Jan 2018 22:42:28 +0000 (16:42 -0600)]
osd: move recovery contexts to normal wq

We have a specific PGRecoveryContext type/event--even though we are just
calling a GenContext--so that we can distinguish the event type properly.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove _ookup_lock_pg_with_map_lock_held()
Sage Weil [Thu, 4 Jan 2018 18:53:39 +0000 (12:53 -0600)]
osd: remove _ookup_lock_pg_with_map_lock_held()

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: new MOSDScrub2 message with spg_t, fast dispatch
Sage Weil [Thu, 4 Jan 2018 18:10:41 +0000 (12:10 -0600)]
osd: new MOSDScrub2 message with spg_t, fast dispatch

Send new message to mimic+ OSDs.  Fast dispatch it at the OSD.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: request scrub via a state machine event
Sage Weil [Thu, 4 Jan 2018 18:09:56 +0000 (12:09 -0600)]
osd/PG: request scrub via a state machine event

Continuing effort to make PG interactions event based.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: use peering events for forced recovery
Sage Weil [Thu, 18 Jan 2018 22:09:52 +0000 (16:09 -0600)]
osd: use peering events for forced recovery

The mgr code is updated to send spg_t's instead of pg_t's (and is slightly
refactored/cleaned).

The PG events are added to the Primary state, unless we're also in the
Clean substate, in which case they are ignored.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/OSDMap: get_primary_shart() variant that returns primary *and* shard
Sage Weil [Thu, 4 Jan 2018 16:48:41 +0000 (10:48 -0600)]
osd/OSDMap: get_primary_shart() variant that returns primary *and* shard

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: prime pg_slots for to-be-split children
Sage Weil [Wed, 3 Jan 2018 17:48:37 +0000 (11:48 -0600)]
osd: prime pg_slots for to-be-split children

Once we know which PGs are about to be created, we instantiate their
pg_slot and mark them waiting_pg, which blocks all incoming events until
the split completes, the PG is installed, and we call wake_pg_waiters().

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove obsolete slow dispatch path for most messages
Sage Weil [Wed, 3 Jan 2018 03:39:03 +0000 (21:39 -0600)]
osd: remove obsolete slow dispatch path for most messages

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fast dispatch M[Mon]Command
Sage Weil [Wed, 3 Jan 2018 14:52:16 +0000 (08:52 -0600)]
osd: fast dispatch M[Mon]Command

These just get dumped onto a work queue.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fast dispatch ping
Sage Weil [Wed, 3 Jan 2018 03:37:30 +0000 (21:37 -0600)]
osd: fast dispatch ping

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agomon/OSDMOnitor: send MOSDPGCreate2 to mimic+ osds
Sage Weil [Wed, 3 Jan 2018 03:30:03 +0000 (21:30 -0600)]
mon/OSDMOnitor: send MOSDPGCreate2 to mimic+ osds

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: handle MOSDPGCreate2 messages (fast dispatch!)
Sage Weil [Wed, 3 Jan 2018 03:29:50 +0000 (21:29 -0600)]
osd: handle MOSDPGCreate2 messages (fast dispatch!)

Add a new MOSDPGCreate2 message that sends the spg_t (not just pg_t) and
includes only the info we need.  Fast dispatch it.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/OSDMapMapping: a getter that returns a spg_t
Sage Weil [Wed, 3 Jan 2018 03:26:52 +0000 (21:26 -0600)]
osd/OSDMapMapping: a getter that returns a spg_t

Note whether a pool is erasure so that we can generate an appropriate
spg_t for a mapping.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: send pg creations through normal pg queue
Sage Weil [Tue, 2 Jan 2018 22:51:41 +0000 (16:51 -0600)]
osd: send pg creations through normal pg queue

Queue a null event tagged with create_info, elimiating the special
legacy path.

These are still not fast dispatch because we need an spg (not pg) to queue
and event, and we need a current osdmap in order to calculate that.  That
isn't possible/a good idea in fast dispatch.  In a subsequent patch we'll
create a new pg create message that includes the correct information and
can be fast dispatched, allowing this path to die off post-nautilus.

Also, improve things so that we ack the pg creation only after the PG has
gone active, meaning it is fully replicated (by at least min_size PGs).

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fix max pg check for peer events
Sage Weil [Tue, 2 Jan 2018 22:44:40 +0000 (16:44 -0600)]
osd: fix max pg check for peer events

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: use atomic for pg_map_size
Sage Weil [Tue, 2 Jan 2018 22:42:36 +0000 (16:42 -0600)]
osd: use atomic for pg_map_size

This avoids the need for pg_map_lock in the max pg check.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PGPeeringEvent: note mon- vs peer-initiated pg creates
Sage Weil [Tue, 2 Jan 2018 22:36:27 +0000 (16:36 -0600)]
osd/PGPeeringEvent: note mon- vs peer-initiated pg creates

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fast dispatch peering events (part 2)
Sage Weil [Tue, 2 Jan 2018 21:36:52 +0000 (15:36 -0600)]
osd: fast dispatch peering events (part 2)

This actually puts the remaining peering events into fast dispatch.  The
only remaining event is the pg create from the mon.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fast dispatch peering events (part 1)
Sage Weil [Tue, 2 Jan 2018 21:35:44 +0000 (15:35 -0600)]
osd: fast dispatch peering events (part 1)

This is a big commit that lays out the infrastructure changes to fast
dispatch the remaining peering events.  It's hard to separate it all out
so this probably doesn't quite build; it's just easier to review as a
separate patch.

- lock ordering for pg_map has changed:
  before:
    OSD::pg_map_lock
      PG::lock
        ShardData::lock

  after:
    PG::lock
      ShardData::lock
        OSD::pg_map_lock

- queue items are now annotated with whether they can proceed without a
pg at all (e.g., query) or can instantiate a pg (e.g., notify log etc).

- There is some wonkiness around getting the initial Initialize event to
a newly-created PG.  I don't love it but it gets the job done for now.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: queue null events without PG lock
Sage Weil [Wed, 20 Dec 2017 12:55:43 +0000 (06:55 -0600)]
osd: queue null events without PG lock

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: move part of wake_pg_waiters into helper
Sage Weil [Mon, 18 Dec 2017 19:55:45 +0000 (13:55 -0600)]
osd: move part of wake_pg_waiters into helper

We'll need this shortly.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: use MTrim peering event for trimming
Sage Weil [Sat, 16 Dec 2017 00:55:03 +0000 (18:55 -0600)]
osd: use MTrim peering event for trimming

This is simpler and cleaner than handling log trimming as a special case.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fast dispatch backfill and recovery reservation events
Sage Weil [Wed, 6 Dec 2017 03:34:58 +0000 (21:34 -0600)]
osd: fast dispatch backfill and recovery reservation events

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: move M{Backfill,Recovery}Reserve event logic into message
Sage Weil [Wed, 6 Dec 2017 03:33:40 +0000 (21:33 -0600)]
osd: move M{Backfill,Recovery}Reserve event logic into message

Better encapsulation!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agomessages/MOSDPeeringOp: add
Sage Weil [Wed, 6 Dec 2017 03:32:51 +0000 (21:32 -0600)]
messages/MOSDPeeringOp: add

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: move peering event type out of PG class
Sage Weil [Wed, 6 Dec 2017 02:13:48 +0000 (20:13 -0600)]
osd/PG: move peering event type out of PG class

We will create these directly from peering Messages shortly.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: keep epoch, not map ref, of last osdmap for lsat persisted epoch
Sage Weil [Thu, 4 Jan 2018 14:46:35 +0000 (08:46 -0600)]
osd/PG: keep epoch, not map ref, of last osdmap for lsat persisted epoch

No need to pin the map in memory!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: remove old update_store_on_load()
Sage Weil [Fri, 9 Feb 2018 21:49:50 +0000 (15:49 -0600)]
osd/PG: remove old update_store_on_load()

This isn't needed post-luminous.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoMerge tag 'v13.0.2'
Sage Weil [Tue, 3 Apr 2018 15:08:22 +0000 (10:08 -0500)]
Merge tag 'v13.0.2'

v13.0.2

7 years agoMerge PR #21180 into master
Patrick Donnelly [Tue, 3 Apr 2018 13:51:18 +0000 (06:51 -0700)]
Merge PR #21180 into master

* refs/pull/21180/head:
vstart_runner: examine check_status before error

Reviewed-by: John Spray <john.spray@redhat.com>
7 years agoMerge pull request #20460 from colletj/v1_image_creation_disallow
Jason Dillaman [Tue, 3 Apr 2018 13:17:21 +0000 (09:17 -0400)]
Merge pull request #20460 from colletj/v1_image_creation_disallow

librbd: disallow creation of v1 image format

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
7 years agoMerge pull request #21202 from tchaikov/wip-rbd-replay
Jason Dillaman [Tue, 3 Apr 2018 13:14:39 +0000 (09:14 -0400)]
Merge pull request #21202 from tchaikov/wip-rbd-replay

rbd-replay: remove boost dependency

Reviewed-by: Jason Dillaman <dillaman@redhat.com>
7 years agoMerge pull request #21142 from dragonylffly/wip-fix-ebusy
Jason Dillaman [Tue, 3 Apr 2018 11:39:15 +0000 (07:39 -0400)]
Merge pull request #21142 from dragonylffly/wip-fix-ebusy

rbd-nbd: fix ebusy when do map

Reviewed-by: Jason Dillaman <dillaman@redhat.com>