]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
7 years agoos/ObjectStore: add has_contexts()
Sage Weil [Tue, 3 Apr 2018 01:57:04 +0000 (20:57 -0500)]
os/ObjectStore: add has_contexts()

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: clean up osdmap_lock
Sage Weil [Mon, 2 Apr 2018 20:44:57 +0000 (15:44 -0500)]
osd: clean up osdmap_lock

Comment rules and take lock during init/shutdown (nice but not necessary).

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoPendingReleaseNotes: not pg_num adjustments during pool create
Sage Weil [Mon, 2 Apr 2018 20:33:36 +0000 (15:33 -0500)]
PendingReleaseNotes: not pg_num adjustments during pool create

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoPendingReleaseNotes: mons before osds to avoid force-recovery issue
Sage Weil [Mon, 2 Apr 2018 19:39:41 +0000 (14:39 -0500)]
PendingReleaseNotes: mons before osds to avoid force-recovery issue

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: drop obsolete Pred
Sage Weil [Mon, 2 Apr 2018 14:24:16 +0000 (09:24 -0500)]
osd: drop obsolete Pred

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: OSDShard: sdata_op_ordering_lock -> shard_lock
Sage Weil [Mon, 2 Apr 2018 14:24:03 +0000 (09:24 -0500)]
osd: OSDShard: sdata_op_ordering_lock -> shard_lock

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: OSDShard: sdata_lock -> sdata_wait_lock
Sage Weil [Mon, 2 Apr 2018 14:19:51 +0000 (09:19 -0500)]
osd: OSDShard: sdata_lock -> sdata_wait_lock

This is only used for waiting.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: rename OSDPGShard::osdmap -> shard_osdmap
Sage Weil [Mon, 2 Apr 2018 14:18:26 +0000 (09:18 -0500)]
osd: rename OSDPGShard::osdmap -> shard_osdmap

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: move sdata_op_ordering_lock declaration
Sage Weil [Mon, 2 Apr 2018 14:12:43 +0000 (09:12 -0500)]
osd: move sdata_op_ordering_lock declaration

So that comment makes sense

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: reduce debug level for pg epoch min
Sage Weil [Mon, 2 Apr 2018 14:12:25 +0000 (09:12 -0500)]
osd: reduce debug level for pg epoch min

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: accessors for num_pgs
Sage Weil [Mon, 2 Apr 2018 14:08:51 +0000 (09:08 -0500)]
osd: accessors for num_pgs

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fix old wake_pg_waiters references
Sage Weil [Mon, 2 Apr 2018 13:56:38 +0000 (08:56 -0500)]
osd: fix old wake_pg_waiters references

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fix 'stale' message
Sage Weil [Mon, 2 Apr 2018 13:50:07 +0000 (08:50 -0500)]
osd: fix 'stale' message

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: constify arg for handle_pg_create_info, maybe_wait_for_max_pg
Sage Weil [Mon, 2 Apr 2018 13:45:29 +0000 (08:45 -0500)]
osd: constify arg for handle_pg_create_info, maybe_wait_for_max_pg

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: constify arg to prime_splits
Sage Weil [Mon, 2 Apr 2018 13:44:19 +0000 (08:44 -0500)]
osd: constify arg to prime_splits

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: constify arg to identify_splits
Sage Weil [Mon, 2 Apr 2018 13:43:55 +0000 (08:43 -0500)]
osd: constify arg to identify_splits

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: drop unused pushes_to_free variable on _process
Sage Weil [Mon, 2 Apr 2018 13:33:37 +0000 (08:33 -0500)]
osd: drop unused pushes_to_free variable on _process

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: handle pushes_to_free in consume_map
Sage Weil [Mon, 2 Apr 2018 13:33:19 +0000 (08:33 -0500)]
osd: handle pushes_to_free in consume_map

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: synchronously remove pgids when pool tombstone is missing or invalid
Sage Weil [Mon, 2 Apr 2018 13:20:43 +0000 (08:20 -0500)]
osd: synchronously remove pgids when pool tombstone is missing or invalid

This is needed for upgraded clusters (e.g., v13.0.2 clusters with an
missing ec_profile or upgraded clusters with partially-deleted pools/pgs).

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoqa/suites: mon warn on pool no app = false for api tests
Sage Weil [Tue, 13 Mar 2018 13:57:29 +0000 (08:57 -0500)]
qa/suites: mon warn on pool no app = false for api tests

Among other things, the list.cc tests set pg_num which waits for cluster
healthy.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoqa/suites/rados/basic/tasks/rados_api_tests: debug ms = 1
Sage Weil [Tue, 13 Mar 2018 01:25:35 +0000 (20:25 -0500)]
qa/suites/rados/basic/tasks/rados_api_tests: debug ms = 1

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: periodically request newer map from mon if waiting peering events
Sage Weil [Mon, 12 Mar 2018 14:16:12 +0000 (09:16 -0500)]
osd: periodically request newer map from mon if waiting peering events

If we have peering events waiting on a newer map than we have, request it
from the mon.  Do this periodically in tick so that we normally wait to get
it from a peer first.

This avoids a deadlock situation where we are, say, waiting for a newer
map to create a pg or but do not ever get the map to do it (because the
cluster is idle).

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: use rctx transaction for PG removal
Sage Weil [Fri, 9 Mar 2018 01:46:03 +0000 (19:46 -0600)]
osd: use rctx transaction for PG removal

In the normal case, queue up the removal work on the rctx transaction.

For the final cleanup, since we need to block, dispatch it ourselves, and
do not do so in OSD.cc.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: some debug output in identify_split_children
Sage Weil [Fri, 9 Mar 2018 04:05:14 +0000 (22:05 -0600)]
osd: some debug output in identify_split_children

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: do final pg delete transaction on pg sequencer
Sage Weil [Thu, 8 Mar 2018 17:50:44 +0000 (11:50 -0600)]
osd/PG: do final pg delete transaction on pg sequencer

Simpler, cleaner.  Also, this way we flush before returning.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: better debug output in identify_splits
Sage Weil [Wed, 7 Mar 2018 21:09:30 +0000 (15:09 -0600)]
osd: better debug output in identify_splits

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: handle NOUP flag vs boot race
Sage Weil [Wed, 7 Mar 2018 21:09:16 +0000 (15:09 -0600)]
osd: handle NOUP flag vs boot race

If we digest maps that show a NOUP flag change *and* we also go active,
there is no need to restart the boot process--we can just go/stay active.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoqa/suites/rados/singleton/all/recovery_preemption: make test more reliable
Sage Weil [Tue, 27 Feb 2018 22:25:21 +0000 (16:25 -0600)]
qa/suites/rados/singleton/all/recovery_preemption: make test more reliable

A 30 second run did only 7000 ops, which means ~50 log entires per pg...
not enough to trigger backfill.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoqa/suites/rados/singleton/all/mon-seesaw: whitelist PG_AVAILABILITY
Sage Weil [Wed, 28 Feb 2018 16:17:09 +0000 (10:17 -0600)]
qa/suites/rados/singleton/all/mon-seesaw: whitelist PG_AVAILABILITY

The seesaw might delay pg creation by more than 60s.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: ensure an actual transaction gets queued for recovery finish
Sage Weil [Mon, 26 Feb 2018 19:45:28 +0000 (13:45 -0600)]
osd/PG: ensure an actual transaction gets queued for recovery finish

Otherwise, this context gets leaked and lost.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: close split vs query race in consume_map
Sage Weil [Fri, 23 Feb 2018 19:39:40 +0000 (13:39 -0600)]
osd: close split vs query race in consume_map

Consider the race:

- shard 0 consumes epoch E
- shard 1 consumes epoch E
  - shard 1 pg P will split to C
- shard 0 processes query on C, returns DNE
- shard 0 primes slot C

Close race by priming split children before consuming map into each
OSDShard.  That way the query will either (1) arrive before E and before
slot C is primed and wait for E, or find the slot present with
waiting_for_split true.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: improve documentation for event queue ordering and requeueing rules
Sage Weil [Fri, 23 Feb 2018 19:18:53 +0000 (13:18 -0600)]
osd: improve documentation for event queue ordering and requeueing rules

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: flush sequencer/collection on shutdown
Sage Weil [Fri, 23 Feb 2018 15:19:13 +0000 (09:19 -0600)]
osd/PG: flush sequencer/collection on shutdown

This should catch any in-flight work we have.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: move shutdown into PG
Sage Weil [Fri, 23 Feb 2018 15:17:12 +0000 (09:17 -0600)]
osd/PG: move shutdown into PG

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/osd_types: fix pg_t::pool() return type (uint64_t -> int64_t)
Sage Weil [Fri, 23 Feb 2018 14:58:32 +0000 (08:58 -0600)]
osd/osd_types: fix pg_t::pool() return type (uint64_t -> int64_t)

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agomon/OSDMonitor: disallow pg_num changes until after pool is created
Sage Weil [Fri, 23 Feb 2018 14:52:42 +0000 (08:52 -0600)]
mon/OSDMonitor: disallow pg_num changes until after pool is created

The pg create handling OSD code does not handle races between a mon create
message and a split message.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: set send_notify on child
Sage Weil [Thu, 22 Feb 2018 15:18:28 +0000 (09:18 -0600)]
osd/PG: set send_notify on child

If we are a non-primary, we need to ensure the split children send
notifies.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: kill broken _process optimization; simplify null pg flow
Sage Weil [Thu, 22 Feb 2018 03:10:54 +0000 (21:10 -0600)]
osd: kill broken _process optimization; simplify null pg flow

- drop fast quuee to waiting list optimization: it breaks ordering and is
a useless optimization
- restructure so that we don't drop the lock and revalidate the world if
pg == nullptr

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fix fast pg create vs limits
Sage Weil [Wed, 21 Feb 2018 03:23:25 +0000 (21:23 -0600)]
osd: fix fast pg create vs limits

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: (pre)publish map before distributing to shards (and pgs)
Sage Weil [Wed, 21 Feb 2018 03:14:27 +0000 (21:14 -0600)]
osd: (pre)publish map before distributing to shards (and pgs)

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: update numpg_* counters when removing a pg
Sage Weil [Tue, 20 Feb 2018 21:49:46 +0000 (15:49 -0600)]
osd: update numpg_* counters when removing a pg

Usually on a pg create we see an OSDMap update; on PG removal completion
we may not.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: decrement deleting pg count in _delete_some
Sage Weil [Tue, 20 Feb 2018 21:49:19 +0000 (15:49 -0600)]
osd: decrement deleting pg count in _delete_some

The exit() method for ToDelete state doesn't run on PG destruction.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: clear shard osdmaps during shutdown
Sage Weil [Tue, 20 Feb 2018 21:33:09 +0000 (15:33 -0600)]
osd: clear shard osdmaps during shutdown

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: make save osdmap accessor for OSDShard
Sage Weil [Tue, 20 Feb 2018 21:20:39 +0000 (15:20 -0600)]
osd: make save osdmap accessor for OSDShard

The advance_pg needs to get the shard osdmap without racing against
consume_map().

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: clean up mutex naming for OSDShard
Sage Weil [Tue, 20 Feb 2018 21:20:00 +0000 (15:20 -0600)]
osd: clean up mutex naming for OSDShard

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agocommon/tracked_int_ptr: fix operator= return value
Sage Weil [Sun, 18 Feb 2018 20:36:28 +0000 (14:36 -0600)]
common/tracked_int_ptr: fix operator= return value

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fix pg removal vs _process race
Sage Weil [Sun, 18 Feb 2018 02:27:30 +0000 (20:27 -0600)]
osd: fix pg removal vs _process race

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: lookup_*pg must return PGRef
Sage Weil [Fri, 16 Feb 2018 21:53:43 +0000 (15:53 -0600)]
osd: lookup_*pg must return PGRef

Otherwise it is fundamentally unsafe, as the PG might get destroyed out
from under us without a reference.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: kill pass-through _open_pg
Sage Weil [Fri, 9 Feb 2018 22:15:00 +0000 (16:15 -0600)]
osd: kill pass-through _open_pg

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove old min pg epoch tracking
Sage Weil [Fri, 9 Feb 2018 22:14:42 +0000 (16:14 -0600)]
osd: remove old min pg epoch tracking

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: remove RecoveryCtx on_applied and on_commit
Sage Weil [Fri, 9 Feb 2018 22:05:46 +0000 (16:05 -0600)]
osd/PG: remove RecoveryCtx on_applied and on_commit

These were awkward and unnecessary.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: register delete completion directly on Transaction
Sage Weil [Fri, 9 Feb 2018 22:07:44 +0000 (16:07 -0600)]
osd/PG: register delete completion directly on Transaction

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: register split completion directly on Transaction
Sage Weil [Fri, 9 Feb 2018 22:04:30 +0000 (16:04 -0600)]
osd: register split completion directly on Transaction

No need to use wonky RecoveryCtx C_Contexts

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: drop unused context list accessors for RecoveryCtx
Sage Weil [Fri, 9 Feb 2018 22:00:28 +0000 (16:00 -0600)]
osd/PG: drop unused context list accessors for RecoveryCtx

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: register recovery finish context directly on Transaction
Sage Weil [Fri, 9 Feb 2018 22:02:08 +0000 (16:02 -0600)]
osd/PG: register recovery finish context directly on Transaction

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: drop unused activate() context list arg
Sage Weil [Fri, 9 Feb 2018 22:00:10 +0000 (16:00 -0600)]
osd/PG: drop unused activate() context list arg

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: register flush completions directly on the Transaction
Sage Weil [Fri, 9 Feb 2018 21:58:15 +0000 (15:58 -0600)]
osd/PG: register flush completions directly on the Transaction

No need to awkward list passed as an arg; all of these callbacks end up
on the Transaction anyway.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: wait for pg epochs based on shard tracking
Sage Weil [Fri, 9 Feb 2018 21:37:59 +0000 (15:37 -0600)]
osd: wait for pg epochs based on shard tracking

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: index pg (slots) by map epoch within each shard
Sage Weil [Fri, 9 Feb 2018 21:31:14 +0000 (15:31 -0600)]
osd: index pg (slots) by map epoch within each shard

This will replace the epoch tracking in OSDService shortly.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: link back to pg slot
Sage Weil [Fri, 9 Feb 2018 21:04:32 +0000 (15:04 -0600)]
osd/PG: link back to pg slot

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: OSDShard::pg_slot -> OSDShardPGSlot
Sage Weil [Fri, 9 Feb 2018 20:42:25 +0000 (14:42 -0600)]
osd: OSDShard::pg_slot -> OSDShardPGSlot

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: change pg_slots unordered_map to use unique_ptr<>
Sage Weil [Fri, 9 Feb 2018 20:39:34 +0000 (14:39 -0600)]
osd: change pg_slots unordered_map to use unique_ptr<>

This avoids moving slots around in memory in the unordered_map... they can
be big!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove some unused methods
Sage Weil [Fri, 9 Feb 2018 20:22:23 +0000 (14:22 -0600)]
osd: remove some unused methods

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove created_pgs tracking in RecoveryCtx
Sage Weil [Fri, 9 Feb 2018 19:16:19 +0000 (13:16 -0600)]
osd: remove created_pgs tracking in RecoveryCtx

Not needed or used!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fix PG::ch init
Sage Weil [Fri, 9 Feb 2018 19:13:07 +0000 (13:13 -0600)]
osd: fix PG::ch init

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: use _attach_pg and _detach_pg helpers; keep PG::osd_shard ptr
Sage Weil [Thu, 8 Feb 2018 22:54:36 +0000 (16:54 -0600)]
osd: use _attach_pg and _detach_pg helpers; keep PG::osd_shard ptr

Consolidate num_pgs updates (and fix a counting bug along the way).

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove old split tracking machinery
Sage Weil [Thu, 8 Feb 2018 22:27:21 +0000 (16:27 -0600)]
osd: remove old split tracking machinery

This infrastructure is no longer used; simpler split tracking now lives in
the shards pg_slots directly.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: restructure consume_map in terms of shards
Sage Weil [Thu, 8 Feb 2018 22:23:04 +0000 (16:23 -0600)]
osd: restructure consume_map in terms of shards

- new split primming machinery
- new primed split cleanup on pg removal
- cover the pg creation path

The old split tracking is now totally unused; will be removed in the next
patch.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: pass sdata into dequeue_peering_evt (and dequeue_delete)
Sage Weil [Thu, 8 Feb 2018 19:45:37 +0000 (13:45 -0600)]
osd: pass sdata into dequeue_peering_evt (and dequeue_delete)

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: pass data into OpQueueItem::run()
Sage Weil [Thu, 8 Feb 2018 19:36:41 +0000 (13:36 -0600)]
osd: pass data into OpQueueItem::run()

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: kill pg_map
Sage Weil [Tue, 6 Feb 2018 20:35:17 +0000 (14:35 -0600)]
osd: kill pg_map

Split doesn't work quite right; num_pgs count is probably off.  But, things
mostly work.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: rename OSDShard waiting_for_pg_osdmap -> osdmap
Sage Weil [Tue, 6 Feb 2018 17:27:36 +0000 (11:27 -0600)]
osd: rename OSDShard waiting_for_pg_osdmap -> osdmap

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: use _get_pgs() where possible; avoid touching pg_map directly
Sage Weil [Tue, 6 Feb 2018 16:51:42 +0000 (10:51 -0600)]
osd: use _get_pgs() where possible; avoid touching pg_map directly

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: get _get_pgs() and _get_pgids()
Sage Weil [Tue, 6 Feb 2018 16:50:31 +0000 (10:50 -0600)]
osd: get _get_pgs() and _get_pgids()

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove get_mapped_pools command
Sage Weil [Tue, 6 Feb 2018 14:48:24 +0000 (08:48 -0600)]
osd: remove get_mapped_pools command

No in-tree users.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: move ShardedOpWQ::ShardData -> OSDShard
Sage Weil [Tue, 6 Feb 2018 00:22:40 +0000 (18:22 -0600)]
osd: move ShardedOpWQ::ShardData -> OSDShard

Soon we will destroy pg_map!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: kill _open_lock_pg
Sage Weil [Mon, 5 Feb 2018 21:54:19 +0000 (15:54 -0600)]
osd: kill _open_lock_pg

Move lock call to caller.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: kill _create_lock_pg
Sage Weil [Mon, 5 Feb 2018 21:52:56 +0000 (15:52 -0600)]
osd: kill _create_lock_pg

Unused.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: do not release recovery_ops_reserved on requeue
Sage Weil [Fri, 2 Feb 2018 21:50:25 +0000 (15:50 -0600)]
osd: do not release recovery_ops_reserved on requeue

This doesn't make sense.. although it's the same behavior as
luminous.

The point of the releases here is that if we drop something that is in
the queue we drop the recovery_ops_reserved counter by that much.  However,
if something is in the queue and waiting, and we wake it back up, there
is no net change to _reserved... which is only decremented when we
actually dequeue something.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: debug recovery_ops_reserved
Sage Weil [Fri, 2 Feb 2018 21:26:52 +0000 (15:26 -0600)]
osd: debug recovery_ops_reserved

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: move PG peering waiters into op wq
Sage Weil [Fri, 2 Feb 2018 16:11:49 +0000 (10:11 -0600)]
osd: move PG peering waiters into op wq

This resolves problems with a peering event being delivered triggering
advance_pg which triggers a requeue of waiting events that are requeued
*behind* the event we are processing.  It also reduces the number of
wait lists by one, yay!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: store ec profile with final pool
Sage Weil [Thu, 1 Feb 2018 19:58:15 +0000 (13:58 -0600)]
osd: store ec profile with final pool

We need this to reinstantiate semi-deleted ec backends.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: ignore RecoveryDone in ReplicaActive too
Sage Weil [Thu, 1 Feb 2018 18:59:29 +0000 (12:59 -0600)]
osd/PG: ignore RecoveryDone in ReplicaActive too

This can be missed on a RepRecovering -> RepNotRecovering ->
RepWaitBackfillReserved transition.  Catch any straggler events in
ReplicaActive.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/osd_types: include epoch_sent in pg_query_t operator<<
Sage Weil [Tue, 30 Jan 2018 14:56:46 +0000 (08:56 -0600)]
osd/osd_types: include epoch_sent in pg_query_t operator<<

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: restructure pg waiting more
Sage Weil [Fri, 2 Feb 2018 16:04:44 +0000 (10:04 -0600)]
osd: restructure pg waiting more

Wait by epoch.  This is less kludgey than before!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: restructure pg waiting
Sage Weil [Fri, 19 Jan 2018 19:23:01 +0000 (13:23 -0600)]
osd: restructure pg waiting

Rethink the way we wait for PGs.  We need to order peering events relative to
each other; keep them in a separate queue in the pg_slot.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: normal command uses slow dispatch (it can send messages)
Sage Weil [Fri, 19 Jan 2018 14:51:07 +0000 (08:51 -0600)]
osd: normal command uses slow dispatch (it can send messages)

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/OSD,PG: get_osdmap()->get_epoch() -> get_osdmap_epoch()
Sage Weil [Thu, 18 Jan 2018 22:31:46 +0000 (16:31 -0600)]
osd/OSD,PG: get_osdmap()->get_epoch() -> get_osdmap_epoch()

Avoid wrangling shared_ptr!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: misc fixes
Sage Weil [Wed, 17 Jan 2018 16:23:15 +0000 (10:23 -0600)]
osd: misc fixes

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: kill disk_tp, recovery_gen_wq
Sage Weil [Tue, 16 Jan 2018 22:48:37 +0000 (16:48 -0600)]
osd: kill disk_tp, recovery_gen_wq

Progress!

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: move recovery contexts to normal wq
Sage Weil [Tue, 16 Jan 2018 22:42:28 +0000 (16:42 -0600)]
osd: move recovery contexts to normal wq

We have a specific PGRecoveryContext type/event--even though we are just
calling a GenContext--so that we can distinguish the event type properly.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove _ookup_lock_pg_with_map_lock_held()
Sage Weil [Thu, 4 Jan 2018 18:53:39 +0000 (12:53 -0600)]
osd: remove _ookup_lock_pg_with_map_lock_held()

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: new MOSDScrub2 message with spg_t, fast dispatch
Sage Weil [Thu, 4 Jan 2018 18:10:41 +0000 (12:10 -0600)]
osd: new MOSDScrub2 message with spg_t, fast dispatch

Send new message to mimic+ OSDs.  Fast dispatch it at the OSD.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/PG: request scrub via a state machine event
Sage Weil [Thu, 4 Jan 2018 18:09:56 +0000 (12:09 -0600)]
osd/PG: request scrub via a state machine event

Continuing effort to make PG interactions event based.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: use peering events for forced recovery
Sage Weil [Thu, 18 Jan 2018 22:09:52 +0000 (16:09 -0600)]
osd: use peering events for forced recovery

The mgr code is updated to send spg_t's instead of pg_t's (and is slightly
refactored/cleaned).

The PG events are added to the Primary state, unless we're also in the
Clean substate, in which case they are ignored.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd/OSDMap: get_primary_shart() variant that returns primary *and* shard
Sage Weil [Thu, 4 Jan 2018 16:48:41 +0000 (10:48 -0600)]
osd/OSDMap: get_primary_shart() variant that returns primary *and* shard

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: prime pg_slots for to-be-split children
Sage Weil [Wed, 3 Jan 2018 17:48:37 +0000 (11:48 -0600)]
osd: prime pg_slots for to-be-split children

Once we know which PGs are about to be created, we instantiate their
pg_slot and mark them waiting_pg, which blocks all incoming events until
the split completes, the PG is installed, and we call wake_pg_waiters().

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: remove obsolete slow dispatch path for most messages
Sage Weil [Wed, 3 Jan 2018 03:39:03 +0000 (21:39 -0600)]
osd: remove obsolete slow dispatch path for most messages

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fast dispatch M[Mon]Command
Sage Weil [Wed, 3 Jan 2018 14:52:16 +0000 (08:52 -0600)]
osd: fast dispatch M[Mon]Command

These just get dumped onto a work queue.

Signed-off-by: Sage Weil <sage@redhat.com>
7 years agoosd: fast dispatch ping
Sage Weil [Wed, 3 Jan 2018 03:37:30 +0000 (21:37 -0600)]
osd: fast dispatch ping

Signed-off-by: Sage Weil <sage@redhat.com>