git-server-git.apps.pok.os.sepia.ceph.com Git

author	Kefu Chai <kchai@redhat.com>
	Mon, 9 May 2016 06:15:36 +0000 (14:15 +0800)
committer	Kefu Chai <tchaikov@gmail.com>
	Mon, 9 May 2016 12:07:51 +0000 (20:07 +0800)
commit	369db9930887d75b498927da9c97733bff4472b6
tree	d4f48b3b071b88a8fa950c112426203c2ba19f47	tree \| snapshot
parent	646a6a28c31c379ecd7a37eaa0e0641d1a93c728	commit \| diff

osd: remove all stale osdmaps in handle_osd_map()

in a large cluster, there are better chances that the OSD fails to trim
the cached osdmap in a timely manner. and sometimes, it is just unable
to keep up with the incoming osdmap if skip_maps, so the osdmap cache
can keep building up to over 250GB in size. in this change

* publish_superblock() before trimming the osdmaps, so other osdmap
  consumers of OSDService.superblock won't access the osdmaps being
  removed.
* trim all stale osdmaps in batch of conf->osd_target_transaction_size
  if skip_maps is true. in my test, it happens when the osd only
  receives the osdmap from monitor occasionally because the osd happens
  to be chosen when monitor wants to share a new osdmap with a random
  osd.
* always use dedicated transaction(s) for trimming osdmaps. so even in
  the normal case where we are able to trim all stale osdmaps in a
  single batch, a separated transaction is used. we can piggy back
  the commits for removing maps, but we keep it this way for simplicity.
* use std::min() instead MIN() for type safety

Fixes: http://tracker.ceph.com/issues/13990
Signed-off-by: Kefu Chai <kchai@redhat.com>

src/osd/OSD.cc		diff \| blob \| history
src/osd/OSD.h		diff \| blob \| history