]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
11 years agoclient: check cap ID when handling cap export message 1714/head
Yan, Zheng [Mon, 21 Apr 2014 08:26:33 +0000 (16:26 +0800)]
client: check cap ID when handling cap export message

handle following sequence of events:
- mds0 exports an inode to mds1. client receives the cap import
  message from mds1. caps from mds0 are removed while handling
  the cap import message.
- mds1 exports an inode to mds0. client receives the cap export
  message from mds1. handle_cap_export() adds placeholder caps
  for mds0
- client receives the first cap export message (for exporting
  inode from mds0 to mds1)

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoclient: avoid releasing caps that are being used
Yan, Zheng [Tue, 22 Apr 2014 02:26:50 +0000 (10:26 +0800)]
client: avoid releasing caps that are being used

To avoid releasing caps that are being used, encode_inode_release()
should send implemented caps to MDS.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoMerge pull request #1713 from ceph/wip-7439
Samuel Just [Wed, 23 Apr 2014 00:36:20 +0000 (17:36 -0700)]
Merge pull request #1713 from ceph/wip-7439

Wip 7439

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoReplicatedPG: handle ec pools in mark_all_unfound_lost 1713/head
Samuel Just [Tue, 22 Apr 2014 21:56:08 +0000 (14:56 -0700)]
ReplicatedPG: handle ec pools in mark_all_unfound_lost

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG: enable mark_unfound_lost delete for ec pools
Samuel Just [Tue, 22 Apr 2014 19:45:28 +0000 (12:45 -0700)]
ReplicatedPG: enable mark_unfound_lost delete for ec pools

revert is tricky to implement at this time for ec pools, so
we'll instead just implement delete for ec pools.

Fixes: #7439
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoqa/workunits/rbd/copy.sh: skip some tests when tiering is enabled
Sage Weil [Tue, 22 Apr 2014 16:42:16 +0000 (09:42 -0700)]
qa/workunits/rbd/copy.sh: skip some tests when tiering is enabled

The rados ls bit doesn't work.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoqa/workunits/rbd/copy.sh: fix test
Sage Weil [Tue, 22 Apr 2014 16:37:32 +0000 (09:37 -0700)]
qa/workunits/rbd/copy.sh: fix test

I broke this in commit 9d64ac66082bd108ec3c2a74e2e77475b5564eae.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoMerge pull request #1691 from ceph/wip-8139
Sage Weil [Tue, 22 Apr 2014 19:40:02 +0000 (12:40 -0700)]
Merge pull request #1691 from ceph/wip-8139

osd_types: pg_t: allow is_split to handle checks for splits prior to the most recent

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoECBackend: use std::swap for boost::optional
Samuel Just [Tue, 22 Apr 2014 17:21:55 +0000 (10:21 -0700)]
ECBackend: use std::swap for boost::optional

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1710 from ceph/wip-coverity
Yehuda Sadeh [Tue, 22 Apr 2014 16:02:33 +0000 (09:02 -0700)]
Merge pull request #1710 from ceph/wip-coverity

a couple coverity fixes

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoMerge pull request #1711 from ceph/wip-coverity-respawn
Sage Weil [Tue, 22 Apr 2014 15:37:21 +0000 (08:37 -0700)]
Merge pull request #1711 from ceph/wip-coverity-respawn

mds: make strncpy in ::respawn safer

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agomds: make strncpy in ::respawn safer 1711/head
John Spray [Tue, 22 Apr 2014 15:31:27 +0000 (16:31 +0100)]
mds: make strncpy in ::respawn safer

Previous code assumed null terminated argv[0]
was not longer than PATH_MAX and the resulting
strncpy was not strictly safe.

Modify the bounds to ensure that copy will not
result in an unterminated string if argv[0]
is oversized.

Signed-off-by: John Spray <john.spray@inktank.com>
11 years agoosd/osd_types: RWState: initialize snaptrimmer_write_marker 1710/head
Sage Weil [Tue, 22 Apr 2014 15:29:58 +0000 (08:29 -0700)]
osd/osd_types: RWState: initialize snaptrimmer_write_marker

** CID 1204295:  Uninitialized scalar field  (UNINIT_CTOR)
/osd/osd_types.h: 2716 in ObjectContext::RWState::RWState()()

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosdc/Objecter: drop unused field
Sage Weil [Tue, 22 Apr 2014 15:28:52 +0000 (08:28 -0700)]
osdc/Objecter: drop unused field

This as missed by 860d72770cdf092c027d50f4ee03bed76c975599.

** CID 1204296:  Uninitialized scalar field  (UNINIT_CTOR)
/osdc/Objecter.h: 1165 in Objecter::Op::Op(const object_t &, const
object_locator_t &, std::vector<OSDOp, std::allocator<OSDOp>> &, int, Context *,
Context *, unsigned long *)()
/osdc/Objecter.h: 1165 in Objecter::Op::Op(const object_t &, const
object_locator_t &, std::vector<OSDOp, std::allocator<OSDOp>> &, int, Context *,
Context *, unsigned long *)()

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc/release-notes: a bit of prose about firefly
Sage Weil [Tue, 22 Apr 2014 01:33:00 +0000 (18:33 -0700)]
doc/release-notes: a bit of prose about firefly

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/osd_types: pg_interval_t: include primaries in operator<< 1691/head
Sage Weil [Sun, 20 Apr 2014 05:08:41 +0000 (22:08 -0700)]
osd/osd_types: pg_interval_t: include primaries in operator<<

Also make up vs acting explicit.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/osd_types: pg_interval_t: include up_primary in pg_interval_t
Sage Weil [Sun, 20 Apr 2014 05:06:48 +0000 (22:06 -0700)]
osd/osd_types: pg_interval_t: include up_primary in pg_interval_t

Nothing uses this, but it triggers a new interval, which makes it confusing
when it is not recording in the interval itself.  Let's add it now.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd/osd_types: pg_interval_t: dump primary
Sage Weil [Sun, 20 Apr 2014 05:05:27 +0000 (22:05 -0700)]
osd/osd_types: pg_interval_t: dump primary

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: change in up set primary constitutes a peering interval change
Sage Weil [Sun, 20 Apr 2014 05:04:33 +0000 (22:04 -0700)]
osd: change in up set primary constitutes a peering interval change

In several places, a change in the up_primary triggers a new peering
interval, but the palces that actually generate the new past intervals,
including check_new_interval(), did not enforce that.  This becomes
somewhat obvious when you see that those callers are ignoring the
up_primary output argument for pg_to_up_acting_osds().

Fix this by adding arguments to check_new_interval and fixing the callers
to pass them in properly.  Add a unit test case to verify this.

Note that the past interval struct itself does not record who the
up_primary was; possibly it should.

Fixes: #8139
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: use parent pgid (as appropriate) in generate_past_intervals()
Sage Weil [Fri, 18 Apr 2014 22:48:33 +0000 (15:48 -0700)]
osd: use parent pgid (as appropriate) in generate_past_intervals()

Feed in the ancestor pg_t (if any) when we are looking at intervals for
previous maps that may have preceded a recent split.

Fixes: #8139
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1651 from enovance/wip-brag
Sage Weil [Tue, 22 Apr 2014 03:49:43 +0000 (20:49 -0700)]
Merge pull request #1651 from enovance/wip-brag

Few bug fixes in ceph-brag

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agodoc/release-notes: v0.80
Sage Weil [Tue, 22 Apr 2014 01:20:56 +0000 (18:20 -0700)]
doc/release-notes: v0.80

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1707 from ceph/wip-rbd-test
Josh Durgin [Mon, 21 Apr 2014 23:53:35 +0000 (16:53 -0700)]
Merge pull request #1707 from ceph/wip-rbd-test

rbd: fix tests for cache pools

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoqa/workunit/rbd/import_export.sh: skip list-objects tests with tiering 1707/head
Sage Weil [Mon, 21 Apr 2014 23:47:10 +0000 (16:47 -0700)]
qa/workunit/rbd/import_export.sh: skip list-objects tests with tiering

Listing objects isn't reliable with cache pools; skip that part of the
test if we see that rbd has tiering enabled.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoqa/workunit/rbd/copy.sh: do not delete/recreate rbd pool
Sage Weil [Mon, 21 Apr 2014 23:26:23 +0000 (16:26 -0700)]
qa/workunit/rbd/copy.sh: do not delete/recreate rbd pool

Among other things, it breaks when tiering is enabled.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc: Fixed syntax to include 'pool'.
John Wilkins [Mon, 21 Apr 2014 22:43:23 +0000 (15:43 -0700)]
doc: Fixed syntax to include 'pool'.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agoPG::PriorSet: consider lost osds in up_now for pcontdec
Samuel Just [Sun, 20 Apr 2014 23:45:12 +0000 (16:45 -0700)]
PG::PriorSet: consider lost osds in up_now for pcontdec

Otherwise, the pg will remain down even as osds are marked lost.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1703 from ceph/wip-7942
Samuel Just [Mon, 21 Apr 2014 22:13:22 +0000 (15:13 -0700)]
Merge pull request #1703 from ceph/wip-7942

Wip 7942

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoReplicatedPG::do_op: check for blocked snapset obj 1703/head
Samuel Just [Wed, 16 Apr 2014 17:37:01 +0000 (10:37 -0700)]
ReplicatedPG::do_op: check for blocked snapset obj

Otherwise, we might use an invalid snapset in find_object_context.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG: in trim, grab w locks on obc and snapset_obc
Samuel Just [Mon, 14 Apr 2014 23:20:39 +0000 (16:20 -0700)]
ReplicatedPG: in trim, grab w locks on obc and snapset_obc

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG: if we get ENOENT on clone, remove clone from snapset
Samuel Just [Fri, 18 Apr 2014 23:51:34 +0000 (16:51 -0700)]
ReplicatedPG: if we get ENOENT on clone, remove clone from snapset

Fixes: #7916
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG: do not create whiteout clones
Samuel Just [Tue, 8 Apr 2014 21:03:59 +0000 (14:03 -0700)]
ReplicatedPG: do not create whiteout clones

First, make_writeable treats whiteout heads like snapdir for
cloning purposes.  Second, to ensure that we send the correct
deletes on flush to the backing pool, we instead use oi.snaps
on any clone we are flushing to infer the snaps during which
head did not exist and send a delete as appropriate prior to
the copy_from.

Normally, we'd have a problem if the delete and the copy_from
completed, but an interval change intervened before the dirty
flag was cleared since we'd end up re-deleting the object.
To avoid that, we use the CEPH_OSD_FLAG_ORDERSNAP flag.

Additionally, we will use the correct snap_seq on the delete
or flush as appropriate to ensure that the previous clone
gets created with the same clone id as in the cache pool.

Fixes: #7942
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG,rados: add CEPH_OSD_[COPY_FROM]_MAP_SNAP_TO_CLONE
Samuel Just [Tue, 8 Apr 2014 21:27:33 +0000 (14:27 -0700)]
ReplicatedPG,rados: add CEPH_OSD_[COPY_FROM]_MAP_SNAP_TO_CLONE

When promoting a clone, we want to use the provided snapid to specify
specify the clone id directly.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1705 from ceph/wip-8124
Sage Weil [Mon, 21 Apr 2014 21:28:43 +0000 (14:28 -0700)]
Merge pull request #1705 from ceph/wip-8124

Wip 8124

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoqa/workunits/cephtool/test.sh: make set pg_num test non-racy
Sage Weil [Mon, 21 Apr 2014 21:18:21 +0000 (14:18 -0700)]
qa/workunits/cephtool/test.sh: make set pg_num test non-racy

Loop while the pool is creating.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
11 years agoReplicatedPG: do not use shard for hit_set object names 1705/head
Samuel Just [Sun, 20 Apr 2014 20:16:36 +0000 (13:16 -0700)]
ReplicatedPG: do not use shard for hit_set object names

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::agent_load_hit_sets: take ondisk_read_lock
Samuel Just [Sat, 19 Apr 2014 01:11:55 +0000 (18:11 -0700)]
ReplicatedPG::agent_load_hit_sets: take ondisk_read_lock

Otherwise, the hit_set might be not yet written due to a recently
completed recovery.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoPG,PGLog: update hit_set during peering
Samuel Just [Fri, 18 Apr 2014 22:14:33 +0000 (15:14 -0700)]
PG,PGLog: update hit_set during peering

Fixes: #8124
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoosd/: propogate hit_set history with repop
Samuel Just [Thu, 17 Apr 2014 19:27:07 +0000 (12:27 -0700)]
osd/: propogate hit_set history with repop

We don't actually send the whole info on each repop, just the log
entries, updated stats, and a few other bits.  For hit_set ops, we need
to also communicate the new hit_set history status atomically with the
log entries and the transaction.  Thus, we add a channel for an optional
pg_hit_set_history_t field in PGBackend::submit_transaction interface
and associated messages and implementations to update the hit_set info
field along with the log entries.

This also means that hit_set_(persist|trim) update an
updated_hit_set_history field on the OpContext instead of directly
modifying the info field.

Fixes: #8124
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoencoding: use unqualified name for encode/decode in boost::optional encoding
Samuel Just [Thu, 17 Apr 2014 19:23:21 +0000 (12:23 -0700)]
encoding: use unqualified name for encode/decode in boost::optional encoding

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoECMsgTypes::ECSubWrite: fix at_version indentation
Samuel Just [Thu, 17 Apr 2014 18:03:19 +0000 (11:03 -0700)]
ECMsgTypes::ECSubWrite: fix at_version indentation

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoosd: track the number of hit_set archive objects in a pg
Samuel Just [Mon, 21 Apr 2014 17:52:58 +0000 (10:52 -0700)]
osd: track the number of hit_set archive objects in a pg

Also, use this value in agent_choose_mode instead of the max
number.

Related: #8124
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::hit_set_persist: clean up degraded check
Samuel Just [Thu, 17 Apr 2014 01:02:27 +0000 (18:02 -0700)]
ReplicatedPG::hit_set_persist: clean up degraded check

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::mark_all_unfound_lost: delete local copy if necessary
Samuel Just [Thu, 20 Mar 2014 22:42:41 +0000 (15:42 -0700)]
ReplicatedPG::mark_all_unfound_lost: delete local copy if necessary

There might be a local copy for an EC pool in the DELETE case.  The replica
copies should be already handled by merge_log.

Fixes: #7439
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agobuffer: adjust #include order
Sage Weil [Sat, 19 Apr 2014 00:33:52 +0000 (17:33 -0700)]
buffer: adjust #include order

The pthread.h include is somehow clobbering things, although it is not
clear how.  :(

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1696 from ceph/wip-8097
Samuel Just [Fri, 18 Apr 2014 22:12:09 +0000 (15:12 -0700)]
Merge pull request #1696 from ceph/wip-8097

buffer: use Mutex instead of Spinlock for raw crcs

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1697 from ceph/wip-num_objects_omap
Sage Weil [Fri, 18 Apr 2014 21:24:06 +0000 (14:24 -0700)]
Merge pull request #1697 from ceph/wip-num_objects_omap

osd_types::object_stat_sum_t: fix add/sub for num_objects_omap

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1695 from ceph/wip-8153
Sage Weil [Fri, 18 Apr 2014 21:09:37 +0000 (14:09 -0700)]
Merge pull request #1695 from ceph/wip-8153

Revert "ReplicatedPG::get_snapset_context: assert snap obj is not missin...

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoRevert "ReplicatedPG::get_snapset_context: assert snap obj is not missing" 1695/head
Samuel Just [Fri, 18 Apr 2014 20:59:22 +0000 (13:59 -0700)]
Revert "ReplicatedPG::get_snapset_context: assert snap obj is not missing"

This breaks mark_lost_unfound_revert.

This reverts commit 0d2177a18071ad9c9581826a43751c36bab5b2db.

11 years agoMerge pull request #1693 from ceph/wip-7997
Sage Weil [Fri, 18 Apr 2014 20:54:30 +0000 (13:54 -0700)]
Merge pull request #1693 from ceph/wip-7997

mon: fix get_version race (more)

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agotest: handle the create-pg delay when testing cache split syntax
Greg Farnum [Fri, 18 Apr 2014 18:01:40 +0000 (11:01 -0700)]
test: handle the create-pg delay when testing cache split syntax

Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1692 from ceph/wip-7784
Sage Weil [Fri, 18 Apr 2014 18:47:59 +0000 (11:47 -0700)]
Merge pull request #1692 from ceph/wip-7784

mon: OSDMonitor: HEALTH_WARN on 'mon osd down out interval == 0'

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agomon: OSDMonitor: HEALTH_WARN on 'mon osd down out interval == 0' 1692/head
Joao Eduardo Luis [Fri, 18 Apr 2014 18:15:52 +0000 (19:15 +0100)]
mon: OSDMonitor: HEALTH_WARN on 'mon osd down out interval == 0'

A 'status' or 'health' request will return a HEALTH_WARN whenever the
monitor handling the request has the option set to zero.

Fixes: 7784
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: wait for PaxosService readable in handle_get_version 1693/head
Sage Weil [Fri, 18 Apr 2014 18:12:23 +0000 (11:12 -0700)]
mon: wait for PaxosService readable in handle_get_version

We were waiting for the election to finish, but we need to *also* wait for
paxos to recover.  Being a peon or leader is not sufficient and we may
return a map that is still old.

Fixes: #7997
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1676 from ceph/wip-8092
Sage Weil [Fri, 18 Apr 2014 04:21:59 +0000 (21:21 -0700)]
Merge pull request #1676 from ceph/wip-8092

Wip 8092

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1678 from ceph/wip-8108
Sage Weil [Fri, 18 Apr 2014 04:19:33 +0000 (21:19 -0700)]
Merge pull request #1678 from ceph/wip-8108

osd: OSDMap: have osdmap json dump print valid boolean instead of string

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoosd_types: pg_t: add get_ancestor() method
Sage Weil [Fri, 18 Apr 2014 04:05:49 +0000 (21:05 -0700)]
osd_types: pg_t: add get_ancestor() method

Give us the ancestor for when the pool had a past value for pg_num.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1683 from ceph/wip-mds-op-prio
Gregory Farnum [Fri, 18 Apr 2014 00:53:40 +0000 (17:53 -0700)]
Merge pull request #1683 from ceph/wip-mds-op-prio

mds: dynamically adjust priority of committing dirfrags

Reviewed-by: Greg Farnum <greg@inktank.com>
11 years agoMerge pull request #1689 from ceph/wip-8091
Sage Weil [Thu, 17 Apr 2014 21:51:18 +0000 (14:51 -0700)]
Merge pull request #1689 from ceph/wip-8091

Wip 8091

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoReplicatedPG::recover_replicas: do not recover clones while snap obj is missing 1689/head
Samuel Just [Tue, 15 Apr 2014 21:17:33 +0000 (14:17 -0700)]
ReplicatedPG::recover_replicas: do not recover clones while snap obj is missing

Otherwise, we cannot safely read the snapset for the clone.

Fixes: #8091
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoosd_types::object_stat_sum_t: fix add/sub for num_objects_omap 1697/head
Samuel Just [Thu, 17 Apr 2014 20:30:30 +0000 (13:30 -0700)]
osd_types::object_stat_sum_t: fix add/sub for num_objects_omap

Introduced in a130a4452e4fb159dc62fb417077d98dc9ebd621
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1688 from ceph/wip-8048
Samuel Just [Thu, 17 Apr 2014 20:18:21 +0000 (13:18 -0700)]
Merge pull request #1688 from ceph/wip-8048

osd/ReplicatedPG: check clones for degraded

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1685 from ceph/wip-8132
Sage Weil [Thu, 17 Apr 2014 20:18:01 +0000 (13:18 -0700)]
Merge pull request #1685 from ceph/wip-8132

mon: set leader commands prior to first election

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoosd/ReplicatedPG: check clones for degraded 1688/head
Sage Weil [Thu, 17 Apr 2014 20:11:54 +0000 (13:11 -0700)]
osd/ReplicatedPG: check clones for degraded

We check whether the head is degraded, and we check whether a clone is
unreadable, but in the case where we have a cache op on a degraded object,
we don't check.  That leads to an assert when the repop hits the replica
and the object is in the peer's missing set.

Fix this by adding a check on the clone when write_ordered is true.  Note
that checking write_ordered is better than whether it is a cache op because
we want to preserve write ordering even for reads that are flagged by the
client.

Fixes: #8048
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1674 from ceph/wip-8086
Sage Weil [Thu, 17 Apr 2014 19:49:58 +0000 (12:49 -0700)]
Merge pull request #1674 from ceph/wip-8086

ReplicatedPG::agent_work: skip hitset objects before getting object cont...

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1687 from ceph/wip-8130
Yehuda Sadeh [Thu, 17 Apr 2014 17:50:40 +0000 (10:50 -0700)]
Merge pull request #1687 from ceph/wip-8130

osdc/Objecter: fix osd target for newly-homeless op

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoosdc/Objecter: fix osd target for newly-homeless op 1687/head
Sage Weil [Thu, 17 Apr 2014 17:48:26 +0000 (10:48 -0700)]
osdc/Objecter: fix osd target for newly-homeless op

If we recalculate the mapping and find that there is no primary, we need
to set the 'osd' field to -1.  Otherwise, the caller will try to resend
to a dead session with bad results.

This was introduced in the refactor 860d72770c.

Fixes: #8130
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1684 from onlyjob/debian
Sage Weil [Thu, 17 Apr 2014 17:07:40 +0000 (10:07 -0700)]
Merge pull request #1684 from onlyjob/debian

spelling corrections

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1671 from ceph/wip-7699
Sage Weil [Thu, 17 Apr 2014 17:05:22 +0000 (10:05 -0700)]
Merge pull request #1671 from ceph/wip-7699

mds: Fix respawn (add path resolution)

Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #1677 from ceph/wip-poolset-noblock
Sage Weil [Thu, 17 Apr 2014 17:03:26 +0000 (10:03 -0700)]
Merge pull request #1677 from ceph/wip-poolset-noblock

mon: Don't block on EAGAIN from `osd pool set`

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agomon: set leader commands prior to first election 1685/head
Sage Weil [Thu, 17 Apr 2014 16:33:44 +0000 (09:33 -0700)]
mon: set leader commands prior to first election

If we have just started and receive a command, we currently will reply with
EINVAL because the leader commands are empty.  Note that this race is very
difficult to reach because the (old) peon needs to forward a command to
the mon while it still thinks it has quorum, and the message needs to get
sent after the leader mon has restarted and reset its connection but before
it has declared a new election.

To fix this, we should assume at startup time that our commands are
valid.  If it is an internal command that does not require quorum, that
is fine.  If it does require quorum, we will retry the command after the
election completes and we will revalidate the command then.

Fixes: #8132
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomon: EBUSY instead of EAGAIN when pgs creating 1677/head
John Spray [Thu, 17 Apr 2014 14:28:22 +0000 (15:28 +0100)]
mon: EBUSY instead of EAGAIN when pgs creating

In 69321bf, EAGAIN changed behaviour to block indefinitely
rather than returning to user.  Change the return for
`osd pool set` operations that are blocked by creating PGs
to return EBUSY instead of EAGAIN, so that they are excepted
from this blocking behaviour.

Signed-off-by: John Spray <john.spray@inktank.com>
11 years agoMerge pull request #1675 from guangyy/wip-bench
Gregory Farnum [Thu, 17 Apr 2014 04:57:41 +0000 (21:57 -0700)]
Merge pull request #1675 from guangyy/wip-bench

Make rados/rest bench work for multiple write instances without metadata conflict.

Reviewed-by: Greg Farnum <greg@inktank.com>
11 years agospelling corrections 1684/head
Dmitry Smirnov [Thu, 17 Apr 2014 02:43:30 +0000 (12:43 +1000)]
spelling corrections

11 years agoMerge pull request #1681 from ceph/wip-8043
Samuel Just [Thu, 17 Apr 2014 01:16:11 +0000 (18:16 -0700)]
Merge pull request #1681 from ceph/wip-8043

mon/OSDMonitor: require force argument to split a cache pool

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1682 from ceph/wip-8020
Sage Weil [Thu, 17 Apr 2014 01:13:01 +0000 (18:13 -0700)]
Merge pull request #1682 from ceph/wip-8020

OSD: split pg stats during pg split

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoOSD: split pg stats during pg split 1682/head
Samuel Just [Mon, 7 Apr 2014 23:37:46 +0000 (16:37 -0700)]
OSD: split pg stats during pg split

Fixes: #8020
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoosd_types::osd_stat_sum_t: fix floor for num_objects_omap
Samuel Just [Thu, 17 Apr 2014 01:04:35 +0000 (18:04 -0700)]
osd_types::osd_stat_sum_t: fix floor for num_objects_omap

Introduced in a130a4452e4fb159dc62fb417077d98dc9ebd621
Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge branch 'wip-8100'
David Zafman [Wed, 16 Apr 2014 22:09:09 +0000 (15:09 -0700)]
Merge branch 'wip-8100'

Reviewed-by: Mark Nelson <mark.nelson@inktank.com>
11 years agocommon/obj_bencher: Fix error return check from read that is negative on error
David Zafman [Wed, 16 Apr 2014 21:02:13 +0000 (14:02 -0700)]
common/obj_bencher: Fix error return check from read that is negative on error

Fixed read return value in d99f1d9f68db41231e0ffff4082b05d6d095c231

Fixes: #8100
Signed-off-by: David Zafman <david.zafman@inktank.com>
11 years agoMerge pull request #1680 from ceph/wip-7786
Sage Weil [Wed, 16 Apr 2014 18:49:58 +0000 (11:49 -0700)]
Merge pull request #1680 from ceph/wip-7786

civetweb: update subproject

11 years agoosd/ReplicatedPG: add missing whitespace in debug output
David Zafman [Wed, 16 Apr 2014 18:08:23 +0000 (11:08 -0700)]
osd/ReplicatedPG: add missing whitespace in debug output

Signed-off-by: David Zafman <david.zafman@inktank.com>
11 years agomds: dynamically adjust priority of committing dirfrags 1683/head
Yan, Zheng [Wed, 16 Apr 2014 02:53:01 +0000 (10:53 +0800)]
mds: dynamically adjust priority of committing dirfrags

Adjust priority of committing dirfrags according to number of
expiring log segments. The more expiring log segments, the higher
priority. Because it mean MDS does not trim log segments quickly
enough.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agomds: fix cap revoke confirmation 1676/head
Yan, Zheng [Wed, 16 Apr 2014 05:35:39 +0000 (13:35 +0800)]
mds: fix cap revoke confirmation

when the _revokes list is emptied, it doesn't mean that client has
released the revoking caps. It's possible that client was flusing
dirty metadata.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoUse string instead of char* when saving arguments for rest-bench 1675/head
Guang Yang [Wed, 16 Apr 2014 01:28:16 +0000 (01:28 +0000)]
Use string instead of char* when saving arguments for rest-bench

11 years agoReplicatedPG::get_snapset_context: assert snap obj is not missing
Samuel Just [Tue, 15 Apr 2014 21:14:31 +0000 (14:14 -0700)]
ReplicatedPG::get_snapset_context: assert snap obj is not missing

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agomon/OSDMonitor: require force argument to split a cache pool 1681/head
Sage Weil [Tue, 15 Apr 2014 20:57:21 +0000 (13:57 -0700)]
mon/OSDMonitor: require force argument to split a cache pool

There are several perils when splitting a cache pool:

 - split invalidstes pg stats, which disables the agent
 - a scrub must be manually triggered post-split to rebuild stats
 - the pool may fill the OSDs during that period.
 - or, the pool may end up beyond the 'full' mark and once scrub does
   complete and the agent activate we may block IO for a long time while
   we catch up with flush/evict

Make it a bit harder for users to shoot themselves in the foot.

Fixes: #8043
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoosd: OSDMap: have osdmap json dump print valid boolean instead of string 1678/head
Joao Eduardo Luis [Tue, 15 Apr 2014 16:55:18 +0000 (17:55 +0100)]
osd: OSDMap: have osdmap json dump print valid boolean instead of string

Fixes: 8108
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomds: Fix respawn (add path resolution) 1671/head
John Spray [Mon, 14 Apr 2014 16:14:42 +0000 (17:14 +0100)]
mds: Fix respawn (add path resolution)

Previously assumed that ceph-mds executable was in
PWD - now use /proc/self/exe to find the
executable whereever it may be.  Leave in old version
as a fallback for non-linux environments.

Also add a 'respawn' command so that it's easy to test
respawn with `ceph mds tell <id> respawn`

Fixes: #7966
11 years agomds: share max size to client who is allowed for WR cap
Yan, Zheng [Tue, 15 Apr 2014 08:06:07 +0000 (16:06 +0800)]
mds: share max size to client who is allowed for WR cap

WR cap is allowed for the loner client when filelock is in excl->mix
state. MDS should share max size with the loner client in this case.
Otherwise the client may wait for the max size forever.

Fixes: #8092
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoMake rados/rest bench work for multiple write instances without metadata conflict.
Guang Yang [Tue, 15 Apr 2014 07:48:37 +0000 (07:48 +0000)]
Make rados/rest bench work for multiple write instances without metadata conflict.
Signed-off-by: Guang Yang <yguang@yahoo-inc.com>
11 years agoMerge pull request #1666 from ceph/wip-mds
Yan, Zheng [Tue, 15 Apr 2014 00:13:01 +0000 (08:13 +0800)]
Merge pull request #1666 from ceph/wip-mds

Wip mds

11 years agoReplicatedPG::process_copy_chunk: don't check snaps if we got head
Samuel Just [Tue, 8 Apr 2014 17:47:55 +0000 (10:47 -0700)]
ReplicatedPG::process_copy_chunk: don't check snaps if we got head

Even if we are promoting a clone, we may be reading from head.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::finish_promote: soid.clone may have been trimed, fix assert
Samuel Just [Wed, 9 Apr 2014 22:57:37 +0000 (15:57 -0700)]
ReplicatedPG::finish_promote: soid.clone may have been trimed, fix assert

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::agent_work: skip if head is missing
Samuel Just [Fri, 11 Apr 2014 21:35:39 +0000 (14:35 -0700)]
ReplicatedPG::agent_work: skip if head is missing

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::cancel_flush: requeue dup_ops even if !op
Samuel Just [Fri, 11 Apr 2014 00:38:22 +0000 (17:38 -0700)]
ReplicatedPG::cancel_flush: requeue dup_ops even if !op

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::_rollback_to: fix comment, clone certainly could be missing
Samuel Just [Fri, 11 Apr 2014 00:18:32 +0000 (17:18 -0700)]
ReplicatedPG::_rollback_to: fix comment, clone certainly could be missing

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1673 from ceph/wip-stress-watch
Samuel Just [Mon, 14 Apr 2014 23:12:31 +0000 (16:12 -0700)]
Merge pull request #1673 from ceph/wip-stress-watch

ceph_test_stress_watch: test over cache pool

Reviewed-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoMerge pull request #1667 from ceph/wip-8089
Samuel Just [Mon, 14 Apr 2014 23:11:47 +0000 (16:11 -0700)]
Merge pull request #1667 from ceph/wip-8089

osd: fix dup request ahndling for ENOENT and cache ops

Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #1654 from ceph/wip-7940
Samuel Just [Mon, 14 Apr 2014 23:10:42 +0000 (16:10 -0700)]
Merge pull request #1654 from ceph/wip-7940

Wip 7940

Reviewed-by: Samuel Just <sam.just@inktank.com>