]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
11 years agoMerge pull request #792 from ceph/wip-doc-openstack
Sage Weil [Fri, 1 Nov 2013 23:07:21 +0000 (16:07 -0700)]
Merge pull request #792 from ceph/wip-doc-openstack

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoMerge pull request #766 from kri5/wip-5374
Sage Weil [Fri, 1 Nov 2013 23:06:52 +0000 (16:06 -0700)]
Merge pull request #766 from kri5/wip-5374

Rebase of Wip 5374 against master

Reviewed-by: Yehuda Sadeh <yehdua@inktank.com>
11 years agodoc: Removed the Folsom reference. 792/head
John Wilkins [Thu, 31 Oct 2013 00:53:37 +0000 (17:53 -0700)]
doc: Removed the Folsom reference.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agodoc: Restored show_image_direct and added a link to older versions.
John Wilkins [Thu, 31 Oct 2013 00:44:24 +0000 (17:44 -0700)]
doc: Restored show_image_direct and added a link to older versions.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agodoc: Removed nova-volume, early Ceph references and Folsom references.
John Wilkins [Thu, 31 Oct 2013 00:21:14 +0000 (17:21 -0700)]
doc: Removed nova-volume, early Ceph references and Folsom references.

fixes: 5006

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
11 years agodoc/release-notes: formatting
Sage Weil [Wed, 30 Oct 2013 21:19:45 +0000 (14:19 -0700)]
doc/release-notes: formatting

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc/release-notes: fix formatting
Sage Weil [Wed, 30 Oct 2013 21:08:08 +0000 (14:08 -0700)]
doc/release-notes: fix formatting

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agodoc/release-notes: fix indentation
Sage Weil [Tue, 29 Oct 2013 23:10:16 +0000 (16:10 -0700)]
doc/release-notes: fix indentation

ERROR: /srv/autobuild-ceph/gitbuilder.git/build/doc/release-notes.rst:48: Unexpected indentation.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge branch 'next'
Gary Lowell [Wed, 30 Oct 2013 18:41:10 +0000 (18:41 +0000)]
Merge branch 'next'

11 years agoMerge branch 'next' of jenkins:ceph/ceph into next
Gary Lowell [Wed, 30 Oct 2013 18:34:42 +0000 (18:34 +0000)]
Merge branch 'next' of jenkins:ceph/ceph into next

11 years agoceph: Release resource before return in BackedObject::download()
Li Wang [Wed, 30 Oct 2013 13:32:34 +0000 (21:32 +0800)]
ceph: Release resource before return in BackedObject::download()

Close file before return

Signed-off-by: Li Wang <liwang@ubuntukylin.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoceph: Fix memory leak in chain_listxattr
Li Wang [Wed, 30 Oct 2013 08:39:09 +0000 (16:39 +0800)]
ceph: Fix memory leak in chain_listxattr

Free allocated memory before return

Signed-off-by: Li Wang <liwang@ubuntukylin.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoFix memory leak in Backtrace::print()
Li Wang [Wed, 30 Oct 2013 08:18:10 +0000 (16:18 +0800)]
Fix memory leak in Backtrace::print()

Free already allocated memory if short of memory

Signed-off-by: Li Wang <liwang@ubuntukylin.com>
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agov0.72-rc1 v0.72-rc1
Gary Lowell [Wed, 30 Oct 2013 00:45:10 +0000 (00:45 +0000)]
v0.72-rc1

11 years agoMerge pull request #788 from ceph/wip-6605
João Eduardo Luís [Tue, 29 Oct 2013 23:34:14 +0000 (16:34 -0700)]
Merge pull request #788 from ceph/wip-6605

mon: OSDMonitor: only allow an osd to boot iff it has the fsid on record

Fixes: #6605
Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge remote-tracking branch 'gh/wip-crush-hook'
Sage Weil [Tue, 29 Oct 2013 22:06:13 +0000 (15:06 -0700)]
Merge remote-tracking branch 'gh/wip-crush-hook'

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoupstart, sysvinit: use ceph-crush-location hook
Sage Weil [Tue, 29 Oct 2013 18:08:58 +0000 (11:08 -0700)]
upstart, sysvinit: use ceph-crush-location hook

Instead of hard-coding a check in ceph.conf and some reasonable
defaults, defer this work to ceph-crush-location, and allow users to
specify their own hook with alternative logic.

This can be helpful in a nubmer of cases, like:

 - rack (or other) information included in hostname and easily parsed
   out by a hook
 - multiple types of devices in each host, resulting in 'parallel'
   crush trees (e.g., one for hdd, one for ssd)

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoceph-crush-location: new crush location hook
Sage Weil [Tue, 29 Oct 2013 18:03:04 +0000 (11:03 -0700)]
ceph-crush-location: new crush location hook

This generalizes the bit of code that builds a key=value pair list to
update an entity's CRUSH location.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoRevert "ceph-crush-location: new crush location hook"
Sage Weil [Tue, 29 Oct 2013 20:58:14 +0000 (13:58 -0700)]
Revert "ceph-crush-location: new crush location hook"

This reverts commit fc49065d855cfd74cb861d294f3464dd616e82ee.

Merged to wrong branch; my bad!

11 years agoRevert "upstart, sysvinit: use ceph-crush-location hook"
Sage Weil [Tue, 29 Oct 2013 20:58:10 +0000 (13:58 -0700)]
Revert "upstart, sysvinit: use ceph-crush-location hook"

This reverts commit 111a37efb19cb46a48d669bc9866c29b4015a889.

11 years agomon: OSDMonitor: fix comparison between signed and unsigned integer warning 787/head 788/head
Joao Eduardo Luis [Tue, 29 Oct 2013 20:36:05 +0000 (20:36 +0000)]
mon: OSDMonitor: fix comparison between signed and unsigned integer warning

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: OSDMonitor: only allow an osd to boot iff it has the fsid on record
Joao Eduardo Luis [Tue, 29 Oct 2013 20:35:25 +0000 (20:35 +0000)]
mon: OSDMonitor: only allow an osd to boot iff it has the fsid on record

Fixes: #6605
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon: OSDMonitor: fix some annoying whitespace
Joao Eduardo Luis [Tue, 29 Oct 2013 20:30:37 +0000 (20:30 +0000)]
mon: OSDMonitor: fix some annoying whitespace

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoMerge pull request #779 from ceph/wip-crush-hook
Loic Dachary [Tue, 29 Oct 2013 19:24:05 +0000 (12:24 -0700)]
Merge pull request #779 from ceph/wip-crush-hook

upstart,sysvinit: allow 'osd crush location hook' script to determine osd crush position

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoupstart, sysvinit: use ceph-crush-location hook 779/head
Sage Weil [Tue, 29 Oct 2013 18:08:58 +0000 (11:08 -0700)]
upstart, sysvinit: use ceph-crush-location hook

Instead of hard-coding a check in ceph.conf and some reasonable
defaults, defer this work to ceph-crush-location, and allow users to
specify their own hook with alternative logic.

This can be helpful in a nubmer of cases, like:

 - rack (or other) information included in hostname and easily parsed
   out by a hook
 - multiple types of devices in each host, resulting in 'parallel'
   crush trees (e.g., one for hdd, one for ssd)

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoceph-crush-location: new crush location hook
Sage Weil [Tue, 29 Oct 2013 18:03:04 +0000 (11:03 -0700)]
ceph-crush-location: new crush location hook

This generalizes the bit of code that builds a key=value pair list to
update an entity's CRUSH location.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #786 from ceph/wip-6673
Sage Weil [Tue, 29 Oct 2013 17:16:52 +0000 (10:16 -0700)]
Merge pull request #786 from ceph/wip-6673

mon/PGMonitor: always send pg creations after mapping

Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon/PGMonitor: always send pg creations after mapping 786/head
Sage Weil [Tue, 29 Oct 2013 17:10:21 +0000 (10:10 -0700)]
mon/PGMonitor: always send pg creations after mapping

At some point in the dumpling cycle I separated the map stage from the
send stage.  We can send the creates any time we have a non-zero osdmap
epoch, and are in good shape as long as we do the map step after the
osdmap is loaded (hence the post_paxos_update).

Some background:

We originally introduced the map-but-don't send in a2fe0137, at which
point all was well because we only called it on ceph-mon startup.

Later, this turned into post_paxos_update in e635c478, at which point
it was now called by a running monitor.. but we didn't add in the
send_pg_creates().  This is where this bug stems from.

This particular path is responsible for the stalled test referenced in
bug #6673.

Backport: dumpling
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agomon/OSDMonitor: fix signedness warning on poolid
Sage Weil [Tue, 29 Oct 2013 15:59:06 +0000 (08:59 -0700)]
mon/OSDMonitor: fix signedness warning on poolid

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge remote-tracking branch 'upstream/next'
Samuel Just [Tue, 29 Oct 2013 15:27:54 +0000 (08:27 -0700)]
Merge remote-tracking branch 'upstream/next'

11 years agoReplicatedPG::recover_backfill: update last_backfill to max() when backfill is complete
Samuel Just [Tue, 29 Oct 2013 06:05:30 +0000 (23:05 -0700)]
ReplicatedPG::recover_backfill: update last_backfill to max() when backfill is complete

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG: src_obcs can now be empty
Samuel Just [Tue, 29 Oct 2013 06:35:00 +0000 (23:35 -0700)]
ReplicatedPG: src_obcs can now be empty

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge remote-tracking branch 'upstream/next'
Samuel Just [Tue, 29 Oct 2013 05:51:04 +0000 (22:51 -0700)]
Merge remote-tracking branch 'upstream/next'

11 years agoMerge pull request #773 from dachary/wip-6614
Sage Weil [Tue, 29 Oct 2013 04:15:32 +0000 (21:15 -0700)]
Merge pull request #773 from dachary/wip-6614

common: rebuild_page_aligned sometimes rebuilds unaligned

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #780 from ceph/wip-6585
athanatos [Tue, 29 Oct 2013 04:11:27 +0000 (21:11 -0700)]
Merge pull request #780 from ceph/wip-6585

Wip 6585

Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge pull request #775 from ceph/wip-readdirend
Sage Weil [Tue, 29 Oct 2013 04:01:27 +0000 (21:01 -0700)]
Merge pull request #775 from ceph/wip-readdirend

mds: fix readdir end check

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agomds: fix readdir end check 775/head
Yan, Zheng [Sun, 27 Oct 2013 09:11:11 +0000 (17:11 +0800)]
mds: fix readdir end check

If the last item in the directory is a remote link and the corresponding
inode is not in cache, the readir reply will not contain the last item.
But iterator 'it' is equal to dir->end() in this case, it causes the 'end'
flag of the readdir reply be set to true.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
11 years agoMerge pull request #777 from ceph/wip-scripts
Josh Durgin [Tue, 29 Oct 2013 00:12:26 +0000 (17:12 -0700)]
Merge pull request #777 from ceph/wip-scripts

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
11 years agoosd/ReplicatedPG: use MIN for backfill_pos 780/head
Sage Weil [Mon, 28 Oct 2013 23:39:09 +0000 (16:39 -0700)]
osd/ReplicatedPG: use MIN for backfill_pos

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #772 from ceph/wip-5612
Loic Dachary [Mon, 28 Oct 2013 23:13:34 +0000 (16:13 -0700)]
Merge pull request #772 from ceph/wip-5612

init-ceph, upstart: make crush update on osd start time out

Reviewed-by: Loic Dachary <loic@dachary.org>
11 years agoReplicatedPG: recover_backfill: don't prematurely adjust last_backfill
Samuel Just [Mon, 28 Oct 2013 23:09:59 +0000 (16:09 -0700)]
ReplicatedPG: recover_backfill: don't prematurely adjust last_backfill

We can't adjust last_backfill to object x until x has been fully
backfilled.  pending_backfill_updates contains all those backfills
started, but which have not yet been reflected in pinfo.last_update.
backfills_in_flight contains those backfills which have not yet
completed.  Thus, we can adjust last_update to the largest entry
in pending_backfill_updates not in backfills_in_flight.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG: add empty stat when we remove an object in recover_backfill
Samuel Just [Mon, 28 Oct 2013 23:03:25 +0000 (16:03 -0700)]
ReplicatedPG: add empty stat when we remove an object in recover_backfill

Subsequent updates to that object need to have their stats added
to the backfill info stats atomically with the last_backfill
update.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG: replace backfill_pos with last_backfill_started
Samuel Just [Mon, 28 Oct 2013 22:53:24 +0000 (15:53 -0700)]
ReplicatedPG: replace backfill_pos with last_backfill_started

last_backfill_started reflects what pinfo.last_backfill will be
once all currently outstanding backfills complete.  backfill_pos
was tricky since we couldn't correctly inialize it without
doing the first backfill scan pair.

In recover_backfill, we rescan from last_backfill_started rather
than from backfill_pos.  This ensures that we capture all clones
created between last_backfill_started and what previously had been
backfill_pos without special handling in make_writeable.  The main
downside is that we will tend to "rescan" last_backfill_started.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoPG::BackfillInfo: introduce trim_to
Samuel Just [Mon, 28 Oct 2013 22:49:58 +0000 (15:49 -0700)]
PG::BackfillInfo: introduce trim_to

We'll use this to trim off last_backfill_started since it'll
often be included in rescans.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoPG::BackfillInterval: use trim() in pop_front()
Samuel Just [Mon, 28 Oct 2013 22:49:23 +0000 (15:49 -0700)]
PG::BackfillInterval: use trim() in pop_front()

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG::prepare_transaction: info.last_backfill is inclusive
Samuel Just [Mon, 28 Oct 2013 22:22:37 +0000 (15:22 -0700)]
ReplicatedPG::prepare_transaction: info.last_backfill is inclusive

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoupstart: fail osd start if crush update fails 772/head
Sage Weil [Mon, 28 Oct 2013 22:56:36 +0000 (15:56 -0700)]
upstart: fail osd start if crush update fails

If the update for the CRUSH position fails for some reason, do not
start the OSD.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoinit-ceph: make crush update on osd start time out
Sage Weil [Mon, 28 Oct 2013 22:56:15 +0000 (15:56 -0700)]
init-ceph: make crush update on osd start time out

If the monitor is not currently available, this crush update would block
forever, preventing the OSD and (potentially) the rest of the system
from starting up.  Instead, make it time out after 10 seconds and then
abort startup.  This prevents startup of an OSD if we failed to update
the CRUSH position for some reason.

In fact, do not start up the OSD if the CRUSH update fails for any
reason--not just a timeout!

Works-around: #5612
Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoMerge pull request #771 from ceph/wip-ceph-context
Sage Weil [Mon, 28 Oct 2013 21:48:24 +0000 (14:48 -0700)]
Merge pull request #771 from ceph/wip-ceph-context

ceph_context: replace semaphore

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agopybind: use find_library for libcephfs and librbd 777/head
Noah Watkins [Mon, 28 Oct 2013 21:37:07 +0000 (14:37 -0700)]
pybind: use find_library for libcephfs and librbd

Use find_library to avoid assumptions about platform shared library
naming conventions.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agoMerge pull request #778 from ceph/wip-6621
Sage Weil [Mon, 28 Oct 2013 21:28:25 +0000 (14:28 -0700)]
Merge pull request #778 from ceph/wip-6621

radosgw-admin: accept negative values for quota params

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoradosgw-admin: accept negative values for quota params 778/head
Yehuda Sadeh [Mon, 28 Oct 2013 20:36:45 +0000 (13:36 -0700)]
radosgw-admin: accept negative values for quota params

and document that in the usage output.

Fixes: #6621
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoMerge pull request #760 from ceph/wip-6585
athanatos [Mon, 28 Oct 2013 20:50:34 +0000 (13:50 -0700)]
Merge pull request #760 from ceph/wip-6585

Wip 6585

Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
11 years agoReplicatedPG: no need to clear repop->*obc
Samuel Just [Mon, 28 Oct 2013 20:49:23 +0000 (13:49 -0700)]
ReplicatedPG: no need to clear repop->*obc

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoMerge remote-tracking branch 'upstream/next' into wip-obc
Samuel Just [Mon, 28 Oct 2013 20:47:39 +0000 (13:47 -0700)]
Merge remote-tracking branch 'upstream/next' into wip-obc

11 years agodoc/release-notes: emperor blurb
Sage Weil [Mon, 28 Oct 2013 20:53:11 +0000 (13:53 -0700)]
doc/release-notes: emperor blurb

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agoReplicatedBackend: don't hold ObjectContexts in pull completion callback 760/head
Samuel Just [Mon, 28 Oct 2013 18:02:34 +0000 (11:02 -0700)]
ReplicatedBackend: don't hold ObjectContexts in pull completion callback

We need flushing the sequencer to ensure that all Contexts which hold
ObjectContextRefs have been run or deleted.
C_ReplicatedBackend_OnPullComplete, however, gets queued in a second
work queue in order to avoid performing expensive push related reads
in the FileStore finisher.

Rather than keep the objects contexts around, we instead put off
removing the object from the pulling map until the call back
fires and read the object context out of the pulling map.  This
way the ObjectContextRef will be cleaned up along with the rest
of the pulling map in on_change.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG: put repops even in TrimObjects
Samuel Just [Sun, 27 Oct 2013 03:21:25 +0000 (20:21 -0700)]
ReplicatedPG: put repops even in TrimObjects

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG: improved on_flushed error output
Samuel Just [Sun, 27 Oct 2013 01:24:41 +0000 (18:24 -0700)]
ReplicatedPG: improved on_flushed error output

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoPG: call on_flushed on FlushEvt
Samuel Just [Sat, 26 Oct 2013 23:46:22 +0000 (16:46 -0700)]
PG: call on_flushed on FlushEvt

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoPG,ReplicatedPG: remove the waiting_for_backfill_peer mechanism
Samuel Just [Sat, 26 Oct 2013 00:58:31 +0000 (17:58 -0700)]
PG,ReplicatedPG: remove the waiting_for_backfill_peer mechanism

See previous patch.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG: have make_writeable adjust backfill_pos
Samuel Just [Sat, 26 Oct 2013 00:58:10 +0000 (17:58 -0700)]
ReplicatedPG: have make_writeable adjust backfill_pos

If we are writing to backfill_pos and create a clone, we end
up failing to send the transaction creating the clone to the
backfill peer.  This is fine as long as we end up backfilling
the clone.  To that end, we simply add the clone to
backfill_info and adjust backfill_pos accordingly.  This is less
brittle than the waiting_for_backfill_pos mechanism since it
works even if we wait between that check and issuing the repop,
which can happen for copy_from.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedBackend: fix failed push error output
Samuel Just [Sat, 26 Oct 2013 23:52:32 +0000 (16:52 -0700)]
ReplicatedBackend: fix failed push error output

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG,osd_types: move rw tracking from its own map to ObjectContext
Samuel Just [Sat, 26 Oct 2013 23:52:16 +0000 (16:52 -0700)]
ReplicatedPG,osd_types: move rw tracking from its own map to ObjectContext

We also modify recovering to hold a reference to the recovering obc
in order to ensure that our backfill_read_lock doesn't outlive the
obc.

ReplicatedPG::op_applied no longer clears repop->obc since we need
it to live until the op is finally cleaned up.  This is fine since
repop->obc is now an ObjectContextRef and can clean itself up.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoosd_types,OpRequest: move osd_req_id into OpRequest
Samuel Just [Sat, 26 Oct 2013 00:36:40 +0000 (17:36 -0700)]
osd_types,OpRequest: move osd_req_id into OpRequest

This way I can have OpRequest included from osd_types.h.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoOpRequest: move method implementations into cc
Samuel Just [Sat, 26 Oct 2013 00:35:49 +0000 (17:35 -0700)]
OpRequest: move method implementations into cc

I need to remove the osd_types.h include.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years agoReplicatedPG: reset new_obs and new_snapset in execute_ctx
Samuel Just [Fri, 25 Oct 2013 01:52:59 +0000 (18:52 -0700)]
ReplicatedPG: reset new_obs and new_snapset in execute_ctx

This way, if execute_ctx is rerun on the same OpContext, we
won't erroneously reuse a stale snapset/object_info.

Signed-off-by: Samuel Just <sam.just@inktank.com>
11 years ago fix the bug if we set pgp_num=-1 using "ceph osd pool set data|metadata|rbd -1"
huangjun [Mon, 28 Oct 2013 04:12:26 +0000 (12:12 +0800)]
   fix the bug if we set pgp_num=-1 using "ceph osd pool set data|metadata|rbd -1"
   will set the pgp_num to a hunge number.

Signed-off-by: huangjun <hjwsm1989@gmail.com>
(cherry picked from commit bf198e673fd876e34006d3c83f0479454e6295aa)

11 years agoMerge pull request #776 from hjwsm1989/master
Sage Weil [Mon, 28 Oct 2013 20:29:05 +0000 (13:29 -0700)]
Merge pull request #776 from hjwsm1989/master

   fix the bug if we set pgp_num=-1 using "ceph osd pool set data|metada...

11 years agoceph_context: use condition variable for wake-up 771/head
Noah Watkins [Sat, 12 Oct 2013 21:39:19 +0000 (14:39 -0700)]
ceph_context: use condition variable for wake-up

reopen_logs is called from normal thread context (as opposed to an async
signal handler), so we grab a mutex and condition variables. The mutex
use also allows us to avoid volatile variables since the mutex will add
any necessary barriers.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agotest: Use a portable syntax for seq(1)
Alan Somers [Fri, 11 Oct 2013 20:48:32 +0000 (13:48 -0700)]
test: Use a portable syntax for seq(1)

Use a portable syntax for seq(1).  GNU seq has a default INCR of 1, but
BSD seq has a default INCR of either +1 or -1, depending on the other
arguments.  INCR must be explicitly specified for portability.

This bug is the reason that I was running into the segfaults whose
fix I reported as BUG #6510.

Signed-off-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agotest: Change interpreter from /bin/bash to /bin/sh
Alan Somers [Fri, 11 Oct 2013 20:49:36 +0000 (13:49 -0700)]
test: Change interpreter from /bin/bash to /bin/sh

Change interpreter from /bin/bash to /bin/sh.  bash is not guaranteed
to be installed on all Unix systems, and it's not guaranteed to be
installed into /bin either.

There are other scripts that specify /bin/bash; they need to be
examined one by one to look for bashisms.  This was the only one I had
to modify to get unit tests working.

Signed-off-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agotest: Use portable arguments to /usr/bin/env
Alan Somers [Mon, 28 Oct 2013 18:26:49 +0000 (11:26 -0700)]
test: Use portable arguments to /usr/bin/env

Don't use the "--ignore-environment" option to env.  It is a
nonstandard GNU extension.  The standard version is "-i".

Signed-off-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agopybind: use find_library to look for librados
Noah Watkins [Sat, 12 Oct 2013 21:53:14 +0000 (14:53 -0700)]
pybind: use find_library to look for librados

Uses find_library to search for librados, rather than using the soname.
For instance, on OSX librados is named librados.2.dylib.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
11 years agodoc/release-notes: v0.72 draft release notes
Sage Weil [Mon, 28 Oct 2013 17:36:44 +0000 (10:36 -0700)]
doc/release-notes: v0.72 draft release notes

Signed-off-by: Sage Weil <sage@inktank.com>
11 years ago fix the bug if we set pgp_num=-1 using "ceph osd pool set data|metadata|rbd -1" 776/head
huangjun [Mon, 28 Oct 2013 04:12:26 +0000 (12:12 +0800)]
   fix the bug if we set pgp_num=-1 using "ceph osd pool set data|metadata|rbd -1"
   will set the pgp_num to a hunge number.

Signed-off-by: huangjun <hjwsm1989@gmail.com>
11 years agoReplicatedPG: take and drop read locks when doing backfill
Greg Farnum [Wed, 23 Oct 2013 18:28:45 +0000 (11:28 -0700)]
ReplicatedPG: take and drop read locks when doing backfill

All our interfaces are in place, so now we can actually take and
drop the locks.
1) Take locks in ReplicatedPG::recover_backfill. This is the entry
into the backfill code path, and covers all objects which are
added to backfills_in_flight (via prep_backfill_object_push()). If we
can't get the lock right away, we stop the backfill movement there
until we can do so.
2) Drop the locks in ReplicatedPG::on_peer_recover(), called when the
push is completed.
2b) Further drop the locks on all backfills_in_flight objects in
_clear_recovery_state(), for when we cancel peering.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoPG: switch the start_recovery_ops interface to specify work to do as a param
Greg Farnum [Tue, 22 Oct 2013 00:36:04 +0000 (17:36 -0700)]
PG: switch the start_recovery_ops interface to specify work to do as a param

We previously inferred whether there was useful work to be done
by looking at the number of ops started, but with the upcoming
introduction of the rw_manager read locking on backfill, we could
start no ops while still having work to do. Switch around the
interfaces to specify these as separate pieces of information.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoReplicatedPG: implement the RWTracker mechanisms for backfill read locking
Greg Farnum [Mon, 21 Oct 2013 21:27:32 +0000 (14:27 -0700)]
ReplicatedPG: implement the RWTracker mechanisms for backfill read locking

We want backfill to take read locks on the objects it's pushing. Add
a get_backfill_read(hobject_t) function, a corresponding drop_backfill_read(),
and a backfill_waiting_on_read member in ObjState. Check that member when
getting a write lock, and in put_write(). Tell callers to requeue the recovery
if necessary, and clean up the backfill block when its read lock is dropped.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoReplicatedPG: separate RWTracker's waitlist from getting locks
Greg Farnum [Mon, 21 Oct 2013 21:02:57 +0000 (14:02 -0700)]
ReplicatedPG: separate RWTracker's waitlist from getting locks

This way we can try and get locks which aren't associated with
an OpRequest.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agocommon: add an hobject_t::is_min() function
Greg Farnum [Mon, 21 Oct 2013 21:11:28 +0000 (14:11 -0700)]
common: add an hobject_t::is_min() function

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agorgw: Use JSONFormatter to use keystone API 766/head
Christophe Courtaut [Tue, 9 Jul 2013 21:32:33 +0000 (23:32 +0200)]
rgw: Use JSONFormatter to use keystone API

Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
11 years agorgw: Use keystone password to validate token too
Christophe Courtaut [Thu, 4 Jul 2013 07:57:56 +0000 (09:57 +0200)]
rgw: Use keystone password to validate token too

Adds the alternative use of password, instead of admin token,
to validate tokens.

Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
11 years agorgw: Adds passwd alternative to keystone admin token
Christophe Courtaut [Wed, 3 Jul 2013 18:48:12 +0000 (20:48 +0200)]
rgw: Adds passwd alternative to keystone admin token

http://tracker.ceph.com/issues/5374 Fixes #5374

This adds options parsing to have a user, password and tenant,
to be able to ask for a token.
This token is then used to authenticate against keystone, instead
of relying on the admin token.
Otherwise, you can still use the admin token to authenticate.
This doesn't change the existing behaviour.

Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
11 years agoMerge pull request #672 from liammonahan/wip-defer-to-bucket-acls
Yehuda Sadeh [Sun, 27 Oct 2013 02:24:59 +0000 (19:24 -0700)]
Merge pull request #672 from liammonahan/wip-defer-to-bucket-acls

Add a configurable to allow bucket perms to be checked before key perms

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
11 years agoMerge pull request #765 from ceph/wip-6635
Sage Weil [Sat, 26 Oct 2013 00:53:30 +0000 (17:53 -0700)]
Merge pull request #765 from ceph/wip-6635

mon: OSDMonitor: Make 'osd pool rename' idempotent

Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agomon/OSDMonitor: make racing dup pool rename behave 765/head
Sage Weil [Sat, 26 Oct 2013 00:45:06 +0000 (17:45 -0700)]
mon/OSDMonitor: make racing dup pool rename behave

If we get dup pool rename requests that are racing, make sure the second
one comes back with 'success' if the rename entry already exists in the
pending_inc map.

Signed-off-by: Sage Weil <sage@inktank.com>
11 years agocommon: rebuild_page_aligned sometimes rebuilds unaligned 773/head
Loic Dachary [Sat, 26 Oct 2013 00:11:43 +0000 (02:11 +0200)]
common: rebuild_page_aligned sometimes rebuilds unaligned

rebuild_page_aligned relies on rebuild to create memory that is aligned
according to list::is_page_aligned(). However, when the bufferlist only
contains a single ptr and that its size is not list::is_n_page_size(),
rebuild will not create the expected alligned bufferlist.

The allocation of the ptr is moved out of rebuild which is now given the
ptr as an argument. The rebuild_page_aligned function always require an
aligned ptr with buffer::create_page_aligned(_len) for consistency.

The test

    bufferlist bl;
    bufferptr ptr(buffer::create_page_aligned(2));
    ptr.set_offset(1);
    ptr.set_length(1);
    bl.append(ptr);
    EXPECT_FALSE(bl.is_page_aligned());
    bl.rebuild_page_aligned();
    EXPECT_FALSE(bl.is_page_aligned());

demonstrated the problem. It was assumed to be a feature but should have
been identified as a bug. The last ligne is replaced with

    EXPECT_TRUE(bl.is_page_aligned());

Most tests related to is_page_aligned() wrongfully assumed that

    bufferptr ptr(2);

is never page aligned. Most of the time it is not but sometime it is
when the pointer address is by chance on a CEPH_PAGE_SIZE boundary,
which triggered #6614. Non aligned ptr are created as follows instead:

    bufferptr ptr(buffer::create_page_aligned(2));
    ptr.set_offset(1);
    ptr.set_length(1);

http://tracker.ceph.com/issues/6614 fixes: #6614

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agomon: OSDMonitor: Make 'osd pool rename' idempotent
Joao Eduardo Luis [Fri, 25 Oct 2013 02:33:53 +0000 (03:33 +0100)]
mon: OSDMonitor: Make 'osd pool rename' idempotent

'ceph osd pool rename' takes two arguments: source pool and dest pool.
If by chance 'source pool' does not exist and 'destination pool' does,
then, in order to assure it's idempotent, we want to assume that if
'source pool' no longer exists is because it was already renamed.

However, while we will return success in such case, we want to make sure
to let the user know that we made such assumption.  Mostly to warn the
user of such a thing in case of a mistake on the user's part (say, the
user didn't notice that the source pool didn't exist, while the dest did),
but also to make sure that the user is not surprised by the command
returning success if the user expected an ENOENT or EEXIST.

Fixes: #6635
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
11 years agoMerge pull request #770 from dachary/master
Sage Weil [Fri, 25 Oct 2013 23:00:35 +0000 (16:00 -0700)]
Merge pull request #770 from dachary/master

packages: ceph.spec.in is missing make as a build dependency

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agopackages: ceph.spec.in is missing make as a build dependency 770/head
Loic Dachary [Fri, 25 Oct 2013 22:46:32 +0000 (00:46 +0200)]
packages: ceph.spec.in is missing make as a build dependency

On a virgin centos-6.4, after yum-builddep ceph and following
http://ceph.com/docs/next/install/building-ceph/ instructions to:

cd ceph
./autogen.sh
./configure
make

it fails because make is not installed. It probably is not a problem for
most people because there are few developers who did not install make.

Signed-off-by: Loic Dachary <loic@dachary.org>
11 years agoMerge pull request #769 from ceph/wip-copy-get
Gregory Farnum [Fri, 25 Oct 2013 20:57:21 +0000 (13:57 -0700)]
Merge pull request #769 from ceph/wip-copy-get

With this branch we make copy-get significantly easier to extend by applying our standard encode/decode stuff to it, instead of doing an inline encode-onto-the-payload. We also add some infrastructure for dealing with completion of RepGathers.

Reviewed-by: Sage Weil <sage@inktank.com>
11 years agoObjecter: expose the copy-get()'ed object's category 769/head
Greg Farnum [Wed, 9 Oct 2013 22:07:07 +0000 (15:07 -0700)]
Objecter: expose the copy-get()'ed object's category

In the OSD, store the category in the CopyOp using this.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoosd: add category to object_copy_data_t
Greg Farnum [Fri, 25 Oct 2013 20:41:29 +0000 (13:41 -0700)]
osd: add category to object_copy_data_t

We don't bump the encoding version -- and stick it in the middle --
since it's still brand-new. For simplicity, we encode it unconditionally
rather than trying to embed it alongside the attrs or with its own
"complete" flag in the cursor.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoOSD: add back CEPH_OSD_OP_COPY_GET, and use it in the Objecter
Greg Farnum [Wed, 9 Oct 2013 17:39:19 +0000 (10:39 -0700)]
OSD: add back CEPH_OSD_OP_COPY_GET, and use it in the Objecter

This one is encoded with version information. We are not doing anything
to control which op gets sent by the client, but after discussion with
Sam we think this op isn't accessible enough to clients (right now it's
only triggered by a client sending copy-from, which can only happen via
ceph-test-rados) to require compatibility versioning.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoOSD: rename CEPH_OSD_OP_COPY_GET -> CEPH_OSD_OP_COPY_GET_CLASSIC
Greg Farnum [Wed, 9 Oct 2013 17:08:24 +0000 (10:08 -0700)]
OSD: rename CEPH_OSD_OP_COPY_GET -> CEPH_OSD_OP_COPY_GET_CLASSIC

In order to introduce versioning of copy-get, we need to make it a
different op that has the versioning infrastructure from the start.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoReplicatedPG: copy: move the COPY_GET implementation into its own function
Greg Farnum [Fri, 25 Oct 2013 20:50:58 +0000 (13:50 -0700)]
ReplicatedPG: copy: move the COPY_GET implementation into its own function

It was getting long, isn't terribly dependent on access to do_osd_ops()
state, and will be easier to make generic as its own function.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoosd: Add a new object_copy_data_t, and use it in the OSD/Objecter
Greg Farnum [Tue, 8 Oct 2013 21:57:31 +0000 (14:57 -0700)]
osd: Add a new object_copy_data_t, and use it in the OSD/Objecter

Right now this is very primitive, but we're about to extend it to
deal with request versioning appropriately, and adding in some
extra fields.
Sadly we are doing a little extra copying in the Objecter as a result, but
too bad -- being able to do updates will be worth it.

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoReplicatedPG: cache: don't handle cache if the obc is blocked
Greg Farnum [Fri, 4 Oct 2013 22:54:21 +0000 (15:54 -0700)]
ReplicatedPG: cache: don't handle cache if the obc is blocked

Right now the only way that can happen is if we're in the middle of a
promote!

Signed-off-by: Greg Farnum <greg@inktank.com>
11 years agoReplicatedPG: copy: add a C_KickBlockedObject
Greg Farnum [Fri, 4 Oct 2013 22:55:41 +0000 (15:55 -0700)]
ReplicatedPG: copy: add a C_KickBlockedObject

As the name says, you give it an obc and it kicks the block list
when finish()ed.

Signed-off-by: Greg Farnum <greg@inktank.com>