]> git.apps.os.sepia.ceph.com Git - ceph.git/log
ceph.git
12 years agoceph-mon: Attempt to obtain monmap from several possible sources 225/head
Joao Eduardo Luis [Fri, 19 Apr 2013 16:28:37 +0000 (17:28 +0100)]
ceph-mon: Attempt to obtain monmap from several possible sources

In order of interest/priority:

  - our latest monmap version
  - a backup monmap version created during sync start, if the store
    appears to be in a post-aborted sync state
  - a mkfs monmap version

If none of these are found, we should go ahead and try to build a
monmap from ceph.conf to join an existing cluster.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: Monitor: backup monmap prior to starting a store sync
Joao Eduardo Luis [Fri, 19 Apr 2013 16:28:06 +0000 (17:28 +0100)]
mon: Monitor: backup monmap prior to starting a store sync

If by fate we end up attempting a store sync after failing at
least one before, we might not have a monmap to read from the
store to backup.  Therefore, in that case, we shall backup the
current monmap being used by the monitor.

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: MonmapMonitor: add function to obtain latest monmap
Joao Eduardo Luis [Mon, 22 Apr 2013 15:20:37 +0000 (16:20 +0100)]
mon: MonmapMonitor: add function to obtain latest monmap

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: PaxosService: add 'exists_key/version' helper functions
Joao Eduardo Luis [Mon, 22 Apr 2013 15:13:33 +0000 (16:13 +0100)]
mon: PaxosService: add 'exists_key/version' helper functions

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agodoc: Trimmed toc depth for nicer visual appearance.
John Wilkins [Thu, 18 Apr 2013 21:23:47 +0000 (14:23 -0700)]
doc: Trimmed toc depth for nicer visual appearance.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added new PG troubleshooting use case.
John Wilkins [Thu, 18 Apr 2013 21:08:43 +0000 (14:08 -0700)]
doc: Added new PG troubleshooting use case.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Updated title.
John Wilkins [Thu, 18 Apr 2013 21:08:10 +0000 (14:08 -0700)]
doc: Updated title.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added PG troubleshooting to toctree.
John Wilkins [Thu, 18 Apr 2013 21:07:56 +0000 (14:07 -0700)]
doc: Added PG troubleshooting to toctree.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Bifurcating OSD and PG Troubleshooting. Updated hyperlink.
John Wilkins [Thu, 18 Apr 2013 20:30:50 +0000 (13:30 -0700)]
doc: Bifurcating OSD and PG Troubleshooting. Updated hyperlink.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Bifurcating OSD and PG Troubleshooting. Added PG troubleshooting doc.
John Wilkins [Thu, 18 Apr 2013 20:30:05 +0000 (13:30 -0700)]
doc: Bifurcating OSD and PG Troubleshooting. Added PG troubleshooting doc.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Bifurcating OSD and PG Troubleshooting. Removed PG section.
John Wilkins [Thu, 18 Apr 2013 20:29:16 +0000 (13:29 -0700)]
doc: Bifurcating OSD and PG Troubleshooting. Removed PG section.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agorgw_bucket: Fix dump_index_check.
caleb miles [Thu, 18 Apr 2013 18:09:17 +0000 (14:09 -0400)]
rgw_bucket: Fix dump_index_check.

Signed-off-by caleb miles <caleb.miles@inktank.com>

12 years agoMerge branch 'wip-max_size-3637' into next
Greg Farnum [Thu, 18 Apr 2013 17:39:03 +0000 (10:39 -0700)]
Merge branch 'wip-max_size-3637' into next

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomds: journal the projected root xattrs in add_root()
Kuan Kai Chiu [Thu, 18 Apr 2013 06:43:26 +0000 (14:43 +0800)]
mds: journal the projected root xattrs in add_root()

In EMetaBlob::add_root(), we should log the projected root xattrs
instead of original ones to reflect xattr changes.

Signed-off-by: Kuan Kai Chiu <big.chiu@bigtera.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomds: fix setting/removing xattrs on root
Kuan Kai Chiu [Thu, 18 Apr 2013 06:43:25 +0000 (14:43 +0800)]
mds: fix setting/removing xattrs on root

MDS crashes while journaling dirty root inode in handle_client_setxattr
and handle_client_removexattr. We should use journal_dirty_inode to
safely log root inode here.

Signed-off-by: Kuan Kai Chiu <big.chiu@bigtera.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agodebian/control: Fix typo in libboost version number
Gary Lowell [Thu, 18 Apr 2013 15:47:49 +0000 (08:47 -0700)]
debian/control:  Fix typo in libboost version number

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agobuild: Add new package dependencies
Gary Lowell [Wed, 17 Apr 2013 04:14:18 +0000 (21:14 -0700)]
build:  Add new package dependencies

Add libboost-system-dev (bug #4725).

Add hdparm to rpm installation requirements.  The hdparm
command is used to determin if write-caching is enabled on
the journal device.

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
12 years agodoc: Removed legacy man page index. Generates warning otherwise.
John Wilkins [Thu, 18 Apr 2013 01:34:54 +0000 (18:34 -0700)]
doc: Removed legacy man page index. Generates warning otherwise.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Clarified that admin-socket is accessed from same host.
John Wilkins [Thu, 18 Apr 2013 01:34:27 +0000 (18:34 -0700)]
doc: Clarified that admin-socket is accessed from same host.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Updated hyperlinks to new tshooting section.
John Wilkins [Thu, 18 Apr 2013 01:33:28 +0000 (18:33 -0700)]
doc: Updated hyperlinks to new tshooting section.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Removed this doc. Nothing referenced it, and parent directory echoes content.
John Wilkins [Thu, 18 Apr 2013 01:32:59 +0000 (18:32 -0700)]
doc: Removed this doc. Nothing referenced it, and parent directory echoes content.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Revised top-level ops page.
John Wilkins [Thu, 18 Apr 2013 01:32:10 +0000 (18:32 -0700)]
doc: Revised top-level ops page.

Consolidated authentication into high-level operations. Added a
troubleshooting section. Collapsed toc trees to make the appearance
cleaner.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Removed link to nowhere. Otherwise generates a warning.
John Wilkins [Thu, 18 Apr 2013 01:30:31 +0000 (18:30 -0700)]
doc: Removed link to nowhere. Otherwise generates a warning.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Removed top-level tshoot page, and created new index.
John Wilkins [Thu, 18 Apr 2013 01:29:41 +0000 (18:29 -0700)]
doc: Removed top-level tshoot page, and created new index.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Excised community from OSD tshoot, made it stand alone.
John Wilkins [Thu, 18 Apr 2013 01:28:50 +0000 (18:28 -0700)]
doc: Excised community from OSD tshoot, made it stand alone.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Moved monitor troubleshooting to troubleshooting section.
John Wilkins [Thu, 18 Apr 2013 01:28:16 +0000 (18:28 -0700)]
doc: Moved monitor troubleshooting to troubleshooting section.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Moved troubleshooting OSD to troubleshooting section.
John Wilkins [Thu, 18 Apr 2013 01:27:40 +0000 (18:27 -0700)]
doc: Moved troubleshooting OSD to troubleshooting section.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Added extraneous rgw settings to rgw conf.
John Wilkins [Thu, 18 Apr 2013 01:26:33 +0000 (18:26 -0700)]
doc: Added extraneous rgw settings to rgw conf.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Moved memory profiling from operations to troubleshooting.
John Wilkins [Thu, 18 Apr 2013 01:25:51 +0000 (18:25 -0700)]
doc: Moved memory profiling from operations to troubleshooting.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Moved CPU profiling from operations to troubleshooting.
John Wilkins [Thu, 18 Apr 2013 01:25:06 +0000 (18:25 -0700)]
doc: Moved CPU profiling from operations to troubleshooting.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Set toc depth to 1 level, and added troubleshooting so it appears in sidebar.
John Wilkins [Thu, 18 Apr 2013 01:24:24 +0000 (18:24 -0700)]
doc: Set toc depth to 1 level, and added troubleshooting so it appears in sidebar.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Moved journal discussion to OSD ref from Ceph config.
John Wilkins [Thu, 18 Apr 2013 01:23:46 +0000 (18:23 -0700)]
doc: Moved journal discussion to OSD ref from Ceph config.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Reordered deployment tools in toc.
John Wilkins [Thu, 18 Apr 2013 01:22:18 +0000 (18:22 -0700)]
doc: Reordered deployment tools in toc.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Removed logging from config index. Set depth to 1 for clean appearance.
John Wilkins [Thu, 18 Apr 2013 01:21:35 +0000 (18:21 -0700)]
doc: Removed logging from config index. Set depth to 1 for clean appearance.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Removed logging. Added references. Reorganized and edited.
John Wilkins [Thu, 18 Apr 2013 01:20:51 +0000 (18:20 -0700)]
doc: Removed logging. Added references. Reorganized and edited.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Removed. Not in toc, and otherwise generates a warning.
John Wilkins [Thu, 18 Apr 2013 01:19:34 +0000 (18:19 -0700)]
doc: Removed. Not in toc, and otherwise generates a warning.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Updated hyperlink.
John Wilkins [Thu, 18 Apr 2013 01:18:46 +0000 (18:18 -0700)]
doc: Updated hyperlink.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Removed fragmented logging info. Consolidated into one doc.
John Wilkins [Thu, 18 Apr 2013 01:18:10 +0000 (18:18 -0700)]
doc: Removed fragmented logging info. Consolidated into one doc.

Logging was variously described in the ceph configuration document,
a configuration reference, and a section in operations. Since
logging and debugging are generally used with troubleshooting,
I consolidated the docs and placed them in the troubleshooting
section. Also fixed the example and provided additional detail.

fixes: #3804

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agorbd: Only allow shrinking an image when --allow-shrink flag is passed
Wido den Hollander [Mon, 8 Apr 2013 13:18:32 +0000 (15:18 +0200)]
rbd: Only allow shrinking an image when --allow-shrink flag is passed

Signed-off-by: Wido den Hollander <wido@widodh.nl>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoclient: disable invalidate callbacks :(
Greg Farnum [Wed, 17 Apr 2013 22:41:19 +0000 (15:41 -0700)]
client: disable invalidate callbacks :(

See #4746; it deadlocks right now.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoMerge pull request #219 from ceph/wip-rbd-progress
Josh Durgin [Wed, 17 Apr 2013 22:37:11 +0000 (15:37 -0700)]
Merge pull request #219 from ceph/wip-rbd-progress

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
12 years agorbd: add --no-progress switch 219/head
Sage Weil [Wed, 17 Apr 2013 22:31:36 +0000 (15:31 -0700)]
rbd: add --no-progress switch

Disable progress output to stderr.t

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoleveldbstore: handle old versions of leveldb
Greg Farnum [Wed, 17 Apr 2013 20:21:04 +0000 (13:21 -0700)]
leveldbstore: handle old versions of leveldb

The filter_policy (bloom filter) stuff is fairly new in LevelDB's life,
and it turns out that precise's version is too old for it. Add conditional
compilation for those members in order to build and work properly.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoMerge remote-tracking branch 'gh/wip-4521-fix' into next
Sage Weil [Wed, 17 Apr 2013 22:03:03 +0000 (15:03 -0700)]
Merge remote-tracking branch 'gh/wip-4521-fix' into next

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomds: change XLOCK/XLOCKDONE's next state to LOCK
Yan, Zheng [Fri, 12 Apr 2013 08:11:11 +0000 (16:11 +0800)]
mds: change XLOCK/XLOCKDONE's next state to LOCK

For simplelock and filelock, XLOCK/XLOCKDONE's next state is SYNC.
But filelock in XLOCK/XLOCKDONE state allow Fb caps, filelock in
SYNC state does not. So filelock can be stuck in XLOCK/XLOCKDONE
state forever if there are Fb caps issued.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
12 years agomds: pass proper mask to CInode::get_caps_issued
Yan, Zheng [Fri, 12 Apr 2013 08:11:09 +0000 (16:11 +0800)]
mds: pass proper mask to CInode::get_caps_issued

There is a total of 22 cap bits and file lock uses 8 cap bits.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agomon: Monitor: convert osdmap_full as well 220/head 221/head
Joao Eduardo Luis [Thu, 4 Apr 2013 17:19:02 +0000 (18:19 +0100)]
mon: Monitor: convert osdmap_full as well

Store conversion wasn't converting the osdmap_full/ versions, only the
incrementals under osdmap/ and the latest full version stashed.  This
would lead to some serious problems during OSDMonitor's update_from_paxos
when the latest stashed didn't correspond to the first available
incremental.

Fixes: #4521
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agomon: PaxosService: add helper function to check if a given version exists
Joao Eduardo Luis [Thu, 4 Apr 2013 17:17:21 +0000 (18:17 +0100)]
mon: PaxosService: add helper function to check if a given version exists

Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agoosd/PG.cc: initialize PG::flushed in constructor
Danny Al-Gaaf [Tue, 16 Apr 2013 16:14:49 +0000 (18:14 +0200)]
osd/PG.cc: initialize PG::flushed in constructor

Initialize PG::flushed in constructor with false as
described in doc/dev/osd_internals/pg.rst .

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
(cherry picked from commit fb840c8ff75b0c66dfeed48e8558542fe3da4c24)

12 years agoMerge pull request #215 from ceph/wip-leveldb-config
Sage Weil [Wed, 17 Apr 2013 16:49:11 +0000 (09:49 -0700)]
Merge pull request #215 from ceph/wip-leveldb-config

os: bring leveldbstore options up to date

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoFix policy handling for RESTful admin api.
caleb miles [Wed, 17 Apr 2013 15:11:21 +0000 (11:11 -0400)]
Fix policy handling for RESTful admin api.

Signed-off-by caleb miles <caleb.miles@inktank.com>

12 years agoqa: pull qemu-iotests from ceph.com mirror
Sage Weil [Tue, 16 Apr 2013 23:39:17 +0000 (16:39 -0700)]
qa: pull qemu-iotests from ceph.com mirror

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #214 from ceph/wip-objectcacher-handler-ordered
Sage Weil [Tue, 16 Apr 2013 22:48:15 +0000 (15:48 -0700)]
Merge pull request #214 from ceph/wip-objectcacher-handler-ordered

keep write responses to clones in order

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agolibrbd: flush on diff_iterate
Sage Weil [Tue, 16 Apr 2013 22:45:41 +0000 (15:45 -0700)]
librbd: flush on diff_iterate

The diff_iterate() tests fail when caching is enabled because recent writes
aren't visible to listsnaps.  Flush from diff_iterate to ensure that they
are.  Someday, maybe, we might make diff_iterate() inspect the cache
contents to make this more efficient, but for now that is not necessary.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge branch 'next' of https://github.com/ceph/ceph into next
John Wilkins [Tue, 16 Apr 2013 20:29:15 +0000 (13:29 -0700)]
Merge branch 'next' of https://github.com/ceph/ceph into next

12 years agodoc: Cherry-picked from master to next. Uses ceph-mds package during upgrade.
John Wilkins [Tue, 16 Apr 2013 20:28:18 +0000 (13:28 -0700)]
doc: Cherry-picked from master to next. Uses ceph-mds package during upgrade.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Cherry-picked from master to next. Rewrite of CloudStack document.
John Wilkins [Tue, 16 Apr 2013 20:26:32 +0000 (13:26 -0700)]
doc: Cherry-picked from master to next. Rewrite of CloudStack document.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Cherry-picked from master to next. Updates config to use virtio.
John Wilkins [Tue, 16 Apr 2013 20:24:47 +0000 (13:24 -0700)]
doc: Cherry-picked from master to next. Updates config to use virtio.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Cherry-picked from master to next. Reorders ceph osd create.
John Wilkins [Tue, 16 Apr 2013 20:23:56 +0000 (13:23 -0700)]
doc: Cherry-picked from master to next. Reorders ceph osd create.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agodoc: Cherry picked from master to next. Adds comments on naming OSDs.
John Wilkins [Tue, 16 Apr 2013 20:22:13 +0000 (13:22 -0700)]
doc: Cherry picked from master to next. Adds comments on naming OSDs.

Signed-off-by: John Wilkins <john.wilkins@inktank.com>
12 years agoos/FileJournal: fix journal completion plug removal
Sage Weil [Tue, 16 Apr 2013 15:26:47 +0000 (08:26 -0700)]
os/FileJournal: fix journal completion plug removal

We plug completions when transitioning from a full to non-full journal
to ensure that we do not complete items before we have a stable journal
starting point that is past the committed_thru marker.  However, the order
of the header update and completion queueing means that we never remove
the plug if the journalq is empty--the seq test is always false.  The
result is very slow osd requests that only commit when we do a full sync.

This bug was masked until recently by another issue, fixed in
170d4a3d794260476ecde1e5e2ee719b7cb3ffd1.

The simple fix is to reorder the completion queuing before we update the
new header.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
12 years agoconfig: provide settings for the LevelDB stores we use 215/head
Greg Farnum [Tue, 16 Apr 2013 17:59:21 +0000 (10:59 -0700)]
config: provide settings for the LevelDB stores we use

Now that we can set up the LevelDB options internally, provide
config options on the OSD and the Monitor. We leave the OSD values
at the defaults for now as they're performance-sensitive, but we
set new values on the Monitor so that it can scale to large PGMaps.
(Previously there were issues with large PGMaps taking forever to write;
these changes to the use of compression and the default block and
write buffers counteract them.)

Since we pass these variables through, users who are interested in
doing so now can test and tune them more appropriately.

Reported-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoclient: Fix inode remove from snaprealm race
Sam Lang [Fri, 12 Apr 2013 16:08:35 +0000 (11:08 -0500)]
client: Fix inode remove from snaprealm race

This is a follow on fix to b5ce4d0.  Always remove the inode from the
snaprealm's list of inodes_with_caps before the snaprealm ref is
decremented (and the snaprealm potentially gets freed).

Fixes #4694.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agolibrbd: use initialized data for DiffIterateDiscard test
Sage Weil [Tue, 16 Apr 2013 04:49:38 +0000 (21:49 -0700)]
librbd: use initialized data for DiffIterateDiscard test

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agolibrbd: print seed for all DiffIterate tests
Sage Weil [Tue, 16 Apr 2013 04:32:03 +0000 (21:32 -0700)]
librbd: print seed for all DiffIterate tests

This will aid debugging on failures, and give better coverage.

Signed-off-by: Sage Weil <sage@inktank.com>
12 years agoMerge pull request #217 from alram/master
Sage Weil [Tue, 16 Apr 2013 03:32:46 +0000 (20:32 -0700)]
Merge pull request #217 from alram/master

Fix: use absolute path with udev

Reviewed-by: Sage Weil <sage@inktank.com>
12 years agoFix: use absolute path with udev 217/head
Alexandre Marangone [Mon, 15 Apr 2013 22:57:00 +0000 (15:57 -0700)]
Fix: use absolute path with udev

Avoids the following: udevd[61613]: failed to execute '/lib/udev/bash'
'bash -c 'while [ ! -e /dev/mapper/....

Signed-off-by: Alexandre Marangone <alexandre.marangone@inktank.com>
12 years agoqa: add workunit for running qemu-iotests
Josh Durgin [Sat, 13 Apr 2013 00:33:45 +0000 (17:33 -0700)]
qa: add workunit for running qemu-iotests

This uses the old stand-alone qemu-iotests repo so it works with the
version of qemu in Ubuntu 12.04. The tests depend tightly on qemu
version, so to use later tests we'd need to install corresponding
versions of qemu.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoos: bring leveldbstore options up to date
Greg Farnum [Wed, 10 Apr 2013 22:58:42 +0000 (15:58 -0700)]
os: bring leveldbstore options up to date

LevelDB has a lot of options which we don't implement right now. Add
an options struct to the LevelDBStore which users can access as they
wish in order to set values different from the defaults.
This will let us set various size values, as well as turning on
caching or bloom filter read optimizations.

Signed-off-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agomds: output error number when failing to load an MDSTable
Greg Farnum [Fri, 12 Apr 2013 20:12:03 +0000 (13:12 -0700)]
mds: output error number when failing to load an MDSTable

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoinit-radosgw.sysv: New radosgw init file for rpm based systems
Gary Lowell [Wed, 20 Feb 2013 01:25:27 +0000 (17:25 -0800)]
init-radosgw.sysv:  New radosgw init file for rpm based systems

Added init-radosgw.sys file for rpm based systems, added it to
the tarball list in the makefile, and updated the specfile to
install it.  Also added the a dependency in ceph since it uses
utility routes from that package (On debian systems these are
packaged in ceph-common).  Incorporated review comments from
Alex. (Bug #4571)

Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
Reviewed-by: Alexandre Marangone <alexandre.marangone@inktank.com>
12 years agomds: only go through the max_size change rigamarole if the client requested it
Greg Farnum [Fri, 12 Apr 2013 00:42:59 +0000 (17:42 -0700)]
mds: only go through the max_size change rigamarole if the client requested it

The previous patch was forcing a new size change even if we were
doing it as part of our regular optimistic settings; we don't much
want to do that. This is a small optimization, but Sage asked for
it and it's very easy.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agomds: Locker needs to remember requested max_size changes from clients
Greg Farnum [Fri, 12 Apr 2013 00:30:52 +0000 (17:30 -0700)]
mds: Locker needs to remember requested max_size changes from clients

Previously, if we received an MClientCaps request containing a change
in the inode's max size, and _do_cap_update() was unable to process
the request immediately (due to a locking issue), we would wait-list
the request by adding a call to check_inode_max_size() once the lock
became stable. However, we then tossed out the message without in any
way propagating the new max size which had been requested!

Handle this by extending check_inode_max_size to also accept parameters
for increasing the max size, and by storing all the parameters explicitly
in the C_MDL_CheckMaxSize Context instead of relying on defaults. That
gets us to the point where we *can* notice we need to increase the max. To
actually do so, we now pass calc_new_client_ranges() the requested max
size instead of the actual size if we're doing an update.

Notice that as a side effect of this, all clients get to see the max size
increase instead of just the requester. This should be okay, but it is
chattier than in the optimal case (where we don't get stuck on a lock).

Fixes #3637

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoMerge pull request #213 from ceph/wip-sessionmap-4644
Sam Lang [Thu, 11 Apr 2013 16:08:04 +0000 (09:08 -0700)]
Merge pull request #213 from ceph/wip-sessionmap-4644

mds: fix session_info_t decoding

Reviewed-by: Sam Lang <sam.lang@inktank.com>
12 years agoMerge pull request #212 from ceph/wip-4451
Gregory Farnum [Thu, 11 Apr 2013 15:45:06 +0000 (08:45 -0700)]
Merge pull request #212 from ceph/wip-4451

12 years agomds: Delay export on missing inodes for reconnect 212/head
Sam Lang [Tue, 9 Apr 2013 15:35:19 +0000 (10:35 -0500)]
mds: Delay export on missing inodes for reconnect

The reconnect caps sent by the client on reconnect may not have
inodes found in the inode cache until after clientreplay (when
the client creates a new file, for example). Currently, we send an
export for that cap to the client if we don't see an inode in the cache
and path_is_mine() returns false (for example, if the client didn't
send a path because the file was already unlinked).
Instead, we want to delay handling of the reconnect cap until
clientreplay completes.

This patch modifies handle_client_reconnect() so that we don't assume
the cap isn't ours if we don't have an inode for it, but instead delay
recovery for later. An export cap message is only sent if the inode exists
and the cap isn't ours (non-auth) during reconnect. If any remaining
recovered caps exist in the recovered list once the mds goes active, we
send export messages at that point.

Also, after removing the path_is_mine check,
MDCache::parallel_fetch_traverse_dir() needs to skip non-auth dirfrags.

Fixes #4451.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoclient: Unify session close handling
Sam Lang [Thu, 4 Apr 2013 20:59:56 +0000 (15:59 -0500)]
client:  Unify session close handling

If mds failure causes client reconnect while the
client is unmounting, the client will send a session
close request to the mds even if there are outstanding
inodes in the cache waiting to receive flush_acks.   This
causes the mds to send back a session close message and
the client closes the connection, so that when the mds tries
to send flush acks back to the client, they get dropped, resulting
in the client hanging on unmount.  The pattern for this bug is:

1. mds restart
2. client sends session open request
3. client unmount sets unmounting flag and waits for flush_acks
4. mds sends session open reply
5. client sends session close request (because its unmounting)
6. mds sends session close, client closes connection
7. mds tries to send flush_acks, but drops them because the connection
is gone

This patch unifies the session close handling so that the client
only sends a session close in unmount once all flush acks have been
received.  If the mds restarts during session close, the reconnect
logic will kick the session close waiter so that session close requests
are re-sent for session close replies not yet received.

Signed-off-by: Sam Lang <sam.lang@inktank.com>
12 years agoLibrbdWriteback: complete writes strictly in order 214/head
Josh Durgin [Wed, 10 Apr 2013 21:16:56 +0000 (14:16 -0700)]
LibrbdWriteback:  complete writes strictly in order

RADOS returns writes to the same object in the same order. The
ObjectCacher relies on this assumption to make sure previous writes
are complete and maintain consistency. Reads, however, may be
reordered with respect to each other. When writing to an rbd clone,
reads to the parent must be performed when the object does not exist
in the child yet. These reads may be reordered, resulting in the
original writes being reordered. This breaks the assmuptions of the
ObjectCacher, causing an assert to fail.

To fix this, keep a per-object queue of outstanding writes to an
object in the LibrbdWriteback handler, and finish them in the order in
which they were sent.

Fixes: #4531
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoOSD: make pg upgrade logging quiet
Samuel Just [Wed, 10 Apr 2013 21:13:12 +0000 (14:13 -0700)]
OSD: make pg upgrade logging quiet

Fixes: #4701
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoMerge branch 'wip_4654' into next
Samuel Just [Wed, 10 Apr 2013 21:00:13 +0000 (14:00 -0700)]
Merge branch 'wip_4654' into next

Fixes: #wip_4654
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agorbd qa/workunits: add rbd read data test
Alex Elder [Wed, 10 Apr 2013 20:44:01 +0000 (15:44 -0500)]
rbd qa/workunits: add rbd read data test

This adds a new test script for validating data reads from a mapped
rbd image is what it's expected to be.

See the content of the file for a bit more explanation.

Signed-off-by: Alex Elder <elder@inktank.com>
12 years agorgw_admin: Create keys for a new user by default.
caleb miles [Wed, 10 Apr 2013 19:00:06 +0000 (15:00 -0400)]
rgw_admin: Create keys for a new user by default.

Create a new key pair for new users or when --gen-access-key is specified.

Signed-off-by: caleb miles <caleb.miles@inktank.com>
12 years agoFileJournal: start_seq is seq+1 if journalq.empty()
Samuel Just [Tue, 9 Apr 2013 22:14:19 +0000 (15:14 -0700)]
FileJournal: start_seq is seq+1 if journalq.empty()

This is also the same as journaled_seq + 1 for writeahead
journaling, but not for parallel journaling.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoFileJournal: fix off by one error in committed_thru
Samuel Just [Tue, 9 Apr 2013 22:13:38 +0000 (15:13 -0700)]
FileJournal: fix off by one error in committed_thru

journalq.front().first is the sequence number of the entry
at journalq.front().second.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoJournal: commits may not include all journaled seqs
Samuel Just [Tue, 9 Apr 2013 21:53:52 +0000 (14:53 -0700)]
Journal: commits may not include all journaled seqs

At one point, a commit had to drain the FileStore op
queue.  This is no longer the case.  Consequently, the
journal may have to wait more than one commit for the
filestore to create a stable commit point at a particular
sequence.  Handling this requires two changes:

1) We cannot transition to FULL_WAIT until we receive
a commit_start on a seq >= journaled_seq.
2) We cannot remove the journal completion plug until get
a committed_thru on a seq >= header.start_seq at least as
new as the oldest committed item in the journal.  If on
replay, the journal does not include fs_op_seq, we ignore
it, which is fine since we won't have reported those
entries committed!

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agoJournal: pass the sequence number to commit_start
Samuel Just [Tue, 9 Apr 2013 21:18:51 +0000 (14:18 -0700)]
Journal: pass the sequence number to commit_start

A subsequent patch will need to see the committing seq.

Signed-off-by: Samuel Just <sam.just@inktank.com>
12 years agomds: fix session_info_t decoding 213/head
Yan, Zheng [Fri, 5 Apr 2013 05:58:36 +0000 (13:58 +0800)]
mds: fix session_info_t decoding

commit 0bcf2ac081 changes session_info_t's format, but there is
a typo in the code that decodes old format. We also need to
handle struct_v == 1, which had the same encoding but without
the size guards (which is all handled by DECODE_START_LEGACY_COMPAT).

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoLibrbdWriteback: removed unused and undefined method
Josh Durgin [Wed, 10 Apr 2013 19:22:02 +0000 (12:22 -0700)]
LibrbdWriteback: removed unused and undefined method

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoLibrbdWriteback: use a tid_t for tids
Josh Durgin [Wed, 10 Apr 2013 19:06:36 +0000 (12:06 -0700)]
LibrbdWriteback: use a tid_t for tids

An int could be much smaller, leading to overflow and bad behavior.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoWritebackHandler: make read return nothing
Josh Durgin [Wed, 10 Apr 2013 19:03:04 +0000 (12:03 -0700)]
WritebackHandler: make read return nothing

The tid returned by reads is ignored, and would make tracking writes
internally more difficult by using the same id-space as them. Make read
void and update all implementations.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoObjectCacher: deduplicate final part of flush_set()
Josh Durgin [Mon, 1 Apr 2013 21:51:46 +0000 (14:51 -0700)]
ObjectCacher: deduplicate final part of flush_set()

Both versions of flush_set() did the same thing. Move it into a
helper called from both.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
12 years agotest_stress_watch: remove bogus asserts
Josh Durgin [Wed, 10 Apr 2013 18:35:46 +0000 (11:35 -0700)]
test_stress_watch: remove bogus asserts

There's no reason to check the duration of a watch. The notify will
timeout after 30s on the OSD, but there's no guarantee the client will
see that in any bounded time. This test is really meant as a stress
test of the OSDs anyway, not of the clients, so just remove asserts
about operation duration.

Fixes: #4591
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sam Just <sam.just@inktank.com>
12 years agotest: update rbd formatted-output for progress changes
Josh Durgin [Wed, 10 Apr 2013 17:43:13 +0000 (10:43 -0700)]
test: update rbd formatted-output for progress changes

Progress output now goes to stderr instead of stdout.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
12 years agoMerge branch 'wip-journaler-4618' into next
Greg Farnum [Tue, 9 Apr 2013 23:00:41 +0000 (16:00 -0700)]
Merge branch 'wip-journaler-4618' into next

Reviewed-by: Sam Lang <sam.lang@inktank.com>
12 years agoconfig: fix osd_client_message_cap comment
Greg Farnum [Tue, 9 Apr 2013 19:11:27 +0000 (12:11 -0700)]
config: fix osd_client_message_cap comment

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agoMerge remote-tracking branch 'origin/wip-osd-throttle2' into next
Greg Farnum [Tue, 9 Apr 2013 19:11:15 +0000 (12:11 -0700)]
Merge remote-tracking branch 'origin/wip-osd-throttle2' into next

Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoFileJournal: clarify meaning of start_seq and fix initialization
Samuel Just [Tue, 9 Apr 2013 17:27:50 +0000 (10:27 -0700)]
FileJournal: clarify meaning of start_seq and fix initialization

Second guessing the first sequence number from the FileStore
was silly and broke tests which had the temerity to start at
1 instead of 2...

Fixes: #4687
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
12 years agoRevert "global: call config observers on global_init (and start logging!)"
Greg Farnum [Tue, 9 Apr 2013 01:20:53 +0000 (18:20 -0700)]
Revert "global: call config observers on global_init (and start logging!)"

This reverts commit a30917746614275baeb718e902133f06ef44fba6. This commit
includes calls that involve Mutexes, Lockers, and lockdep -- which isn't
yet set up, so things break horribly. A more subtle approach is required.

Signed-off-by: Greg Farnum <greg@inktank.com>
12 years agomon: Use _daemon version of argparse functions
Dan Mick [Mon, 8 Apr 2013 20:52:32 +0000 (13:52 -0700)]
mon: Use _daemon version of argparse functions

Allow argparse functions to fail if no argument given by using
special versions that avoid the default CLI behavior of "cerr/exit"

Fixes: #4678
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
12 years agoceph_argparse: add _daemon versions of argparse calls
Dan Mick [Mon, 8 Apr 2013 20:49:22 +0000 (13:49 -0700)]
ceph_argparse: add _daemon versions of argparse calls

mon needs to call argparse for a couple of -- options, and the
argparse_witharg routines were attempting to cerr/exit on missing
arguments.  This is appropriate for the CLI usage, but not the daemon
usage.  Add a 'cli' flag that can be set false for the daemon usage
(and cause the parsing routine to return false instead of exit).

The daemon's parsing code due for a rewrite soon.

Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>