Sage Weil [Mon, 8 Feb 2016 16:34:11 +0000 (11:34 -0500)]
global/global_init: chown log, asok if drop privs is deferred
If we are deferring the drop privileges, then we are still root
and need to explicitly chown the log file and admin socket.
Note that this is a fragile solution: if there are other files
that we create or open for write between now and when privs are
eventually dropped, we need to explicitly handle them, too.
Loic Dachary [Sun, 24 Jan 2016 10:07:58 +0000 (17:07 +0700)]
ceph-disk: use the type file for bluestore
The type file in the OSD bluestore data exists and contains the
bluestore string. ceph-disk activate should use it instead of
the "osd objectstore" configuration value. It is better in case the
configuration file changes between prepare and activate.
The fsid file cannot be used by bluestore to signify that ceph-osd
--mkfs has completed successfully because it is pre-populated by
ceph-disk. Introduce the mkfs_done file, dedicated to this, instead of
overloading an existing file.
Signed-off-by: Sage Weil <sage@redhat.com> Signed-off-by: Loic Dachary <loic@dachary.org>
Loic Dachary [Mon, 1 Feb 2016 11:26:05 +0000 (18:26 +0700)]
tests: ceph-disk tests pid files must exist
http://tracker.ceph.com/issues/13422 made it so ceph-osd won't start
unless the pidfile can be created successfully. The default location
being the current directory, ceph-osd must explicitly be told to write
in a directory where it has write permissions.
Loic Dachary [Thu, 28 Jan 2016 04:59:10 +0000 (11:59 +0700)]
ceph-disk: bluestore deactivate / destroy
It is straightforward because it entirely relies on information
collected by ceph-disk list which has full support for bluestore.
It loops on all possible auxiliary devices (as found in Spaces.NAMES)
and does the associated deactivate / destruction which is merely about
handling dmcrypt map / unmap.
Loic Dachary [Thu, 28 Jan 2016 04:12:47 +0000 (11:12 +0700)]
ceph-disk: bluestore list
The objectstore journal and the bluestore block auxiliary device are
handled in the same way. Each occurrence of journal in the code is
replaced with a variable.
A few helpers are added to the Ptype class to factorize the most common
lookups but the code logic is unmodified with one exception: the
more_osd_info previously added a journal_uuid entry regarless. If there
was no journal_uuid file, it would be None. It is changed to only add
the {block,journal}_uuid entry if the corresponding file exist.
Loic Dachary [Thu, 28 Jan 2016 04:53:49 +0000 (11:53 +0700)]
ceph-disk: bluestore trigger
Copy paste the journal code and s/journal/block/
More work will be needed to support multiple auxiliary
devices (block.wal etc). But the goal is to minimize the change because
this commit is part of a series of commits focusing on refactoring
prepare, not the entire ceph-disk codebase.
Loic Dachary [Thu, 28 Jan 2016 04:48:55 +0000 (11:48 +0700)]
ceph-disk: bluestore activate
Only support the block file for now. The refactoring consist of
replacing main_activate_journal with main_activate_space and a name
argument (block, journal). More work will be needed to support multiple
auxiliary devices (block.wal etc). But the goal is to minimize the
change because this commit is part of a series of commits focusing on
refactoring prepare, not the entire ceph-disk codebase.
Loic Dachary [Thu, 28 Jan 2016 04:43:22 +0000 (11:43 +0700)]
ceph-disk: bluestore prepare
Only support the block file for now. It is handled the same as the
journal, only with a different name (block) and it's own set of ptypes
depending on multipath or dmcrypt.
Loic Dachary [Tue, 19 Jan 2016 09:49:40 +0000 (16:49 +0700)]
ceph-disk: refactor prepare
The logic / code path is only modified to the extent necessary for the
refactor.
The Prepare class roughly replaces the prepare_main function but also
handles the prepare subcommand argument parsing. It creates the data and
journal objects and delegate the actual work to them via the prepare()
method.
The Prepare class assumes that preparing an OSD consists on the
following phases:
* optionally prepare auxiliary devices, such as the journal
* prepare a data directory or device
* populate the data directory with fsid etc. and optionally
symbolic links to the auxiliary devices
The PrepareDefault class is derived from Prepare and implements the
current model where there only is one auxiliary device, the journal.
The PrepareJournal class implements the *journal* functions
and is based on a generic class, PrepareSpace which handles the
allocation of an auxiliary device. The only journal specific feature is
left to the PrepareJournal class: querying the OSD to figure out if
a journal is wanted or not.
The OSD data directory is prepared via the PrepareData class. It creates
a file system if necessary (i.e. if a device) and populate the data
directory. Further preparation is then delegated to the auxiliary
devices (i.e. adding a symlink to the device for a journal).
There was some code paths related dmcrypt / multipath devices in
the prepare functions, although it is orthogonal. A class tree for
Devices was created to isolate that.
Although that was the primary reason for adding a new class tree, two
other aspects have also been moved there: ptypes and partition creation.
The ptypes are organized into a data structure with a few helpers in
the hope it will be easier to maintain. All references to the *_UUID
variables have been updated.
The creation of a partition is delegated to sgdisk and a wrapper helps
reduce the code redundancy.
The ptype of a given partition depends on the type of the device (is it
dmcrypt'ed or a multipath device ?). It is best implemented by
derivation so the prepare function does not need to be concerned about
how the ptype of a partition is determined.
Many functions could be refactored into a Device class and its
derivatives, but that was not done to minimize the size of the refactor.
Device knows how to create a partition and figure out the ptype tobe
DevicePartition a regular device partition
DevicePartitionMultipath a partition of a multipath device
DevicePartitionCrypt base class for luks/plain dmcrypt, can map/unmap
DevicePartitionCryptPlain knows how to setup dmcrypt plain
DevicePartitionCryptLuks knows how to setup dmcrypt plain
The CryptHelpers class is introduced to factorize the code snippets that
were duplicated in various places but that do not really belong
because they are convenience wrappers to figure out:
* if dmrypt should be used
* the keysize
* the dmcrypt type (plain or luks)
Loic Dachary [Tue, 19 Jan 2016 11:33:05 +0000 (18:33 +0700)]
tests: workaround ceph-disk global side effects
Because some variables are global in ceph-disk, tests that modify them
interact with each other in non-predictable ways. This will go away
eventually but requires a significant refactor. Workaround by running
one py.test per test file.
Loic Dachary [Tue, 19 Jan 2016 09:19:03 +0000 (16:19 +0700)]
ceph-disk: make all must setup.py install
Refactor the test / virtualenv setup in the same way it was done for
ceph-detect-init.
All shell tests use ceph-helpers.sh which is modified to add ceph-disk /
ceph-detect-init virtualenv/bin to the PATH to ensure the source version
is used even if ceph is installed.
See "ceph-detect-init: make all must setup.py install"
Loic Dachary [Tue, 19 Jan 2016 09:15:43 +0000 (16:15 +0700)]
tests: fix ceph-disk unit tests
Because ceph-disk unit tests were not run as part of make check, part of
the most recent changes broke them. This is a batch fix to sanitize the
situation. Since it is now run with make check, that won't happen again.
Loic Dachary [Tue, 19 Jan 2016 08:55:52 +0000 (15:55 +0700)]
ceph-detect-init: make all must setup.py install
When make all runs in the ceph-detect-init module, it does a "setup.py
build" which is not used. Replace it with a python setup.py install in a
virtualenv so that tests can add the virtualenv/bin to their PATH and
call ceph-detect-init from sources as they would if it was installed.
Part of run-tox.sh is moved to tools/setup-virtualenv.sh so that it can
be re-used by ceph-disk and other python modules.
Sage Weil [Wed, 3 Feb 2016 15:51:01 +0000 (10:51 -0500)]
os/bluestore: change block file mkfs behavior
Previously, if path was set, we'd make a symlink. Otherwise, if
size was set, we'd create a file and resize it accordingly. This
means that setting the size means we create the block "device"
files, which is only useful for debugging, and we want to set a
size that can be used by ceph-disk when creating partitions.
Instead, if path is set, make a symlink. Then/also, if size is
set, and the file/symlink points to a regular file, and that
regular file is 0 bytes, then resize it. This way, vstart.sh
(or a dev) can just touch the file and then mkfs will size it up.