Ken Dreyer [Wed, 6 Sep 2017 17:12:54 +0000 (11:12 -0600)]
rpm: better comments for file removals
As of 54d7a81241eac26d87e2bee513df5efd8866a586, we're doing more than
stripping CoreOS files in the RPM now. Move the comments around to
better match the code that does what the comments describe.
The purpose of this change is to make it easier to read this part of the
RPM spec file.
Sébastien Han [Mon, 4 Sep 2017 20:13:17 +0000 (22:13 +0200)]
shrink-mon: wait a little bit for the mon to be out
Monitor removal from the monmap is not immediate, so let's wait a little
bit and then fail if the monitor is still in the monmap.
We try twice in total with 10 sec intervals.
Sébastien Han [Thu, 31 Aug 2017 09:22:33 +0000 (11:22 +0200)]
ceph-defaults: fix handlers for mds and rgw
The way we handle the restart for both mds and rgw is not ideal, it will
try to restart the daemon on the host that don't run the daemon,
resulting in a service file being created (see bug description).
Now we restart each daemon precisely and in a serialized fashion.
Note: the current implementation does NOT support multiple mds or rgw on
the same node.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1469781 Signed-off-by: Sébastien Han <seb@redhat.com>
Keith Schincke [Thu, 17 Aug 2017 17:25:20 +0000 (13:25 -0400)]
Update ceph_rgw_docker_extra_env to add bind ip
This patch adds passing the RGW_CIVETWEB_IP to the docker
container. This IP defaults to the value of radosgw_civetweb_bind_ip.
radosgw_civetweb_bind_ip default to ipv4.default
Without this value, the RGW containter will bind to 0.0.0.0
Sébastien Han [Thu, 31 Aug 2017 06:29:30 +0000 (08:29 +0200)]
rgw: cleanup old code and remove systemd condition
Remove the old check prior systemd.
We only support systemd so there is no need to run a condition on
systemd. The playbook will fail if systemd is not present.
Sébastien Han [Wed, 30 Aug 2017 08:44:18 +0000 (10:44 +0200)]
site-docker.yml.sample: delegate facts
Now we can use --limit on the container deployment too. This is useful
while deploying client nodes.
e.g: ansible-playbook -i inventory -l clients site-docker.yml.sample
Sébastien Han [Thu, 3 Aug 2017 13:30:25 +0000 (15:30 +0200)]
common: refactor installation method
The installation process is now described as follow:
* you still have to choose a 'ceph_origin' installation method. The
origin can be a 'repository' (add a new repository), distro (it will use
the packages provided by the native repo source of your distribution),
local (only available on redhat system, it installs locally built
packages). This option is not well tested, so use it carefully
* if ceph_origin == 'repository' you will have to decide what kind of
repository you want to enable:
- community: corresponds to the stable upstream/community version
- enterprise: corresponds to the stable enterprise/downstream version
(basically you are a red hat customer)
- dev: it will install ceph from packages built out of the github
development branches
Sébastien Han [Mon, 28 Aug 2017 22:16:31 +0000 (00:16 +0200)]
ceph-docker-common: fix empty array
The list can not be evaluated properly if it containers '[]', which is
the case when using the filter "default([])". To fix this, we have to
properly merge the lists.
This is fixing the issue: "list object has no element 1"
Sébastien Han [Mon, 28 Aug 2017 21:23:36 +0000 (23:23 +0200)]
ceph-docker-common: detect ceph version
By detecting the ceph version running in the container we can easily
apply conditions like:
ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous
We do that already, in ceph-docker-common/tasks/fetch_configs.yml.
fatal: [magna005]: FAILED! => {"failed": true, "msg": "The conditional
check 'ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous'
failed. The error was: error while evaluating conditional
(ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous):
'dict object' has no attribute 'dummy'\n\nThe error appears to have been
in
'/home/ubuntu/ceph-ansible/roles/ceph-docker-common/tasks/fetch_configs.yml':
line 2, column 3, but may\nbe elsewhere in the file depending on the
exact syntax problem.\n\nThe offending line appears to be:\n\n---\n-
name: register rbd bootstrap key\n ^ here\n"}
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1486062 Signed-off-by: Sébastien Han <seb@redhat.com>
Sébastien Han [Mon, 28 Aug 2017 10:04:49 +0000 (12:04 +0200)]
ceph-docker-common: do not log inside the container
Logging inside the container is not useful since it writes to the
overlayfs partition, resulting in potential performance degradation on
the container.
If you need to check the logs, just look at journald.
Sébastien Han [Fri, 12 May 2017 13:59:52 +0000 (15:59 +0200)]
rolling_update: nicer way to set osd flags
Prior to this patch, we were applying the osd flags like this:
"
General pre tasks
Set flags
Upgrade OSDs on a host
Unset flags <-- this triggers pending scrub to start
Set flags
Upgrade OSDs on a hosts
Unset flags <-- this triggers pending scrub to start
.
.
.
General post tasks
"
Now instead, we apply the flag once before starting the OSD update and
unset them once the last OSD is finished.
"
General pre tasks
Set flags and wait for any scrubs to finish
Upgrade OSDs on a host
Upgrade OSDs on a host
.
.
.
Unset flags
General post tasks
"
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1450754 Signed-off-by: Sébastien Han <seb@redhat.com> Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
Andrew Schoen [Thu, 24 Aug 2017 15:05:46 +0000 (10:05 -0500)]
ceph-config: when using local_action set become: false
There should be no need to use sudo when writing or using these files.
It creates an issue when the user running ansible-playbook does not
have sudo privs.
Sébastien Han [Thu, 24 Aug 2017 14:19:39 +0000 (16:19 +0200)]
ceph-mon: detect ANSIBLE_ROLES_PATH if present
Some deployments can't copy infrastructure playbooks outside of the
infrastructure-playbooks directory. Thus they use ANSIBLE_ROLES_PATH to
overcome this. However some roles have 'playbook_dir' hardcoded, which
results in wrong path since the execution comes from
infrastructure-playbooks. Basically the role triggered by a playbook
from infrastructure-playbooks believes that the roles are in
infrastructure-playbooks/roles. This commit fixes that.
Sébastien Han [Thu, 24 Aug 2017 07:28:22 +0000 (09:28 +0200)]
site: delegate fact to all the hosts
Before this patch we couldn't use --limit properly to only interact with
a particular set of hosts. We basically always required to have ceph-mon
role being played to properly get facts and then build the ceph.conf.
Now, the current running host will get the facts from the machines that
are not part of the current play. This is achieved with the help of the
new option delegate_facts, for more info see:
http://docs.ansible.com/ansible/latest/playbooks_delegation.html#delegated-facts
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1482067 Signed-off-by: Sébastien Han <seb@redhat.com>
This commit eases the use of the
infrastructure-playbooks/switch-from-non-containerized-to-containerized-ceph-daemons.yml
playbook. We basically run it with a couple of pre-tasks and then we let
the playbook run the docker roles.
It obviously expect to have proper variables configured in order to
work.
Andrew Schoen [Wed, 23 Aug 2017 13:59:57 +0000 (08:59 -0500)]
ceph-osd: restructure lvm_volumes variable for more flexiblity
The lvm_volumes variable is now a list of dictionaries that represent
each OSD you'd like to deploy using ceph-volume. Each dictionary must
have the following keys: data, journal and data_vg. Each dictionary also
can optionaly provide a journal_vg key.
The 'data' key represents the lv name used for the OSD and the 'data_vg'
key is the vg name that the given lv resides on. The 'journal' key is
either an lv, device or partition. The 'journal_vg' key is optional and
must be the vg name for the journal lv if given. This key is mainly used
for purging of the journal lv if purge-cluster.yml is run.