ceph-*-build: evaluate num jobs using total mem not free mem
there is chance that the system is under extremely high load when `vmstat`
is called. so the "free memory" can not reflect the available memroy
when building Ceph. so use "total memory" instead.
David Galloway [Tue, 4 Sep 2018 17:43:41 +0000 (13:43 -0400)]
ansible: Install python-pip from epel on centos slaves
I'm not sure why we're just now seeing this. Perhaps it was manually
fixed on prado.ceph.com and a redeploy overwrote it? Regardless
non-libvirt slaves are failing to join Jenkins currently.
Signed-off-by: David Galloway <dgallowa@redhat.com>
Sébastien Han [Tue, 28 Aug 2018 17:23:55 +0000 (10:23 -0700)]
ceph-ansible-pr-syntax-check: group_vars operator
When defaults/main.yml is touched we will generate 2 group_vars files,
one for all.yml.sample and one for rhcs.yml.sample, this is making the
conditional never pass since there is one more value in the count.
Basically if defaults/main.yml and osd/defaults/main.yml are touched
this will result in 3 group_vars file but the count for
defaults/main.yml is 2.
In the end we want to fail if the number of defaults/main.yml is
greather than the number of group_vars files.
Kefu Chai [Wed, 22 Aug 2018 07:27:32 +0000 (15:27 +0800)]
ceph-*-build: set jobs number according to free memory
limits the job number of building ceph/ceph pull requests and deb packages
with the (size of free memory in MB)/1800.
guess we are using more compile-time optimizations now, so to compile
ceph source requires more memory. sometimes, a single cc1plus takes
more than 3GB memeory. that's why we are seeing more and more OOM
in our arm64 builders. following is a sample from omani09 -- a
arm64 builder compiling a ceph/ceph PR targeting master:
also, the performance of over-all compiling is also impacted by the I/O
subsystem. so lower the number of job could actually reduce the time
of the compiling processes to completing for the I/O queue of local
device. so we can use an conservative number for calc an upper bound of
job number for "make" instead using $(nproc). in this change,
$(free_memory_in_mega / 1800) is used as the upper limit of n_jobs.
on a typical arm64 builder with 48 cores and 64 GB mem, the n_jobs is
now 34 .
when building rpm packages, the number of build jobs is specified by
_smp_mflags macro, which is defined by
/usr/lib/rpm/platform/*/macros and /usr/lib/rpm/redhat/macros.
see
https://github.com/rpm-software-management/rpm/blob/master/platform.in#L53
and rhel/centos use following patch
https://git.centos.org/blob/rpms!redhat-rpm-config.git/eaaa6282147d0797a3733f3b91671b7a0752d448/SOURCES!redhat-rpm-config-9.1.0-ncpus-max.patch;jsessionid=xv8lqw4ipwwetge0i19ejo9t
so one cannot build rpm packages on centos/rhel with more than 16 jobs
when using redhat-rpm-config. and 16 is a safe number for us.
Andrew Schoen [Fri, 24 Aug 2018 12:54:46 +0000 (08:54 -0400)]
ceph-volume-ansible-prs: disable centos7 tests
We're having infrastructure issues with OVH and the centos7 vagrant
vm kernel panics after booting. Disabling these for now until the issue
can be resolved.
Andrew Schoen [Wed, 22 Aug 2018 19:38:47 +0000 (15:38 -0400)]
ceph-volume-ansible-prs: deploy the target branch, not the source branch
We're going to start deploying the target branch, the branch we're
merging PRs into, instead of the source branch. This way we don't have to
wait on packages as we can assume there is somethign already available
for the target branch. The ceph-volume code we're testing will be
rsynced to the testing nodes from the source branch.
Sébastien Han [Wed, 22 Aug 2018 09:30:53 +0000 (11:30 +0200)]
ceph-ansible-pr-syntax-chec: add test_ceph_release_in_ceph_default
function
This new function searches for statements like ' -
ceph_release_num[ceph_release] ' in the ceph-defaults role. These must
never be used since the ceph_release is set by role running after
ceph-defaults.
However, the ceph-ansible ci passes this variable when running so it's
always defined, even before ceph-defaults runs. This is particularity
from our CI. So this is done, our Ci won't complain but general users
will end with an error like "'dict object' has no attribute" since the
variable does not exist at the time of the ceph-defaults play.
Sébastien Han [Wed, 22 Aug 2018 09:25:52 +0000 (11:25 +0200)]
ceph-ansible-pr-syntax-check: add two functions to search files
Adding 2 new functions:
* git_diff_to_head which prints the diff for the current new code
* match_file which searches and print a particular file that was
modified by the current code.
Sébastien Han [Mon, 20 Aug 2018 08:39:21 +0000 (10:39 +0200)]
ceph-ansible-pr-syntax-check: use latest ansible version
There is no reason to pin to an old ansible version. Since this is doing
syntax checking we should have the latest Ansible version to compare
against our latest ceph-ansible code.
Sébastien Han [Mon, 13 Aug 2018 12:34:52 +0000 (14:34 +0200)]
ceph-ansible-pr-syntax-check: add cbheck for coding convention
One of our convention in ceph-ansible is to not use capital letters in
task's name. So let's hardcode it in the initial phase of the pipeline.
ALso, this will let user discover what's wrong without us doing the
review and finding the problem.
Adding this environment variable allows tests/tox.sh in ceph/ceph-container to
know whether we are running in a nightly job.
This is needed because ceph-container project checks if some code has
been changed, if no, test is not run.
Since in nightlies job will always enter this condition we must have a
way to be aware if we are running a nightly so we can skip this test.
Once https://github.com/ceph/ceph/pull/22847 will be merged, this commit
will remove all the custom hacks from this script as they moved directly
in ./run-make-check.sh
Sébastien Han [Tue, 3 Jul 2018 08:26:43 +0000 (10:26 +0200)]
ceph-ansible-pr-syntax-check: use target branch to compare
We must use the target branch as a reference when we try to find the
number of commits pushed.
We use 'ghprbTargetBranch' env variable, which comes from jenkins's
injectedEnvVars.
Sébastien Han [Tue, 3 Jul 2018 08:25:09 +0000 (10:25 +0200)]
Revert "ceph-ansible-pr-syntax-check: use master instead of HEAD"
This reverts commit 7c971b82771ef6766206b7ef0716a8705fdbe649.
Because this is the wrong fix, if you push a PR on a stable branch, the
comparaison will happen against master, we don't want that.
Andrew Schoen [Mon, 2 Jul 2018 21:48:00 +0000 (16:48 -0500)]
scripts: collect all logs in /var/log/ceph on ansible failures
On the jobs that use ceph-ansible to test ceph logs are collected from
the vms on failure. However, if the cluster name was not ceph the find
command would fail to collect any ceph-volume logs.
This will carry over our custom sha1 variable when "Rebuild" is clicked.
Previously, it was possible to click "Rebuild," and leave the `sha1`
variable empty which caused every branch/PR to get built.
Signed-off-by: David Galloway <dgallowa@redhat.com>