adami nodes have 96 processing units (nproc). Based on the old script
logic, we were compiling 90 build jobs with 96 processing units. This
combination (96/90) ends up causing many instances of memory
overconsumption on adami nodes.
With this new logic, we take into account that jobs take up more memory
than we expect. This will make it so adami nodes will compile 67 build
jobs on 96 processing units, which will hopefully avoid so many instances
of memory overconsumption. (Total memory on adami nodes is generally
~270036 MiB, so 270036 / 4000 = 67)
braggi nodes do not have this problem; they have 48 processing units.
It has been working so far to compile 48 build jobs with 48 processing
units on the braggi nodes, and this new logic will not change that or
any other node with nproc <= 50.
Fixes: https://tracker.ceph.com/issues/57296
Signed-off-by: Laura Flores <lflores@ibm.com>
}
get_nr_build_jobs() {
- # assume each compiling job takes 2200 MiB memory on average
+ # assume each compiling job takes 3000 MiB memory on average when nproc <= 50
+ # otherwise, assume 4000 MiB when nproc > 50
+ # See https://tracker.ceph.com/issues/57296
local nproc=$(nproc)
- local max_build_jobs=$(vmstat --stats --unit m | \
+ if [[ $nproc -gt 50 ]]; then
+ local max_build_jobs=$(vmstat --stats --unit m | \
+ grep 'total memory' | \
+ awk '{print int($1/4000)}')
+ else
+ local max_build_jobs=$(vmstat --stats --unit m | \
grep 'total memory' | \
awk '{print int($1/3000)}')
+ fi
if [[ $max_build_jobs -eq 0 ]]; then
# probably the system is under high load, use a safe number
max_build_jobs=16