DNC / nobudget acceptance

thanks to Speedy Gonzales

Acceptance tests

create as many tmem guests as possible

on node1

su - xen
cd nobudget/
./newguest.bash auto
while true; do ./newguest.bash auto; sleep 1; done

with three nodes of 32 cores and 64 GiB of RAM, we managed to get 450 idling PVH guests up. what prevented us from creating new guests was OOM which killed lvcreate jobs and eventually the SSH sessions to the dom0. the lvm snapshots live on nodes 1 and 2 for the slack150 template that we’ve used.

root@pmr1:~/xen# running-guests.bash | wc -l
450

root@pmr1:~# lvs | sed 1d | wc -l
554
root@pmr1:~# xl li  | sed 1,2d | wc -l
150

root@pmr2:~# lvs | sed 1d | wc -l
489
root@pmr2:~# xl li  | sed 1,2d | wc -l
150

root@pmr3:~# lvs | sed 1d | wc -l
85
root@pmr3:~# xl li  | sed 1,2d | wc -l
150

the console of any random guest is reachable and responds fine.

conclusion: the bottleneck is not the one we’ve expected. there is still some memory left for XEN guests.

root@pmr1:~# FREE-MEMORY
pmr1: total_memory           : 65461
pmr1: free_memory            : 33520
pmr1: sharing_freed_memory   : 0
pmr1: sharing_used_memory    : 0
pmr2: total_memory           : 65461
pmr2: free_memory            : 33520
pmr2: sharing_freed_memory   : 0
pmr2: sharing_used_memory    : 0
pmr3: total_memory           : 65461
pmr3: free_memory            : 33524
pmr3: sharing_freed_memory   : 0
pmr3: sharing_used_memory    : 0

however it is the dom0 which is saturated.

root@pmr1:~# FREE-MEMORY-DOM0
pmr1:                total        used        free      shared  buff/cache   available
pmr1: Mem:            1891        1768          31           0          92          73
pmr1: Swap:           7167         136        7031
pmr2:                total        used        free      shared  buff/cache   available
pmr2: Mem:            1891        1751          22           0         117         102
pmr2: Swap:           7167         143        7024
pmr3:                total        used        free      shared  buff/cache   available
pmr3: Mem:            1891        1676          43           3         171         146
pmr3: Swap:           7167          18        7149

no remediation required: we a happy for now with 450 up and running guests in total. we could of course consider raising the dom0 memory to 3 GiB but that would be overkill.

stress test 5 guests at once

let’s stress 5 guests living on node3

ssh pmr-vip -l root -p 1563
ssh pmr-vip -l root -p 1566
ssh pmr-vip -l root -p 1569
ssh pmr-vip -l root -p 1572
ssh pmr-vip -l root -p 1575

install the stress utility

slackpkg update
    slackpkg install rsync cyrus-sasl curl lz4 zstd xxHash nghttp2 brotli
    slackpkg install kernel-headers patch gcc-11 automake autoconf-2 binutils make guile gc
sbopkg -r
sbopkg -i stress

now stress that node 5 times around

cat /proc/cpuinfo
stress --cpu 2

meanwhile, try to reach another guest through SSH

ssh pmr-vip -l root -p 1539

==> responds fine

now try again but flooding the guests RAM

stress --vm 2 --vm-keep

==> fails with TMEM

stress: info: [5376] dispatching hogs: 0 cpu, 0 io, 2 vm, 0 hdd
stress: FAIL: [5378] (494) hogvm malloc failed: Cannot allocate memory
stress: FAIL: [5376] (394) <-- worker 5378 returned error 1
stress: WARN: [5376] (396) now reaping child worker processes
stress: FAIL: [5376] (451) failed run completed in 0s

eventually checkup disk i/o also

stress --hdd 2

==> that other guest responds quite slowly already, as long as there is disk i/o request like for bash-completion. the shell access got very slow anyhow, the guest is almost unusable. the HDPARM bench results are not so bad, though

root@dnc1539:~# hdparm -tT /dev/xvda1

/dev/xvda1:
 Timing cached reads:   54950 MB in  2.00 seconds = 27528.01 MB/sec
 Timing buffered disk reads: 140 MB in  3.01 seconds =  46.47 MB/sec

note the dom0 node always is reachable anymore anyhow as it has higher weight.

remediation: TODO I need to get an alarm and kill those unfair guests asap