Soak Test Results

Dinh Doan Van Bien · March 6, 2026

Test: load-test/soak.js Duration: 58 minutes (5m ramp + 48m sustain + 5m ramp-down) VUs: 30 (21 casual / 6 active / 3 power — 70/20/10 split)

k6 Result: Threshold Breach (exit 99)

The test ran to full completion. The threshold breach is the same artefact as Test 3 (todos-realistic.js): the p(95)<300ms read threshold is structurally unachievable from Grafana’s US load zone to the Frankfurt VPS (~130ms RTT baseline). This is a network geography issue, not server degradation. No 5xx errors were observed in service logs.

VPS Memory: service-by-service (5-min cron samples)

Service	Start	End	Delta	Limit	% at end
project1_kong	221 MB	244 MB	+23 MB	512 MB	47.7%
project1_realtime	165 MB	165 MB	0 MB	512 MB	32.3%
project1_db	76 MB	75 MB	-1 MB	1024 MB	7.4%
project1_meta	70 MB	69 MB	-1 MB	256 MB	27.2%
project1_rest	16 MB	16 MB	0 MB	256 MB	6.5%
project1_auth	12 MB	13 MB	+1 MB	256 MB	5.3%
project1_storage	57 MB	57 MB	0 MB	256 MB	22.5%
project1_studio	148 MB	147 MB	-1 MB	512 MB	28.7%

VPS host RAM: fluctuated between 2166–2272 MB used out of 3819 MB total. No sustained growth.

Key Finding: Kong Memory Growth

Kong (project1) grew steadily from 221 MB → 244 MB (+23 MB, +10.4%) over 60 minutes.

Detailed progression:

Interval	Kong mem (MB)	Δ from start
0 min (baseline)	221	—
5 min	223	+2
10 min	224	+3
15 min	225	+4
20 min	226	+5
25 min	228	+7
30 min	229	+8
35 min	232	+11
40 min	234	+13
45 min	237	+16
50 min	240	+19
55 min	242	+21
60 min	244	+23

Rate: ~2 MB per 5-minute interval = ~24 MB/hour under load.

At this rate: 512 MB limit would be reached in ~12 hours of continuous load. Kong at idle is ~221 MB (baseline before test); memory returned to this range after Kong was restarted post-test to restore the rate limit. The growth appears to be Kong’s nginx Lua shared memory accumulating rate-limiting counters (raised to 10000/min for the test), active connection state, or upstream health data — all of which flush on restart.

Verdict: Kong memory growth is real but slow. At realistic load (not 10000/min rate limit), the counter accumulation would be much smaller. No OOM risk for a typical 8-hour business day. Consider monitoring over 24h+ in production to establish a longer baseline.

Kong CPU Spikes

Both project1 and project2 Kong containers showed elevated CPU readings in the second half of the test (12–17% at point-in-time sample):

Interval	p1_kong CPU	p2_kong CPU
0 min	0.04%	0.04%
15 min	14.1%	0.04%
35 min	2.7%	3.2%
40 min	14.3%	17.4%
45 min	14.5%	14.2%
50 min	14.2%	16.0%
55 min	16.9%	14.7%
60 min	12.6%	12.6%

Project2 was not under load during the test, yet shows the same CPU pattern. This rules out traffic as the cause. The most likely explanation: the 5-second docker stats --no-stream snapshot is catching Kong’s nginx worker at a lua health-check or timer-callback moment (Kong health checks fire every ~10s). This is the same pattern as the Falco rule noise from Kong health checks. The CPU is bursty but the bursts are short.

No Memory Leaks Found

All suspected services were stable:

Realtime (BEAM VM): locked at 165 MB throughout — no leak
PostgREST: locked at 16 MB — no leak
Postgres DB: 75–81 MB, slightly fluctuated but no trend — no leak
GoTrue (auth): 11–13 MB — no leak

The primary concern going into this test (Realtime BEAM VM growing under sustained load) did not materialise.

Service Health Post-Test

All 8 project1 services at 1/1 replicas after test completion. Zero restarts.

project1_auth      1/1
project1_db        1/1
project1_kong      1/1
project1_meta      1/1
project1_realtime  1/1
project1_rest      1/1
project1_storage   1/1
project1_studio    1/1

Summary

Question	Answer
Memory leaks in Realtime?	No — stable at 165 MB
Memory leaks in DB?	No — stable at 75–81 MB
Memory leaks in Kong?	Slow growth (~24 MB/hr under load) — not OOM risk for <12hr runs
DB connection pool exhaustion?	No — DB CPU never exceeded 0.71%
Error rate during test?	0% (threshold breach was latency-only, same as Test 3)
VPS RAM pressure?	No — host used 2.1–2.3 GB of 3.8 GB throughout
Any service restarts/OOMs?	None
CX32 upgrade needed?	No — cluster handles sustained realistic load comfortably

CX32 upgrade decision: deferred indefinitely. The CX22 handles 30 VUs of sustained realistic load with headroom to spare. Revisit if concurrent user count grows beyond 50 or if a third project is added.

Action Items

Monitor Kong memory over a 24h window at production idle to distinguish load-driven growth from background accumulation
Consider adding Kong memory to a weekly cron alert (e.g. notify if >400 MB)
Adjust p(95)<300ms read threshold in soak.js to p(95)<500ms, or switch to a European load zone, to avoid false threshold failures from US→DE latency