SSH Gateway load test
Notice
All cases need to observe cpu and memory of the ws-proxy
Remember to delete workspaces after test, since they are always active
How to observe
port-forward prometheus 9090 to local, and search graph with query like
container_memory_rss(pod="<ws-proxy-pod-name>
", container="ws-proxy"}
container_cpu_rss(pod="<ws-proxy-pod-name>
", container="ws-proxy"}
Prepare
Prepare workspace pair AB like this:
We need to keep workspace alive, so edit gitpod-cli to build a new tmp exec file with command to keep heartbeat: send heartbeat every 30 seconds
Open a workspace
A
as target workspace, copy file that step1 produce to it, and exec file to keep it aliveOpen a workspace
B
, repeat step 2
Test cases
Testing in prev envs and test several cases:
Many connections to learn the cost of single connection in terms of memory and CPU
For B, write ssh connect script with connect num 10000
🟢 After 10000 connect exec,
ws-proxy
works fine, target workspace works fine, (but sender's workspace network broken)wait: remote command exited without exit status or exit signal
appear after exec command, maybe ssh gateway still has some difference with real ssh Fixed with PRSeveral connection with huge amount of data back and forth
Simply exec scp with large file (
dd if=/dev/zero of=test bs=1M count=1000
) from A to B and B to A several times🟢 scp 1G data works fine several times
Several connections and see how long they can stay alive with heartbeat
For B, open several terminals connect to A and exec command like
htop
🟢 7 tasks of ssh command htop work and stable in 20 hours
Try dropping and reopening connections to see whether ws-proxy leaks memory on such connections
🟠 Memory leak appeared and fixed by this commit
After fix