Testing network / NAT performance

Sun Jun 12 12:58:07 PDT 2022

Over years I saw multiple reports that new OpenWrt release / kernel
update / netifd change / DSA introduction caused a regression in router
network / NAT speed (masquerade NAT in most cases). Most of those
reports remained unresolved I believe.

The problem is that:
1. OpenWrt doesn't have automated testing environments
2. Developers can't figure anything from undetailed reports
3. Even experienced users don't know how to do proper debugging

I spent almost 2 last months researching & testing masquerade NAT
performance. I thought I'll share my find outs & results. Hopefully
this will get more people involved in tracing & fixing such
regressions.

*************************
* Testing method
*************************

In 99% cases it's a totally bad idea to use online speed test services.
They may be too unreliable. It's better to setup a local server instead.

For actual testing you may use iperf or iperf3. If needed - for some
reason - FTP, HTTP or another protocol may be an option too.

*************************
* Testing results
*************************

Network traffic is often not perfectly stable. To avoid getting false
results it may be worth to:
1. Repeat test in few sessions
2. Reject lowest & highest results
3. Calculate an average speed

Example of my testing:

for i in $(seq 1 5); do
	date
	iperf -t 80 -i 10 -c 192.168.99.1 | head -n -1 | sed -n 's/.* \([0-9][0-9]*\) Mbits\/sec.*/\1/p' | sort -n
	echo
	sleep 15
done

Above script lists 8 results from each iperf session. Later I get middle
4 and calculate avarage from them. Then I calculate average from all 5
sessions. It may be an overkill but it was meant to deal with some
really unstable cases.

*************************
* Environment setup
*************************

Get some (usually 2) PCs powerful enough to easily handle maximum
expected router traffic. Once setup avoid changing anything. Kernel
update or configuration change on PC may affect results even if router
is a bottleneck [1]. Disable power saving - I noticed once a lower
performance whenever screen saver got activated.

Connect PC to WAN port and setup it to use a static IP. You may setup
DHCP server too or just make OpenWrt use static WAN IP & gateway. Start
iperf / FTP / HTTP / whatever server.

Connect another PC to LAN port and install a matching client for
generating network traffic.

*************************
* OpenWrt customizations
*************************

Depending on setup you may need some custom configuration changes. To
avoid applying them manually on every boot use uci-defaults scripts.

Example of my WAN setup:

mkdir -p files/etc/uci-defaults/

cat << EOF > files/etc/uci-defaults/90-nat.sh
#!/bin/sh
uci set network.wan.proto='static'
uci set network.wan.ipaddr='192.168.99.2'
uci set network.wan.netmask='255.255.255.0'
EOF

*************************
* Finding regressions
*************************

In continuous testing pick an interval (every day testing or every n-th
commit testing) and look for regressions.

If you notice a regression the first step is to find the first bad
commit. End users often assume that regression was caused by a kernel
change as that is the simplest difference to notice. Always find exact
commit.

Make sure to use git bisect [2] for finding first bad commits.

*************************
* Stabilizng performance
*************************

Probably the most annoying problem in debugging are unstable results.
Speed changing between testing sessions / reboots / recompilations makes
the whole testing unreliable and makes it hard to find a real
regression.

Below are few tips that may help stabilizing network speeds.

1. Repeat tests and get average

    Explained above.

2. Don't change environment setup

    Explained above.

3. Use pfifo qdisc

    It should be more stable for simple traffic (e.g. iperf generated).
    Include "tc" package and execute something like:

    tc qdisc replace dev eth0 root pfifo

    Verify with:

    tc qdisc

4. Adjust rps_cpus and xps_cpus

    On multi-CPU devices having multiple CPUs assigned to a single
    network device may result in traffic being assigned to random CPU and
    in varying speeds across testing sessions.

5. Disable CONFIG_SMP

    This will likely reduce performance but may help finding regression
    if testing results vary a lot.

6. Organizing kernel symbols

    CPUs of home routers usually have small caches. The way kernel
    symbols get organized during compilation may significantly affect
    network performance [3]. It's especially annoying as network
    unrelated changes may move / reorder symbols and affect cache hits &
    misses.

    There isn't a reliable solution for that. It may help to add:
    -falign-functions=32 or
    -falign-functions=64 (depending on platform).
    using e.g. KBUILD_CFLAGS.

*************************
* Profiling
*************************

Profiling with "perf" [4] allows checking what consumes CPUs. It's very
useful for finding code worth optimizing & comparing CPU usage across
changes.

OpenWrt needs to be commpiled with CONFIG_KERNEL_PERF_EVENTS=y option
and package "perf" needs to be installed.

Example of recording:
1. Start network traffic
2. On router execute: ( cd /tmp/; perf record -a -g -- sleep 60 )
3. Copy /tmp/perf.data to machine used for compiling OpenWrt

Example of reporting:
1. perf report -k build_dir/target-*/linux-*/vmlinux.debug --kallsyms build_dir/target-*/linux-*/linux-*/System.map
2. perf report -k build_dir/target-*/linux-*/vmlinux.debug --kallsyms build_dir/target-*/linux-*/linux-*/System.map --no-child
3. perf report -k build_dir/target-*/linux-*/vmlinux.debug --kallsyms build_dir/target-*/linux-*/linux-*/System.map --no-child -g none

For more fancy reports the Flame Graph [5] can be used:
1. perf script build_dir/target-*/linux-*/vmlinux.debug --kallsyms build_dir/target-*/linux-*/linux-*/System.map > out.perf
2. stackcollapse-perf.pl out.perf > out.folded
3. flamegraph.pl out.folded > out.svg

*************************
* Kernel regressions
*************************

The most problematic to debug are kernel updates. If the first bad
OpenWrt commit is something like kernel switch from 5.4 to 5.10 it means
millions of actual changes.

There is no reasonable way to bisect kernel in OpenWrt. There are so
many kernel patches and so much custom code that it's impossible to
apply all those to dozens of kernel commits during git bisect.

There are two ways to handle such cases:
1. Strip OpenWrt of 90+% custom patches and then try kernel bisecting
2. Use non-OpenWrt environment like Buildroot [6]

*************************
* References
*************************

[1] https://lore.kernel.org/netdev/81e63fc9-ac8c-cb35-4572-c808ddab997d@gmail.com/T/#m161113b88568f90fb10106e0c6dc9beadd4861e2
[2] https://git-scm.com/docs/git-bisect
[3] https://lore.kernel.org/netdev/2a338e8e-3288-859c-d2e8-26c5712d3d06@nbd.name/T/#m2215fd7b363dc321e5b16d6e192168c510b8ce94
[4] https://perf.wiki.kernel.org/index.php/Main_Page
[5] https://www.brendangregg.com/flamegraphs.html
[6] https://buildroot.org/