[OpenWrt-Devel] [RFC] lantiq: SMP interrupts and ethernet driver backport from vanilla v5

Petr Cvek petrcvekcz at gmail.com
Fri Feb 1 06:36:20 EST 2019

Dne 01. 02. 19 v 10:51 Hauke Mehrtens napsal(a):> On 1/31/19 9:30 PM,
Mathias Kresin wrote:
>> 30/01/2019 11:38, Petr Cvek:


thank you both for the answer and sorry for "./" in the patches I
usually do only git patching :-D. The patches are really RFC only. For
example I would like to know if a separate node is the best way for SMP
irq controller (ICU). The second set of the registers could be just
appended at the end of icu0 registers.

Origin of patches: well I was able to find second ICU just by dumping
iomem space and looking for patterns (which is enough to make a clean
implementation), but I was lazy to write it (again, done it once for
some FPGA) myself so I've copied part of the code from this repo:


The probable address of second ICU was confirmed by grepping the GPL
source codes from my modem's vendor (tplink td-W9980)


The burst flag is basically the same value as in "etop" code path. The
warning about 8 word burst size is again from tplink source blob (btw I
think the DMA_2W_BURST flag in 4.14.9x is actually 4 word burst, but I
didn't confirmed it yet).

> I am also interested in benchmarks with only one of the patches to know
> which patch gives us which improvement. The IRQ controller patch needs
> some improvements to get into mainline Linux kernel and I would look
> into it if it creates some improvement.

Thats gonna be a bit of a problem. My TD-W9980 has sadly a wave300 5G
wifi driver, so I cannot test on it as it is broken and crashing the
system (I've managed only make a few scans). For 2.4G wifi, I have only
150Mbps usb dongles, which are slow even without local interferences. My
VDSL has only 22:2 which is OK even with vendor's firmware.

My only option for a benchmarking is the ethernet which is (with the
vanilla openwrt driver) extremely slow. My backport increased the TX
speed of the modem from about 4.5 MiByte/s to about 7.5 MiByte/s (TCP
netcat pipe, ACKs on RX FIFO). Gonna try iperf tomorrow. The driver
works even with NFS rootfs over it. Sometimes a kernel warning about
timeout is printed, but the system works afterwards and this is only a
RFC stage. Probably some spinlock is missing or irq gets acked at the
wrong place. It _will_ be fixed.

What is interesting is my observation of long frames (4000-6000) which
are sometimes coming from lantiq (observed in wireshark). My server is
sending only classic ~1500 frames and I know the lantiq driver is
limited to them too (+ tested in ip link set eth0 mtu 3000 -> error).

> Are the IRQs on the 2. VPE used automatically or do you use it with irq
> balanced?
> Could you please provide the output of "cat /proc/interrupts"
The interupts are used automatically, but the distribution is quite
random. I don't have irqbalance package compiled.

# cat /proc/interrupts
           CPU0       CPU1
  7:      31309      31096      MIPS   7  timer
  8:        586        607      MIPS   0  IPI call
  9:       5991      10371      MIPS   1  IPI resched
 22:      14762          1       icu  22  spi_rx
 23:       2496          1       icu  23  spi_tx
 24:          0          0       icu  24  spi_err
 62:          0          0       icu  62  1e101000.usb, dwc2_hsotg:usb1
 63:          0        192       icu  63  mei_cpe
 72:      11133          0       icu  72  vrx200_rx
 73:          0      13833       icu  73  vrx200_tx
 91:          0         81       icu  91  1e106000.usb, dwc2_hsotg:usb2
112:        745          0       icu 112  asc_tx
113:          0        222       icu 113  asc_rx
114:          0          0       icu 114  asc_err
126:          0          0       icu 126  gptu
127:          0          0       icu 127  gptu
128:          0          0       icu 128  gptu
129:          0          0       icu 129  gptu
130:          0          0       icu 130  gptu
131:          0          0       icu 131  gptu
161:          0          0       icu 161  ifx_pcie_rc0
ERR:          1

(using napi polling decreases interrupt generation, so 11133 and 13833
values are not real number of packets transmitted, the vanilla driver
has more then twice as much as backported driver)

The affinity can be set in /proc/irq/$NUM, this can be used for testing
the improvements cause by patch (the burst width can be written on the
fly too).

> We should probably make use for scattered DMA in
> the driver too. 

I would like to try, but I don't have any vrx200 SoC TRM datasheets and
even with different kernel versions (from vendors) the picture is not


openwrt-devel mailing list
openwrt-devel at lists.openwrt.org

More information about the openwrt-devel mailing list