[OpenWrt-Devel] [PATCH RFC] kernel: drop -fno-reorder-blocks
Rafał Miłecki
zajec5 at gmail.com
Tue Apr 9 05:33:33 EDT 2019
On 09.04.2019 11:30, Rafał Miłecki wrote:
> 1) bcm53xx: BCM47094 SoC (echo 2 > rps_cpus)
>
> zImage size: 1840424 → 1871328 (+1,68%)
>
> a) gro off
> LAN to WAN: 824 Mb/s → 940 Mb/s (+14,08%)
> WAN to LAN: 935 Mb/s → 940 Mb/s (+0,53%)
>
> b) gro on
> LAN to WAN: 512 Mb/s → 534 Mb/s (+4,30%)
> WAN to LAN: 539 Mb/s → 549 Mb/s (+1,85%)
I was obviously curious why this change affects bcm53xx. I tried using perf to
profile kernel before & after the change.
I'm not sure about results interpretation. One thing I noticed is lowered CPU
usage for __softirqentry_text_start. Could that be it?
You can see FlameGraph-s at files.zajec.net/openwrt/fno-reorder-blocks/
P.S.
I used FlameGraph's difffolded.pl to compare LAN to WAN perfs before and after
the change. It seems to highlight the same thing: __softirqentry_text_start. I'm
still unsure what does it mean and if the same improvement can be achieved any
other way.
**********
LAN to WAN
1) Before the patch (824 Mb/s):
+ 9,61% swapper [kernel.kallsyms] [k] v7_dma_inv_range
+ 6,22% swapper [kernel.kallsyms] [k] __softirqentry_text_start
+ 5,14% swapper [kernel.kallsyms] [k] l2c210_inv_range
+ 4,88% ksoftirqd/1 [kernel.kallsyms] [k] v7_dma_clean_range
+ 3,93% swapper [kernel.kallsyms] [k] bcma_host_soc_read32
+ 3,43% ksoftirqd/1 [kernel.kallsyms] [k] __netif_receive_skb_core
+ 3,01% swapper [kernel.kallsyms] [k] arch_cpu_idle
+ 2,81% ksoftirqd/1 [kernel.kallsyms] [k] l2c210_clean_range
+ 2,15% ksoftirqd/1 [kernel.kallsyms] [k] bgmac_start_xmit
+ 2,02% swapper [kernel.kallsyms] [k] bgmac_poll
+ 1,90% ksoftirqd/1 [kernel.kallsyms] [k] __dev_queue_xmit
+ 1,73% ksoftirqd/1 [kernel.kallsyms] [k] nf_hook_slow
+ 1,34% ksoftirqd/1 [kernel.kallsyms] [k] __local_bh_enable_ip
+ 1,05% ksoftirqd/1 [kernel.kallsyms] [k] skb_pull_rcsum
2) After the patch (940 Mb/s):
+ 11,07% swapper [kernel.kallsyms] [k] v7_dma_inv_range
+ 5,76% swapper [kernel.kallsyms] [k] __softirqentry_text_start
+ 5,72% ksoftirqd/1 [kernel.kallsyms] [k] v7_dma_clean_range
+ 5,37% swapper [kernel.kallsyms] [k] l2c210_inv_range
+ 4,34% swapper [kernel.kallsyms] [k] bcma_host_soc_read32
+ 3,65% ksoftirqd/1 [kernel.kallsyms] [k] __netif_receive_skb_core
+ 3,18% ksoftirqd/1 [kernel.kallsyms] [k] l2c210_clean_range
+ 2,71% swapper [kernel.kallsyms] [k] bgmac_poll
+ 2,59% swapper [kernel.kallsyms] [k] arch_cpu_idle
+ 1,97% ksoftirqd/1 [kernel.kallsyms] [k] bgmac_start_xmit
+ 1,67% ksoftirqd/1 [kernel.kallsyms] [k] __dev_queue_xmit
+ 1,54% ksoftirqd/1 [kernel.kallsyms] [k] nf_hook_slow
+ 1,16% ksoftirqd/1 [kernel.kallsyms] [k] ip_rcv
+ 1,08% ksoftirqd/1 [kernel.kallsyms] [k] skb_pull_rcsum
+ 1,07% ksoftirqd/1 [kernel.kallsyms] [k] netif_skb_features
+ 1,04% ksoftirqd/1 [kernel.kallsyms] [k] __local_bh_enable_ip
**********
WAN to LAN
1) Before the patch (935 Mb/s):
+ 10,55% swapper [kernel.kallsyms] [k] v7_dma_inv_range
+ 6,01% swapper [kernel.kallsyms] [k] __softirqentry_text_start
+ 5,56% swapper [kernel.kallsyms] [k] l2c210_inv_range
+ 5,55% ksoftirqd/1 [kernel.kallsyms] [k] v7_dma_clean_range
+ 4,36% swapper [kernel.kallsyms] [k] bcma_host_soc_read32
+ 2,70% ksoftirqd/1 [kernel.kallsyms] [k] l2c210_clean_range
+ 2,65% swapper [kernel.kallsyms] [k] arch_cpu_idle
+ 2,43% ksoftirqd/1 [kernel.kallsyms] [k] __netif_receive_skb_core
+ 2,34% ksoftirqd/1 [kernel.kallsyms] [k] __dev_queue_xmit
+ 2,19% swapper [kernel.kallsyms] [k] bgmac_poll
+ 2,08% ksoftirqd/1 [kernel.kallsyms] [k] bgmac_start_xmit
+ 1,73% ksoftirqd/1 [kernel.kallsyms] [k] nf_hook_slow
+ 1,52% ksoftirqd/1 [kernel.kallsyms] [k] __local_bh_enable_ip
+ 1,45% ksoftirqd/1 [kernel.kallsyms] [k] ip_rcv
+ 1,13% ksoftirqd/1 [kernel.kallsyms] [k] skb_pull_rcsum
+ 1,11% ksoftirqd/1 [kernel.kallsyms] [k] ip_finish_output2
+ 1,06% ksoftirqd/1 [kernel.kallsyms] [k] netif_skb_features
2) After the patch (940 Mb/s):
+ 11,73% swapper [kernel.kallsyms] [k] v7_dma_inv_range
+ 6,05% ksoftirqd/1 [kernel.kallsyms] [k] v7_dma_clean_range
+ 5,94% swapper [kernel.kallsyms] [k] l2c210_inv_range
+ 4,79% swapper [kernel.kallsyms] [k] __softirqentry_text_start
+ 4,08% swapper [kernel.kallsyms] [k] bcma_host_soc_read32
+ 3,05% ksoftirqd/1 [kernel.kallsyms] [k] __netif_receive_skb_core
+ 2,98% ksoftirqd/1 [kernel.kallsyms] [k] l2c210_clean_range
+ 2,53% swapper [kernel.kallsyms] [k] bgmac_poll
+ 2,36% ksoftirqd/1 [kernel.kallsyms] [k] __dev_queue_xmit
+ 2,15% ksoftirqd/1 [kernel.kallsyms] [k] bgmac_start_xmit
+ 2,10% swapper [kernel.kallsyms] [k] arch_cpu_idle
+ 1,64% ksoftirqd/1 [kernel.kallsyms] [k] nf_hook_slow
+ 1,33% ksoftirqd/1 [kernel.kallsyms] [k] ip_rcv
+ 1,28% ksoftirqd/1 [kernel.kallsyms] [k] netif_skb_features
+ 1,27% ksoftirqd/1 [kernel.kallsyms] [k] __local_bh_enable_ip
+ 1,02% swapper [kernel.kallsyms] [k] __skb_flow_dissect
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lan-to-wan-diff.svg
Type: image/svg+xml
Size: 167894 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/openwrt-devel/attachments/20190409/0f0868ca/attachment.svg>
-------------- next part --------------
_______________________________________________
openwrt-devel mailing list
openwrt-devel at lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel
More information about the openwrt-devel
mailing list