[FS#3760] Wifi to WAN traffic causes eth0 to crash with a NETDEV WATCHDOG

OpenWrt Bugs openwrt-bugs at lists.openwrt.org
Thu Apr 29 05:46:10 BST 2021


THIS IS AN AUTOMATED MESSAGE, DO NOT REPLY.

A new Flyspray task has been opened.  Details are below. 

User who did this - Steven Johnson (stevenj) 

Attached to Project - OpenWrt/LEDE Project
Summary - Wifi to WAN traffic causes eth0 to crash with a NETDEV WATCHDOG
Task Type - Bug Report
Category - Base system
Status - Unconfirmed
Assigned To - 
Operating System - All
Severity - Critical
Priority - Very Low
Reported Version - openwrt-21.02
Due in Version - Undecided
Due Date - Undecided
Details - I have been testing OpenWRT 21.02-RC1 stock released firmware image on the Newifi-D2 target (MT7621 based).

Sporadically while using the router, i experience the WAN interface crashing with NETDEV WATCHDOG errors.

I am using the released image for OpenWrt 21.02-rc1 for the Newifi-D2 target
I have also experienced this fault with a custom router build based on MT7621 for a different target, the fault appears general to MT7621 not this target.

I have loaded 19.07.7 stock firmware on this same hardware and the fault does not occur in that firmware version.  A colleague has replicated this fault in 21.02-RC1 with the same type of target (physically different unit), but in a different environment with no shared network resources.

Other than setting up the wifi ssid and password and enabling the 5ghz radio, the network settings are stock.

I have worked out a 90% reliable test case to reproduce the fault:

Have a client device connected to the router on WiFi 5ghz
And a local Iperf3 server running on the wan side of the wired interface.

On the server run:

iperf3 -s

//Note the servers IP address//

On the client run:

iperf3 -c  -P 50


//Notes:
Iperf3 server needs to be local on a 1gbps wired link. With an internet based Iperf3 server (or one with a slower connection) the error is much harder to trigger, it will happen but its much more random.
The number of connections does not need to be 50,  however, the lower the number of connections the less reliable the error triggers.
The WiFi client should be getting about 300-400mbps throughput over the wifi link when its working correctly.//

On most runs, the iperf3 performance will drop to 0mbps,  logread on the router then reveals:

Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.065748] ------------[ cut here ]------------
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.070403] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:448 0x8047e780
Thu Apr 29 03:58:44 2021 kern.info kernel: [  578.077443] NETDEV WATCHDOG: eth0 (mtk_soc_eth): transmit queue 0 timed out
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.084412] Modules linked in: pppoe ppp_async iptable_nat xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE xt_FLOWOFFLOAD xt_CT pppox ppp_generic nf_nat nf_flow_table_hw nf_flow_table nf_conntrack_rtcache nf_conntrack mt76x2e mt76x2_common mt76x02_lib mt7603e mt76 mac80211 ipt_REJECT cfg80211 xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG slhc nf_reject_ipv4 nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_filter ip_tables crc_ccitt compat ledtrig_usbport nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 leds_gpio xhci_plat_hcd xhci_pci xhci_mtk xhci_hcd gpio_button_hotplug usbcore nls_base usb_common
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.147503] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.4.111 #0
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.153490] Stack : 00000000 80840000 ffffffff 8007d6e0 00000000 00000000 00000000 00000000
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.161831]         00000000 00000000 00000000 00000000 00000000 00000001 8fc0fd50 8ba64c1c
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.170163]         8fc0fde8 00000000 00000000 00000000 00000038 805e1804 342e3520 3131312e
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.178486]         00000000 0000001c 00000000 0002402f 00000000 8fc0fd30 00000000 8047e780
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.186808]         00000009 00000001 00200000 00000122 00000003 80359e2c 00000004 80810004
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.195132]         ...
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.197567] Call Trace:
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.197583] [] 0x8007d6e0
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.203512] [] 0x805e1804
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.206990] [] 0x8047e780
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.210464] [] 0x80359e2c
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.213941] [] 0x8000b05c
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.217409] [] 0x8000b064
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.220879] [] 0x805c6f9c
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.224351] [] 0x8007d8ac
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.227829] [] 0x8002bfe8
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.231302] [] 0x8047e780
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.234775] [] 0x8002c0c0
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.238257] [] 0x8047e780
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.241731] [] 0x800a9018
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.245212] [] 0x8047e484
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.248688] [] 0x800965d4
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.252161] [] 0x8009681c
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.255640] [] 0x805e7d1c
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.259116] [] 0x80030768
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.262588] [] 0x802f8404
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.266064] [] 0x80006c28
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.269533]
Thu Apr 29 03:58:44 2021 kern.warn kernel: [  578.271153] ---[ end trace eabfb050c5985bce ]---
Thu Apr 29 03:58:44 2021 kern.err kernel: [  578.275799] mtk_soc_eth 1e100000.ethernet eth0: transmit timed out
Thu Apr 29 03:58:44 2021 kern.info kernel: [  578.282602] mtk_soc_eth 1e100000.ethernet eth0: Link is Down
Thu Apr 29 03:58:44 2021 kern.info kernel: [  578.318310] mtk_soc_eth 1e100000.ethernet eth0: configuring for fixed/rgmii link mode
Thu Apr 29 03:58:44 2021 kern.info kernel: [  578.326251] mtk_soc_eth 1e100000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx


This only occurs on the first instance of the error, the error will repeatedly occur but the only evidence in the logs after the first error is:


Thu Apr 29 04:33:36 2021 kern.err kernel: [ 2670.043971] mtk_soc_eth 1e100000.ethernet eth0: transmit timed out
Thu Apr 29 04:33:36 2021 kern.info kernel: [ 2670.050555] mtk_soc_eth 1e100000.ethernet eth0: Link is Down
Thu Apr 29 04:33:36 2021 kern.info kernel: [ 2670.084619] mtk_soc_eth 1e100000.ethernet eth0: configuring for fixed/rgmii link mode
Thu Apr 29 04:33:36 2021 kern.info kernel: [ 2670.092556] mtk_soc_eth 1e100000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx





More information can be found at the following URL:
https://bugs.openwrt.org/index.php?do=details&task_id=3760

You are receiving this message because you have requested it from the Flyspray bugtracking system.  If you did not expect this message or don't want to receive mails in future, you can change your notification settings at the URL shown above.



More information about the openwrt-bugs mailing list