[FS#4067] ipq40xx: Switch (ar40xx) freezes
OpenWrt Bugs
openwrt-bugs at lists.openwrt.org
Thu Oct 7 03:27:28 PDT 2021
THIS IS AN AUTOMATED MESSAGE, DO NOT REPLY.
A new Flyspray task has been opened. Details are below.
User who did this - Polynomdivision (PolynomialDivision)
Attached to Project - OpenWrt/LEDE Project
Summary - ipq40xx: Switch (ar40xx) freezes
Task Type - Bug Report
Category - Base system
Status - Unconfirmed
Assigned To -
Operating System - All
Severity - Low
Priority - Very Low
Reported Version - Trunk
Due in Version - Undecided
Due Date - Undecided
Details - The ar40xx on an ipq40xx device freezes from time to time. The issue is hard to reproduce since it was seen on multiple devices on a very different time scale. We run a mesh network with olsr (IPv4) and babeld (IPv6) as routing daemons. The affected devices are mainly Fritz!Box 4040 and Fritz!Box 7530. Typically, on setups with a huge load, the switches begin to freeze. As an example, a church that acts as a central point of our mesh network and connection to the internet experiences the freeze daily. We can easily reproduce and test solutions in that location. Devices with fewer clients and just one mesh connection, also crash but it needs some time (30 days). Some devices with almost no traffic or client do not crash.
As a workaround, we wrote the naywatch daemon, which checks for ipv6 link-local connectivity. We also allow collecting debug output from it. With that, we can show what a diff of the swconfig looks like when a freeze happens. So the switch is already in a frozen state. As you can see the CPU Port 0 is not sending any frames (TX is not visible) to the CPU anymore. The switch still receives frames.
Port 0:
mib: Port 0 MIB counters
-RxBroad : 472793
+RxBroad : 472882
RxPause : 0
-RxMulti : 36772
+RxMulti : 36854
RxFcsErr : 0
RxAlignErr : 0
RxRunt : 0
RxFragment : 0
-Rx64Byte : 2075611
-Rx128Byte : 20964917
-Rx256Byte : 16459560
-Rx512Byte : 623700
-Rx1024Byte : 907303
-Rx1518Byte : 48003389
+Rx64Byte : 2075618
+Rx128Byte : 20964980
+Rx256Byte : 16459583
+Rx512Byte : 623752
+Rx1024Byte : 907344
+Rx1518Byte : 48003397
RxMaxByte : 21422694
RxTooLong : 0
-RxGoodByte : 107718798589
+RxGoodByte : 107718869396
RxBadByte : 0
RxOverFlow : 0
Filtered : 0
**What could be wrong:**
* switch incorrectly configured
* switch broken?
* ...
I already wrote with blocktrron and to my understanding, he experienced also a freeze. There is a not published driver DSA implementation of the ar40xx this also has some issues. Maybe, the implementation will help to solve this issue.
**The rest of the diff:**
root at emma-core:~# diff -u 1633480443-swconfig\ dev\ switch0\ show.log 1633480463-swconfig\ dev\ switch0\ show.log
--- "1633480443-swconfig dev switch0 show.log" 2021-10-06 02:34:03.000000000 +0200
+++ "1633480463-swconfig dev switch0 show.log" 2021-10-06 02:34:23.000000000 +0200
@@ -7,22 +7,22 @@
linkdown: ???
Port 0:
mib: Port 0 MIB counters
-RxBroad : 472793
+RxBroad : 472882
RxPause : 0
-RxMulti : 36772
+RxMulti : 36854
RxFcsErr : 0
RxAlignErr : 0
RxRunt : 0
RxFragment : 0
-Rx64Byte : 2075611
-Rx128Byte : 20964917
-Rx256Byte : 16459560
-Rx512Byte : 623700
-Rx1024Byte : 907303
-Rx1518Byte : 48003389
+Rx64Byte : 2075618
+Rx128Byte : 20964980
+Rx256Byte : 16459583
+Rx512Byte : 623752
+Rx1024Byte : 907344
+Rx1518Byte : 48003397
RxMaxByte : 21422694
RxTooLong : 0
-RxGoodByte : 107718798589
+RxGoodByte : 107718869396
RxBadByte : 0
RxOverFlow : 0
Filtered : 0
@@ -51,38 +51,38 @@
link: port:0 link:up speed:1000baseT full-duplex txflow rxflow
Port 1:
mib: Port 1 MIB counters
-RxBroad : 107158
+RxBroad : 107267
RxPause : 0
-RxMulti : 1147
+RxMulti : 1173
RxFcsErr : 0
RxAlignErr : 0
RxRunt : 0
RxFragment : 0
-Rx64Byte : 1536
-Rx128Byte : 68123
-Rx256Byte : 20668
-Rx512Byte : 4912
-Rx1024Byte : 10327
-Rx1518Byte : 90128
-RxMaxByte : 9851
+Rx64Byte : 1555
+Rx128Byte : 68180
+Rx256Byte : 20673
+Rx512Byte : 4925
+Rx1024Byte : 10332
+Rx1518Byte : 90209
+RxMaxByte : 9860
RxTooLong : 0
-RxGoodByte : 166901240
+RxGoodByte : 167051151
RxBadByte : 0
RxOverFlow : 0
-Filtered : 118
-TxBroad : 856851
-TxPause : 1074
-TxMulti : 89960
+Filtered : 307
+TxBroad : 856940
+TxPause : 3442
+TxMulti : 90042
TxUnderRun : 0
-Tx64Byte : 38606
-Tx128Byte : 206906
-Tx256Byte : 56950
-Tx512Byte : 27986
-Tx1024Byte : 60718
-Tx1518Byte : 677209
+Tx64Byte : 40978
+Tx128Byte : 206965
+Tx256Byte : 56973
+Tx512Byte : 28038
+Tx1024Byte : 60759
+Tx1518Byte : 677217
TxMaxByte : 74048
TxOverSize : 0
-TxByte : 1186157213
+TxByte : 1186378970
TxCollision : 0
TxAbortCol : 0
TxMultiCol : 0
@@ -95,38 +95,38 @@
link: port:1 link:up speed:1000baseT full-duplex txflow rxflow auto
Port 2:
mib: Port 2 MIB counters
-RxBroad : 170588
+RxBroad : 170832
RxPause : 0
-RxMulti : 11717
+RxMulti : 11767
RxFcsErr : 0
RxAlignErr : 0
RxRunt : 0
RxFragment : 0
-Rx64Byte : 28452
-Rx128Byte : 6337895
-Rx256Byte : 408749
-Rx512Byte : 54975
-Rx1024Byte : 62053
-Rx1518Byte : 343977
-RxMaxByte : 130630
+Rx64Byte : 28455
+Rx128Byte : 6338150
+Rx256Byte : 408825
+Rx512Byte : 55001
+Rx1024Byte : 62066
+Rx1518Byte : 344149
+RxMaxByte : 130646
RxTooLong : 0
-RxGoodByte : 1345452080
+RxGoodByte : 1345783813
RxBadByte : 0
RxOverFlow : 0
-Filtered : 574
-TxBroad : 793647
-TxPause : 1151
-TxMulti : 79418
+Filtered : 1135
+TxBroad : 793736
+TxPause : 3519
+TxMulti : 79500
TxUnderRun : 0
-Tx64Byte : 97711
-Tx128Byte : 603920
-Tx256Byte : 173282
-Tx512Byte : 76156
-Tx1024Byte : 104686
-Tx1518Byte : 3582480
+Tx64Byte : 100079
+Tx128Byte : 603969
+Tx256Byte : 173305
+Tx512Byte : 76208
+Tx1024Byte : 104727
+Tx1518Byte : 3582488
TxMaxByte : 7391371
TxOverSize : 0
-TxByte : 16528781103
+TxByte : 16529001614
TxCollision : 0
TxAbortCol : 0
TxMultiCol : 0
@@ -139,38 +139,38 @@
link: port:2 link:up speed:1000baseT full-duplex txflow rxflow auto
Port 3:
mib: Port 3 MIB counters
-RxBroad : 159441
+RxBroad : 159671
RxPause : 0
-RxMulti : 18188
+RxMulti : 18257
RxFcsErr : 0
RxAlignErr : 0
RxRunt : 0
RxFragment : 0
-Rx64Byte : 9911
-Rx128Byte : 3051611
-Rx256Byte : 711251
-Rx512Byte : 377985
-Rx1024Byte : 459307
-Rx1518Byte : 46523339
-RxMaxByte : 21133903
+Rx64Byte : 9932
+Rx128Byte : 3052044
+Rx256Byte : 711326
+Rx512Byte : 378003
+Rx1024Byte : 459361
+Rx1518Byte : 46523507
+RxMaxByte : 21133924
RxTooLong : 0
-RxGoodByte : 100988636934
+RxGoodByte : 100989010961
RxBadByte : 0
RxOverFlow : 0
-Filtered : 36461
-TxBroad : 804786
-TxPause : 6016
-TxMulti : 72948
+Filtered : 37251
+TxBroad : 804875
+TxPause : 8384
+TxMulti : 73030
TxUnderRun : 0
-Tx64Byte : 1661972
-Tx128Byte : 18816233
-Tx256Byte : 15830078
-Tx512Byte : 276436
-Tx1024Byte : 524862
-Tx1518Byte : 3198326
+Tx64Byte : 1664343
+Tx128Byte : 18816282
+Tx256Byte : 15830101
+Tx512Byte : 276488
+Tx1024Byte : 524903
+Tx1518Byte : 3198334
TxMaxByte : 426878
TxOverSize : 0
-TxByte : 9485704082
+TxByte : 9485924799
TxCollision : 0
TxAbortCol : 0
TxMultiCol : 0
@@ -202,19 +202,19 @@
RxBadByte : 871047744
RxOverFlow : 0
Filtered : 99
-TxBroad : 909305
-TxPause : 4319
-TxMulti : 67716
+TxBroad : 909394
+TxPause : 6687
+TxMulti : 67798
TxUnderRun : 0
-Tx64Byte : 335196
-Tx128Byte : 1832541
-Tx256Byte : 520952
-Tx512Byte : 320853
-Tx1024Byte : 412422
-Tx1518Byte : 42645040
+Tx64Byte : 337564
+Tx128Byte : 1832588
+Tx256Byte : 520975
+Tx512Byte : 320905
+Tx1024Byte : 412463
+Tx1518Byte : 42645048
TxMaxByte : 13773617
TxOverSize : 0
-TxByte : 84230298783
+TxByte : 84230519096
TxCollision : 0
TxAbortCol : 0
TxMultiCol : 0
More information can be found at the following URL:
https://bugs.openwrt.org/index.php?do=details&task_id=4067
You are receiving this message because you have requested it from the Flyspray bugtracking system. If you did not expect this message or don't want to receive mails in future, you can change your notification settings at the URL shown above.
More information about the openwrt-bugs
mailing list