Bridge-vlan bug? (mt7621/DSA)

Thibaut hacks at slashdirt.org
Fri Aug 5 07:39:36 PDT 2022


Hi,

I’m experiencing a strange bug on Yuncore AX820 (mt7621/mt7905/mt7975, DSA-enabled) when using a bridge-vlan setup. This bug affects at least OpenWRT 22.03.0-rc6.

I’m not sure whether this bug is related to this particular SoC or only to DSA as I was unable to test with another DSA-enabled device (I don’t have any). However this bug does not affect e.g. QCA non-DSA devices.

I’m running out of ideas on how to further debug this problem, so feel free to guide me if more information is needed.
Please CC-me in replies.


== Hardware setup ==

- 1 router (any router works for the purpose of the test), serving DHCP on the LAN (the default configuration from a fresh OpenWRT install works to reproduce this bug - the router setup has not play in the bug).
- 1 AX820 setup as « dumb » AP (testcase config provided below, using a bridge-vlan), with one uplink interface (here ‘wan’) directly connected to the router
- 1 other AP, make/model irrelevant, provided it has the same dumb config as the AX820 and is also directly connected to the router

The APs use a single bridge-vlan to which their interfaces are hooked: in the full scenario multiple VLANs are assigned to the bridge, and assigned to separate SSIDs. All but one VLANs are tagged on the uplink interface. The reduced test case config provided below uses a single untagged VLAN (id 8, for network ‘lan’) and a single SSID: that is enough to expose the bug.


== Bug description ==

The following bug happens on the untagged VLAN on the uplink interface (see testcase config below):

When a client device roams to the AX820 AP (which can be forced by issuing « wifi off » on the other AP when the client is connected to it), a « blackout » period that typically lasts 2-5mn begins, where the client loses connectivity.


== Analysis ==

Running tcpdump, one obvious symptom is that the client emits DHCP requests which are received by the router, the router sends back DHCP replies (confirmed via tcpdump on router) but these replies never reach the client during the blackout period.

In fact, running tcpdump on all of the connected AP’s interface (wireless (wlan0), DSA slave (wan), DSA master (eth0), 8021q (vlan-lan), bridge-vlan (vbridge0)) shows no DHCP reply ever being captured during this blackout period, until one finally makes it through when the blackout ends.


== Known unaffected scenarios ==

If the VLAN is configured tagged on the uplink interface (using « list ports ‘wan:t’ ») - and the router is setup to use tagged frames as well of course - the bug does not occur.

If the slaves are configured with regular ‘br-lan’ bridges (no vlans), the bug does not occur: it seems tied to using a bridge-vlan.

« roaming » a wired device from one AP to the other (through the free ethernet port configured untagged, see testcase config below) does not trigger the bug: wireless seems a key part of this problem.

Finally, the exact same AP configuration used on non-DSA QCA9533-based devices works flawlessly.


== Other remarks ==

To decode DSA master interface (eth0) captures, I used editcap (from Wireshark) as follows:
$ editcap -L -T ether -C 12:4 dsamaster.cap master.cap
this removes the mtk DSA tags that libpcap cannot parse.


== Reduced testcase AP configs ==

/etc/config/network (loopback config not quoted, adjust ipaddr for each AP):

config interface 'lan'
	option proto 'static'
	option netmask '255.255.255.0'
	option ip6assign '60'
	option device 'vlan-lan'
	option ipaddr '192.168.1.2'
	option gateway '192.168.1.1'
	option dns '192.168.1.1'

config device
	option type 'bridge'
	option name 'vbridge0'
	option ipv6 '0'
	option vlan_filtering '1'
	list ports 'lan'
	list ports 'wan'

config device
	option type '8021q'
	option ifname 'vbridge0'
	option vid '8'
	option name 'vlan-lan'
	option ipv6 '0'

config bridge-vlan
	option vlan '8'
	option device 'vbridge0'
	list ports 'wan:u*'
	list ports 'lan:u*’


/etc/config/wireless (wifi-device not quoted):

config wifi-iface 'radio0_test'
	option device 'radio0'
	option mode 'ap'
	option network 'lan'
	option ssid ’test'
	option encryption ’none'



HTH
Thibaut



More information about the openwrt-devel mailing list