[OpenWrt-Devel] mac80211: 802.11s TCP/IP Connectivity Fails After 2018-09

Jeff Kletsky lede at allycomm.com
Tue Feb 5 00:24:20 EST 2019

802.11s mesh appears to not transport TCP/IP even though the mesh
appears up, for commits on master after late September, 2018.

(Steps to replicate at the end of this message)

The output of `iw dev mesh0 station dump` yields similar results for
working and non-working builds, along the lines of:

     mesh plink:    ESTAB
     mesh local PS mode:    ACTIVE
     mesh peer PS mode:    ACTIVE
     mesh non-peer PS mode:    ACTIVE
     authorized:    yes
     authenticated:    yes
     associated:    yes


     mesh plink:    ESTAB
     mesh local PS mode:    ACTIVE
     mesh peer PS mode:    UNKNOWN
     mesh non-peer PS mode:    ACTIVE
     authorized:    yes
     authenticated:    yes
     associated:    yes

ping to the peer(s) prior to the first failing commit is sucessful and
the peer is reflected in the arp table.  Full connectivity can be
established over the mesh to peers with UDP or TCP transport.

Both ping and arp fail to behave as expected on or after the first
failing commit.

The routing is the same for both builds.

Behavior seen on both Archer C7 v2 (QCA9880-BR4A v2)
and a GL.iNet AR750S (QCA9887), both on ath79 target (AR750S is WIP)

Confirmed that the newer (newest?) firmware (ver 10.2.4-1.0-00041)
works on the "old" build.

Lack of TCP/IP transport over the mesh confirmed both with 802.11s
integral routing, as well with 802.11s routing disabled and replaced
with batman-adv.  Configuration using both 802.11s routing and
batman-adv on distinct meshes on the same radios has been running
successfully with four or five Archer C7 v2 units for both backhaul
over the 5 GHz mesh, as well as client access on 5 GHz.  (Not tested
on 2.4 GHz on either the Archer C7 or the AR750S.)

git bisect was used to identify the "first failing commit":

commit db90c243a0b9bd72fc691cd09e58a96ac2a452cf
Author: Hauke Mehrtens <redacted>
Date:   Sun Sep 23 18:02:35 2018 +0200

     mac80211: update to version based on 4.19-rc4

Note that the Candela Technologies drivers do not appear to support
mesh on either chipset.

Any suggestions as to how I can further explore this issue, or assist
others in the process?


Additional notes at

Thanks to @slh and Koen for hopping on thing there!

**To Replicate:**

==> Pick your target of choice

$ ./scripts/diffconfig.sh
# CONFIG_PACKAGE_ath10k-firmware-qca9887-ct is not set
# CONFIG_PACKAGE_kmod-ath10k-ct is not set
# CONFIG_PACKAGE_kmod-hwmon-core is not set
# CONFIG_PACKAGE_wpad-basic is not set

==> Add to `/etc/config/network`

config interface 'nwi_mesh0'
     option ifname 'mesh0'
     option mtu '2304'
     option proto 'static'
     option ipaddr '172.16.0.NNN'
     option netmask ''

==> Add to / replace in `/etc/config/wireless`
       * Please confirm a valid channel for your location
       * Adjust the device path/reference as needed
       * 'n' used for support of older iPhones here

config wifi-device 'radio5'
     option type 'mac80211'
     option channel '149'
     option hwmode '11a'
     option path 'pci0000:00/0000:00:00.0'
     option htmode 'VHT80'
         option require_mode 'n'

config wifi-iface 'mesh0'
     option device 'radio5'
     option ifname 'mesh0'
     option network 'nwi_mesh0'
     option mode 'mesh'
     option mesh_id 'TestMesh'
     option mesh_fwding '1'
     option encryption 'psk2+ccmp'
     option key 'TestMeshPassPhrase'

==> Flash and configure additional unit(s), changing the last octet
of the nwi_mesh0 interface address.

==> Note that the mesh appears established from the output of
`iw dev mesh0 station dump`

==> Note that the one unit can’t ping the other’s nwi_mesh0 address
when building on or after "first bad commit" but ping (arp and ip
neigh) will work as expected prior to that commit.

