[OpenWrt-Devel] wifi dropouts; regression since 2015 still not fixed

Luke Dashjr luke at dashjr.org
Sat May 26 13:59:45 EDT 2018

On Saturday 26 May 2018 17:11:56 you wrote:
> On Sat, May 26, 2018 at 12:34:30PM +0000, Luke Dashjr wrote:
> > Half a year ago, I went to the trouble to identify the exact cause of the
> > regression since 2015 where Openwrt/LEDE would drop off wifi every few
> > hours:
> >
> > https://bugs.openwrt.org/index.php?do=details&task_id=1180
> >
> > There's even a clear and obvious solution: just remove the broken patch.
> Thank you for bringing this up. I haven't been aware of the ticket on
> the bugtracker and can confirm this or a very similar problem on all
> ath9k devices.
> So do I get right that you can also confirm the problem exists and have
> tried significantly hard to make sure that removing patch
> 355-ath9k-limit-retries-for-powersave-response-frames.patch
> and/or completely disabling power-management resolves it?
> If so, please share in more detail what exactly you have tested, on
> which exact hardware and in which way(s) you can reproduce that the
> problem does exist for sure or doesn't exist for sure.

My personal setup for testing is as follows:

Two Nest thermostats, which upon connecting to my network, I shut off the 
NestLabs application (which sets up the wifi, among the thermostats' primary 
controls) and launch my own (which doesn't handle wifi). The relevance of 
these is limited only to detecting the bad wifi conditions - they could 
presumably be replaced with any other device that connects once and does not 
reconnect when the network fails.

Buffalo WZR-600DHP router. Until 6 months ago, I ran Attitude Adjustment for 
years with no issues. Due to exploits (I forget which), I tried upgrading to 
LEDE, and found the wifi would drop out after a few hours. I tolerated this 
problem for a few weeks before going to the effort to do an exhaustive git 
bisect, building new firmware images to try after confirming stability for a 
few days (or instability usually within a day).

Last November, after isolating the problematic commit to 
b30e092de65ca7be7cb277f934016484137d924c, and the problematic patch to (at 
the time) 305-ath9k-limit-retries-for-powersave-response-frames.patch, I 
checked out 6b6578feec74dfe1f5767c573d75ba08cc57c885 (HEAD of the lede-17.01 
branch at the time, I believe) and created the following patch:

@@ -20,7 +20,7 @@ Signed-off-by: Felix Fietkau <nbd at openwrt.org>
 -                        struct ath_buf *bf)
 +                        struct ath_buf *bf, bool ps)
-+      struct ieee80211_tx_info *info = IEEE80211_SKB_CB(bf->bf_mpdu);
++      struct ieee80211_tx_info *info = IEEE80211_SKB_CB(bf->bf_mpdu); 
 +      if (ps) {
 +              /* Clear the first rate to avoid using a sample rate for PS 
frames */

I ran this build on my router from November until February. In February, I 
merged it into 92ea65b36aa783f28accd01fa4850f3640d2c6b6 and upgraded my 
router, and have been running that ever since.

I can't say wifi has been 100% reliable since November, but drop-outs are 
maybe monthly instead of every few hours.

> > Yet still it's been 6 months with nothing done... not even an
> > acknowledgement that the bug report was seen by a human.
> Probably due to the bug description which makes it look like if it was
> a device-specific problem and you could only reproduce it on those
> specific Buffalo boxes...

Don't have anything else to test on :)

> > I even emailed nbd (who added the broken patch) back in November, and
> > heard nothing back from him either.
> >
> > Is Openwrt even being maintained at this point? Has everyone moved on to
> > yet another project or something?
> nah. LEDE has merged back with OpenWrt, as you can see in the git
> history, we have been very active in the recent past and a new release
> is coming up very soon (18.06). It's kinda true that currently nobody
> feels too responsible for ath9k, because most people developing code
> for new hardware have moved on to mt76, ath10k and so on.
> I agree that this is not a very desirable situation and as a community-
> driven project with most contributions being unpaid for, there is not
> much we can do about it. In that sense, you just did something valuable
> by bringing up the culprit patch responsible for all our ath9k problems
> and hopefully someone with that hardware around will find the time to
> come up with a good solution.

Somewhat off-topic, but... is there any company today that financially 
supports the work/development of Openwrt? As noticed, my router is a bit out 
of date, and when I upgrade, I'd prefer to support a company that supports 
the free software firmware community. :)


