[PATCH v2] mac80211: fix regression in sta connection monitor

Georgi Valkov gvalkov at abv.bg
Sat Sep 26 00:41:42 EDT 2020


Hi Felix!

With your latest suggestion, it takes between 10 and 17 hours for the connection to drop, then long five minutes to reconnect.
Notice the order of code execution in the original code of ieee80211_sta_tx_notify():
probe_send_count is always cleared when ack is true. But before clearing probe_send_count, we need to check if ieee80211_is_any_nullfunc() returns true and probe_send_count > 0: if yes, we must call ieee80211_queue_work(). You cleared probe_send_count ahead of time, so the condition to run ieee80211_queue_work() will never be met.

To spare your time, I did spend one week to find the cause, then another learning every detail about the code and testing various solutions, including those you proposed. While I do not have experience with mac80211’s design, I’m quite good at preserving the exact behaviour during large scale refactoring. And in my fix I tried changing as little as possible to keep the patch small, preserving both your changes and the original design behaviour.

Good luck, and thank you for being a long time supporter of this amazing project!

Georgi

> On 2020-09-22, at 1:24 PM, Felix Fietkau <nbd at nbd.name> wrote:
> 
> When a frame was acked and probe frames were sent, the connection monitoring
> needs to be reset, otherwise it will keep probing until the connection is
> considered dead, even though frames have been acked in the mean time.
> 
> Fixes: 9abf4e49830d ("mac80211: optimize station connection monitor")
> Reported-by: Georgi Valkov <gvalkov at abv.bg>
> Signed-off-by: Felix Fietkau <nbd at nbd.name>
> ---
> v2: reset connection monitor when a frame was acked (not just for nulldata)
> 
> net/mac80211/status.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
> 
> diff --git a/net/mac80211/status.c b/net/mac80211/status.c
> index 7fe5bececfd9..cc870d1f7db6 100644
> --- a/net/mac80211/status.c
> +++ b/net/mac80211/status.c
> @@ -1120,6 +1120,8 @@ void ieee80211_tx_status_ext(struct ieee80211_hw *hw,
> 	noack_success = !!(info->flags & IEEE80211_TX_STAT_NOACK_TRANSMITTED);
> 
> 	if (pubsta) {
> +		struct ieee80211_sub_if_data *sdata = sta->sdata;
> +
> 		if (!acked && !noack_success)
> 			sta->status_stats.retry_failed++;
> 		sta->status_stats.retry_count += retry_count;
> @@ -1134,6 +1136,13 @@ void ieee80211_tx_status_ext(struct ieee80211_hw *hw,
> 				/* Track when last packet was ACKed */
> 				sta->status_stats.last_pkt_time = jiffies;
> 
> +				/* Reset connection monitor */
> +				if (sdata->vif.type == NL80211_IFTYPE_STATION &&
> +				    unlikely(sdata->u.mgd.probe_send_count > 0)) {
> +					sdata->u.mgd.probe_send_count = 0;
> +					ieee80211_queue_work(&local->hw, &sdata->work);
> +				}
> +
> 				if (info->status.is_valid_ack_signal) {
> 					sta->status_stats.last_ack_signal =
> 							 (s8)info->status.ack_signal;
> -- 
> 2.28.0
> 
> 




More information about the openwrt-devel mailing list