[OpenWrt-Devel] Autoneg problems with ar8216 on kernel 3.14 (related to ticket 17800)

Heiner Kallweit hkallweit1 at gmail.com
Mon Nov 24 03:45:59 EST 2014


On Mon, Nov 24, 2014 at 1:06 AM, Felix Fietkau <nbd at openwrt.org> wrote:

> On 2014-11-23 23:03, Weedy wrote:
> >
> > On Nov 23, 2014 4:35 PM, "Weedy" <weedy2887 at gmail.com
> > <mailto:weedy2887 at gmail.com>> wrote:
> >>
> >>
> >> On Nov 19, 2014 8:47 AM, "John Crispin" <blogic at openwrt.org
> > <mailto:blogic at openwrt.org>> wrote:
> >> >
> >> >
> >> >
> >> > On 19/11/2014 14:44, Weedy wrote:
> >> > >
> >> > > On Oct 30, 2014 5:08 PM, "Heiner Kallweit" <hkallweit1 at gmail.com
> > <mailto:hkallweit1 at gmail.com>
> >> > > <mailto:hkallweit1 at gmail.com <mailto:hkallweit1 at gmail.com>>> wrote:
> >> > >>
> >> > >> Am 30.10.2014 um 07:22 schrieb Heiner Kallweit:
> >> > >> > Am 30.10.2014 um 03:36 schrieb Florian Fainelli:
> >> > >> >> Le 28/10/2014 11:46, Heiner Kallweit a écrit :
> >> > >> >>> After a little more thinking about it and looking at the code I
> >> > > basically have two questions:
> >> > >> >>>
> >> > >> >>> 1.
> >> > >> >>> Is it actually needed to exclude certain phy's in
> >> > > ar8xxx_phy_config_aneg?
> >> > >> >>> (At least for AR8327 it doesn't get called with an addr != 0
> > anyway)
> >> > >> >>> Else we could remove ar8xxx_phy_config_aneg and directly
> register
> >> > > genphy_config_aneg as
> >> > >> >>> callback for autoneg configuration.
> >> > >> >>
> >> > >> >> Address 0 is special, since this is a MDIO broadcast address
> that
> >> > > will usually be handled by switches as "write to all front-panel
> > ports".
> >> > >> >>
> >> > >> >>>
> >> > >> >>> 2.
> >> > >> >>> Call sequence with regard to ar8216 is:
> >> > >> >>> ar8216: ar8xxx_phy_probe
> >> > >> >>> phy: phy_attach_direct
> >> > >> >>> phy: phy_init_hw
> >> > >> >>> ar8216: ar8xxx_phy_config_init
> >> > >> >>> ar8216: ar8xxx_phy_config_aneg
> >> > >> >>>
> >> > >> >>> ar8216 driver handles AR8327/AR8337 different from the other
> >> > > supported switch types.
> >> > >> >>> ar8xxx_start incl. more detailed configuration is called from
> >> > > ar8xxx_phy_probe already.
> >> > >> >>> For the other switch types it's called from
> > ar8xxx_phy_config_init.
> >> > >> >>>
> >> > >> >>> I wonder whether doing detailed configuration in the probe
> stage
> >> > > might be too early.
> >> > >> >>> phy_init_hw resets the switch anyway later.
> >> > >> >>> Doing the detailed setup in ar8xxx_phy_config_init seems to be
> >> > > more suited however
> >> > >> >>> there might be good reason why it's handled the way it is.
> >> > >> >>
> >> > >> >> I suppose that you could re-advertise auto-negotiation by
> calling
> >> > > genphy_config_advert() in the config_init routine.
> >> > >> >
> >> > >> > The actual problem is caused by BMCR_ANENABLE being cleared by
> the
> >> > > reset in phy_init_hw.
> >> > >> > As far as I can see genphy_config_advert doesn't bring back
> > this flag.
> >> > >> > What does genphy_config_aneg mainly do?
> >> > >> >   - call genphy_config_advert
> >> > >> >   - check if BMCR_ANENABLE is set
> >> > >> >   - if it's not, call genphy_restart_aneg
> >> > >> > Therefore, to bring back BMCR_ANENABLE, we have to call
> >> > > genphy_config_aneg or genphy_restart_aneg.
> >> > >> > genphy_restart_aneg most likely is sufficient, however I don't
> see
> >> > > that genphy_config_aneg
> >> > >> > can do any harm if being called with addr == 0.
> >> > >> > At least for me it works perfectly fine to call
> genphy_config_aneg
> >> > > for all addr's.
> >> > >> >
> >> > >> > Rgds, Heiner
> >> > >> >
> >> > >> >>
> >> > >> >>>
> >> > >> >>> Rgds,
> >> > >> >>> Heiner
> >> > >> >>>
> >> > >> >>>
> >> > >> >>> Am 27.10.2014 um 22:00 schrieb Heiner Kallweit:
> >> > >> >>>> With two different TPLINK routers (both with a AR8327N
> switch) I
> >> > > faced the issue that with
> >> > >> >>>> kernel 3.14 certain switch ports used 10MBit/half only.
> >> > >> >>>> Under kernel 3.10 everything was fine and the same ports used
> >> > > 1000MBit/full.
> >> > >> >>>>
> >> > >> >>>> As the ar8216 driver is the same I had a look at the common
> phy
> >> > > code in drivers/net/phy.
> >> > >> >>>> In function phy_init_hw in phy_device.c kernel 3.14 resets the
> >> > > phy whilst 3.10 does not.
> >> > >> >>>>
> >> > >> >>>> The issue turned out to be that when resetting only flag
> >> > > BMCR_RESET is set, not BMCR_ANENABLE.
> >> > >> >>>> (In ar8216.c always both flags are set when the switch is
> reset)
> >> > >> >>>> Therefore autoneg was not enabled. Also later in the boot
> > process
> >> > > it doesn't get enabled.
> >> > >> >>>> Adding BMCR_ANENABLE solved the problem and now also under
> 3.14
> >> > > all ports use 1000MBit/full.
> >> > >> >>>>
> >> > >> >>>> However I'm not sure whether it's a poper fix to add
> >> > > BMCR_ANENABLE in this generic phy function.
> >> > >> >>>> It might not be appropriate for other phy's.
> >> > >> >>>>
> >> > >> >>>> After resetting the switch later in the boot process
> >> > > ar8xxx_phy_config_aneg is called with
> >> > >> >>>> phydev->addr being 0.
> >> > >> >>>> In this case the function returns immediately. Otherwise it
> > would
> >> > > call genphy_config_aneg.
> >> > >> >>>> Calling genphy_config_aneg would also solve the problem as it
> >> > > checks for BMCR_ANENABLE
> >> > >> >>>> being set and sets it if needed.
> >> > >> >>>> I don't know the reason why genphy_config_aneg isn't called in
> >> > > case of addr 0.
> >> > >> >>>> Or is this a typo and the check actually should be addr != 0 ?
> >> > >> >>>>
> >> > >> >>>> Rgds, Heiner
> >> > >> >>>>
> >> > >>
> >> > >> The following rudimentary patch works fine for me with kernel
> > 3.14.18 on
> >> > >> TP-LINK TL-WDR4900 (mpc85xx with AR8327Nv4)
> >> > >> TP-LINK TL-WDR4300 ( ar71xx with AR8327Nv2)
> >> > >>
> >> > >> Apart from using genphy_config_aneg also for addr==0 I replaced
> >> > > msleep(1000)
> >> > >> with a polling function inspired by phy_poll_reset in phy_device.c
> >> > >> On AR8327 the reset actually takes less than 20ms. Sleeping for
> > 1000ms
> >> > >> seems to be a waste of time.
> >> > >>
> >> > >> Little more work would be needed to make it a proper patch:
> >> > >> - move ar8xxx_phy_poll_reset more to the beginning so that it is
> > defined
> >> > >>   before being used in any xxxx_hw_init function
> >> > >> - replace msleep(1000) also in the other xxxx_hw_init functions
> >> > >> - remove pr_info debug message or make it at least pr_dbg
> >> > >> - insert some comments
> >> > >> - use git format-patch output
> >> > >>
> >> > >> Rgds, Heiner
> >> > >>
> >> > >>
> >> > >> --- ar8216.c.orig       2014-10-30 21:50:19.999723156 +0100
> >> > >> +++ ar8216.c    2014-10-30 21:42:11.996481099 +0100
> >> > >> @@ -1591,6 +1591,29 @@
> >> > >>  #endif
> >> > >>
> >> > >>  static int
> >> > >> +ar8xxx_phy_poll_reset(struct mii_bus *bus, int num_phys)
> >> > >> +{
> >> > >> +       unsigned int sleep_msecs = 20;
> >> > >> +       int ret, elapsed, i;
> >> > >> +
> >> > >> +       for(elapsed = sleep_msecs; elapsed <= 600; elapsed +=
> >> > > sleep_msecs) {
> >> > >> +               msleep(sleep_msecs);
> >> > >> +               for (i = 0; i < num_phys; i++) {
> >> > >> +                       ret = mdiobus_read(bus, i, MII_BMCR);
> >> > >> +                       if (ret < 0) return ret;
> >> > >> +                       if (ret & BMCR_RESET) break;
> >> > >> +                       if (i == num_phys - 1) {
> >> > >> +                               usleep_range(1000, 2000);
> >> > >> +                               pr_info("ar8xxx_phy_poll_reset
> >> > > finished in %u ms\n",
> >> > >> +                                       elapsed);
> >> > >> +                               return 0;
> >> > >> +                       }
> >> > >> +               }
> >> > >> +       }
> >> > >> +       return -ETIMEDOUT;
> >> > >> +}
> >> > >> +
> >> > >> +static int
> >> > >>  ar8327_hw_init(struct ar8xxx_priv *priv)
> >> > >>  {
> >> > >>         struct mii_bus *bus;
> >> > >> @@ -1620,7 +1643,7 @@
> >> > >>                 mdiobus_write(bus, i, MII_BMCR, BMCR_RESET |
> >> > > BMCR_ANENABLE);
> >> > >>         }
> >> > >>
> >> > >> -       msleep(1000);
> >> > >> +       ar8xxx_phy_poll_reset(bus, AR8327_NUM_PHYS);
> >> > >>
> >> > >>         return 0;
> >> > >>  }
> >> > >> @@ -2840,15 +2863,6 @@
> >> > >>         return ret;
> >> > >>  }
> >> > >>
> >> > >> -static int
> >> > >> -ar8xxx_phy_config_aneg(struct phy_device *phydev)
> >> > >> -{
> >> > >> -       if (phydev->addr == 0)
> >> > >> -               return 0;
> >> > >> -
> >> > >> -       return genphy_config_aneg(phydev);
> >> > >> -}
> >> > >> -
> >> > >>  static const u32 ar8xxx_phy_ids[] = {
> >> > >>         0x004dd033,
> >> > >>         0x004dd034, /* AR8327 */
> >> > >> @@ -3020,7 +3034,7 @@
> >> > >>         .remove         = ar8xxx_phy_remove,
> >> > >>         .detach         = ar8xxx_phy_detach,
> >> > >>         .config_init    = ar8xxx_phy_config_init,
> >> > >> -       .config_aneg    = ar8xxx_phy_config_aneg,
> >> > >> +       .config_aneg    = genphy_config_aneg,
> >> > >>         .read_status    = ar8xxx_phy_read_status,
> >> > >>         .driver         = { .owner = THIS_MODULE },
> >> > >>  };
> >> > >> _______________________________________________
> >> > >> openwrt-devel mailing list
> >> > >> openwrt-devel at lists.openwrt.org
> > <mailto:openwrt-devel at lists.openwrt.org>
> > <mailto:openwrt-devel at lists.openwrt.org
> > <mailto:openwrt-devel at lists.openwrt.org>>
> >> > >> https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel
> >> > >
> >> > > I feel like this shouldn't die off in the depths of the ML
> >> > >
> >> >
> >> > we had a call about it today, this patch and the other ar821x patches
> >> > will be handled soon
> >>
> >> You guys bumped trunk to 3.14 but didn't fix this.
> >
> > It's also broken in 43342.
> >
> > For me this manifests as the WAN port is stuck at 1000T so my 100T DSL
> > modem doesn't exist. Lucky I keep spare switches around.
> On what device is it broken for you? I did apply Heiner Kallweit's
> fixes, and it worked for me.
>
> - Felix

The problem with the non-working WAN port was reported to me also by others.
Using genphy_config_aneg with phy 0 seems to cause this.
The following fix which I sent Nov 21st is still waiting to be applied.

ar8216: Fix issue with autoneg being disabled under 3.14, revert 43332

Heiner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.infradead.org/pipermail/openwrt-devel/attachments/20141124/1066ea7a/attachment.htm>
-------------- next part --------------
_______________________________________________
openwrt-devel mailing list
openwrt-devel at lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel


More information about the openwrt-devel mailing list