[OpenWrt-Devel] Autoneg problems with ar8216 on kernel 3.14 (related to ticket 17800)

Weedy weedy2887 at gmail.com
Mon Nov 24 07:05:19 EST 2014


On Mon, Nov 24, 2014 at 3:45 AM, Heiner Kallweit <hkallweit1 at gmail.com> wrote:
>
> On Mon, Nov 24, 2014 at 1:06 AM, Felix Fietkau <nbd at openwrt.org> wrote:
>>
>> On 2014-11-23 23:03, Weedy wrote:
>> >
>> > On Nov 23, 2014 4:35 PM, "Weedy" <weedy2887 at gmail.com
>> > <mailto:weedy2887 at gmail.com>> wrote:
>> >>
>> >>
>> >> On Nov 19, 2014 8:47 AM, "John Crispin" <blogic at openwrt.org
>> > <mailto:blogic at openwrt.org>> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On 19/11/2014 14:44, Weedy wrote:
>> >> > >
>> >> > > On Oct 30, 2014 5:08 PM, "Heiner Kallweit" <hkallweit1 at gmail.com
>> > <mailto:hkallweit1 at gmail.com>
>> >> > > <mailto:hkallweit1 at gmail.com <mailto:hkallweit1 at gmail.com>>> wrote:
>> >> > >>
>> >> > >> Am 30.10.2014 um 07:22 schrieb Heiner Kallweit:
>> >> > >> > Am 30.10.2014 um 03:36 schrieb Florian Fainelli:
>> >> > >> >> Le 28/10/2014 11:46, Heiner Kallweit a écrit :
>> >> > >> >>> After a little more thinking about it and looking at the code
>> >> > >> >>> I
>> >> > > basically have two questions:
>> >> > >> >>>
>> >> > >> >>> 1.
>> >> > >> >>> Is it actually needed to exclude certain phy's in
>> >> > > ar8xxx_phy_config_aneg?
>> >> > >> >>> (At least for AR8327 it doesn't get called with an addr != 0
>> > anyway)
>> >> > >> >>> Else we could remove ar8xxx_phy_config_aneg and directly
>> >> > >> >>> register
>> >> > > genphy_config_aneg as
>> >> > >> >>> callback for autoneg configuration.
>> >> > >> >>
>> >> > >> >> Address 0 is special, since this is a MDIO broadcast address
>> >> > >> >> that
>> >> > > will usually be handled by switches as "write to all front-panel
>> > ports".
>> >> > >> >>
>> >> > >> >>>
>> >> > >> >>> 2.
>> >> > >> >>> Call sequence with regard to ar8216 is:
>> >> > >> >>> ar8216: ar8xxx_phy_probe
>> >> > >> >>> phy: phy_attach_direct
>> >> > >> >>> phy: phy_init_hw
>> >> > >> >>> ar8216: ar8xxx_phy_config_init
>> >> > >> >>> ar8216: ar8xxx_phy_config_aneg
>> >> > >> >>>
>> >> > >> >>> ar8216 driver handles AR8327/AR8337 different from the other
>> >> > > supported switch types.
>> >> > >> >>> ar8xxx_start incl. more detailed configuration is called from
>> >> > > ar8xxx_phy_probe already.
>> >> > >> >>> For the other switch types it's called from
>> > ar8xxx_phy_config_init.
>> >> > >> >>>
>> >> > >> >>> I wonder whether doing detailed configuration in the probe
>> >> > >> >>> stage
>> >> > > might be too early.
>> >> > >> >>> phy_init_hw resets the switch anyway later.
>> >> > >> >>> Doing the detailed setup in ar8xxx_phy_config_init seems to be
>> >> > > more suited however
>> >> > >> >>> there might be good reason why it's handled the way it is.
>> >> > >> >>
>> >> > >> >> I suppose that you could re-advertise auto-negotiation by
>> >> > >> >> calling
>> >> > > genphy_config_advert() in the config_init routine.
>> >> > >> >
>> >> > >> > The actual problem is caused by BMCR_ANENABLE being cleared by
>> >> > >> > the
>> >> > > reset in phy_init_hw.
>> >> > >> > As far as I can see genphy_config_advert doesn't bring back
>> > this flag.
>> >> > >> > What does genphy_config_aneg mainly do?
>> >> > >> >   - call genphy_config_advert
>> >> > >> >   - check if BMCR_ANENABLE is set
>> >> > >> >   - if it's not, call genphy_restart_aneg
>> >> > >> > Therefore, to bring back BMCR_ANENABLE, we have to call
>> >> > > genphy_config_aneg or genphy_restart_aneg.
>> >> > >> > genphy_restart_aneg most likely is sufficient, however I don't
>> >> > >> > see
>> >> > > that genphy_config_aneg
>> >> > >> > can do any harm if being called with addr == 0.
>> >> > >> > At least for me it works perfectly fine to call
>> >> > >> > genphy_config_aneg
>> >> > > for all addr's.
>> >> > >> >
>> >> > >> > Rgds, Heiner
>> >> > >> >
>> >> > >> >>
>> >> > >> >>>
>> >> > >> >>> Rgds,
>> >> > >> >>> Heiner
>> >> > >> >>>
>> >> > >> >>>
>> >> > >> >>> Am 27.10.2014 um 22:00 schrieb Heiner Kallweit:
>> >> > >> >>>> With two different TPLINK routers (both with a AR8327N
>> >> > >> >>>> switch) I
>> >> > > faced the issue that with
>> >> > >> >>>> kernel 3.14 certain switch ports used 10MBit/half only.
>> >> > >> >>>> Under kernel 3.10 everything was fine and the same ports used
>> >> > > 1000MBit/full.
>> >> > >> >>>>
>> >> > >> >>>> As the ar8216 driver is the same I had a look at the common
>> >> > >> >>>> phy
>> >> > > code in drivers/net/phy.
>> >> > >> >>>> In function phy_init_hw in phy_device.c kernel 3.14 resets
>> >> > >> >>>> the
>> >> > > phy whilst 3.10 does not.
>> >> > >> >>>>
>> >> > >> >>>> The issue turned out to be that when resetting only flag
>> >> > > BMCR_RESET is set, not BMCR_ANENABLE.
>> >> > >> >>>> (In ar8216.c always both flags are set when the switch is
>> >> > >> >>>> reset)
>> >> > >> >>>> Therefore autoneg was not enabled. Also later in the boot
>> > process
>> >> > > it doesn't get enabled.
>> >> > >> >>>> Adding BMCR_ANENABLE solved the problem and now also under
>> >> > >> >>>> 3.14
>> >> > > all ports use 1000MBit/full.
>> >> > >> >>>>
>> >> > >> >>>> However I'm not sure whether it's a poper fix to add
>> >> > > BMCR_ANENABLE in this generic phy function.
>> >> > >> >>>> It might not be appropriate for other phy's.
>> >> > >> >>>>
>> >> > >> >>>> After resetting the switch later in the boot process
>> >> > > ar8xxx_phy_config_aneg is called with
>> >> > >> >>>> phydev->addr being 0.
>> >> > >> >>>> In this case the function returns immediately. Otherwise it
>> > would
>> >> > > call genphy_config_aneg.
>> >> > >> >>>> Calling genphy_config_aneg would also solve the problem as it
>> >> > > checks for BMCR_ANENABLE
>> >> > >> >>>> being set and sets it if needed.
>> >> > >> >>>> I don't know the reason why genphy_config_aneg isn't called
>> >> > >> >>>> in
>> >> > > case of addr 0.
>> >> > >> >>>> Or is this a typo and the check actually should be addr != 0
>> >> > >> >>>> ?
>> >> > >> >>>>
>> >> > >> >>>> Rgds, Heiner
>> >> > >> >>>>
>> >> > >>
>> >> > >> The following rudimentary patch works fine for me with kernel
>> > 3.14.18 on
>> >> > >> TP-LINK TL-WDR4900 (mpc85xx with AR8327Nv4)
>> >> > >> TP-LINK TL-WDR4300 ( ar71xx with AR8327Nv2)
>> >> > >>
>> >> > >> Apart from using genphy_config_aneg also for addr==0 I replaced
>> >> > > msleep(1000)
>> >> > >> with a polling function inspired by phy_poll_reset in phy_device.c
>> >> > >> On AR8327 the reset actually takes less than 20ms. Sleeping for
>> > 1000ms
>> >> > >> seems to be a waste of time.
>> >> > >>
>> >> > >> Little more work would be needed to make it a proper patch:
>> >> > >> - move ar8xxx_phy_poll_reset more to the beginning so that it is
>> > defined
>> >> > >>   before being used in any xxxx_hw_init function
>> >> > >> - replace msleep(1000) also in the other xxxx_hw_init functions
>> >> > >> - remove pr_info debug message or make it at least pr_dbg
>> >> > >> - insert some comments
>> >> > >> - use git format-patch output
>> >> > >>
>> >> > >> Rgds, Heiner
>> >> > >>
>> >> > >>
>> >> > >> --- ar8216.c.orig       2014-10-30 21:50:19.999723156 +0100
>> >> > >> +++ ar8216.c    2014-10-30 21:42:11.996481099 +0100
>> >> > >> @@ -1591,6 +1591,29 @@
>> >> > >>  #endif
>> >> > >>
>> >> > >>  static int
>> >> > >> +ar8xxx_phy_poll_reset(struct mii_bus *bus, int num_phys)
>> >> > >> +{
>> >> > >> +       unsigned int sleep_msecs = 20;
>> >> > >> +       int ret, elapsed, i;
>> >> > >> +
>> >> > >> +       for(elapsed = sleep_msecs; elapsed <= 600; elapsed +=
>> >> > > sleep_msecs) {
>> >> > >> +               msleep(sleep_msecs);
>> >> > >> +               for (i = 0; i < num_phys; i++) {
>> >> > >> +                       ret = mdiobus_read(bus, i, MII_BMCR);
>> >> > >> +                       if (ret < 0) return ret;
>> >> > >> +                       if (ret & BMCR_RESET) break;
>> >> > >> +                       if (i == num_phys - 1) {
>> >> > >> +                               usleep_range(1000, 2000);
>> >> > >> +                               pr_info("ar8xxx_phy_poll_reset
>> >> > > finished in %u ms\n",
>> >> > >> +                                       elapsed);
>> >> > >> +                               return 0;
>> >> > >> +                       }
>> >> > >> +               }
>> >> > >> +       }
>> >> > >> +       return -ETIMEDOUT;
>> >> > >> +}
>> >> > >> +
>> >> > >> +static int
>> >> > >>  ar8327_hw_init(struct ar8xxx_priv *priv)
>> >> > >>  {
>> >> > >>         struct mii_bus *bus;
>> >> > >> @@ -1620,7 +1643,7 @@
>> >> > >>                 mdiobus_write(bus, i, MII_BMCR, BMCR_RESET |
>> >> > > BMCR_ANENABLE);
>> >> > >>         }
>> >> > >>
>> >> > >> -       msleep(1000);
>> >> > >> +       ar8xxx_phy_poll_reset(bus, AR8327_NUM_PHYS);
>> >> > >>
>> >> > >>         return 0;
>> >> > >>  }
>> >> > >> @@ -2840,15 +2863,6 @@
>> >> > >>         return ret;
>> >> > >>  }
>> >> > >>
>> >> > >> -static int
>> >> > >> -ar8xxx_phy_config_aneg(struct phy_device *phydev)
>> >> > >> -{
>> >> > >> -       if (phydev->addr == 0)
>> >> > >> -               return 0;
>> >> > >> -
>> >> > >> -       return genphy_config_aneg(phydev);
>> >> > >> -}
>> >> > >> -
>> >> > >>  static const u32 ar8xxx_phy_ids[] = {
>> >> > >>         0x004dd033,
>> >> > >>         0x004dd034, /* AR8327 */
>> >> > >> @@ -3020,7 +3034,7 @@
>> >> > >>         .remove         = ar8xxx_phy_remove,
>> >> > >>         .detach         = ar8xxx_phy_detach,
>> >> > >>         .config_init    = ar8xxx_phy_config_init,
>> >> > >> -       .config_aneg    = ar8xxx_phy_config_aneg,
>> >> > >> +       .config_aneg    = genphy_config_aneg,
>> >> > >>         .read_status    = ar8xxx_phy_read_status,
>> >> > >>         .driver         = { .owner = THIS_MODULE },
>> >> > >>  };
>> >> > >> _______________________________________________
>> >> > >> openwrt-devel mailing list
>> >> > >> openwrt-devel at lists.openwrt.org
>> > <mailto:openwrt-devel at lists.openwrt.org>
>> > <mailto:openwrt-devel at lists.openwrt.org
>> > <mailto:openwrt-devel at lists.openwrt.org>>
>> >> > >> https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel
>> >> > >
>> >> > > I feel like this shouldn't die off in the depths of the ML
>> >> > >
>> >> >
>> >> > we had a call about it today, this patch and the other ar821x patches
>> >> > will be handled soon
>> >>
>> >> You guys bumped trunk to 3.14 but didn't fix this.
>> >
>> > It's also broken in 43342.
>> >
>> > For me this manifests as the WAN port is stuck at 1000T so my 100T DSL
>> > modem doesn't exist. Lucky I keep spare switches around.
>> On what device is it broken for you? I did apply Heiner Kallweit's
>> fixes, and it worked for me.
>>
>> - Felix

TL-WDR4300
That was the topic of the first patches

>
> The problem with the non-working WAN port was reported to me also by others.
> Using genphy_config_aneg with phy 0 seems to cause this.
> The following fix which I sent Nov 21st is still waiting to be applied.

You didn't attach or paste anything.

> ar8216: Fix issue with autoneg being disabled under 3.14, revert 43332

Thanks for this.
_______________________________________________
openwrt-devel mailing list
openwrt-devel at lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel


More information about the openwrt-devel mailing list