mvebu: armada 3720 cpufreq reverts

Marek Behún marek.behun at nic.cz
Wed Jun 30 10:07:55 PDT 2021


On Wed, 30 Jun 2021 17:51:24 +0200
Robert Marko <robert.marko at sartura.hr> wrote:

> On Wed, Jun 30, 2021 at 3:19 PM Marek Behún <marek.behun at nic.cz>
> wrote:
> >
> > Hello Robert,
> >
> > I am writing regarding commit
> >   mvebu: 5.10 fix DVFS caused random boot crashes
> >   https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=080a0b74e39d159eecf69c468debec42f28bf4d8
> > in OpenWRT.
> >
> > This commit reverts the one patch of a3720 cpufreq driver, but not
> > the subsequent ones.
> >
> > Your commit message says that some 1.2 GHz SOCs are unstable with
> > the fix. Did you also test this with the subsequent patches, which
> > are now in stable kernels? I guess the answer is yes, because all
> > these patches were backported to 5.10.37.  
> 
> Hi Marek,
> 
> Yes, the rest of the patches were there as well.
> >
> > I am of the opinion that a better approach would be to
> > - either disable cpufreq for 1.2 GHz variants
> > - fix a3720 cpufreq driver to only scale up to 1 GHz on 1.2 GHz
> > variant  
> 
> I would prefer limiting it to 1GHz as that would not cause
> performance issues, but 1GHz models could have the same issue as well.
> This is because the voltages that are set as a minimum are from the
> testing that Pali and the Turris guys did, but it really depends on
> the SoC batch you receive.

The thing is you cannot limit it to 1 GHz in kernel, because when the
device is booted to 1.2 GHz the dividers are {1, 2, 4, 6}, so the
available frequencies are 1200 MHz, 600 MHz, 300 MHz, 200 MHz.

If you want to limit it to 1 GHz, you need to build the flash-image.bin
with CLOCKSPRESET=CPU_1000_DDR_800 and reflash the device.

With your revert the cpufreq scaling may be stable, but the CPU clock
switches to TBG-A-P, which is 750 MHz.
The result is that you are scaling, but you are scaling between
  750 MHz, 375 MHz, 187.5 MHz, 125 MHz

Which is even worse than 1 GHz variant, where the top frequecny with
your revert is 800 MHz.

> >
> > Since the approach you've taken now (reverting the patch) basically
> > changes the CPU parnet clock to DDR clock, which is just wrong.
> > Worse is that you are doing this for everybody, not just for the 1.2
> > GHz variants.
> >
> > What do you think?  
> 
> I understand that it was not the best solution, but something had to
> be done as I was not able to even finish booting on multiple boards
> before crashing. It just reverted the things back to the previous
> state.
> 
> I really could not figure a proper solution even after being in touch
> with Pali, and contacting
> GlobalScale.
> 
> This is an issue caused by Marvell simply ignoring the issue and
> refusing to publish
> a fix or release the OTP and AVS docs as they all have a validated
> voltage in the OTP
> somewhere.

I have sent patch to upstream kernel disabling cpufreq on 1.2 GHz
models. I think this is the most sane solution for now, since we
simply do not know how to scale properly on this variant.

Once the patch is accepted, would you please remove your revert?

Marek



More information about the openwrt-devel mailing list