Marvell EBU 32-bit performance benchmarks (VFPv3-D16 vs NEON builds, Turris Omnia).

Kabuli Chana anome at
Thu Oct 21 13:55:00 PDT 2021

On 2021-10-21 14:31, Rosen Penev wrote:
> On Thu, Oct 21, 2021 at 12:19 PM Kabuli Chana <anome at> wrote:
>> For me the argument was not about whether there should 2, but whether
>> the change to vfpv3-d16 was the right choice as the 1. openssl is of
>> course preordained to run NEON SIMD code so no change would be expected,
>> but a benefit is seen on WG. I would suggest that is the result of 16
>> vs. 32 FP registers being available, and nothing to do with NEON; i.e.
>> vfpv3-d16 vs. vfpv3. The result being smaller and faster due to more
>> concise FP code generation due to less FP register rejigging.
> I call BS.
> WireGuard does not use floating point.Just fixed point arithmetic,
> like most crypto. WireGuard being faster probably has to do with its
> usage of NEON assembly.
Careful now, the brit might take you to task for language violation. But 
yes, after I replied it dawned on me that it would be odd that some FP 
code found its way into a kernel module. I have never spent any time 
looking at the WG code, but it seems if it uses NEON it would make sense 
that it detected things, rather than having to have things enabled 
externally, but I do not know how rsalvaterra set things up for the build.

But, I think that NEON as the target default is questionable, although 
others exist.
>> On 2021-10-21 09:38, Rui Salvaterra wrote:
>>> Hi, guys,
>>> So, last meeting I proposed splitting the 32-bit mvebu target in
>>> vfpv3-d16 and neon subtargets. It seems this subject comes up every
>>> couple of years, or so. This time I hope to show solid evidence on why
>>> it would be an exercise in futility, closing the matter once and for
>>> all. In order to do so, I tested the performance of openssl speed (10
>>> seconds time limit), cryptsetup benchmark and iperf3 over WireGuard.
>>> Here are the results for each build:
>>> VFPv3-D16:
>>> NEON:
>>> These are master builds from my stmvebu branch, running Linux 5.10.75.
>>> The configuration is custom, but each build differs only in the
>>> CPU_SUBTYPE (vfpv3-d16 vs neon).
>>> Cheers,
>>> Rui
>>> _______________________________________________
>>> openwrt-devel mailing list
>>> openwrt-devel at
>> _______________________________________________
>> openwrt-devel mailing list
>> openwrt-devel at

More information about the openwrt-devel mailing list