Marvell EBU 32-bit performance benchmarks (VFPv3-D16 vs NEON builds, Turris Omnia).

Thu Oct 21 12:16:20 PDT 2021

For me the argument was not about whether there should 2, but whether 
the change to vfpv3-d16 was the right choice as the 1. openssl is of 
course preordained to run NEON SIMD code so no change would be expected, 
but a benefit is seen on WG. I would suggest that is the result of 16 
vs. 32 FP registers being available, and nothing to do with NEON; i.e. 
vfpv3-d16 vs. vfpv3. The result being smaller and faster due to more 
concise FP code generation due to less FP register rejigging.

On 2021-10-21 09:38, Rui Salvaterra wrote:
> Hi, guys,
>
> So, last meeting I proposed splitting the 32-bit mvebu target in
> vfpv3-d16 and neon subtargets. It seems this subject comes up every
> couple of years, or so. This time I hope to show solid evidence on why
> it would be an exercise in futility, closing the matter once and for
> all. In order to do so, I tested the performance of openssl speed (10
> seconds time limit), cryptsetup benchmark and iperf3 over WireGuard.
> Here are the results for each build:
>
> VFPv3-D16:
> https://paste.debian.net/1216262/
>
> NEON:
> https://paste.debian.net/1216261/
>
> These are master builds from my stmvebu branch, running Linux 5.10.75.
> The configuration is custom, but each build differs only in the
> CPU_SUBTYPE (vfpv3-d16 vs neon).
>
> Cheers,
> Rui
>
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel at lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel