Home > Backend Development > Golang > How to load global data into NEON registers more efficiently in Go's assembler?

How to load global data into NEON registers more efficiently in Go's assembler?

PHPz
Release: 2024-02-12 20:39:08
forward
647 people have browsed it

How to load global data into NEON registers more efficiently in Gos assembler?

php小编百草为您带来了关于在Go的汇编器中如何更高效地将全局数据加载到NEON寄存器的解决方案。Go语言在底层使用汇编器来优化性能,而NEON寄存器是ARM架构中的一种特殊寄存器,用于加速多媒体和信号处理操作。本文将介绍如何使用适当的指令和技巧,以提高数据加载到NEON寄存器的效率,从而优化Go程序的性能。让我们一起来看看吧!

问题内容

arm64 asm代码中有p256one全局数据作为示例:

DATA p256one<>+0x00(SB)/8, $0x0000000000000001
DATA p256one<>+0x08(SB)/8, $0xffffffff00000000
DATA p256one<>+0x10(SB)/8, $0xffffffffffffffff
DATA p256one<>+0x18(SB)/8, $0x00000000fffffffe

GLOBL p256one<>(SB), 8, $32
Copy after login

我需要将 p256one<>(SB) 加载到 V0 和 V1 寄存器中。目前我使用以下方法:

LDP p256one<>+0x00(SB), (R0, R1)
    LDP p256one<>+0x10(SB), (R2, R3)
    VMOV R0, V0.D[0]
    VMOV R1, V0.D[1]
    VMOV R2, V1.D[0]
    VMOV R3, V1.D[1]
Copy after login

这里总共使用了六个指令。我们知道我们可以加载内存数据,如下所示:

VLD1 (R0), [V0.B16, V1.B16]
Copy after login

但是我们似乎无法用同样的方法加载全局数据。

那么,在Go的汇编代码中是否有更高效的方式将全局数据加载到NEON寄存器中呢?

解决方法

尝试将地址加载到寄存器中,然后从该地址加载:

MOVD $p256one<>(SB), R0
    VLD1 (R0), [V0.B16, V1.B16]
Copy after login

The above is the detailed content of How to load global data into NEON registers more efficiently in Go's assembler?. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:stackoverflow.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template