Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aarch64: Support FEAT_LSFE #201

Merged
merged 1 commit into from
Mar 8, 2025
Merged

aarch64: Support FEAT_LSFE #201

merged 1 commit into from
Mar 8, 2025

Conversation

taiki-e
Copy link
Owner

@taiki-e taiki-e commented Jan 12, 2025

Armv9.6 added atomic float instructions for binary{16,32,64} and bfloat16 as FEAT_LSFE (Large System Float Extension).

This PR optimizes AArch64 {16,32,64}-bit atomic float add/sub/max/min when FEAT_LSFE is enabled.

LLVM's assembly support for FEAT_LSFE needs LLVM 20 (llvm/llvm-project@67ff5ba), so use .inst directive on LLVM 19 or older.

Run-time detection is also implemented, but at this time it is only used in testing. AFAIK no CPUs actually implement this feature yet, so we will only refer to the feature available at compile time at this time.

@taiki-e taiki-e added the O-aarch64 Target: Armv8-A, Armv8-R, or later processors in AArch64 mode label Jan 12, 2025
@taiki-e taiki-e added the A-float Area: related to atomic float label Jan 12, 2025
@taiki-e taiki-e force-pushed the main branch 5 times, most recently from 53c8409 to 378f6cd Compare January 15, 2025 15:07
@taiki-e taiki-e force-pushed the main branch 8 times, most recently from 52836df to 4a9ffc4 Compare February 5, 2025 17:50
@taiki-e taiki-e force-pushed the main branch 5 times, most recently from a368389 to eeb0235 Compare February 24, 2025 12:09
@taiki-e taiki-e force-pushed the main branch 5 times, most recently from 77c5d0d to 813bf8f Compare March 7, 2025 16:12
@taiki-e taiki-e force-pushed the aarch64-lsfe branch 2 times, most recently from 74aec13 to fdd02b9 Compare March 8, 2025 13:46
@taiki-e taiki-e force-pushed the aarch64-lsfe branch 3 times, most recently from 94aeaa1 to cc03496 Compare March 8, 2025 14:20
@taiki-e taiki-e merged commit 05bef02 into main Mar 8, 2025
119 checks passed
@taiki-e taiki-e deleted the aarch64-lsfe branch March 8, 2025 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-float Area: related to atomic float O-aarch64 Target: Armv8-A, Armv8-R, or later processors in AArch64 mode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant