[AArch64][SDAG] Lower f16->s16 FP_TO_INT_SAT to *v1f16 #154822

mrkajetanp · 2025-08-21T18:56:39Z

Conversions from f16 to s16 performed by FP_TO_INT_SAT can be done directly within FPRs, e.g. fcvtzs h0, h0.
Generating this format reduces the number of instruction required for correct behaviour, as it sidesteps the issues with incorrect saturation that arise when using fcvtzs w0, h0 for the same casts.
Add new AArch64ISD::FCVTZS_HALF and AArch64ISD::FCVTZU_HALF nodes to represent the necessary instruction sequence.

Related to #154343.

llvmbot · 2025-08-21T18:57:12Z

@llvm/pr-subscribers-backend-aarch64

Author: Kajetan Puchalski (mrkajetanp)

Changes

Conversions from f16 to s16 performed by FP_TO_INT_SAT can be done directly within FPRs, e.g. fcvtzs h0, h0.
Generating this format reduces the number of instruction required for correct behaviour, as it sidesteps the issues with incorrect saturation that arise when using fcvtzs w0, h0 for the same casts.

Related to #154343.

Full diff: https://github.com/llvm/llvm-project/pull/154822.diff

3 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+18)
(modified) llvm/test/CodeGen/AArch64/fptosi-sat-scalar.ll (+2-7)
(modified) llvm/test/CodeGen/AArch64/fptoui-sat-scalar.ll (+2-4)

diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index d168cc8d1bd06..f7cb76ed64b13 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -4912,6 +4912,24 @@ SDValue AArch64TargetLowering::LowerFP_TO_INT_SAT(SDValue Op,
   if (DstWidth < SatWidth)
     return SDValue();
 
+  if (SrcVT == MVT::f16 && SatVT == MVT::i16 && DstVT == MVT::i32) {
+    auto Opcode = (Op.getOpcode() == ISD::FP_TO_SINT_SAT)
+                      ? AArch64::FCVTZSv1f16
+                      : AArch64::FCVTZUv1f16;
+    auto Cvt = SDValue(DAG.getMachineNode(Opcode, DL, MVT::f16, SrcVal), 0);
+    auto Sign = DAG.getTargetConstant(-1, DL, MVT::i64);
+    auto Hsub = DAG.getTargetConstant(AArch64::hsub, DL, MVT::i32);
+    auto SubregToReg =
+        SDValue(DAG.getMachineNode(TargetOpcode::SUBREG_TO_REG, DL, MVT::v8f16,
+                                   Sign, Cvt, Hsub),
+                0);
+    auto Ssub = DAG.getTargetConstant(AArch64::ssub, DL, MVT::i32);
+    auto Extract = SDValue(DAG.getMachineNode(TargetOpcode::EXTRACT_SUBREG, DL,
+                                              MVT::f32, SubregToReg, Ssub),
+                           0);
+    return DAG.getBitcast(MVT::i32, Extract);
+  }
+
   SDValue NativeCvt =
       DAG.getNode(Op.getOpcode(), DL, DstVT, SrcVal, DAG.getValueType(DstVT));
   SDValue Sat;
diff --git a/llvm/test/CodeGen/AArch64/fptosi-sat-scalar.ll b/llvm/test/CodeGen/AArch64/fptosi-sat-scalar.ll
index e3aef487890f9..a5f6ac628403c 100644
--- a/llvm/test/CodeGen/AArch64/fptosi-sat-scalar.ll
+++ b/llvm/test/CodeGen/AArch64/fptosi-sat-scalar.ll
@@ -670,13 +670,8 @@ define i16 @test_signed_i16_f16(half %f) nounwind {
 ;
 ; CHECK-SD-FP16-LABEL: test_signed_i16_f16:
 ; CHECK-SD-FP16:       // %bb.0:
-; CHECK-SD-FP16-NEXT:    fcvtzs w8, h0
-; CHECK-SD-FP16-NEXT:    mov w9, #32767 // =0x7fff
-; CHECK-SD-FP16-NEXT:    cmp w8, w9
-; CHECK-SD-FP16-NEXT:    csel w8, w8, w9, lt
-; CHECK-SD-FP16-NEXT:    mov w9, #-32768 // =0xffff8000
-; CHECK-SD-FP16-NEXT:    cmn w8, #8, lsl #12 // =32768
-; CHECK-SD-FP16-NEXT:    csel w0, w8, w9, gt
+; CHECK-SD-FP16-NEXT:    fcvtzs h0, h0
+; CHECK-SD-FP16-NEXT:    fmov w0, s0
 ; CHECK-SD-FP16-NEXT:    ret
 ;
 ; CHECK-GI-CVT-LABEL: test_signed_i16_f16:
diff --git a/llvm/test/CodeGen/AArch64/fptoui-sat-scalar.ll b/llvm/test/CodeGen/AArch64/fptoui-sat-scalar.ll
index 07e49e331415e..2613f8337a918 100644
--- a/llvm/test/CodeGen/AArch64/fptoui-sat-scalar.ll
+++ b/llvm/test/CodeGen/AArch64/fptoui-sat-scalar.ll
@@ -531,10 +531,8 @@ define i16 @test_unsigned_i16_f16(half %f) nounwind {
 ;
 ; CHECK-SD-FP16-LABEL: test_unsigned_i16_f16:
 ; CHECK-SD-FP16:       // %bb.0:
-; CHECK-SD-FP16-NEXT:    fcvtzu w8, h0
-; CHECK-SD-FP16-NEXT:    mov w9, #65535 // =0xffff
-; CHECK-SD-FP16-NEXT:    cmp w8, w9
-; CHECK-SD-FP16-NEXT:    csel w0, w8, w9, lo
+; CHECK-SD-FP16-NEXT:    fcvtzu h0, h0
+; CHECK-SD-FP16-NEXT:    fmov w0, s0
 ; CHECK-SD-FP16-NEXT:    ret
 ;
 ; CHECK-GI-CVT-LABEL: test_unsigned_i16_f16:

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

mrkajetanp · 2025-08-21T19:00:45Z

It feels like there should be a better way to do this, but SDAG was really unhappy with me trying to do anything that involved having an i16 type on any operation, so this was the only way I've found (so far) of having it not crash.

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp

Conversions from f16 to s16 performed by FP_TO_INT_SAT can be done directly within FPRs, e.g. `fcvtzs h0, h0`. Generating this format reduces the number of instruction required for correct behaviour, as it sidesteps the issues with incorrect saturation that arise when using `fcvtzs w0, h0` for the same casts. Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>

efriedma-quic

LGTM

llvm/lib/Target/AArch64/AArch64InstrInfo.td

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

davemgreen

Thanks. LGTM

mrkajetanp requested review from aemerson, tblah and davemgreen August 21, 2025 18:56

llvmbot added the backend:AArch64 label Aug 21, 2025

davemgreen requested a review from efriedma-quic August 21, 2025 18:58

davemgreen reviewed Aug 21, 2025

View reviewed changes

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Outdated Show resolved Hide resolved

mrkajetanp requested a review from davemgreen August 26, 2025 14:05

efriedma-quic reviewed Aug 26, 2025

View reviewed changes

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp Outdated Show resolved Hide resolved

mrkajetanp requested a review from efriedma-quic August 27, 2025 12:13

efriedma-quic reviewed Aug 27, 2025

View reviewed changes

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp Outdated Show resolved Hide resolved

mrkajetanp added 4 commits August 27, 2025 22:12

Move into AArch64ISD::FCVTZ[S|U]_HALF

eed4fd2

Make new AArch64ISD nodes return f32

e0898a0

Move AArch64ISD selection into TableGen

604902c

mrkajetanp force-pushed the sdag-s16-to-int-sat branch from 5bf55ab to 604902c Compare August 27, 2025 23:13

mrkajetanp requested a review from efriedma-quic August 27, 2025 23:16

efriedma-quic approved these changes Aug 27, 2025

View reviewed changes

davemgreen reviewed Aug 28, 2025

View reviewed changes

llvm/lib/Target/AArch64/AArch64InstrInfo.td Outdated Show resolved Hide resolved

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Outdated Show resolved Hide resolved

Explicitly SIGN_EXTEND_INREG

33e86a1

mrkajetanp requested a review from davemgreen August 28, 2025 12:40

davemgreen approved these changes Aug 28, 2025

View reviewed changes

Merge branch 'main' into sdag-s16-to-int-sat

5cdfa01

mrkajetanp merged commit 6dd67f8 into llvm:main Aug 28, 2025
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AArch64][SDAG] Lower f16->s16 FP_TO_INT_SAT to *v1f16 #154822

[AArch64][SDAG] Lower f16->s16 FP_TO_INT_SAT to *v1f16 #154822

mrkajetanp commented Aug 21, 2025 •

edited

Loading

Uh oh!

llvmbot commented Aug 21, 2025

Uh oh!

Uh oh!

mrkajetanp commented Aug 21, 2025

Uh oh!

Uh oh!

Uh oh!

efriedma-quic left a comment

Uh oh!

Uh oh!

Uh oh!

davemgreen left a comment

Uh oh!

Uh oh!

Uh oh!

[AArch64][SDAG] Lower f16->s16 FP_TO_INT_SAT to *v1f16 #154822

[AArch64][SDAG] Lower f16->s16 FP_TO_INT_SAT to *v1f16 #154822

Conversation

mrkajetanp commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Aug 21, 2025

Uh oh!

Uh oh!

mrkajetanp commented Aug 21, 2025

Uh oh!

Uh oh!

Uh oh!

efriedma-quic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mrkajetanp commented Aug 21, 2025 •

edited

Loading