Skip to content

Conversation

RKSimon
Copy link
Collaborator

@RKSimon RKSimon commented Aug 26, 2025

All 3 instructions are well defined bit twiddling operations - they do not introduce undef/poison with well defined inputs.

Fixes regressions in #152107

…2P8AFFINEQB / GF2P8MULB handling

All 3 instructions are well defined bit twiddling operations - they do not introduce undef/poison with well defined inputs.

Fixes regressions in llvm#152107
@llvmbot
Copy link
Member

llvmbot commented Aug 26, 2025

@llvm/pr-subscribers-backend-x86

Author: Simon Pilgrim (RKSimon)

Changes

All 3 instructions are well defined bit twiddling operations - they do not introduce undef/poison with well defined inputs.

Fixes regressions in #152107


Full diff: https://github.com/llvm/llvm-project/pull/155409.diff

2 Files Affected:

  • (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+5)
  • (modified) llvm/test/CodeGen/X86/combine-gfni.ll (+3-6)
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 19131fbd4102b..dacbda6d7eb10 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -45167,6 +45167,11 @@ bool X86TargetLowering::canCreateUndefOrPoisonForTargetNode(
   // SSE signbit extraction.
   case X86ISD::MOVMSK:
     return false;
+  // GFNI instructions.
+  case X86ISD::GF2P8AFFINEINVQB:
+  case X86ISD::GF2P8AFFINEQB:
+  case X86ISD::GF2P8MULB:
+    return false;
   case ISD::INTRINSIC_WO_CHAIN:
     switch (Op->getConstantOperandVal(0)) {
     case Intrinsic::x86_sse2_pmadd_wd:
diff --git a/llvm/test/CodeGen/X86/combine-gfni.ll b/llvm/test/CodeGen/X86/combine-gfni.ll
index 21ea17a20e0fd..b105cdf7ea895 100644
--- a/llvm/test/CodeGen/X86/combine-gfni.ll
+++ b/llvm/test/CodeGen/X86/combine-gfni.ll
@@ -24,8 +24,7 @@ define <16 x i8> @gf2p8affineqb_freeze(<16 x i8> %a0, <16 x i8> %a1, <16 x i8> %
 ; AVX512-LABEL: gf2p8affineqb_freeze:
 ; AVX512:       # %bb.0:
 ; AVX512-NEXT:    vpmovb2m %xmm2, %k1
-; AVX512-NEXT:    vgf2p8affineqb $11, %xmm1, %xmm1, %xmm1
-; AVX512-NEXT:    vmovdqu8 %xmm1, %xmm0 {%k1}
+; AVX512-NEXT:    vgf2p8affineqb $11, %xmm1, %xmm1, %xmm0 {%k1}
 ; AVX512-NEXT:    retq
   %i = icmp slt <16 x i8> %a2, zeroinitializer
   %g = call <16 x i8> @llvm.x86.vgf2p8affineqb.128(<16 x i8> %a1, <16 x i8> %a1, i8 11)
@@ -55,8 +54,7 @@ define <16 x i8> @gf2p8affineinvqb_freeze(<16 x i8> %a0, <16 x i8> %a1, <16 x i8
 ; AVX512-LABEL: gf2p8affineinvqb_freeze:
 ; AVX512:       # %bb.0:
 ; AVX512-NEXT:    vpmovb2m %xmm2, %k1
-; AVX512-NEXT:    vgf2p8affineinvqb $11, %xmm1, %xmm1, %xmm1
-; AVX512-NEXT:    vmovdqu8 %xmm1, %xmm0 {%k1}
+; AVX512-NEXT:    vgf2p8affineinvqb $11, %xmm1, %xmm1, %xmm0 {%k1}
 ; AVX512-NEXT:    retq
   %i = icmp slt <16 x i8> %a2, zeroinitializer
   %g = call <16 x i8> @llvm.x86.vgf2p8affineinvqb.128(<16 x i8> %a1, <16 x i8> %a1, i8 11)
@@ -86,8 +84,7 @@ define <16 x i8> @gf2p8mulb_freeze(<16 x i8> %a0, <16 x i8> %a1, <16 x i8> %a2)
 ; AVX512-LABEL: gf2p8mulb_freeze:
 ; AVX512:       # %bb.0:
 ; AVX512-NEXT:    vpmovb2m %xmm2, %k1
-; AVX512-NEXT:    vgf2p8mulb %xmm1, %xmm1, %xmm1
-; AVX512-NEXT:    vmovdqu8 %xmm1, %xmm0 {%k1}
+; AVX512-NEXT:    vgf2p8mulb %xmm1, %xmm1, %xmm0 {%k1}
 ; AVX512-NEXT:    retq
   %i = icmp slt <16 x i8> %a2, zeroinitializer
   %g = call <16 x i8> @llvm.x86.vgf2p8mulb.128(<16 x i8> %a1, <16 x i8> %a1)

Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@RKSimon RKSimon merged commit 343e944 into llvm:main Aug 26, 2025
11 checks passed
@RKSimon RKSimon deleted the x86-gfni-nopoison branch August 26, 2025 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants