Skip to content

Conversation

tonykuttai
Copy link
Contributor

@tonykuttai tonykuttai commented Aug 11, 2025

Adds support for ternary equivalent operations of the form ternary(A, X, B) and ternary(A, X, C) where X=[and(B,C)| nor(B,C)| eqv(B,C)| nand(B,C)].

The following are the patterns involved and the imm values:

Operation Immediate Value
ternary(A, and(B,C), B) 49
ternary(A, nor(B,C), B) 56
ternary(A, eqv(B,C), B) 57
ternary(A, nand(B,C), B) 62
ternary(A, and(B,C), C) 81
ternary(A, nor(B,C), C) 88
ternary(A, eqv(B,C), C) 89
ternary(A, nand(B,C), C) 94

eg. xxeval XT, XA, XB, XC, 49

  • performs XA ? and(XB, XC) : Band places the result in XT.

This is the continuation of [PowerPC] Exploit xxeval instruction for ternary patterns - ternary(A, X, and(B,C)).

@tonykuttai tonykuttai marked this pull request as ready for review August 11, 2025 08:46
@tonykuttai tonykuttai requested a review from lei137 August 11, 2025 08:47
@llvmbot
Copy link
Member

llvmbot commented Aug 11, 2025

@llvm/pr-subscribers-backend-powerpc

Author: Tony Varghese (tonykuttai)

Changes

Adds support for ternary equivalent operations of the form ternary(A, X, B) and ternary(A, X, C) where X=[and(B,C)| nor(B,C)| eqv(B,C)| nand(B,C)].

The following are the patterns involved and the imm values:

ternary(A,  and(B,C),   B)	49
ternary(A,  nor(B,C),   B)	56
ternary(A,  eqv(B,C),   B)	57
ternary(A,  nand(B,C),  B)	62

ternary(A,  and(B,C),   C)	81
ternary(A,  nor(B,C),   C)	88
ternary(A,  eqv(B,C),   C)	89
ternary(A,  nand(B,C),  C)	94

This is the continuation of [PowerPC] Exploit xxeval instruction for ternary patterns - ternary(A, X, and(B,C)).


Patch is 20.21 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/152956.diff

3 Files Affected:

  • (modified) llvm/lib/Target/PowerPC/PPCInstrP10.td (+51)
  • (modified) llvm/test/CodeGen/PowerPC/xxeval-vselect-x-b.ll (+17-33)
  • (modified) llvm/test/CodeGen/PowerPC/xxeval-vselect-x-c.ll (+17-33)
diff --git a/llvm/lib/Target/PowerPC/PPCInstrP10.td b/llvm/lib/Target/PowerPC/PPCInstrP10.td
index 98dd8464c0ac8..691ee72650eaf 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrP10.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrP10.td
@@ -2231,6 +2231,13 @@ def VEqv
                                             (v4i32(bitconvert node:$a)),
                                             (v4i32(bitconvert node:$b)))))]>;
 
+// Vector NAND operation (not(and))
+def VNand
+    : PatFrags<(ops node:$a, node:$b), [(vnot(and node:$a, node:$b)),
+                                        (bitconvert(vnot(and
+                                            (v4i32(bitconvert node:$a)),
+                                            (v4i32(bitconvert node:$b)))))]>;
+
 // =============================================================================
 // XXEVAL Ternary Pattern Multiclass: XXEvalTernarySelectAnd
 // This class matches the equivalent Ternary Operation: A ? f(B,C) : AND(B,C)
@@ -2266,6 +2273,48 @@ multiclass XXEvalTernarySelectAnd<ValueType Vt> {
             Vt, (vselect Vt:$vA, (VNot Vt:$vB), (VAnd Vt:$vB, Vt:$vC)), 28>;
 }
 
+// =============================================================================
+// XXEVAL Ternary Pattern Multiclass: XXEvalTernarySelectB
+// This class matches the equivalent Ternary Operation: A ? f(B,C) : B
+// and emit the corresponding xxeval instruction with the imm value.
+//
+// The patterns implement xxeval vector select operations where:
+// - A is the selector vector
+// - f(B,C) is the "true" case op on vectors B and C (AND, NOR, EQV, NAND)
+// - B is the "false" case op (vector B)
+// =============================================================================
+multiclass XXEvalTernarySelectB<ValueType Vt>{
+  // Pattern: (A ? AND(B,C) : B) XXEVAL immediate value: 49
+  def : XXEvalPattern<Vt, (vselect Vt:$vA, (VAnd Vt:$vB, Vt:$vC), Vt:$vB), 49>;
+  // Pattern: (A ? NOR(B,C) : B) XXEVAL immediate value: 56
+  def : XXEvalPattern<Vt, (vselect Vt:$vA, (VNor Vt:$vB, Vt:$vC), Vt:$vB), 56>;
+  // Pattern: (A ? EQV(B,C) : B) XXEVAL immediate value: 57
+  def : XXEvalPattern<Vt, (vselect Vt:$vA, (VEqv Vt:$vB, Vt:$vC), Vt:$vB), 57>;
+  // Pattern: (A ? NAND(B,C) : B) XXEVAL immediate value: 62
+  def : XXEvalPattern<Vt, (vselect Vt:$vA, (VNand Vt:$vB, Vt:$vC), Vt:$vB), 62>;
+}
+
+// =============================================================================
+// XXEVAL Ternary Pattern Multiclass: XXEvalTernarySelectC
+// This class matches the equivalent Ternary Operation: A ? f(B,C) : C
+// and emit the corresponding xxeval instruction with the imm value.
+//
+// The patterns implement xxeval vector select operations where:
+// - A is the selector vector
+// - f(B,C) is the "true" case op on vectors B and C (AND, NOR, EQV, NAND)
+// - C is the "false" case op (vector C)
+// =============================================================================
+multiclass XXEvalTernarySelectC<ValueType Vt>{
+  // Pattern: (A ? AND(B,C) : C) XXEVAL immediate value: 81
+  def : XXEvalPattern<Vt, (vselect Vt:$vA, (VAnd Vt:$vB, Vt:$vC), Vt:$vC), 81>;
+  // Pattern: (A ? NOR(B,C) : C) XXEVAL immediate value: 88
+  def : XXEvalPattern<Vt, (vselect Vt:$vA, (VNor Vt:$vB, Vt:$vC), Vt:$vC), 88>;
+  // Pattern: (A ? EQV(B,C) : C) XXEVAL immediate value: 89
+  def : XXEvalPattern<Vt, (vselect Vt:$vA, (VEqv Vt:$vB, Vt:$vC), Vt:$vC), 89>;
+  // Pattern: (A ? NAND(B,C) : C) XXEVAL immediate value: 94
+  def : XXEvalPattern<Vt, (vselect Vt:$vA, (VNand Vt:$vB, Vt:$vC), Vt:$vC), 94>;
+}
+
 let Predicates = [PrefixInstrs, HasP10Vector] in {
   let AddedComplexity = 400 in {
     def : Pat<(v4i32 (build_vector i32immNonAllOneNonZero:$A,
@@ -2377,6 +2426,8 @@ let Predicates = [PrefixInstrs, HasP10Vector] in {
     // XXEval Patterns for ternary Operations.
     foreach Ty = [v4i32, v2i64, v8i16, v16i8] in {
         defm : XXEvalTernarySelectAnd<Ty>;
+        defm : XXEvalTernarySelectB<Ty>;
+        defm : XXEvalTernarySelectC<Ty>; 
     }
 
     // Anonymous patterns to select prefixed VSX loads and stores.
diff --git a/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-b.ll b/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-b.ll
index c366fd5f0a8c2..a51e392279d55 100644
--- a/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-b.ll
+++ b/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-b.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
-; Test file to verify the emission of Vector Selection instructions when ternary operators are used.
+; Test file to verify the emission of Vector Evaluation instructions when ternary operators are used.
 
 ; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc64le-unknown-unknown \
 ; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
@@ -15,10 +15,9 @@ define <4 x i32> @ternary_A_and_BC_B_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %
 ; CHECK-LABEL: ternary_A_and_BC_B_4x32:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxleqv v5, v5, v5
-; CHECK-NEXT:    xxland vs0, v3, v4
 ; CHECK-NEXT:    vslw v2, v2, v5
 ; CHECK-NEXT:    vsraw v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 49
 ; CHECK-NEXT:    blr
 entry:
   %and = and <4 x i32> %B, %C
@@ -31,11 +30,10 @@ define <2 x i64> @ternary_A_and_BC_B_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %
 ; CHECK-LABEL: ternary_A_and_BC_B_2x64:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxlxor v5, v5, v5
-; CHECK-NEXT:    xxland vs0, v3, v4
 ; CHECK-NEXT:    xxsplti32dx v5, 1, 63
 ; CHECK-NEXT:    vsld v2, v2, v5
 ; CHECK-NEXT:    vsrad v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 49
 ; CHECK-NEXT:    blr
 entry:
   %and = and <2 x i64> %B, %C
@@ -48,10 +46,9 @@ define <16 x i8> @ternary_A_and_BC_B_16x8(<16 x i1> %A, <16 x i8> %B, <16 x i8>
 ; CHECK-LABEL: ternary_A_and_BC_B_16x8:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltib v5, 7
-; CHECK-NEXT:    xxland vs0, v3, v4
 ; CHECK-NEXT:    vslb v2, v2, v5
 ; CHECK-NEXT:    vsrab v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 49
 ; CHECK-NEXT:    blr
 entry:
   %and = and <16 x i8> %B, %C
@@ -64,10 +61,9 @@ define <8 x i16> @ternary_A_and_BC_B_8x16(<8 x i1> %A, <8 x i16> %B, <8 x i16> %
 ; CHECK-LABEL: ternary_A_and_BC_B_8x16:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltiw v5, 983055
-; CHECK-NEXT:    xxland vs0, v3, v4
 ; CHECK-NEXT:    vslh v2, v2, v5
 ; CHECK-NEXT:    vsrah v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 49
 ; CHECK-NEXT:    blr
 entry:
   %and = and <8 x i16> %B, %C
@@ -80,10 +76,9 @@ define <4 x i32> @ternary_A_nor_BC_B_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %
 ; CHECK-LABEL: ternary_A_nor_BC_B_4x32:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxleqv v5, v5, v5
-; CHECK-NEXT:    xxlnor vs0, v3, v4
 ; CHECK-NEXT:    vslw v2, v2, v5
 ; CHECK-NEXT:    vsraw v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 56
 ; CHECK-NEXT:    blr
 entry:
   %or = or <4 x i32> %B, %C
@@ -97,11 +92,10 @@ define <2 x i64> @ternary_A_nor_BC_B_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %
 ; CHECK-LABEL: ternary_A_nor_BC_B_2x64:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxlxor v5, v5, v5
-; CHECK-NEXT:    xxlnor vs0, v3, v4
 ; CHECK-NEXT:    xxsplti32dx v5, 1, 63
 ; CHECK-NEXT:    vsld v2, v2, v5
 ; CHECK-NEXT:    vsrad v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 56
 ; CHECK-NEXT:    blr
 entry:
   %or = or <2 x i64> %B, %C
@@ -115,10 +109,9 @@ define <16 x i8> @ternary_A_nor_BC_B_16x8(<16 x i1> %A, <16 x i8> %B, <16 x i8>
 ; CHECK-LABEL: ternary_A_nor_BC_B_16x8:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltib v5, 7
-; CHECK-NEXT:    xxlnor vs0, v3, v4
 ; CHECK-NEXT:    vslb v2, v2, v5
 ; CHECK-NEXT:    vsrab v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 56
 ; CHECK-NEXT:    blr
 entry:
   %or = or <16 x i8> %B, %C
@@ -132,10 +125,9 @@ define <8 x i16> @ternary_A_nor_BC_B_8x16(<8 x i1> %A, <8 x i16> %B, <8 x i16> %
 ; CHECK-LABEL: ternary_A_nor_BC_B_8x16:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltiw v5, 983055
-; CHECK-NEXT:    xxlnor vs0, v3, v4
 ; CHECK-NEXT:    vslh v2, v2, v5
 ; CHECK-NEXT:    vsrah v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 56
 ; CHECK-NEXT:    blr
 entry:
   %or = or <8 x i16> %B, %C
@@ -149,10 +141,9 @@ define <4 x i32> @ternary_A_eqv_BC_B_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %
 ; CHECK-LABEL: ternary_A_eqv_BC_B_4x32:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxleqv v5, v5, v5
-; CHECK-NEXT:    xxleqv vs0, v3, v4
 ; CHECK-NEXT:    vslw v2, v2, v5
 ; CHECK-NEXT:    vsraw v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 57
 ; CHECK-NEXT:    blr
 entry:
   %xor = xor <4 x i32> %B, %C
@@ -166,11 +157,10 @@ define <2 x i64> @ternary_A_eqv_BC_B_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %
 ; CHECK-LABEL: ternary_A_eqv_BC_B_2x64:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxlxor v5, v5, v5
-; CHECK-NEXT:    xxleqv vs0, v3, v4
 ; CHECK-NEXT:    xxsplti32dx v5, 1, 63
 ; CHECK-NEXT:    vsld v2, v2, v5
 ; CHECK-NEXT:    vsrad v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 57
 ; CHECK-NEXT:    blr
 entry:
   %xor = xor <2 x i64> %B, %C
@@ -184,10 +174,9 @@ define <16 x i8> @ternary_A_eqv_BC_B_16x8(<16 x i1> %A, <16 x i8> %B, <16 x i8>
 ; CHECK-LABEL: ternary_A_eqv_BC_B_16x8:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltib v5, 7
-; CHECK-NEXT:    xxleqv vs0, v3, v4
 ; CHECK-NEXT:    vslb v2, v2, v5
 ; CHECK-NEXT:    vsrab v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 57
 ; CHECK-NEXT:    blr
 entry:
   %xor = xor <16 x i8> %B, %C
@@ -201,10 +190,9 @@ define <8 x i16> @ternary_A_eqv_BC_B_8x16(<8 x i1> %A, <8 x i16> %B, <8 x i16> %
 ; CHECK-LABEL: ternary_A_eqv_BC_B_8x16:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltiw v5, 983055
-; CHECK-NEXT:    xxleqv vs0, v3, v4
 ; CHECK-NEXT:    vslh v2, v2, v5
 ; CHECK-NEXT:    vsrah v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 57
 ; CHECK-NEXT:    blr
 entry:
   %xor = xor <8 x i16> %B, %C
@@ -218,10 +206,9 @@ define <4 x i32> @ternary_A_nand_BC_B_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32>
 ; CHECK-LABEL: ternary_A_nand_BC_B_4x32:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxleqv v5, v5, v5
-; CHECK-NEXT:    xxlnand vs0, v3, v4
 ; CHECK-NEXT:    vslw v2, v2, v5
 ; CHECK-NEXT:    vsraw v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 62
 ; CHECK-NEXT:    blr
 entry:
   %and = and <4 x i32> %B, %C
@@ -235,11 +222,10 @@ define <2 x i64> @ternary_A_nand_BC_B_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64>
 ; CHECK-LABEL: ternary_A_nand_BC_B_2x64:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxlxor v5, v5, v5
-; CHECK-NEXT:    xxlnand vs0, v3, v4
 ; CHECK-NEXT:    xxsplti32dx v5, 1, 63
 ; CHECK-NEXT:    vsld v2, v2, v5
 ; CHECK-NEXT:    vsrad v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 62
 ; CHECK-NEXT:    blr
 entry:
   %and = and <2 x i64> %B, %C
@@ -253,10 +239,9 @@ define <16 x i8> @ternary_A_nand_BC_B_16x8(<16 x i1> %A, <16 x i8> %B, <16 x i8>
 ; CHECK-LABEL: ternary_A_nand_BC_B_16x8:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltib v5, 7
-; CHECK-NEXT:    xxlnand vs0, v3, v4
 ; CHECK-NEXT:    vslb v2, v2, v5
 ; CHECK-NEXT:    vsrab v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 62
 ; CHECK-NEXT:    blr
 entry:
   %and = and <16 x i8> %B, %C
@@ -270,10 +255,9 @@ define <8 x i16> @ternary_A_nand_BC_B_8x16(<8 x i1> %A, <8 x i16> %B, <8 x i16>
 ; CHECK-LABEL: ternary_A_nand_BC_B_8x16:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltiw v5, 983055
-; CHECK-NEXT:    xxlnand vs0, v3, v4
 ; CHECK-NEXT:    vslh v2, v2, v5
 ; CHECK-NEXT:    vsrah v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v3, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 62
 ; CHECK-NEXT:    blr
 entry:
   %and = and <8 x i16> %B, %C
diff --git a/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-c.ll b/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-c.ll
index f70f1d093f069..54bf6c03f8c1a 100644
--- a/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-c.ll
+++ b/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-c.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
-; Test file to verify the emission of Vector Selection instructions when ternary operators are used.
+; Test file to verify the emission of Vector Evaluation instructions when ternary operators are used.
 
 ; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc64le-unknown-unknown \
 ; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
@@ -15,10 +15,9 @@ define <4 x i32> @ternary_A_and_BC_C_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %
 ; CHECK-LABEL: ternary_A_and_BC_C_4x32:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxleqv v5, v5, v5
-; CHECK-NEXT:    xxland vs0, v3, v4
 ; CHECK-NEXT:    vslw v2, v2, v5
 ; CHECK-NEXT:    vsraw v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 81
 ; CHECK-NEXT:    blr
 entry:
   %and = and <4 x i32> %B, %C
@@ -31,11 +30,10 @@ define <2 x i64> @ternary_A_and_BC_C_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %
 ; CHECK-LABEL: ternary_A_and_BC_C_2x64:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxlxor v5, v5, v5
-; CHECK-NEXT:    xxland vs0, v3, v4
 ; CHECK-NEXT:    xxsplti32dx v5, 1, 63
 ; CHECK-NEXT:    vsld v2, v2, v5
 ; CHECK-NEXT:    vsrad v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 81
 ; CHECK-NEXT:    blr
 entry:
   %and = and <2 x i64> %B, %C
@@ -48,10 +46,9 @@ define <16 x i8> @ternary_A_and_BC_C_16x8(<16 x i1> %A, <16 x i8> %B, <16 x i8>
 ; CHECK-LABEL: ternary_A_and_BC_C_16x8:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltib v5, 7
-; CHECK-NEXT:    xxland vs0, v3, v4
 ; CHECK-NEXT:    vslb v2, v2, v5
 ; CHECK-NEXT:    vsrab v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 81
 ; CHECK-NEXT:    blr
 entry:
   %and = and <16 x i8> %B, %C
@@ -64,10 +61,9 @@ define <8 x i16> @ternary_A_and_BC_C_8x16(<8 x i1> %A, <8 x i16> %B, <8 x i16> %
 ; CHECK-LABEL: ternary_A_and_BC_C_8x16:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltiw v5, 983055
-; CHECK-NEXT:    xxland vs0, v3, v4
 ; CHECK-NEXT:    vslh v2, v2, v5
 ; CHECK-NEXT:    vsrah v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 81
 ; CHECK-NEXT:    blr
 entry:
   %and = and <8 x i16> %B, %C
@@ -80,10 +76,9 @@ define <4 x i32> @ternary_A_nor_BC_C_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %
 ; CHECK-LABEL: ternary_A_nor_BC_C_4x32:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxleqv v5, v5, v5
-; CHECK-NEXT:    xxlnor vs0, v3, v4
 ; CHECK-NEXT:    vslw v2, v2, v5
 ; CHECK-NEXT:    vsraw v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 88
 ; CHECK-NEXT:    blr
 entry:
   %or = or <4 x i32> %B, %C
@@ -97,11 +92,10 @@ define <2 x i64> @ternary_A_nor_BC_C_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %
 ; CHECK-LABEL: ternary_A_nor_BC_C_2x64:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxlxor v5, v5, v5
-; CHECK-NEXT:    xxlnor vs0, v3, v4
 ; CHECK-NEXT:    xxsplti32dx v5, 1, 63
 ; CHECK-NEXT:    vsld v2, v2, v5
 ; CHECK-NEXT:    vsrad v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 88
 ; CHECK-NEXT:    blr
 entry:
   %or = or <2 x i64> %B, %C
@@ -115,10 +109,9 @@ define <16 x i8> @ternary_A_nor_BC_C_16x8(<16 x i1> %A, <16 x i8> %B, <16 x i8>
 ; CHECK-LABEL: ternary_A_nor_BC_C_16x8:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltib v5, 7
-; CHECK-NEXT:    xxlnor vs0, v3, v4
 ; CHECK-NEXT:    vslb v2, v2, v5
 ; CHECK-NEXT:    vsrab v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 88
 ; CHECK-NEXT:    blr
 entry:
   %or = or <16 x i8> %B, %C
@@ -132,10 +125,9 @@ define <8 x i16> @ternary_A_nor_BC_C_8x16(<8 x i1> %A, <8 x i16> %B, <8 x i16> %
 ; CHECK-LABEL: ternary_A_nor_BC_C_8x16:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltiw v5, 983055
-; CHECK-NEXT:    xxlnor vs0, v3, v4
 ; CHECK-NEXT:    vslh v2, v2, v5
 ; CHECK-NEXT:    vsrah v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 88
 ; CHECK-NEXT:    blr
 entry:
   %or = or <8 x i16> %B, %C
@@ -149,10 +141,9 @@ define <4 x i32> @ternary_A_eqv_BC_C_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %
 ; CHECK-LABEL: ternary_A_eqv_BC_C_4x32:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxleqv v5, v5, v5
-; CHECK-NEXT:    xxleqv vs0, v3, v4
 ; CHECK-NEXT:    vslw v2, v2, v5
 ; CHECK-NEXT:    vsraw v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 89
 ; CHECK-NEXT:    blr
 entry:
   %xor = xor <4 x i32> %B, %C
@@ -166,11 +157,10 @@ define <2 x i64> @ternary_A_eqv_BC_C_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %
 ; CHECK-LABEL: ternary_A_eqv_BC_C_2x64:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxlxor v5, v5, v5
-; CHECK-NEXT:    xxleqv vs0, v3, v4
 ; CHECK-NEXT:    xxsplti32dx v5, 1, 63
 ; CHECK-NEXT:    vsld v2, v2, v5
 ; CHECK-NEXT:    vsrad v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 89
 ; CHECK-NEXT:    blr
 entry:
   %xor = xor <2 x i64> %B, %C
@@ -184,10 +174,9 @@ define <16 x i8> @ternary_A_eqv_BC_C_16x8(<16 x i1> %A, <16 x i8> %B, <16 x i8>
 ; CHECK-LABEL: ternary_A_eqv_BC_C_16x8:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltib v5, 7
-; CHECK-NEXT:    xxleqv vs0, v3, v4
 ; CHECK-NEXT:    vslb v2, v2, v5
 ; CHECK-NEXT:    vsrab v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 89
 ; CHECK-NEXT:    blr
 entry:
   %xor = xor <16 x i8> %B, %C
@@ -201,10 +190,9 @@ define <8 x i16> @ternary_A_eqv_BC_C_8x16(<8 x i1> %A, <8 x i16> %B, <8 x i16> %
 ; CHECK-LABEL: ternary_A_eqv_BC_C_8x16:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltiw v5, 983055
-; CHECK-NEXT:    xxleqv vs0, v3, v4
 ; CHECK-NEXT:    vslh v2, v2, v5
 ; CHECK-NEXT:    vsrah v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 89
 ; CHECK-NEXT:    blr
 entry:
   %xor = xor <8 x i16> %B, %C
@@ -218,10 +206,9 @@ define <4 x i32> @ternary_A_nand_BC_C_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32>
 ; CHECK-LABEL: ternary_A_nand_BC_C_4x32:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxleqv v5, v5, v5
-; CHECK-NEXT:    xxlnand vs0, v3, v4
 ; CHECK-NEXT:    vslw v2, v2, v5
 ; CHECK-NEXT:    vsraw v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 94
 ; CHECK-NEXT:    blr
 entry:
   %and = and <4 x i32> %B, %C
@@ -235,11 +222,10 @@ define <2 x i64> @ternary_A_nand_BC_C_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64>
 ; CHECK-LABEL: ternary_A_nand_BC_C_2x64:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxlxor v5, v5, v5
-; CHECK-NEXT:    xxlnand vs0, v3, v4
 ; CHECK-NEXT:    xxsplti32dx v5, 1, 63
 ; CHECK-NEXT:    vsld v2, v2, v5
 ; CHECK-NEXT:    vsrad v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 94
 ; CHECK-NEXT:    blr
 entry:
   %and = and <2 x i64> %B, %C
@@ -253,10 +239,9 @@ define <16 x i8> @ternary_A_nand_BC_C_16x8(<16 x i1> %A, <16 x i8> %B, <16 x i8>
 ; CHECK-LABEL: ternary_A_nand_BC_C_16x8:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltib v5, 7
-; CHECK-NEXT:    xxlnand vs0, v3, v4
 ; CHECK-NEXT:    vslb v2, v2, v5
 ; CHECK-NEXT:    vsrab v2, v2, v5
-; CHECK-NEXT:    xxsel v2, v4, vs0, v2
+; CHECK-NEXT:    xxeval v2, v2, v3, v4, 94
 ; CHECK-NEXT:    blr
 entry:
   %and = and <16 x i8> %B, %C
@@ -270,10 +255,9 @@ define <8 x i16> @ternary_A_nand_BC_C_8x16(<8 x i1> %A, <8 x i16> %B, <8 x i16>
 ; CHECK-LABEL: ternary_A_nand_BC_C_8x16:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxspltiw v5, 983055
-; CHECK-NEXT:    xxlnand vs0, v3, v4
 ...
[truncated]

@tonykuttai
Copy link
Contributor Author

tonykuttai commented Aug 20, 2025

The following patterns:

  • ternary(A, C, B)
  • ternary(A, not(C), B)
  • ternary(A, B, C)
  • ternary(A, not(B), C)
    are not considered in this PR as the current xxsel (3 Cycle instruction) variants perform better than xxeval (4 cycle instruction) for this patterns.

//
// The patterns implement xxeval vector select operations where:
// - A is the selector vector
// - f(B,C) is the "true" case op on vectors B and C (AND, NOR, EQV, NAND)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group Code Review - Change op to operand to have a more clear description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here op refers to f which is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is operation more accurate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it.

// The patterns implement xxeval vector select operations where:
// - A is the selector vector
// - f(B,C) is the "true" case op on vectors B and C (AND, NOR, EQV, NAND)
// - B is the "false" case op (vector B)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group Code Review - Same as above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to operand

// - A is the selector vector
// - f(B,C) is the "true" case op on vectors B and C (AND, NOR, EQV, NAND)
// - B is the "false" case op (vector B)
// =============================================================================
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group Code Review - Comment clarifying on why we are not using xxeval instruction for the case ternary(A, C, B), etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the comment.

// The patterns implement xxeval vector select operations where:
// - A is the selector vector
// - f(B,C) is the "true" case op on vectors B and C (AND, NOR, EQV, NAND)
// - C is the "false" case op (vector C)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group Code Review - Modify op to operand.

@tonykuttai tonykuttai requested a review from Himadhith August 20, 2025 15:15
@tonykuttai tonykuttai force-pushed the tvarghese/xxevalxbc branch from 592395e to f1c8063 Compare August 20, 2025 15:30
//
// The patterns implement xxeval vector select operations where:
// - A is the selector vector
// - f(B,C) is the "true" case op on vectors B and C (AND, NOR, EQV, NAND)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is operation more accurate?

@tonykuttai tonykuttai requested a review from amy-kwan August 22, 2025 01:39
@tonykuttai
Copy link
Contributor Author

tonykuttai commented Aug 28, 2025

gentle ping @amy-kwan @RolandF77 @lei137

Copy link
Contributor

@lei137 lei137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants