[HLSL] DML shader `Zero_256_uint16_native_4` causing 1773 DML Operator Test failures

The shader [`Zero_256_uint16_native_4`](https://microsoft.visualstudio.com/WindowsAI/_git/DirectML?path=%2FProduct%2FShaders%2FGenerated%2FZero_256_uint16_native_4.hlsl&_a=contents&version=GBmaster) (Shader ID: [193008](https://microsoft.visualstudio.com/WindowsAI/_git/DirectML?path=/Product/Shaders/Generated/Windows/JitInfo.json&version=GBmaster&line=55001&lineEnd=55007&lineStartColumn=1&lineEndColumn=7&lineStyle=plain&_a=contents)) is causing 596 single-shader DML Operator Test failures.

A "single-shader DML Operator Test" is a DML Operator Test that uses a single shader compiled by clang-dxc, but may use any number of fxc-compiled shaders. Therefore, `Zero_256_uint16_native_4` is certainly the sole cause of these test failures.

Furthermore, there are a total of 1773 failing DML Operator Tests using the `Zero_256_uint16_native_4` shader.

Some failing single-shader DML Operator Tests:
- OperatorTests::ConvolutionDefault#5
- OperatorTests::ConvolutionDefaultWithReduction#27 
- OperatorTests::ConvolutionDepthwise#9 
- OperatorTests::ConvolutionBasicGemm#1
- OperatorTests::ConvolutionBasicGemmIZ4#1 
- OperatorTests::LayoutTransformedConvolutionDefault#metadataSet0#12

Single-shader test results on machines:
- clang-dml01 (AMD): Fail
- clang-dml02 (NVIDIA): Fail
- clang-dml03 (Intel): Fail
- local (WARP): Fail

Reproduction:
```
❯ .\TE.exe .\DirectML.Test.OperatorTests.dll /logOutput:low /p:DisableMetacommands=1 /name:"OperatorTests::ConvolutionDefault#5"
Test Authoring and Execution Framework v10.72 for x64

StartGroup: OperatorTests::ConvolutionDefault#5
Error: Output Tensor #0:
Error: Tensor Sizes: 4,3,1,5,1
Error: Tensor Data Type: float16
Error: Index: 0004 @00000160 [0,0,0,4,0].  Ref: 0.0000000000 (0x0000).  DML: -nan (0xFFFF).  Abs: nan.  Rel: nan%.  Ulp: 65535
Error: Index: 0008 @00000125 [0,1,0,3,0].  Ref: 0.0000000000 (0x0000).  DML: -nan (0xFFFF).  Abs: nan.  Rel: nan%.  Ulp: 65535
Error: Index: 0009 @00000165 [0,1,0,4,0].  Ref: 0.0000000000 (0x0000).  DML: -nan (0xFFFF).  Abs: nan.  Rel: nan%.  Ulp: 65535
Error: Index: 0013 @00000130 [0,2,0,3,0].  Ref: 0.0000000000 (0x0000).  DML: -nan (0xFFFF).  Abs: nan.  Rel: nan%.  Ulp: 65535
Error: Index: 0014 @00000170 [0,2,0,4,0].  Ref: 0.0000000000 (0x0000).  DML: -nan (0xFFFF).  Abs: nan.  Rel: nan%.  Ulp: 65535
Error: Index: 0019 @00000161 [1,0,0,4,0].  Ref: 0.0000000000 (0x0000).  DML: -nan (0xFFFF).  Abs: nan.  Rel: nan%.  Ulp: 65535
Error: Index: 0023 @00000126 [1,1,0,3,0].  Ref: 0.0000000000 (0x0000).  DML: -nan (0xFFFF).  Abs: nan.  Rel: nan%.  Ulp: 65535
Error: Index: 0024 @00000166 [1,1,0,4,0].  Ref: 0.0000000000 (0x0000).  DML: -nan (0xFFFF).  Abs: nan.  Rel: nan%.  Ulp: 65535
Error: Index: 0028 @00000131 [1,2,0,3,0].  Ref: 0.0000000000 (0x0000).  DML: -nan (0xFFFF).  Abs: nan.  Rel: nan%.  Ulp: 65535
Error: 24 / 60 (40.000000%) of elements were found to be above tolerance.
Error: Max absolute delta: 0.000488.  Allowed absolute tolerance: 0.002000.
Error: Max relative delta: 0.077340%.  Allowed relative tolerance: 0.040000%.
Error: Max ULP delta: 65535.  Allowed tolerance: 4 ULPs (float16).
Error: Verify: Fail [File: C:\__w\1\s\DirectML\SharedToolingLib\External\Test\TaefHelper\TaefHelper.cpp, Function: TaefHelper::Fail, Line: 133]
EndGroup: OperatorTests::ConvolutionDefault#5 [Failed]

Summary of Non-passing Tests:
    OperatorTests::ConvolutionDefault#5 [Failed]

Summary: Total=1, Passed=0, Failed=1, Blocked=0, Not Run=0, Skipped=0
```

Note: The test may also be ran with WARP be changing the GPU adapter index by adding the argument `/p:GpuAdapterIndex=N` where `N` is the index for the Microsoft Basic Render Driver. Remove the argument `/logOutput:low` to see which GPU was selected, along with some other potentially helpful information when running the test.

The latest version of DML built with clang-dxc compiled shaders can be obtained from an [internal ClangDML Azure pipeline](https://microsoft.visualstudio.com/WindowsAI/_build?definitionId=171676&_a=summary) via the published `x64-win-redist-release-hlsl-clang` pipeline artifact.

The latest validated DXIL shader binary can be obtained from another [internal ClangDML Azure pipeline](https://dev.azure.com/microsoft/WindowsAI/_build?definitionId=167323&_a=summary) via the published `ValidatedShaders` pipeline artifact.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[HLSL] DML shader `Zero_256_uint16_native_4` causing 1773 DML Operator Test failures #155890

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[HLSL] DML shader Zero_256_uint16_native_4 causing 1773 DML Operator Test failures #155890

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[HLSL] DML shader `Zero_256_uint16_native_4` causing 1773 DML Operator Test failures #155890