Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the msad4 HLSL Function #99137

Open
12 tasks
Tracked by #99235
farzonl opened this issue Jul 16, 2024 · 1 comment
Open
12 tasks
Tracked by #99235

Implement the msad4 HLSL Function #99137

farzonl opened this issue Jul 16, 2024 · 1 comment
Labels
backend:DirectX backend:SPIR-V bot:HLSL HLSL HLSL Language Support metabug Issue to collect references to a group of similar or related issues.

Comments

@farzonl
Copy link
Member

farzonl commented Jul 16, 2024

  • Implement msad4 clang builtin,
  • Link msad4 clang builtin with hlsl_intrinsics.h
  • Add sema checks for msad4 to CheckHLSLBuiltinFunctionCall in SemaChecking.cpp
  • Add codegen for msad4 to EmitHLSLBuiltinExpr in CGBuiltin.cpp
  • Add codegen tests to clang/test/CodeGenHLSL/builtins/msad4.hlsl
  • Add sema tests to clang/test/SemaHLSL/BuiltIns/msad4-errors.hlsl
  • Create the int_dx_msad4 intrinsic in IntrinsicsDirectX.td
  • Create the DXILOpMapping of int_dx_msad4 to 53 in DXIL.td
  • Create the msad4.ll and msad4_errors.ll tests in llvm/test/CodeGen/DirectX/
  • Create the int_spv_msad4 intrinsic in IntrinsicsSPIRV.td
  • In SPIRVInstructionSelector.cpp create the msad4 lowering and map it to int_spv_msad4 in SPIRVInstructionSelector::selectIntrinsic.
  • Create SPIR-V backend test case in llvm/test/CodeGen/SPIRV/hlsl-intrinsics/msad4.ll

DirectX

DXIL Opcode DXIL OpName Shader Model Shader Stages
53 Bfi 6.0 ()

SPIR-V

SAbs:

Description:

SAbs

Result is x if x ≥ 0; otherwise result is -x, where x is
interpreted as a signed integer.

Result Type and the type of x must both be integer scalar or integer
vector types. Result Type and operand types must have the same number
of components with the same component width. Results are computed per
component.

Number Operand 1 Operand 2 Operand 3 Operand 4

5

<id>
x

Test Case(s)

Example 1

//dxc msad4_test.hlsl -T lib_6_8 -enable-16bit-types -O0

export uint4 fn(uint p1, uint2 p2, uint4 p3) {
    return msad4(p1, p2, p3);
}

HLSL:

Compares a 4-byte reference value and an 8-byte source value and accumulates a vector of 4 sums. Each sum corresponds to the masked sum of absolute differences of a different byte alignment between the reference value and the source value.

uint4 result = msad4(uint reference, uint2 source, uint4 accum);

Parameters

reference

[in] The reference array of 4 bytes in one uint value.

source

[in] The source array of 8 bytes in two uint2 values.

accum

[in] A vector of 4 values. msad4 adds this vector to the masked sum of absolute differences of the different byte alignments between the reference value and the source value.

Return Value

A vector of 4 sums. Each sum corresponds to the masked sum of absolute differences of different byte alignments between the reference value and the source value. msad4 doesn't include a difference in the sum if that difference is masked (that is, the reference byte is 0).

Remarks

To use the msad4 intrinsic in your shader code, call the ID3D11Device::CheckFeatureSupport method with D3D11_FEATURE_D3D11_OPTIONS to verify that the Direct3D device supports the SAD4ShaderInstructions feature option. The msad4 intrinsic requires a WDDM 1.2 display driver, and all WDDM 1.2 display drivers must support msad4. If your app creates a rendering device with feature level 11.0 or 11.1 and the compilation target is shader model 5 or later, the HLSL source code can use the msad4 intrinsic.

Return values are only accurate up to 65535. If you call the msad4 intrinsic with inputs that might result in return values greater than 65535, msad4 produces undefined results.

Minimum Shader Model

This function is supported in the following shader models.

Shader Model Supported
Shader model 5 or later yes

Examples

Here is an example result calculation for msad4:

reference = 0xA100B2C3;
source.x = 0xD7B0C372
source.y = 0x4F57C2A3
accum = {1,2,3,4}
result.x alignment source: 0xD7B0C372
result.x = accum.x + |0xD7   0xA1| + 0 (masked) + |0xC3   0xB2| + |0x72   0xC3| = 1 + 54 + 0 + 17 + 81 = 153
result.y alignment source: 0xA3D7B0C3
result.y = accum.y + |0xA3   0xA1| + 0 (masked) + |0xB0   0xB2| + |0xC3   0xC3| = 2 + 2 + 0 + 2 + 0 = 6
result.z alignment source: 0xC2A3D7B0
result.z = accum.z + |0xC2   0xA1| + 0 (masked) + |0xD7   0xB2| + |0xB0   0xC3| = 3 + 33 + 0 + 37 + 19 = 92
result.w alignment source: 0x57C2A3D7
result.w = accum.w + |0x57   0xA1| + 0 (masked) + |0xA3   0xB2| + |0xD7   0xC3| = 4 + 74 + 0 + 15 + 20 = 113
result = {153,6,92,113}

Here is an example of how you can use msad4 to search for a reference pattern within a buffer:

uint4 accum = {0,0,0,0};
for(uint i=0;i<REF_SIZE;i++)
    accum = msad4(
        buf_ref[i], 
        uint2(buf_src[DTid.x+i], buf_src[DTid.x+i+1]), 
        accum);
buf_accum[DTid.x] = accum;

Requirements

Requirement Value
Minimum supported client
Windows 8 [desktop apps | UWP apps]
Minimum supported server
Windows Server 2012 [desktop apps | UWP apps]

See also

Intrinsic Functions

@farzonl farzonl added backend:DirectX backend:SPIR-V bot:HLSL HLSL HLSL Language Support metabug Issue to collect references to a group of similar or related issues. labels Jul 16, 2024
@damyanp damyanp moved this to Ready in HLSL Support Oct 30, 2024
@damyanp damyanp moved this from Ready to Planning in HLSL Support Oct 30, 2024
@pow2clk
Copy link
Contributor

pow2clk commented Nov 19, 2024

Expanded on DXIL and SPIRV. We should look at the expansion, determine the dependencies and other possible complications.

@pow2clk pow2clk moved this from Planning to Designing in HLSL Support Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:DirectX backend:SPIR-V bot:HLSL HLSL HLSL Language Support metabug Issue to collect references to a group of similar or related issues.
Projects
Status: Designing
Development

No branches or pull requests

2 participants