Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPIR-V] Invalid spir-v generated when using nointerpolation keyword #5270

Closed
jmccarthy634 opened this issue Jun 6, 2023 · 34 comments · Fixed by #5384
Closed

[SPIR-V] Invalid spir-v generated when using nointerpolation keyword #5270

jmccarthy634 opened this issue Jun 6, 2023 · 34 comments · Fixed by #5384
Assignees
Labels
bug Bug, regression, crash spirv Work related to SPIR-V

Comments

@jmccarthy634
Copy link

Hello, when compiling the following drastically reduced HLSL example to spir-v with dxc.exe -spirv -fspv-target-env=vulkan1.1 -fvk-use-dx-layout -O3 -T ps_6_0 -E psMain nointerpolationbug.hlsl:

struct PSInput
{
	nointerpolation int level : LEVEL;
};

int psMain(PSInput input) : SV_Target0
{
	return input.level;
}

I get the following error:

fatal error: generated SPIR-V is invalid: Expected Result Type to be a composite type
  %18 = OpCompositeConstruct %int %17 %12 %12

This occurs on the current latest nightly build against commit e0a2965, it does not occur using an earlier build against commit 740dd09.

@jmccarthy634
Copy link
Author

jmccarthy634 commented Jun 7, 2023

@ShchchowAMD this appears to have started happening with commit cd3b40b

Going through more of our shader builds and it definitely seems to affect everywhere we use nointerpolation, with results varying from something like the above to straight up crashing inside the optimizer in spvtools.

@ShchchowAMD
Copy link
Contributor

Thanks for reporting this issue.

Ref to comment in the commit: #4638 (comment)

For latest KHR extension, following SV_BaryCentrics description in wiki (https://github.com/microsoft/DirectXShaderCompiler/wiki/SV_Barycentrics):

For pixel shaders to perform attribute interpolation using the system-generated barycentric weights, the attributes values at the vertices must be provided. These can be accessed by declaring the attribute as nointerpolation, and using new intrinsics to read the values at the 3 vertices.

So I think an inputs decorated with 'nointerpolation' should be treated as it has an extra dimension, indicating which vertices it mapped to. For the error reported here, its type is treated as an int[3] now.

@s-perron @Keenuts
I think this is the case we discussed before, like, whether 'nointerpolation' inputs could be used outside GetAttributeAtVertex() in a function body?

I'm not sure whether above case is legal in KHR extension, like the case in discussion before:

float4 main(nointerpolation float4 col : COLOR) : SV_Target {
     return GetAttributeAtVertex(col, 2) + col;
}

If we could make sure it should only be used in such intrinsics, I'll add error message in compile result info.

@s-perron
Copy link
Collaborator

Yes, we did discuss this before. This case cannot be handled in Vulkan because the spir-v cannot correctly represent it. The spir-v validator message is not useful. If we can, we should try to add an error message before we generate the spir-v.

@ShchchowAMD Are you able to do that?

@ShchchowAMD
Copy link
Contributor

Yea. I'll make a new commit for this.

@handsomematt
Copy link

handsomematt commented Jun 12, 2023

Also been getting issues due to inappropriate usage of nointerpolation but from an LLVM assert Error: assert(new_array_type != nullptr && "Can't copy an array to a non-array.") without it spitting out any errors about invalid spir-v, making it initially difficult to narrow down the troublesome code.

// dxc.exe -T ps_6_0 -E MainPs -spirv
// Internal compiler error: LLVM Assert

struct PS_INPUT
{
	nointerpolation uint nView : TEXCOORD12;
};

Texture2D g_tBlueNoise : register( t0 );

uint2 TextureDimensions2D( Texture2D tex, uint nMipLevel )
{
	uint2 nDim;
	uint nLevels;
	tex.GetDimensions( nMipLevel, nDim.x, nDim.y, nLevels );
	return nDim;
}

float4 MainPs( PS_INPUT i ) : SV_Target0
{
	uint nView = i.nView;
	TextureDimensions2D( g_tBlueNoise, 0 );
	return float4( 1.0, 1.0, 1.0, 1.0 );
}

@danginsburg
Copy link
Contributor

danginsburg commented Jun 13, 2023

Yea. I'll make a new commit for this.

@handsomematt @s-perron - I might not be fully understanding the discussion here, but can you confirm you will be fixing the regression with the case @handsomematt mentions above. We have this in our shaders too, and this has worked for a very long time, so I presume this is just a regression due to your change?

I presume nointerpolation on a PS input should just map to the SPIR-V Flat decoration as it always has - should be fully represenable in SPIR-V?

@s-perron
Copy link
Collaborator

I cannot find what HLSL is supposed to do if you access a nointerpolation input without using GetAttributeAtVertex. Is there anything that specifies what it does?

This is all that I could find.

https://github.com/microsoft/DirectXShaderCompiler/wiki/SV_Barycentrics#per-vertex-attributes

@danginsburg
Copy link
Contributor

I cannot find what HLSL is supposed to do if you access a nointerpolation input without using GetAttributeAtVertex. Is there anything that specifies what it does?

This is all that I could find.

https://github.com/microsoft/DirectXShaderCompiler/wiki/SV_Barycentrics#per-vertex-attributes

The case that regressed is a PS input, not VS input. It should be a Flat interpolation modifier. I don’t think there is any ambiguity about that? It used to work.

@ShchchowAMD
Copy link
Contributor

@danginsburg From the spec, GetAttributeAtVertex is used in PS. We now support this intrinsic so nointerpolation input will be treated as an array. Its index will represent VertexID, indicating where this value comes from / which vertex it is representing now. We are assuming it won't be used outside this intrinsics.

@handsomematt I'll check this. Another question for this example is just as first comment to @danginsburg here.

@danginsburg
Copy link
Contributor

Ok. I am confused. The SV_Barycentric proposal changes the meaning of nointerpolation PS inputs for all shader models? Shouldn’t it be opt in somehow? This has been supported since SM5 and has been in the spir-v backend for a long time https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/SPIR-V.rst#id66

@handsomematt
Copy link

Just for further clarity here's the expected output of this code before cd3b40b and the error we're getting after, the assert I posted earlier is an additional error with the same root problem.

struct PS_INPUT
{
	nointerpolation uint nView : TEXCOORD12;
};

uint MainPs( PS_INPUT i ) : SV_Target0
{
	return i.nView;
}

// Before cd3b40b
// dxc.exe -T ps_6_0 -E MainPs -spirv
// ...
//               OpName %MainPs "MainPs"
//               OpDecorate %in_var_TEXCOORD12 Flat
//               OpDecorate %in_var_TEXCOORD12 Location 0
// ...

// After cd3b40b
// dxc.exe -T ps_6_0 -E MainPs -spirv
// fatal error: generated SPIR-V is invalid: Expected Result Type to be a composite type
//  %17 = OpCompositeConstruct %uint %16 %11 %11 

@s-perron
Copy link
Collaborator

I'll review the specs, and get back to you.

@s-perron s-perron added bug Bug, regression, crash spirv Work related to SPIR-V labels Jun 13, 2023
@danginsburg
Copy link
Contributor

Here are some observations I have from poking at it:

  • ps_6_0 in DXIL does not allow GetAttributeAtVertex, it gives this error:
error: Opcode AttributeAtVertex not valid in shader model ps_6_0.
note: at '%4 = call float @dx.op.attributeAtVertex.f32(i32 137, i32 0, i32 0, i8 3, i8 1)' in block '#0' of function 'PSMain'.
Validation failed.

I would propose, therefore, that for the purposes of SPIR-V:

  • ps_6_0 should not allow barycentrics and should continue to generate the same code for naked interpolation qualifiers
  • Vulkan barycentrics should only been enabled at ps_6_1 AND the inclusion of barycentrics in the shader code

That seems to be what DX12 is doing from what I can tell.

@s-perron
Copy link
Collaborator

I have a better idea of what is going on. @ShchchowAMD sorry for the bad advice earlier. I was not familiar enough with how nointerpolation is handled.

TL/DR: When a nointerpolation input is used in a PS without GetAttributeAtVertex, we should load the value at the provoking vertex. That is, treat it as if it was GetAttributeAtVertex(<var>, 0);. The other change it to change the type to an array only if GetAttributeAtVertex is used somewhere in the file.

@ShchchowAMD Let me know if this seems reasonable to implement. I'm still not very familiar with how DXC is structured.

This will be a long explanation because I want to document how we ended up here, and why it is done this way.

nointerpolation before barycentrics

Before the barycentrics were implemented, a variable marked with nointerpolation was simply decorated as flat, and it was used by simply loading it. In Vulkan, this means that the value of that input will be the value from the provoking vertex.

The HLSL specification does not define the nointerpolation keyword precisely. All it says is "Do not interpolate the outputs of a vertex shader before passing them to a pixel shader.". They probably end up using the value from the provoking vertex, but that is not specificed.

Regardless of what the HLSL spec says, the syntax is legal in HLSL, and the SPIR-V backend implemented a particular behaviour is the only reasonable implementation (my opinion).

nointerpolation after barycentrics

When barycentrics were added, the pixel shader potentially needs access the uninterpolated values for each vertex. In HLSL this is done by using the GetAttributeAtVertex builtin function. In SPIR-V, this is done by adding the PerVertexKHR decoration and turning the value into an array of values.

At this point, we had a decision to make: What is the type of a nointerpolation input, and how should it be accessed?

The first implementation tried to identify if the input is used in GetAttributeAtVertex. If it was used exclusively in GetAttributeAtVertex, then it was turned into an array. If it was never used in GetAttributeAtVertex, it was left as a scalar. In other cases, an error was emitted. This design was rejected because it was too intrusive. A front-end should not have to look at the uses of a variable to determine its type.

When considering a new design, I suggested that we always turn the input into an array. This would eliminate the need to look at the uses to determine the type of the variable. I made a mistake when I suggested that we consider a use without GetAttributeAtVertex as invalid. I was under the impression that the HLSL was expecting the interpolated value to be used, and it was not possible to get both the uninterpolated and interpolated value in the PS.

Now I realize that the uses without GetAttributeAtVertex should return the value at the provoking vertex based on the behaviour before barycentrics were added. This can be implemented by loading the element of the array that corresponds to the provoking vertex.

The extra capability

In general, we do not want to add a capability to the shader when it is not strictly needed. This will needlessly limit the HW on which the shader can run.

As I mentioned above, I want to keep the code generation in DXC simple. The code generation should not have to trace uses to get the type. It can become arbitrarily complicated when the input can be passed as a parameter to other functions.

The would be to only use make the input an array if and only the GetAttributeAtVertex is used somewhere. I think this is a simple check. Note that an input will be changed into an array even if GetAttributeAtVertex is used on other inputs.

@danginsburg
Copy link
Contributor

danginsburg commented Jun 13, 2023

I think I agree but with respect to this statement I think it's a little confusing:

That is, treat it as if it was GetAttributeAtVertex(, 0);. The other change it to change the type to an array only if GetAttributeAtVertex is used somewhere in the file.

I would say "That is, treat it as if it was GetAttributeAtVertex(<var>, 0) but do not require barycentrics in the generated SPIR-V". In the DXIL backend with ps_6_1+, if you explicitly use GetAttributeAtVertex(<var>, 0) it will require barycentrics, whereas it will not require barycentrics if you use the naked qualifier. I think I'm just restating what you said in the details but when I first read the TLDR I was confused.

@ShchchowAMD
Copy link
Contributor

@danginsburg @s-perron
Sorry that I don't have an idea a nointerpolated inputs could be represented as above example, apologize for the issues I made and make others confused.

Just a simple repeat to make sure I understand right, with a simple example here:

col + GetAttributeAtVertex(col, 1) ====> implicitly conv to (in SPIV level) ====> GetAttributeAtVertex(col, 0) + GetAttributeAtVertex(col, 1)

I think this would make sense to me. both for type meaning and implementation ways.
BTW, this would conv all the nointerpolation decorated inputs into an array. I'm not sure whether this would introduce new issues.

PS. The reason why using type redeclare in current impl is that in DXC HLSL-SPV trans level, it would use two variables to represent an input. One for stage IO, another for param var in 'main' block, loading data from related stage IO, I just want to make impl more simple. If above understanding is right, it would reduce workload as we don't need to check and modify result type in all statements in any blocks, related to nointerpolated inputs.

@danginsburg
Copy link
Contributor

BTW, this would conv all the nointerpolation decorated inputs into an array. I'm not sure whether this would introduce new issues.

This should definitely not be done. You should not generate SPIR-V that requires SPV_KHR_fragment_shader_barycentric and it shouldn't require additional PS inputs. It should just generate the code it used to generate:

OpDecorate %in_var_TEXCOORD12 Flat
OpDecorate %in_var_TEXCOORD12 Location 0

@s-perron
Copy link
Collaborator

I'll try to clarify my TL/DR above. I was suggesting the following pseudo code:

  1. If the compilation unit does not have the GetAttributeAtVertex function
    1. Declare the nointerpolation variables as scalars with the flat decoration.
  2. Otherwise
    1. Declare the nointerpolation variables as array with the flat and pervertex decorations. (Are both needed?)
    2. Implicitly convert col + GetAttributeAtVertex(col, 1) to GetAttributeAtVertex(col, 0) + GetAttributeAtVertex(col, 1) in the spir-v backend.

If generating the code both ways leads to a lot of if-statements in DXC, we could write a spirv-opt pass that removes the PerVertexKHR decoration when the only vertex used is the provoking vertex. I could write the spirv-opt pass.

@ShchchowAMD
Copy link
Contributor

Thanks, now I know the meanings. I'll have a try locally, it may take some time and I'll update ASAP.

@MarijnS95
Copy link
Contributor

MarijnS95 commented Jun 14, 2023

I've recently also upgraded DXC, and started running into compiler issues with a vertex shader containing nointerpolation, identical in nature to #5270 (comment): multiple assertions (I am compiling DXC with assertions enabled so that we can catch otherwise-obscure issues like #5272 early), and reverting cd3b40b solves them all.

The following two asserts occur inside spirv-opt, and commenting them out to "reproduce" a "regular" release without asserts eventually leads to a simple segfault (can provide the backtrace if needed).

external/SPIRV-Tools/source/opt/pass.cpp:150: uint32_t spvtools::opt::Pass::GenerateCopy(spvtools::opt::Instruction *, uint32_t, spvtools::opt::Instruction *): Assertion `false && "Don't know how to copy this type.  Code is likely illegal."' failed.
external/SPIRV-Tools/source/opt/def_use_manager.cpp:56: void spvtools::opt::analysis::DefUseManager::AnalyzeInstUse(spvtools::opt::Instruction *): Assertion `def && "Definition is not registered."' failed.

On the following shader:

struct VsOut {
    nointerpolation uint someIndex : TEXCOORD1;
};

struct BindingsOffset {
    uint someData;
};

[[vk::binding(0, 0)]] ByteAddressBuffer g_ByteAddressBuffer[10];
ConstantBuffer<BindingsOffset> g_bindingsOffset : register(b0, space0);

VsOut main(uint vertexIndex : SV_VertexID) {
    ByteAddressBuffer buf = g_ByteAddressBuffer[0];

    VsOut vOut = (VsOut)0;
    vOut.someIndex = g_bindingsOffset.someData;

    return vOut;
}

Compiled with:

dxc -E main -T vs_6_6 repro.hlsl -spirv

If I remove nointerpolationfrom the output, this compiles fine. Likewise, but strangely, if I remove ByteAddressBuffer buf = g_ByteAddressBuffer[0]; (whose value is unused), it also compiles fine 😕.


Apologies for not really catching up on the context above, but we're not doing anything with GetAttributeAtVertex(). Hopefully this additional context helps you get to the bottom of it though (and/or solve a different but related issue?).

@ShchchowAMD
Copy link
Contributor

@MarijnS95 Thanks for sharing this test case. I'm working locally to fix above cases.
I'll update here once I make a new PR.

For the first issue: If I remove nointerpolationfrom the output, this compiles fine. I think the root cause is that type expanding now depend on attrib 'nointerpolation', which is wrong, I'll try to fix it.
For the second one: if I remove ByteAddressBuffer buf = g_ByteAddressBuffer[0]; (whose value is unused), it also compiles fine, it may take some time to find the root cause, I'll update with you later.

BTW, thanks very much for providing a nice test case, It may take me some time to refine current codes.

@s-perron
Copy link
Collaborator

@ShchchowAMD Any update?

@ShchchowAMD
Copy link
Contributor

I've moved type redecl to intrinsic process part and now usage of old pass or new intrinsic separately would work fine now.

Still working with spirv instructions' type replacement (it may be called in other instructions) and adding a GetAttributeAtVertex(v, 0) for mixed usage, also the special case for Boolean input now.

This may introduce two interfaces (may be refined later):

  1. adding new spirv instructions in front of an existed instruction (finished)
  2. replacing an existed one in spv builder 'instructions' array (working).

I'll try my best to finish it next week ASAP.

@s-perron
Copy link
Collaborator

Great! Thanks.

@s-perron
Copy link
Collaborator

@ShchchowAMD Sorry, but I will be on you a lot about this. People keep reporting new issues related to this. This is a serious regression. If we cannot get the fix soon, should we revert your change? Then you can put it back in when you have to fix ready.

@danginsburg
Copy link
Contributor

Yeah, we are stuck not being able to take newer version of DXC until this gets fixed. I don't want to make extra work, but if it's not going to be addressed soon I agree it should be reverted. I wouldn't want this to go into an official release. It's broken for most of our pixel shaders, and I imagine many other people's as well.

@ShchchowAMD
Copy link
Contributor

Hi perron,
I'll update a fix today or tomorrow ASAP. I'm refining current new codes.
Apologize again for the regression.

@Goshido
Copy link

Goshido commented Jul 1, 2023

Are there any autotests to prevent such breaking changes in the future?

@jmccarthy634
Copy link
Author

Thanks everyone for your responses, I would also agree that this change should probably be reverted - even if a fix is forthcoming, it seems like it might be a better idea to instead have some test coverage (as suggested by @Goshido above) alongside a fixed version of the original change instead?

@ShchchowAMD
Copy link
Contributor

@s-perron Please revert my previous commit. Current fix spent more time than expected.

I'll add more test coverages before making a new PR.
For some corner cases, I'll raise a new issue ticket for discussion.

Sorry for this regression again.

@s-perron
Copy link
Collaborator

s-perron commented Jul 3, 2023

@Keenuts @cassiebeckley I'm off today can one of you take care of the revert? If not I'll do it tomorrow.

@Keenuts
Copy link
Collaborator

Keenuts commented Jul 3, 2023

Sure, but everybody is OOO for the next 2 days (except me in Europe 🙃) so might not get the approvals for revert before Wednesday.

s-perron pushed a commit that referenced this issue Jul 4, 2023
…entric (SV_Barycentrics)" (#5384)

Reverts #4638
This PR causes multiples issues (#5326, #5342, #5270). Reverting, will
need more testing.

Fixed #5270
@s-perron s-perron reopened this Jul 4, 2023
@s-perron s-perron closed this as completed Jul 4, 2023
@Goshido
Copy link

Goshido commented Jul 4, 2023

DirectX Shader Compiler v1.7.2212.10216 2096d1b8054d1de864924ef0c5009f533597b9d4 does not have the issue anymore.

Thanks 👍

@jmccarthy634
Copy link
Author

Thanks all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug, regression, crash spirv Work related to SPIR-V
Projects
None yet
8 participants