Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a proposal for how to explicitly specify struct layouts #171

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
187 changes: 187 additions & 0 deletions proposals/NNNN-explicit-layout-struct.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
<!-- {% raw %} -->

# Types with explicit layouts in DXIL and SPIR-V

* Proposal: [NNNN](NNNN-explicit-layout-struct.md)
* Author(s): [bogner](https://github.com/bogner)
* Status: **Design In Progress**

## Introduction

This introduces the `dx.Layout` and `spirv.Layout` target extension types,
which can be used to represent HLSL structures that need explicit layout
information in LLVM IR.

## Motivation

Some HLSL types have layout that isn't practical to derive from the module's
DataLayout. This includes all kinds of `cbuffer`, but especially those that use
`packoffset`, and also applies to structs that use the vulkan `[[vk::offset()]]
extension and possibly objects with specific alignment specified on subobjects.

We need to be able to represent these types in IR so that we can generate
correct code in the backends.

## Proposed solution

We should implement a target type that includes a struct type, total size, and
offsets for each member of the struct. This type can then be used in other
target types, such as CBuffer or StructuredBuffer definitions, or even in other
layout types. We need target types for DirectX and for SPIR-V, but the types
can and should mirror each other.

```
target("[dx|spirv].Layout", %struct_type, <size>, [offset...])
```

### Examples

In the examples below we generally use "dx.Layout", since the "spirv.Layout"
variants would be identical.

While these aren't necessarily needed for types that don't have explicit layout
rules, some examples of "standard" layout objects represented this way are
helpful:

```llvm
; struct simple1 {
; float x;
; float y;
; };
%__hlsl_simple1 = { i32, i32 }
target("dx.Layout", %__hlsl_simple1, 8, 0, 4)

; struct simple2 {
; float3 x;
; float y;
; };
%__hlsl_simple2 = { <3 x float>, float }
target("dx.Layout", %__hlsl_simple2, 16, 0, 12)

; struct nested {
; simple2 s2;
; simple1 s1;
; };
%__hlsl_nested = type { target("dx.Layout", %__hlsl_simple2, 16, 0, 12),
target("dx.Layout", %__hlsl_simple1, 8, 0, 4) }
target("dx.Layout", %__layout_nested, 24, 0, 16)
```

Objects whose layout differs in cbuffers than in structs:

```llvm
; struct array_struct {
; float x[4];
; float y;
; };
%__hlsl_array_struct = type { [4 x float], float }
target("dx.Layout", %__hlsl_array_struct, 20, 0, 16)

; cbuffer array_cbuf1 {
; float x[4];
; float y;
; };
target("dx.Layout", %__hlsl_array_struct, 56, 0, 52)

; cbuffer array_cbuf2 {
; array_struct s;
; };
target("dx.Layout", %__hlsl_array_struct, 56, 0, 52)

; struct nested2 {
; simple1 s1;
; simple2 s2;
; };
%__hlsl_nested2 = type { target("dx.Layout", %__hlsl_simple1, 8, 0, 4),
target("dx.Layout", %__hlsl_simple2, 16, 0, 12) }
target("dx.Layout", %__hlsl_nested2, 24, 0, 8)

; cbuffer nested_cbuf {
; simple1 s1;
; simple2 s2;
; };
target("dx.Layout", %__hlsl_nested2, 32, 0, 16)
```

Simple usage of packoffset:

```llvm
; cbuffer packoffset1 {
; float x : packoffset(c1.x);
; float y : packoffset(c2.y);
; };
target("dx.Layout", { i32, i32 }, 40, 16, 36)
```

packoffset that only specifies the first field:
> note: This emits a warning in DXC. Do we really want to support it?

```llvm
; cbuffer packoffset1 {
; float x : packoffset(c1.x);
; float y;
; };
target("dx.Layout", { i32, i32 }, 24, 16, 20)
```

packoffset that only specifies a later field:
> note: This behaves differently between DXIL and SPIR-V in DXC, and the DXIL
> behaviour is very surprising. Do we want to allow this?

```llvm
; cbuffer packoffset1 {
; float x;
; float y : packoffset(c1.x);
; };
target("dx.Layout", { i32, i32 }, 24, 20, 16)
target("spirv.Layout", { i32, i32 }, 20, 0, 16)
```

packoffset that reorders fields:
> note: This fails to compile for SPIR-V in DXC. Is this worth handling?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my perspective, I think it is intuitive for users to have the order of the members in the struct match their order in memory. We'd have to see what kind of usage there is.

On the other hand, this is not a fundamental problem for SPIR-V. In https://docs.vulkan.org/spec/latest/chapters/interfaces.html#interfaces-resources-layout, there is an explicit note:

The numeric order of Offset decorations does not need to follow member declaration order.

We would not have to jump through too many hoops to implement this. The deciding factor should be what is best for users.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both DXC and FXC support this and it seems to be fairly well defined in the language. I think we'll want to treat the current situation as a bug in DXC's SPIR-V backend and support it in clang.


```llvm
; cbuffer packoffset1 {
; float x : packoffset(c2.y);
; float y : packoffset(c1.x);
; };
target("dx.Layout", { i32, i32 }, 40, 36, 16)
```

Use of `[[vk::offset()]]`:

```llvm
; struct vkoffset1 {
; float2 a;
; [[vk::offset(8) float2 b;
; }
%__hlsl_vkoffset1 = { <2 x float>, <2 x float> }
target("spirv.Layout", %__hlsl_vkoffset1, 12, 0, 8)

; struct complex {
; float r;
; float i;
; };
; struct vkoffset2 {
; float2 a;
; [[vk::offset(8) complex b;
; }
%__hlsl_vkoffset2 = { <2 x float>, { float, float } }
target("spirv.Layout", %__hlsl_vkoffset1, 16, 0, 8)
```

## Open questions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are other open question. How and when do you convert from the spirv.Layout or dx.Layout types to the type without the offset, and how could it interact with optimizations. Consider a structured buffer access like https://godbolt.org/z/z49rEWe58. I'm guessing this would change by replacing %struct.T with target("dx.Layout", 16, 0, 8) in the dx.RawBuffer type.

But we still have the question of what should the type of the store. How should the GEP look?

At first the store becomes:

  %2 = getelementptr inbounds nuw %struct.T, ptr %1, i32 0, i32 0
  store <2 x float> zeroinitializer, ptr %2, align 8
  %3 = getelementptr inbounds nuw %struct.T, ptr %1, i32 0, i32 1
  store <2 x float> splat (float 1.000000e+00), ptr %3, align 8

Note that the GEPs currently act on the struct type. We cannot change the GEPs to use the target extension type, because they are not allowed in GEPs. We could leave it implicit in some ways, but then the optimizer will assume it knows the layout and optimize accordingly:

  store <2 x float> zeroinitializer, ptr %2, align 8
  %3 = getelementptr inbounds nuw i8, ptr %2, i32 8
  store <2 x float> splat (float 1.000000e+00), ptr %3, align 8

Note that the opimizer modified the GEPs assuming it knows the offsets, even though it does not. How will you stop the optimizer from making assumptions about the layout of the struct, since we are representing it differently than llvm-ir usually expects?

This is less of a problem for cbuffers because we expect all access to the cbuffer to be through an intrinsic. There are a few options for structrued buffers:

I have a couple ideas on how to fix this, but only one that seems reasonable. Add an intrinsic that does a GEP of the dx.Layout type. This should hide everything from the optimizer.

We might also need an intrinsic that will do a memcpy between a type with a layout and a type that does not. Consider this example: https://godbolt.org/z/rh5dvd3E7. The memcpy copies the buffer contents to a variable for Foo which expects the struct without the layout information.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memcpy type intrinsic could correspond to OpCopyLogical in SPIR-V.

Copy link
Collaborator

@s-perron s-perron Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also expand the memcpy to copy the individual member one at a time, if we want to expose more to the optimizer. If we use an intrinsic, copy propagation will not work well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be hard to allow target types into a GEP instruction? Because I feel this would be the best option vs adding another intrinsic:

  • we keep the GEP semantic, we are just saying "don't assume anything about the offset computation we do, let the backend handle it"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding target type handling to the GEP instruction sounds nicer than needing a parallel set of operations, but that's definitely a larger change to LLVM. I'll add some notes to the open questions to capture these ideas - we'll need to answer this in some satisfying way in order to handle vk::offset properly.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to be pedantic, memcpy is a std::bit_cast/OpBitCast equivalent, the OpCopyLogical is more of a std::copy - a magical member by member logical copy

Y'all already know all that, its just that if the meaning memcpy starts getting overloaded like that it will lead to horrible head-scratching for outside/new contributors.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we try to use a target type in the existing GEP instruction, I believe we will have many places we will have to add special cases. Any code that tries to optimize a GEP will have have a special case for the target extension type, even if it does nothing with it. That defeats the purpose of using an opaque type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some text to try to capture the questions here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like I indeed missed an important aspect (Thanks DevSH)
If we use the target type in the GEP, then do a load/store, we are saying we do a memcpy. And if the layout is different, that would be wrong.
Maybe in such case we should emit an intrinsic to have carry the semantic of the OpCopyLogical, but not a memcpy ?

(This doesn't solve the issue of GEP being allowed to use target types still)


- Should we also add a `target("dx.CBufArray", <type>, <size>)` type, rather
than having the CBuffer logic need to be aware of special array element
spacing rules?
- Should reordering fields actually be allowed here?

## Acknowledgments

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another open issue. How will member functions work? This is related to how the this pointer will be handled in member functions.

This proposal is expanded from comments in [llvm/wg-hlsl#94] and follow up
conversations.

[llvm/wg-hlsl#94]: https://github.com/llvm/wg-hlsl/pull/94

<!-- {% endraw %} -->