Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metal: Replace unreachable control flow with exit block branches (on macOS <15) #618

Merged
merged 3 commits into from
Aug 23, 2024

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Aug 23, 2024

Addresses JuliaGPU/Metal.jl#370. The MWE from that issue:

define void @kernel(i32 %threads_per_grid, i32 %thread_position_in_grid) {
  %.fca.1.0.1.0.extract = load i8, i8 addrspace(1)* null, align 1
  %.unpack9.unpack = load i64, i64 addrspace(1)* null, align 8
  %.not3 = icmp eq i8 %.fca.1.0.1.0.extract, 0
  br i1 %.not3, label %L238.us.peel, label %L28

L238.us.peel:
  %.not.us.peel = icmp eq i32 %thread_position_in_grid, 0
  br i1 %.not.us.peel, label %L254, label %L30.us

L30.us:
  %.not1.us = icmp eq i32 %threads_per_grid, 0
  br i1 %.not1.us, label %L186.us, label %L42

L186.us:
  %.not7.us = icmp ult i64 0, %.unpack9.unpack
  br i1 %.not7.us, label %L30.us, label %L254

L28:
  %.not149 = icmp ult i64 %.unpack9.unpack, 2
  br i1 %.not149, label %L254, label %L42

L42:
  call void @llvm.trap()
  unreachable

L254:
  ret void
}

The pass from this PR transforms that IR to:

L42:                                              ; preds = %L28, %L30.us
  br label %L254

L254:                                             ; preds = %L42, %L28, %L186.us, %L238.us.peel
  ret void

When there's more than just a ret instruction in the exit block, we split the block so that we don't have to care about the SSA values used within (otherwise branching to the block may result in used SSAValues being undefined):

L42:
  call void @llvm.trap()
  unreachable

L254:
  %val = add i32 %thread_position_in_grid, 1
  ret void

... becomes

L254:                                             ; preds = %L28, %L186.us, %L238.us.peel
  %val = add i32 %thread_position_in_grid, 1
  br label %ret

ret:                                              ; preds = %L42, %L254
  ret void

When the function returns something, we insert a phi that takes an undef from the unreachable block:

L42:
  call void @llvm.trap()
  unreachable

L254:
  %val = phi i1 [ %.not.us.peel, %L238.us.peel ], [ %.not7.us, %L186.us ], [ %.not149, %L28 ]
  ret i1 %val

... becomes

L42:                                              ; preds = %L28, %L30.us
  br label %ret

L254:                                             ; preds = %L28, %L186.us, %L238.us.peel
  %val = phi i1 [ %.not.us.peel, %L238.us.peel ], [ %.not7.us, %L186.us ], [ %.not149, %L28 ]
  br label %ret

ret:                                              ; preds = %L42, %L254
  %1 = phi i1 [ %val, %L254 ], [ undef, %L42 ]
  ret i1 %1

@maleadt maleadt added the metal Stuff about the Apple metal back-end. label Aug 23, 2024
@maleadt maleadt force-pushed the tb/metal_unreachable_exit branch from 70ab32f to ca31a94 Compare August 23, 2024 09:10
Copy link

codecov bot commented Aug 23, 2024

Codecov Report

Attention: Patch coverage is 0% with 51 lines in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (dd6a8eb) to head (62c3eb9).
Report is 5 commits behind head on master.

Files Patch % Lines
src/metal.jl 0.00% 51 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##           master    #618   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files          24      24           
  Lines        3045    3096   +51     
======================================
- Misses       3045    3096   +51     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

We can re-use the existing exit block if it only contains returns.
@maleadt maleadt merged commit 7997c66 into master Aug 23, 2024
1 check was pending
@maleadt maleadt deleted the tb/metal_unreachable_exit branch August 23, 2024 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
metal Stuff about the Apple metal back-end.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant