Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation failure on 1.11 #370

Closed
maleadt opened this issue Jun 18, 2024 · 4 comments · Fixed by #408
Closed

Compilation failure on 1.11 #370

maleadt opened this issue Jun 18, 2024 · 4 comments · Fixed by #408
Labels
upstream Out of our hands

Comments

@maleadt
Copy link
Member

maleadt commented Jun 18, 2024

As frequently seen on CI:

gpuarrays/base                                (2) |         failed at 2024-06-18T06:24:37.874
TaskFailedException
    nested task error: InterruptException:
    Stacktrace:
     [1] try_yieldto(undo::typeof(identity))
       @ Base ./task.jl:944
     [2] throwto
       @ ./task.jl:956 [inlined]
     [3] (::var"#34#43"{Dict{String, DateTime}, Task, var"#recycle_worker#42"})()
       @ Main /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/build/default-macmini-aarch64-2-0/julialang/metal-dot-jl/test/runtests.jl:284Testing finished in 6 minutes, 36 seconds, 592 milliseconds
Worker 2 failed running test gpuarrays/base:
Some tests did not pass: 91 passed, 0 failed, 2 errored, 0 broken.
gpuarrays/base: Error During Test at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-aarch64-2.0/depots/5cd495a2-4a16-4674-ae02-c839447744bb/packages/GPUArrays/HjWFN/test/testsuite/base.jl:342
  Test threw exception
  Expression: compare(view, AT, a, view(i, 2:2))
  Compilation to native code failed; see below for details.
  If you think this is a bug, please file an issue and attach /private/var/tmp/agent-tempdirs/default-macmini-aarch64-2.0/tmp/jl_T2TyvmFyF8.metallib

I reduced this to:

target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024-n8:16:32"
target triple = "air64-apple-macosx13.3.1"

declare void @llvm.trap()

define void @kernel(i32 %threads_per_grid, i32 %thread_position_in_grid) {
  %.fca.1.0.1.0.extract = load i8, i8 addrspace(1)* null, align 1
  %.unpack9.unpack = load i64, i64 addrspace(1)* null, align 8
  %.not3 = icmp eq i8 %.fca.1.0.1.0.extract, 0
  br i1 %.not3, label %L238.us.peel, label %L28

L238.us.peel:
  %.not.us.peel = icmp eq i32 %thread_position_in_grid, 0
  br i1 %.not.us.peel, label %L254, label %L30.us

L30.us:
  %.not1.us = icmp eq i32 %threads_per_grid, 0
  br i1 %.not1.us, label %L186.us, label %L42

L186.us:
  %.not7.us = icmp ult i64 0, %.unpack9.unpack
  br i1 %.not7.us, label %L30.us, label %L254

L28:
  %.not149 = icmp ult i64 %.unpack9.unpack, 2
  br i1 %.not149, label %L254, label %L42

L42:
  call void @llvm.trap()
  unreachable

L254:
  ret void
}

!air.kernel = !{!0}
!air.version = !{!7}

!0 = !{void (i32, i32)* @kernel, !1, !2}
!1 = !{}
!2 = !{!5, !6}
!5 = !{i32 2, !"air.threads_per_grid", !"air.arg_type_name", !"uint"}
!6 = !{i32 3, !"air.thread_position_in_grid", !"air.arg_type_name", !"uint"}
!7 = !{i32 2, i32 5, i32 0}

It's not obvious what's wrong here. The generated crash reports just point to a crash in the back-end, specifically during a machine function pass (llvm::MachineFunctionPass::runOnFunction).

Original and reduced bitcode & libraries: libraries.zip

@maleadt maleadt added the bug label Jun 18, 2024
@maleadt
Copy link
Member Author

maleadt commented Jun 18, 2024

This crashes on 14.5, 15 beta, and using the offline compiler from Xcode 16 beta. I've reported it to Apple.

@maleadt
Copy link
Member Author

maleadt commented Aug 6, 2024

This is fixed on the latest macOS 15 beta. We can work around the issue by changing our trap + unreachable with a branch to the exit block. Rewrites like that are not entirely trivial, so I'm inclined to just urge Julia 1.11 users to upgrade to macOS 15...

@maleadt maleadt added the upstream Out of our hands label Aug 6, 2024
@christiangnrd
Copy link
Member

I'm fine with that considering upgrade patterns of MacOS users, but how will we deal with CI? Will we upgrade all machines? Or just one that 1.11 always uses? I guess if we'll be supporting 1.10 for MacOS 13+ we should at least have one machine still on v13.

@maleadt
Copy link
Member Author

maleadt commented Aug 23, 2024

I opted for implementing the suggested fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upstream Out of our hands
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants