Custom Kernel outputs zeros? #819

avik-pal · 2025-02-27T21:34:48Z

using Reactant, CUDA

Reactant.set_default_backend("cpu")

function square_kernel!(x, y)
    i = threadIdx().x
    x[i] = y[i] * y[i]
    # We don't yet auto lower this via polygeist
    # sync_threads()
    return nothing
end

function square!(x, y)
    @cuda blocks = 1 threads = length(x) square_kernel!(x, y)
    return nothing
end

function square(x)
    y = similar(x)
    square!(y, x)
    return y
end

x = Reactant.to_rarray(collect(1:1:64) ./ 64)

@code_hlo square(x)

@jit square(x)

64-element ConcretePJRTArray{Float64, 1, 1, Reactant.Sharding.ShardInfo{Reactant.Sharding.NoSharding, Nothing}}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 ⋮
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

The text was updated successfully, but these errors were encountered:

mofeing · 2025-03-02T08:19:22Z

mmm just to discard synchronicity problems. does it change if you wait a couple of seconds before printing?

avik-pal · 2025-03-02T14:11:11Z

no it doesnt

mofeing · 2025-03-02T16:28:11Z

mmm what happens if you initialize y with ones instead of similar? i suspect that side-effects are not properly tracked and it might be optimizing away the kernel

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom Kernel outputs zeros? #819

Custom Kernel outputs zeros? #819

avik-pal commented Feb 27, 2025

mofeing commented Mar 2, 2025

avik-pal commented Mar 2, 2025

mofeing commented Mar 2, 2025

Custom Kernel outputs zeros? #819

Custom Kernel outputs zeros? #819

Comments

avik-pal commented Feb 27, 2025

mofeing commented Mar 2, 2025

avik-pal commented Mar 2, 2025

mofeing commented Mar 2, 2025