-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression 0.7.2 -> 0.7.3 #127
Comments
Not never, just very slow. Reproducer, note the 10x smaller import SparseArrays: SparseMatrixCSC, mul!, sprand; import SparseArrays
import Polyester: @batch
function mymul!(C::AbstractMatrix, A::AbstractMatrix, B::SparseMatrixCSC, α::Number, β::Number)
@batch for j in axes(B, 2)
C[:, j] .*= β
C[:, j] .+= A * (α.*B[:, j])
end
return C
end
N, K, M = 1_000, 2_000, 3_000;
A = rand(N, K); B = sprand(K, M, 0.05); C = similar(A, (N, M));
@time mymul!(C, A, B, true, false) I get julia> @time mymul!(C, A, B, true, false);
190.354090 seconds (2.71 M allocations: 308.822 MiB, 0.04% gc time, 1.81% compilation time) vs julia> @time mymul!(C, A, B, true, false);
1.105294 seconds (2.61 M allocations: 195.908 MiB, 3.82% gc time, 93.76% compilation time) using Polyester 0.7.3 vs 0.7.2, respectively. Second run: julia> @time mymul!(C, A, B, true, false);
0.050903 seconds (21.00 k allocations: 33.591 MiB) vs julia> @time mymul!(C, A, B, true, false);
108.885509 seconds (54.00 k allocations: 138.446 MiB, 0.01% gc time) So one issue we can see is that we are allocating several times the memory, So, the next step is |
0.7.2 calls |
0.7.2 julia> using StrideArraysCore
julia> PtrArray(A) isa DenseArray
true 0.7.3 julia> PtrArray(A) isa DenseArray
false Because it isn't a |
I made a little package that speeds up (dense) x (sparse) matmul (https://github.com/RomeoV/ThreadedDenseSparseMul.jl) and (happened to be) on Polyester
0.5
during development.When updating (and bisecting) the Polyester version to the latest, I've noticed that Polyester 0.7.2 -> 0.7.3 broke my package functionality, such that the loop below maxes out my cpu but never finishes.
The loop is simply
You can test it out with
Again, this computation doesn't finish with Polyester version
>= 0.7.3
(but finishes very quickly with<0.7.3
) and when weCtrl-C
we always find the cpu to currently evalI saw that
0.7.3
changed a few dependencies around, so probably the problem got introduced somewhere there.For now, I'll just restrict my package
[compat]
, but maybe this points at a problem deeper down...Any insights?
versioninfo()
julia> versioninfo() Julia Version 1.10.0-rc1 Commit 5aaa9485436 (2023-11-03 07:44 UTC) Build Info: Official https://julialang.org/ release Platform Info: OS: Linux (x86_64-linux-gnu) CPU: 16 × 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake) Threads: 23 on 16 virtual cores
The text was updated successfully, but these errors were encountered: