Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block diagonal matrix factorization in linear time #575

Open
ctessum opened this issue Feb 11, 2025 · 8 comments
Open

Block diagonal matrix factorization in linear time #575

ctessum opened this issue Feb 11, 2025 · 8 comments

Comments

@ctessum
Copy link

ctessum commented Feb 11, 2025

Is your feature request related to a problem? Please describe.

I'm trying to use IMEX to solve a reaction-advection system, where the reaction is the stiff part and the advection is the non-stiff part.
The issue is described in detail here.

Describe the solution you’d like

The state of the system is a S x N matrix, where S is the number of chemical species and N is the number of grid cells. While solving the stiff part, each column of the state matrix is independent of all the other columns (because the reaction happens within an individual grid cell), so the Jacobian is a Block diagonal matrix.

Because all the grid cells are independent, the solve time should theoretically scale linearly with number of grid cells. However linear solve part of it doesn't scale linearly:

using LinearSolve
using BlockBandedMatrices
using ProgressLogging
using LinearAlgebra
using Plots

ngrid = Int.(round.((10.0.^collect(0.5:0.2:4))./2)) .* 2
ts = []
@progress for N in ngrid
    B = rand(3 * N)
    b = repeat([3], N)
    x = rand(N*3, N*3)
    A = BlockBandedMatrix{eltype(x)}(x, b, b, (0, 0))
    prob = LinearProblem(A, B)
    t = @elapsed solve(prob, LUFactorization())
    push!(ts, t)
end

plot(ngrid, ts, xaxis=:log, yaxis=:log, xlabel="N", legend=:topleft,
    label="Block Diagonal Matrix", ylabel="Time (s)")

plot!(ngrid, ts[3] .* ngrid ./ ngrid[3],
    label="Linear Scaling")

Image

The red line is linear scaling, but as you can see the linear solve with the block banded matrix is more like quadratic.

So I like like it to scale linearly, and just in general be as fast as possible in this case.

Describe alternatives you’ve considered

If we use a BandedMatrix instead of a BlockBandedMatrix, we do get linear scaling, which is the green line in the figure above:

using BandedMatrices

ts = []
@progress for N in ngrid
    B = rand(3 * N)
    b = repeat([3], N)
    x = rand(N*3, N*3)
    A = BandedMatrix{eltype(x)}(x,  (2, 2))
    prob = LinearProblem(A, B)
    t = @elapsed solve(prob, LUFactorization())
    push!(ts, t)
end

plot!(ngrid, ts, label="Banded Matrix")

However, the extra zeros in the banded matrix mean that something like 2x the amount of computations are being done compared to what is strictly necessary. This may be significant when the number of chemical species S is large.

Additional context

I'm told there is a GPU kernel for this here.

@oscardssmith
Copy link
Contributor

You likely want to use https://github.com/JuliaArrays/BlockDiagonals.jl. That said, in the future, the Julia github is for bugs in Julia, not for looking for performance advice using the ecosystem. For that, you should post on Discourse, or the Julia slack.

@ctessum
Copy link
Author

ctessum commented Feb 11, 2025

Thank you for your response. I appreciate your work in maintaining this repository. I understand that there are different ways to interpret the purpose of github issues, but I am attempting to request a feature, and as such I chose the "feature request" github issue template provided by the repository. I began by creating a post on discourse, which is here, and the ensuing discussion resulted in @ChrisRackauckas suggesting that I open an issue here, which I have now done.

Thanks for suggesting BlockDiagonals.jl. I tried it out, and this is the result I get:

using LinearSolve
using BlockDiagonals
using ProgressLogging
using LinearAlgebra
using Plots

ngrid = Int.(round.((10.0.^collect(0.5:0.2:3.0))./2)) .* 2
ts = []
@progress for N in ngrid
    B = rand(3 * N)
    b = repeat([3], N)
    x = rand(N*3, N*3)
    A = BlockDiagonal([rand(3, 3) for i in 1:N])
    #A = BlockBandedMatrix{eltype(x)}(x, b, b, (0, 0))
    prob = LinearProblem(A, B)
    t = @elapsed solve(prob, LUFactorization())
    push!(ts, t)
end


plot(ngrid, ts, xaxis=:log, yaxis=:log, xlabel="N", legend=:topleft,
    label="Block Diagonal Matrix", ylabel="Time (s)")

plot!(ngrid, ts[3] .* ngrid ./ ngrid[3],
    label="Linear Scaling")

Image

@oscardssmith
Copy link
Contributor

oscardssmith commented Feb 11, 2025

sorry for closing. I misread somehow and thought that this issue was on the Julia repository rather than the LinearSolve one. This is very much in the right place.

@oscardssmith oscardssmith reopened this Feb 11, 2025
@ctessum
Copy link
Author

ctessum commented Feb 11, 2025

No problem!

@ChrisRackauckas
Copy link
Member

It looks like BlockDiagonals.jl does not overload LU: https://github.com/JuliaArrays/BlockDiagonals.jl/blob/master/src/linalg.jl#L175-L188

It wouldn't be too hard to take that code and write an lu and and ldiv! for that type though, looping through the same way that is currently done in that dispatch.

@oscardssmith
Copy link
Contributor

it even could multithread the blocks for extra speed.

@ChrisRackauckas
Copy link
Member

yes indeed

@ctessum
Copy link
Author

ctessum commented Feb 12, 2025

There is also this package: https://github.com/mipals/BlockDiagonalMatrices.jl

At the bottom of the readme it says it is written in a way that makes parallelization easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants