Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect and throw an error when multithreading with Boxed variables #141

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

MasonProtter
Copy link
Member

@MasonProtter MasonProtter commented Mar 11, 2025

Closes #132, Closes #133

Before:

julia> let
           v = tmap(1:10) do i
               A = i
               sleep(rand())
               A
           end
           A = 1 # oops, now everything is a race condition!
           v
       end
10-element Vector{Int64}:
 9
 2
 2
 4
 2
 4
 9
 2
 9
 2

This PR:

julia> let
           v = tmap(1:10) do i
               A = i
               sleep(rand())
               A
           end
           A = 1 # oops, now everything is a race condition!
           v
       end
ERROR: Attempted to capture and modify outer local variable(s) A, which would be not only slow, but could also cause a race condition. Consider marking these variables as local inside their respective closure, or redesigning your code to avoid the race condition.

If these variables are inside a @one_by_one or @only_one block, consider using a mutable Ref instead of re-binding the variable.

This error can be bypassed with the @allow_boxed_captures macro.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] throw_if_boxed_captures(f::Function)
    @ OhMyThreads.Implementation ~/Dropbox/Julia/OhMyThreads/src/implementation.jl:314
  [3] throw_if_boxed_captures(f::Function)
    @ OhMyThreads.Implementation ~/Dropbox/Julia/OhMyThreads/src/implementation.jl:322
  [4] throw_if_boxed_captures
    @ ~/Dropbox/Julia/OhMyThreads/src/implementation.jl:329 [inlined]
  [5] _tmapreduce(f::Function, op::Function, Arrs::Tuple{…}, ::Type{…}, scheduler::DynamicScheduler{…}, mapreduce_kwargs::@NamedTuple{})
...

This new behaviour can be opted out of with the new @allow_boxed_captures macro:

julia> @allow_boxed_captures let
           v = tmap(1:10) do i
               A = i
               sleep(rand())
               A
           end
           A = 1 # oops, now everything is a race condition!
           v
       end
10-element Vector{Int64}:
 4
 2
 8
 4
 3
 6
 6
 2
 6
 6

@MasonProtter
Copy link
Member Author

Okay, turns out we use a lot of boxed variables in the test suite. In particular with the @one_by_one and similar macros.

@carstenbauer what do you think about this change? Is it too distruptive? We'd need to change things lke

-       x = 0
-       y = 0
+       x = Ref(0)
+       y = Ref(0)
        sao = SingleAccessOnly()
        try
            @tasks for i in 1:10
                @set ntasks = 10

-                y += 1 # not safe (race condition)
+                y[] += 1 # not safe (race condition)
                @one_by_one begin
-                    x += 1 # parallel-safe because inside of one_by_one region
+                    x[] += 1 # parallel-safe because inside of one_by_one region 
                    acquire(sao) do
                        sleep(0.01)
                    end
                end
            end
-            @test x == 10
+            @test x[] == 10
        catch ErrorException
            @test false
        end

@carstenbauer
Copy link
Member

Hm, telling the user to use Refs feels unfortunate to me. But so is having lots of boxed variables...

@MasonProtter
Copy link
Member Author

Yeah :(

@MasonProtter
Copy link
Member Author

MasonProtter commented Mar 15, 2025

Okay, so I went through and made all the tests pass. This actually was enough to convince me that this really is a good idea. You see, we did a lot of stuff like

@testset "outer" begin
    x = 0
    @testset "inner" begin
        @tasks for a in b
             something_with(x)
        end
    end
    test_func = () -> begin
        x = 0
        @tasks for a in b
             something_else_with(x)
        end
         x
    end
end 

Even that is enough to sneakily hit closure boxing because x was being shared between the the outer x = 0 and the x = 0 in the test_func. This is such a thorny and problematic thing that's so easily to accidentally hit, I'm starting to be in favour of tagging a breaking version change to do this.


[deps]
BangBang = "198e06fe-97b7-11e9-32a5-e1d131e6ad66"
ChunkSplitters = "ae650224-84b6-46f8-82ea-d812ca08434e"
ScopedValues = "7e506255-f358-4e82-b7e4-beb19740aa63"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed because we currently support the LTS v1.10, and ScopedValues were only added to Base in v1.11

This can be removed if we ever drop v1.10

Comment on lines +287 to +291
macro allow_boxed_captures(ex)
quote
@with allowing_boxed_captures => true $(esc(ex))
end
end
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to go with a scoped value here as the way to disable the error behaviour. This means that any outer-code can be simply annotated with

@allow_boxed_captures

and anything inside it will ignore the error checks. I think I like this design rather than using a kwarg per method of tmapreduce / tmap / @tasks, etc. but I'm open to having my mind changed.

This also makes me wonder if some other stuff should be scoped values in the future 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Detect and throw an error whenever we get a closure with a Core.Box tmap and thread-local variables
3 participants