-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotate iterate(::AbstractArray) with inbounds #40397
Conversation
Does this also fix #39354? |
I believe we deliberately did not do this, to guard against potentially-buggy AbstractArray subtypes. |
I feared it might have been the case... But I personally think protecting against programming error at low-level constructs like array implementation is too much of a wide contract. There are so many other things that result in undefined behavior in Julia. Also, I'd prefer more strict definitions in the treatment of programming error in the API, rather than each API trying to read the user's mind and resulting in slightly different boundaries. If we want a better failure mode from miss-implementation, I think we need a trait system so that the implementer can express the confidence of their implementation. That said, an alternative easier solution may be to use |
Yeah, there's some really long issues about this... searching for "inbounds next iteration" is rather hopeless, but I found this: Here's the crux of the problem:
More concretely: when inboundsing The issue in my mind isn't so much trusting buggy array implementations — that ship has long sailed — but rather trusting |
What if we made a function |
This is not really possible in general, unless you know the type of the array and how it implements the iteration state 1, right? For example, it is a legal implementation for I believe it's needless to say, but allow me to reemphasize that designing an abstract interface and programming against it is a mathematical activity. Expressing the invariance and letting the compiler rely on it is crucial for witting generic high-performance code. I think we need a principled approach, at least for a very basic control flow like a structured loop and very basic containers like array types. Footnotes
|
Right, I'd be 100% on board with this if it were only about folks ginning up phony states or bad array implementations. But it's not. It's quite likely that if you're manually iterating over two arrays at the same time, that they'd be the same type. For example: julia> function f(x, y)
i1 = iterate(x)
i2 = iterate(y)
r = 0
while i1 !== nothing && i2 !== nothing
v1, s1 = i1
v2, s2 = i2
r = v1 + v2
i1 = iterate(y, s1) # oops
i2 = iterate(x, s2)
end
r
end
f (generic function with 2 methods)
julia> f((rand(2))', (rand(3))')
ERROR: BoundsError: attempt to access 2-element Vector{Float64} at index [3] It seems not great that the above typo could be a segfault (ok, not likely to segfault on a 1-past access, but offsets could make this more explodey). |
I suppose, though, that this isn't terribly unlike the mistake of changing the length of a vector while iterating it or changing indices of a view behind its back. This is just why we've not done it yet. |
My conclusion from 3 years ago was: there's always |
Thanks, this makes sense. I wasn't thinking about this kind of programming error.
I was writing exactly this comment before simplifying my previous comment :) And yes, this is my point. If you don't satisfy the precondition before calling an API, you'll get undefined behavior. Checking every precondition for every non- Of course, if we want to preserve user-friendliness of (1) Add a function it = iterator(xs) # default to `iterator(xs) = xs`
y = iterate(it)
while y !== nothing
x, s = y
$loop_body(x)
y = iterate(it, s)
end As a bonus, wrapping iterable in a custom type is easier. I think I saw StefanKarpinski mentioning it somewhere. (2) function iterate(iter::AbstractType, state)
checkstate(iter, state)
unsafe_iterate(iter, state)
end
unsafe_iterate(iter::AbstractType, state) = throw(MethodError(unsafe_iterate, state)) For (3) use |
I just realized that it's possible to protect against this kind of mistake by putting the array itself in the state function iterate(X::AbstractArray, state=(X, eachindex(X),))
@_inline_meta
A = first(state) # state includes A
y = iterate(tail(state)...)
y === nothing && return nothing
@inbounds A[y[1]], (A, state[2], tail(y)...)
end |
This is too unsafe |
This PR adds inbounds and inline annotation to
iterate
onAbstractArray
as already done forArray
.Before this PR, the effect of the missing inbounds annotation can be observed with this benchmark:
(Aside: Interestingly,
f($(view(xs, 1:length(xs))))
showed no performance difference, even though the bound check is not eliminated when looking at the LLVM IR.)The slow case
f1!($([0.0]), $(view(xs, 1:length(xs))))
is fixed this this PR.Since
must hold (not throw) for any abstract array
xs
, I think this use of@inbounds
is correct. The use of@inbounds
is justified by the information locally available inside theiterate
method.