Skip to content

Commit

Permalink
Merge pull request #408 from JuliaDiff/ox/skewdoc
Browse files Browse the repository at this point in the history
Fix skew-symmetric comment
  • Loading branch information
oxinabox authored Jul 23, 2021
2 parents b0e3f32 + 7863913 commit d68a5b8
Showing 1 changed file with 19 additions and 21 deletions.
40 changes: 19 additions & 21 deletions docs/src/opting_out_of_rules.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,44 +10,44 @@ This is done with the [`@opt_out`](@ref) macro.

Consider one a `rrule` for `sum` (the following simplified from the one in [ChainRules.jl](https://github.com/JuliaDiff/ChainRules.jl/blob/master/src/rulesets/Base/mapreduce.jl) itself)
```julia
function rrule(::typeof(sum), x::AbstractArray{<:Number}; dims=:)
function rrule(::typeof(sum), x::AbstractArray{<:Number})
y = sum(x; dims=dims)
project = ProjectTo(x)
function sum_pullback(ȳ)
# broadcasting the two works out the size no-matter `dims`
# project makes sure we stay in the same vector subspace as `x`
# no putting in off-diagonal entries in Diagonal etc
= project(broadcast(lasttuple, x, ȳ)))
= project(fill(ȳ, size(x)))
return (NoTangent(), x̄)
end
return y, sum_pullback
end
```

That is a fairly reasonable `rrule` for the vast majority of cases.

You might have a custom array type for which you could write a faster rule.
For example, the pullback for summing a [`SkewSymmetric` (anti-symmetric)](https://en.wikipedia.org/wiki/Skew-symmetric_matrix) matrix can be optimized to basically be `Diagonal(fill(ȳ, size(x,1)))`.
To do that, you can indeed write another more specific [`rrule`](@ref).
But another case is where the AD system itself would generate a more optimized case.
In which case you would do that, by writing a faster, more specific, `rrule`.
But sometimes, it is the case that ADing the (faster, more specific) primal for your custom array type would yeild the faster pullback without you having to write a `rrule` by hand.

For example, the [`NamedDimsArray`](https://github.com/invenia/NamedDims.jl) is a thin wrapper around some other array type.
Its sum method is basically just to call `sum` on its parent.
It is entirely conceivable[^1] that the AD system can do better than our `rrule` here.
For example by avoiding the overhead of [`project`ing](@ref ProjectTo).
Consider a summing [`SkewSymmetric` (anti-symmetric)](https://en.wikipedia.org/wiki/Skew-symmetric_matrix) matrix.
The skew symmetric matrix has structural zeros on the diagonal, and off-diagonals are paired with their negation.
Thus the sum is always going to be zero.
As such the author of that matrix type would probably have overloaded `sum(x::SkewSymmetric{T}) where T = zero(T)`.
ADing this would result in the tangent computed for `x` as `ZeroTangent()` and it would be very fast since AD can see that `x` is never used in the right-hand side.
In contrast the generic method for `AbstractArray` defined above would have to allocate the fill array, and then compute the skew projection.
Only to findout the output would be projected to `SkewSymmetric(zeros(T))` anyway (slower, and a less useful type).

To opt-out of using the generic `rrule` and to allow the AD system to do its own thing we use the
[`@opt_out`](@ref) macro, to say to not use it for sum of `NamedDimsArrays`.
[`@opt_out`](@ref) macro, to say to not use it for sum of `SkewSymmetric`.

```julia
@opt_out rrule(::typeof(sum), ::NamedDimsArray)
@opt_out rrule(::typeof(sum), ::SkewSymmetric)
```

We could even opt-out for all 1 arg functions.
Perhaps we might not want to ever use rules for SkewSymmetric, because we have determined that it is always better to leave it to the AD, unless a verys specific rule has been written[^1].
We could then opt-out for all 1 arg functions.
```@julia
@opt_out rrule(::Any, ::NamedDimsArray)
@opt_out rrule(::Any, ::SkewSymmetric)
```
Though this is likely to cause some method-ambiguities.
Though this is likely to cause some method-ambiguities, if we do it for more, but we can resolve those.


Similar can be done `@opt_out frule`.
It can also be done passing in a [`RuleConfig`](@ref config).
Expand Down Expand Up @@ -93,6 +93,4 @@ If for a given signature there is a more specific method in the `no_rrule`/`no_f
You can, likely by looking at the primal method table, workout which method you would have it if the rule had not been defined,
and then `invoke` it.



[^1]: It is also possible, that this is not the case. Benchmark your real uses cases.
[^1]: seems unlikely, but it is possible, there is a lot of structure that can be taken advantage of for some matrix types.

0 comments on commit d68a5b8

Please sign in to comment.