Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enum discriminants used as indices are subject to bounds checks #36955

Closed
abonander opened this issue Oct 4, 2016 · 7 comments
Closed

Enum discriminants used as indices are subject to bounds checks #36955

abonander opened this issue Oct 4, 2016 · 7 comments
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@abonander
Copy link
Contributor

abonander commented Oct 4, 2016

Case Study

All enum discriminants are statically known, but the optimizer appears not to be able to introspect their max values when they are cast to primitives and used as indices, probably because at the LLVM level, they only ever are primitives. LLVM is never told their maximum value. This means that when using enum discriminants to index arrays/vectors whose lengths are also statically known, bounds checks are emitted even when one would reasonably expect them to be elided.

Minimal example (Playground):

#![feature(test)]

extern crate test;

use test::black_box;

#[repr(usize)]
enum Index {
    Zero = 0,
    One = 1,
}

#[inline(never)]
fn idx_cast(vals: &[i32; 2], idx: Index) {
    black_box(vals[idx as usize]);
}

fn main() {
    let vals = black_box([0, 1]);
    idx_cast(&vals, black_box(Index::Zero));
}

If you look at the emitted assembly (in release mode), you can see that idx_cast does a bounds-check.

If you add ::std::intrinsics::assume(idx as usize < 2), the bounds-check is elided and LLVM appears to use the discriminant as the index directly: https://is.gd/DdX7dd

This also works even if enum Index is not #[repr(usize)]; LLVM simply zero-extends the discriminant: https://is.gd/2JrJty

So as a possible optimization at codegen, whenever an enum is cast to a primitive, we could (or maybe even should) emit @llvm.assume( [discriminant value] <= [maximum discriminant value] ).

Converting via explicit match also allows elision of the bounds check without reintroducing a branch, but has differing behavior between enums with 2 variants and enums of more variants:

  • 2 variant enums get a use 1 if discriminant is 1, otherwise 0 treatment: https://is.gd/VS7JZK
  • Enums with > 2 variants directly use the discriminant (or zero-extend it if it's a different width) like the assume solution: https://is.gd/d8dEtY

I assume LLVM does the former in the different-width case because it uses fewer cycles than the latter, since they are otherwise functionally equivalent. However, when the enum is #[repr(usize)], the 2-variant version is using 3 extra, unnecessary instructions. But that seems like an LLVM problem more than a codegen problem.

@bluss bluss added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Oct 4, 2016
@bluss
Copy link
Member

bluss commented Oct 4, 2016

Some more cases. No safe code workaround uses as few instructions as using assume. https://play.rust-lang.org/?gist=f3d9f44dd1b92fc947939b2a8b633b47&version=nightly&backtrace=0

@abonander
Copy link
Contributor Author

abonander commented Oct 4, 2016

@bluss With 3 or more variants, the matching conversion is equivalent (which is nice in the meantime for people who are squeamish about unsafe or unstable APIs). However, LLVM seems to wrongly special-case the 2-variant matches even when width extension isn't necessary. Still, I doubt those three extra instructions are as expensive as a branch, which would definitely get worked out by prediction at runtime in hot loops but that still has up-front cost before the prediction tables are filled and after they are cleared.

@abonander
Copy link
Contributor Author

I don't consider myself a compiler/optimizer person, but I'm willing to bet the 2-variant case is causing LLVM to assume C/C++ boolean semantics where any nonzero value is equivalent to true and should be converted to 1 for consistency.

The question then becomes, how do we get around LLVM making this assumption without removing the heuristic entirely (which I'm sure a nonnegligible number of programs rely on for correctness).

@arielb1
Copy link
Contributor

arielb1 commented Oct 4, 2016

@abonander

The problem is that SimplifyCfg transforms switches to ifs, losing the range metadata, before it eliminates unneeded switches.

Also, SimplifyCfg does these optimizations before we do mem2reg, so even if it did things in the correct order that wouldn't work well.

@pmarcelll
Copy link
Contributor

Resolving #23898 would help fixing this, right?

@arielb1
Copy link
Contributor

arielb1 commented Oct 4, 2016

It won't, and this issue has a PR open that fixes it.

@apasel422 apasel422 changed the title Enum discrimnants used as indices are subject to bounds checks Enum discriminants used as indices are subject to bounds checks Oct 5, 2016
@bors bors closed this as completed in 45fe3a1 Oct 6, 2016
@bluss
Copy link
Member

bluss commented Oct 7, 2016

Thanks, amazingly quickly fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

No branches or pull requests

4 participants