From 9506661d53469f17456af628fe7b1fc12cb3e275 Mon Sep 17 00:00:00 2001 From: Simonas Kazlauskas Date: Sat, 21 Apr 2018 16:06:35 +0300 Subject: [PATCH 1/6] RFC for optimise(size) attribute --- text/0000-optimise-attr.md | 120 +++++++++++++++++++++++++++++++++++++ 1 file changed, 120 insertions(+) create mode 100644 text/0000-optimise-attr.md diff --git a/text/0000-optimise-attr.md b/text/0000-optimise-attr.md new file mode 100644 index 00000000000..2c9372ba6dc --- /dev/null +++ b/text/0000-optimise-attr.md @@ -0,0 +1,120 @@ +- Feature Name: optimise_attr +- Start Date: 2018-03-26 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +This RFC introduces the `#[optimise]` attribute, specifically its `#[optimise(size)]` variant for +controlling optimisation level on a per-item basis. + +# Motivation +[motivation]: #motivation + +Currently, rustc has only a small number of optimisation options that apply globally to the +crate. With LTO and RLIB-only crates these options become applicable to a whole-program, which +reduces the ability to control optimisation even further. + +For applications such as embedded, it is critical, that they satisfy the size constraints. This +means, that code must consciously pick one or the other optimisation level. However, since +optimisation level is increasingly applied program-wide, options like `-Copt-level=3` or +`-Copt-level=s` are less and less useful – it is no longer feasible (and never was feasible with +cargo) to use the former one for code where performance matters and the latter everywhere else. + +With a C toolchain this is fairly easy to achieve by compiling the relevant objects with different +options. In Rust ecosystem, however, where this concept does not exist, an alternate solution is +necessary. + +With `#[optimise(size)]` it is possible to annotate separate functions, so that they are optimised +for size in a project otherwise optimised for speed (which is the default for `cargo --release`). + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +Sometimes, optimisations are a tradeoff between execution time and the code size. Some +optimisations, such as loop unrolling increase code size many times on average (compared to +original function size). + +```rust +#[optimise(size)] +fn banana() { + // code +} +``` + +Will instruct rustc to consider this tradeoff more carefully and avoid optimising in a way that +would result in larger code rather than a smaller one. It may also have effect on what instructions +are selected to appear in the final binary. + +Note that `#[optimise(size)]` is a hint, rather than a hard requirement and compiler may still, +while optimising, take decisions that increase function size compared to an entirely unoptimised +result. + +Using this attribute is recommended when inspection of generated code reveals unnecessarily large +function or functions, but use of `-O` is still preferable over `-C opt-level=s` or `-C +opt-level=z`. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +The `#[optimise(size)]` attribute applied to a function definition will instruct the optimisation +engine to avoid applying optimisations that could result in a size increase and machine code +generator to generate code that’s smaller rather than larger. + +Note that the `#[optimise(size)]` attribute is just a hint and is not guaranteed to result in any +different or smaller code. + +Since `#[optimise(size)]` instructs optimisations to behave in a certain way, this means that this +attribute has no effect when no optimisations are run (such as is the case when `-Copt-level=0`). +Interaction of this attribute with the `-Copt-level=s` and `-Copt-level=z` flags is not specified +and is left up to implementation to decide. + +# Drawbacks +[drawbacks]: #drawbacks + +* Not all of the alternative codegen backends may be able to express such a request, hence the +“this is an optimisation hint” note on the `#[optimise(size)]` attribute. + * As a fallback, this attribute may be implemented in terms of more specific optimisation hints + (such as `inline(never)`, the future `unroll(never)` etc). + +# Rationale and alternatives +[alternatives]: #alternatives + +Proposed is a very semantic solution (describes the desired result, instead of behaviour) to the +problem of needing to sometimes inhibit some of the trade-off optimisations such as loop unrolling. + +Alternative, of course, would be to add attributes controlling such optimisations, such as +`#[unroll(no)]` on top of a a loop statement. There’s already precedent for this in the `#[inline]` +annotations. + +The author would like to argue that we should eventually have *both*, the `#[optimise(size)]` for +people who look at generated code and decide that it is too large, and the targetted attributes for +people who know *why* the code is too large. + +Furthermore, currently `optimise(size)` is able to do more than any possible combination of +targetted attributes would be able to such as influencing the instruction selection or switch +codegen strategy (jump table, if chain, etc.) This makes the attribute useful even in presence of +all the targetted optimisation knobs we might have in the future. + +--- + +Alternative: `optimize` (American English) instead of `optimise`… or both? + +# Prior art +[prior-art]: #prior-art + +* LLVM: `optsize`, `optnone`, `minsize` function attributes (exposed in Clang in some way); +* GCC: `__attribute__((optimize))` function attribute which allows setting the optimisation level +and using certain(?) `-f` flags for each function; +* IAR: Optimisations have a checkbox for “No size constraints”, which allows compiler to go out of +its way to optimise without considering the size tradeoff. Can only be applied on a +per-compilation-unit basis. Enabled by default, as is appropriate for a compiler targetting +embedded use-cases. + +# Unresolved questions +[unresolved]: #unresolved-questions + +* Should we support such an attribute at module-level? Crate-level? + * If yes, should we also implement `optimise(always)`? `optimise(level=x)`? + * Left for future discussion, but should make sure such extension is possible. From b1b24aaef838a6d78568ba2e9682f47ad89447ad Mon Sep 17 00:00:00 2001 From: Simonas Kazlauskas Date: Thu, 3 May 2018 16:14:57 +0300 Subject: [PATCH 2/6] s/optimise/optimize --- text/0000-optimise-attr.md | 34 +++++++++++++++------------------- 1 file changed, 15 insertions(+), 19 deletions(-) diff --git a/text/0000-optimise-attr.md b/text/0000-optimise-attr.md index 2c9372ba6dc..964bcff8d97 100644 --- a/text/0000-optimise-attr.md +++ b/text/0000-optimise-attr.md @@ -1,4 +1,4 @@ -- Feature Name: optimise_attr +- Feature Name: optimize_attr - Start Date: 2018-03-26 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -6,7 +6,7 @@ # Summary [summary]: #summary -This RFC introduces the `#[optimise]` attribute, specifically its `#[optimise(size)]` variant for +This RFC introduces the `#[optimize]` attribute, specifically its `#[optimize(size)]` variant for controlling optimisation level on a per-item basis. # Motivation @@ -26,8 +26,8 @@ With a C toolchain this is fairly easy to achieve by compiling the relevant obje options. In Rust ecosystem, however, where this concept does not exist, an alternate solution is necessary. -With `#[optimise(size)]` it is possible to annotate separate functions, so that they are optimised -for size in a project otherwise optimised for speed (which is the default for `cargo --release`). +With `#[optimize(size)]` it is possible to annotate separate functions, so that they are optimized +for size in a project otherwise optimized for speed (which is the default for `cargo --release`). # Guide-level explanation [guide-level-explanation]: #guide-level-explanation @@ -37,7 +37,7 @@ optimisations, such as loop unrolling increase code size many times on average ( original function size). ```rust -#[optimise(size)] +#[optimize(size)] fn banana() { // code } @@ -47,8 +47,8 @@ Will instruct rustc to consider this tradeoff more carefully and avoid optimisin would result in larger code rather than a smaller one. It may also have effect on what instructions are selected to appear in the final binary. -Note that `#[optimise(size)]` is a hint, rather than a hard requirement and compiler may still, -while optimising, take decisions that increase function size compared to an entirely unoptimised +Note that `#[optimize(size)]` is a hint, rather than a hard requirement and compiler may still, +while optimising, take decisions that increase function size compared to an entirely unoptimized result. Using this attribute is recommended when inspection of generated code reveals unnecessarily large @@ -58,14 +58,14 @@ opt-level=z`. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -The `#[optimise(size)]` attribute applied to a function definition will instruct the optimisation +The `#[optimize(size)]` attribute applied to a function definition will instruct the optimisation engine to avoid applying optimisations that could result in a size increase and machine code generator to generate code that’s smaller rather than larger. -Note that the `#[optimise(size)]` attribute is just a hint and is not guaranteed to result in any +Note that the `#[optimize(size)]` attribute is just a hint and is not guaranteed to result in any different or smaller code. -Since `#[optimise(size)]` instructs optimisations to behave in a certain way, this means that this +Since `#[optimize(size)]` instructs optimisations to behave in a certain way, this means that this attribute has no effect when no optimisations are run (such as is the case when `-Copt-level=0`). Interaction of this attribute with the `-Copt-level=s` and `-Copt-level=z` flags is not specified and is left up to implementation to decide. @@ -74,7 +74,7 @@ and is left up to implementation to decide. [drawbacks]: #drawbacks * Not all of the alternative codegen backends may be able to express such a request, hence the -“this is an optimisation hint” note on the `#[optimise(size)]` attribute. +“this is an optimisation hint” note on the `#[optimize(size)]` attribute. * As a fallback, this attribute may be implemented in terms of more specific optimisation hints (such as `inline(never)`, the future `unroll(never)` etc). @@ -88,19 +88,15 @@ Alternative, of course, would be to add attributes controlling such optimisation `#[unroll(no)]` on top of a a loop statement. There’s already precedent for this in the `#[inline]` annotations. -The author would like to argue that we should eventually have *both*, the `#[optimise(size)]` for +The author would like to argue that we should eventually have *both*, the `#[optimize(size)]` for people who look at generated code and decide that it is too large, and the targetted attributes for people who know *why* the code is too large. -Furthermore, currently `optimise(size)` is able to do more than any possible combination of +Furthermore, currently `optimize(size)` is able to do more than any possible combination of targetted attributes would be able to such as influencing the instruction selection or switch codegen strategy (jump table, if chain, etc.) This makes the attribute useful even in presence of all the targetted optimisation knobs we might have in the future. ---- - -Alternative: `optimize` (American English) instead of `optimise`… or both? - # Prior art [prior-art]: #prior-art @@ -108,7 +104,7 @@ Alternative: `optimize` (American English) instead of `optimise`… or both? * GCC: `__attribute__((optimize))` function attribute which allows setting the optimisation level and using certain(?) `-f` flags for each function; * IAR: Optimisations have a checkbox for “No size constraints”, which allows compiler to go out of -its way to optimise without considering the size tradeoff. Can only be applied on a +its way to optimize without considering the size tradeoff. Can only be applied on a per-compilation-unit basis. Enabled by default, as is appropriate for a compiler targetting embedded use-cases. @@ -116,5 +112,5 @@ embedded use-cases. [unresolved]: #unresolved-questions * Should we support such an attribute at module-level? Crate-level? - * If yes, should we also implement `optimise(always)`? `optimise(level=x)`? + * If yes, should we also implement `optimize(always)`? `optimize(level=x)`? * Left for future discussion, but should make sure such extension is possible. From cef2ebc337f9091d58518c315b3e5f1c111bcc91 Mon Sep 17 00:00:00 2001 From: Simonas Kazlauskas Date: Fri, 13 Jul 2018 22:03:13 +0300 Subject: [PATCH 3/6] Adjust text to add optimize(speed) --- text/0000-optimise-attr.md | 121 +++++++++++++++++++++++++------------ 1 file changed, 83 insertions(+), 38 deletions(-) diff --git a/text/0000-optimise-attr.md b/text/0000-optimise-attr.md index 964bcff8d97..3f2ec0e50f1 100644 --- a/text/0000-optimise-attr.md +++ b/text/0000-optimise-attr.md @@ -6,8 +6,8 @@ # Summary [summary]: #summary -This RFC introduces the `#[optimize]` attribute, specifically its `#[optimize(size)]` variant for -controlling optimisation level on a per-item basis. +This RFC introduces the `#[optimize]` attribute for controlling optimisation level on a per-item +basis. # Motivation [motivation]: #motivation @@ -17,24 +17,26 @@ crate. With LTO and RLIB-only crates these options become applicable to a whole- reduces the ability to control optimisation even further. For applications such as embedded, it is critical, that they satisfy the size constraints. This -means, that code must consciously pick one or the other optimisation level. However, since -optimisation level is increasingly applied program-wide, options like `-Copt-level=3` or -`-Copt-level=s` are less and less useful – it is no longer feasible (and never was feasible with -cargo) to use the former one for code where performance matters and the latter everywhere else. +means, that code must consciously pick one or the other optimisation level. Absence of a method to +selectively optimise different parts of a program in different ways precludes users from utilising +the hardware they have to the greatest degree. -With a C toolchain this is fairly easy to achieve by compiling the relevant objects with different -options. In Rust ecosystem, however, where this concept does not exist, an alternate solution is -necessary. +With a C toolchain selective optimisation is fairly easy to achieve by compiling the relevant +codegen units (objects) with different options. In Rust ecosystem, where the concept of such units +does not exist, an alternate solution is necessary. -With `#[optimize(size)]` it is possible to annotate separate functions, so that they are optimized -for size in a project otherwise optimized for speed (which is the default for `cargo --release`). +With the `#[optimize]` attribute it is possible to annotate the optimisation level of separate +items, so that they are optimized differently from the global optimisation option. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -Sometimes, optimisations are a tradeoff between execution time and the code size. Some +## `#[optimize(size)]` + +Sometimes, optimisations are a trade-off between execution time and the code size. Some optimisations, such as loop unrolling increase code size many times on average (compared to -original function size). +original function size) for marginal performance benefits. In case such optimisation is not +desirable… ```rust #[optimize(size)] @@ -43,7 +45,7 @@ fn banana() { } ``` -Will instruct rustc to consider this tradeoff more carefully and avoid optimising in a way that +…will instruct rustc to consider this trade-off more carefully and avoid optimising in a way that would result in larger code rather than a smaller one. It may also have effect on what instructions are selected to appear in the final binary. @@ -55,26 +57,66 @@ Using this attribute is recommended when inspection of generated code reveals un function or functions, but use of `-O` is still preferable over `-C opt-level=s` or `-C opt-level=z`. +## `#[optimize(speed)]` + +Conversely, when one of the global optimisation options for code size is used (`-Copt-level=s` or +`-Copt-level=z`), profiling might reveal some functions that are unnecessarily “hot”. In that case, +those functions may be annotated with the `#[optimize(speed)]` to make the compiler make its best +effort to produce faster code. + +```rust +#[optimize(speed)] +fn banana() { + // code +} +``` + +Much like with `#[optimize(size)]`, the `speed` counterpart is also a hint and will likely not +yield the same results as using the global optimisation option for speed. + # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -The `#[optimize(size)]` attribute applied to a function definition will instruct the optimisation -engine to avoid applying optimisations that could result in a size increase and machine code -generator to generate code that’s smaller rather than larger. +The `#[optimize(size)]` attribute applied to an item will instruct the optimisation pipeline to +avoid applying optimisations that could result in a size increase and machine code generator to +generate code that’s smaller rather than faster. + +The `#[optimize(speed)]` attribute applied to an item will instruct the optimisation pipeline to +apply optimisations that are likely to yield performance wins and machine code generator to +generate code that’s faster rather than smaller. + +The `#[optimize]` attributes are just a hint to the compiler and are not guaranteed to result in +any different code. + +If an `#[optimize]` attribute is applied to some grouping item (such as `mod` or a crate), it +propagates transitively to all items defined within the grouping item. + +It is an error to specify multiple incompatible `#[optimize]` options to a single item at once. +A more explicit `#[optimize]` attribute overrides a propagated attribute. + +`#[optimize(speed)]` is a no-op when a global optimisation for speed option is set (i.e. `-C +opt-level=1-3`). Similarly `#[optimize(size)]` is a no-op when a global optimisation for size +option is set (i.e. `-C opt-level=s/z`). `#[optimize]` attributes are no-op when no optimizations +are done globally (i.e. `-C opt-level=0`). In all other cases the *exact* interaction of the +`#[optimize]` attribute with the global optimization level is not specified and is left up to +implementation to decide. + +# Implementation approach + +For the LLVM backend, these attributes may be implemented in a following manner: -Note that the `#[optimize(size)]` attribute is just a hint and is not guaranteed to result in any -different or smaller code. +`#[optimize(size)]` – explicit function attributes exist at LLVM level. Items with +`optimize(size)` would simply apply the LLVM attributes to the functions. -Since `#[optimize(size)]` instructs optimisations to behave in a certain way, this means that this -attribute has no effect when no optimisations are run (such as is the case when `-Copt-level=0`). -Interaction of this attribute with the `-Copt-level=s` and `-Copt-level=z` flags is not specified -and is left up to implementation to decide. +`#[optimize(speed)]` in conjunction with `-C opt-level=s/z` – use a global optimisation level of +`-C opt-level=2/3` and apply the equivalent LLVM function attribute (`optsize`, `minsize`) to all +items which do not have an `#[optimize(speed)]` attribute. # Drawbacks [drawbacks]: #drawbacks * Not all of the alternative codegen backends may be able to express such a request, hence the -“this is an optimisation hint” note on the `#[optimize(size)]` attribute. +“this is a hint” note on the `#[optimize]` attribute. * As a fallback, this attribute may be implemented in terms of more specific optimisation hints (such as `inline(never)`, the future `unroll(never)` etc). @@ -85,17 +127,17 @@ Proposed is a very semantic solution (describes the desired result, instead of b problem of needing to sometimes inhibit some of the trade-off optimisations such as loop unrolling. Alternative, of course, would be to add attributes controlling such optimisations, such as -`#[unroll(no)]` on top of a a loop statement. There’s already precedent for this in the `#[inline]` +`#[unroll(no)]` on top of a loop statement. There’s already precedent for this in the `#[inline]` annotations. -The author would like to argue that we should eventually have *both*, the `#[optimize(size)]` for -people who look at generated code and decide that it is too large, and the targetted attributes for -people who know *why* the code is too large. +The author would like to argue that we should eventually have *both*, the `#[optimize]` for +people who look at generated code but are not willing to dig for exact reasons, and the targeted +attributes for people who know *why* the code is not satisfactory. -Furthermore, currently `optimize(size)` is able to do more than any possible combination of -targetted attributes would be able to such as influencing the instruction selection or switch -codegen strategy (jump table, if chain, etc.) This makes the attribute useful even in presence of -all the targetted optimisation knobs we might have in the future. +Furthermore, currently `optimize` is able to do more than any possible combination of targeted +attributes would be able to such as influencing the instruction selection or switch codegen +strategy (jump table, if chain, etc.) This makes the attribute useful even in presence of all the +targeted optimisation knobs we might have in the future. # Prior art [prior-art]: #prior-art @@ -103,14 +145,17 @@ all the targetted optimisation knobs we might have in the future. * LLVM: `optsize`, `optnone`, `minsize` function attributes (exposed in Clang in some way); * GCC: `__attribute__((optimize))` function attribute which allows setting the optimisation level and using certain(?) `-f` flags for each function; -* IAR: Optimisations have a checkbox for “No size constraints”, which allows compiler to go out of -its way to optimize without considering the size tradeoff. Can only be applied on a -per-compilation-unit basis. Enabled by default, as is appropriate for a compiler targetting +* IAR: Optimisations have a check box for “No size constraints”, which allows compiler to go out of +its way to optimize without considering the size trade-off. Can only be applied on a +per-compilation-unit basis. Enabled by default, as is appropriate for a compiler targeting embedded use-cases. # Unresolved questions [unresolved]: #unresolved-questions -* Should we support such an attribute at module-level? Crate-level? - * If yes, should we also implement `optimize(always)`? `optimize(level=x)`? - * Left for future discussion, but should make sure such extension is possible. +* Should we also implement `optimize(always)`? `optimize(level=x)`? + * Left for future discussion, but should make sure such extension is possible. +* Should there be any way to specify what global optimisation for speed level is used in + conjunction with the optimisation for speed option (e.g. `-Copt-level=s3` could be equivalent to + `-Copt-level=3` and `#[optimize(size)]` on the crate item); + * This may matter for users of `#[optimize(speed)]`. From 1de6b6df78ae19eb83360b8f90f7556688f3c562 Mon Sep 17 00:00:00 2001 From: Simonas Kazlauskas Date: Tue, 25 Sep 2018 21:45:15 +0300 Subject: [PATCH 4/6] Change the wording to accomodate expressions Additionally, clarify propagation of the attribute. --- text/0000-optimise-attr.md | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/text/0000-optimise-attr.md b/text/0000-optimise-attr.md index 3f2ec0e50f1..e8c7dd22e4f 100644 --- a/text/0000-optimise-attr.md +++ b/text/0000-optimise-attr.md @@ -77,22 +77,27 @@ yield the same results as using the global optimisation option for speed. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -The `#[optimize(size)]` attribute applied to an item will instruct the optimisation pipeline to -avoid applying optimisations that could result in a size increase and machine code generator to -generate code that’s smaller rather than faster. +The `#[optimize(size)]` attribute applied to an item or expression will instruct the optimisation +pipeline to avoid applying optimisations that could result in a size increase and machine code +generator to generate code that’s smaller rather than faster. -The `#[optimize(speed)]` attribute applied to an item will instruct the optimisation pipeline to -apply optimisations that are likely to yield performance wins and machine code generator to -generate code that’s faster rather than smaller. +The `#[optimize(speed)]` attribute applied to an item or expression will instruct the optimisation +pipeline to apply optimisations that are likely to yield performance wins and machine code +generator to generate code that’s faster rather than smaller. The `#[optimize]` attributes are just a hint to the compiler and are not guaranteed to result in any different code. If an `#[optimize]` attribute is applied to some grouping item (such as `mod` or a crate), it -propagates transitively to all items defined within the grouping item. +propagates transitively to all items defined within the grouping item. Note, that a function is +also a “grouping” item for the purposes of this RFC, and `#[optimize]` attribute applied to a +function will propagate to other functions or closures defined within the body of the function. -It is an error to specify multiple incompatible `#[optimize]` options to a single item at once. -A more explicit `#[optimize]` attribute overrides a propagated attribute. +`#[optimize]` attribute may also be applied to a closure expression using the currently unstable +`stmt_expr_attributes` feature. + +It is an error to specify multiple incompatible `#[optimize]` options to a single item or +expression at once. A more explicit `#[optimize]` attribute overrides a propagated attribute. `#[optimize(speed)]` is a no-op when a global optimisation for speed option is set (i.e. `-C opt-level=1-3`). Similarly `#[optimize(size)]` is a no-op when a global optimisation for size @@ -101,6 +106,12 @@ are done globally (i.e. `-C opt-level=0`). In all other cases the *exact* intera `#[optimize]` attribute with the global optimization level is not specified and is left up to implementation to decide. +`#[optimize]` attribute applied to non function-like items (such as `struct`) or non function-like +expressions (i.e. not closures) is considered “unused” as of this RFC and should fire the +`unused_attribute` lint (unless the same attribute was used for a function-like item or expression, +via e.g. propagation). Some future RFC may assign some behaviour to this attribute with respect to +such definitions. + # Implementation approach For the LLVM backend, these attributes may be implemented in a following manner: @@ -159,3 +170,4 @@ embedded use-cases. conjunction with the optimisation for speed option (e.g. `-Copt-level=s3` could be equivalent to `-Copt-level=3` and `#[optimize(size)]` on the crate item); * This may matter for users of `#[optimize(speed)]`. +* Are the propagation and `unused_attr` approaches right? From f34ddbb7cd5243dfec5a036f963d6fd1c12fd87a Mon Sep 17 00:00:00 2001 From: Simonas Kazlauskas Date: Tue, 25 Sep 2018 22:13:14 +0300 Subject: [PATCH 5/6] s to z --- ...optimise-attr.md => 0000-optimize-attr.md} | 56 +++++++++---------- 1 file changed, 28 insertions(+), 28 deletions(-) rename text/{0000-optimise-attr.md => 0000-optimize-attr.md} (83%) diff --git a/text/0000-optimise-attr.md b/text/0000-optimize-attr.md similarity index 83% rename from text/0000-optimise-attr.md rename to text/0000-optimize-attr.md index e8c7dd22e4f..c76daebb101 100644 --- a/text/0000-optimise-attr.md +++ b/text/0000-optimize-attr.md @@ -6,36 +6,36 @@ # Summary [summary]: #summary -This RFC introduces the `#[optimize]` attribute for controlling optimisation level on a per-item +This RFC introduces the `#[optimize]` attribute for controlling optimization level on a per-item basis. # Motivation [motivation]: #motivation -Currently, rustc has only a small number of optimisation options that apply globally to the +Currently, rustc has only a small number of optimization options that apply globally to the crate. With LTO and RLIB-only crates these options become applicable to a whole-program, which -reduces the ability to control optimisation even further. +reduces the ability to control optimization even further. For applications such as embedded, it is critical, that they satisfy the size constraints. This -means, that code must consciously pick one or the other optimisation level. Absence of a method to -selectively optimise different parts of a program in different ways precludes users from utilising +means, that code must consciously pick one or the other optimization level. Absence of a method to +selectively optimize different parts of a program in different ways precludes users from utilising the hardware they have to the greatest degree. -With a C toolchain selective optimisation is fairly easy to achieve by compiling the relevant +With a C toolchain selective optimization is fairly easy to achieve by compiling the relevant codegen units (objects) with different options. In Rust ecosystem, where the concept of such units does not exist, an alternate solution is necessary. -With the `#[optimize]` attribute it is possible to annotate the optimisation level of separate -items, so that they are optimized differently from the global optimisation option. +With the `#[optimize]` attribute it is possible to annotate the optimization level of separate +items, so that they are optimized differently from the global optimization option. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation ## `#[optimize(size)]` -Sometimes, optimisations are a trade-off between execution time and the code size. Some -optimisations, such as loop unrolling increase code size many times on average (compared to -original function size) for marginal performance benefits. In case such optimisation is not +Sometimes, optimizations are a trade-off between execution time and the code size. Some +optimizations, such as loop unrolling increase code size many times on average (compared to +original function size) for marginal performance benefits. In case such optimization is not desirable… ```rust @@ -59,7 +59,7 @@ opt-level=z`. ## `#[optimize(speed)]` -Conversely, when one of the global optimisation options for code size is used (`-Copt-level=s` or +Conversely, when one of the global optimization options for code size is used (`-Copt-level=s` or `-Copt-level=z`), profiling might reveal some functions that are unnecessarily “hot”. In that case, those functions may be annotated with the `#[optimize(speed)]` to make the compiler make its best effort to produce faster code. @@ -72,17 +72,17 @@ fn banana() { ``` Much like with `#[optimize(size)]`, the `speed` counterpart is also a hint and will likely not -yield the same results as using the global optimisation option for speed. +yield the same results as using the global optimization option for speed. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -The `#[optimize(size)]` attribute applied to an item or expression will instruct the optimisation -pipeline to avoid applying optimisations that could result in a size increase and machine code +The `#[optimize(size)]` attribute applied to an item or expression will instruct the optimization +pipeline to avoid applying optimizations that could result in a size increase and machine code generator to generate code that’s smaller rather than faster. -The `#[optimize(speed)]` attribute applied to an item or expression will instruct the optimisation -pipeline to apply optimisations that are likely to yield performance wins and machine code +The `#[optimize(speed)]` attribute applied to an item or expression will instruct the optimization +pipeline to apply optimizations that are likely to yield performance wins and machine code generator to generate code that’s faster rather than smaller. The `#[optimize]` attributes are just a hint to the compiler and are not guaranteed to result in @@ -99,8 +99,8 @@ function will propagate to other functions or closures defined within the body o It is an error to specify multiple incompatible `#[optimize]` options to a single item or expression at once. A more explicit `#[optimize]` attribute overrides a propagated attribute. -`#[optimize(speed)]` is a no-op when a global optimisation for speed option is set (i.e. `-C -opt-level=1-3`). Similarly `#[optimize(size)]` is a no-op when a global optimisation for size +`#[optimize(speed)]` is a no-op when a global optimization for speed option is set (i.e. `-C +opt-level=1-3`). Similarly `#[optimize(size)]` is a no-op when a global optimization for size option is set (i.e. `-C opt-level=s/z`). `#[optimize]` attributes are no-op when no optimizations are done globally (i.e. `-C opt-level=0`). In all other cases the *exact* interaction of the `#[optimize]` attribute with the global optimization level is not specified and is left up to @@ -119,7 +119,7 @@ For the LLVM backend, these attributes may be implemented in a following manner: `#[optimize(size)]` – explicit function attributes exist at LLVM level. Items with `optimize(size)` would simply apply the LLVM attributes to the functions. -`#[optimize(speed)]` in conjunction with `-C opt-level=s/z` – use a global optimisation level of +`#[optimize(speed)]` in conjunction with `-C opt-level=s/z` – use a global optimization level of `-C opt-level=2/3` and apply the equivalent LLVM function attribute (`optsize`, `minsize`) to all items which do not have an `#[optimize(speed)]` attribute. @@ -128,16 +128,16 @@ items which do not have an `#[optimize(speed)]` attribute. * Not all of the alternative codegen backends may be able to express such a request, hence the “this is a hint” note on the `#[optimize]` attribute. - * As a fallback, this attribute may be implemented in terms of more specific optimisation hints + * As a fallback, this attribute may be implemented in terms of more specific optimization hints (such as `inline(never)`, the future `unroll(never)` etc). # Rationale and alternatives [alternatives]: #alternatives Proposed is a very semantic solution (describes the desired result, instead of behaviour) to the -problem of needing to sometimes inhibit some of the trade-off optimisations such as loop unrolling. +problem of needing to sometimes inhibit some of the trade-off optimizations such as loop unrolling. -Alternative, of course, would be to add attributes controlling such optimisations, such as +Alternative, of course, would be to add attributes controlling such optimizations, such as `#[unroll(no)]` on top of a loop statement. There’s already precedent for this in the `#[inline]` annotations. @@ -148,15 +148,15 @@ attributes for people who know *why* the code is not satisfactory. Furthermore, currently `optimize` is able to do more than any possible combination of targeted attributes would be able to such as influencing the instruction selection or switch codegen strategy (jump table, if chain, etc.) This makes the attribute useful even in presence of all the -targeted optimisation knobs we might have in the future. +targeted optimization knobs we might have in the future. # Prior art [prior-art]: #prior-art * LLVM: `optsize`, `optnone`, `minsize` function attributes (exposed in Clang in some way); -* GCC: `__attribute__((optimize))` function attribute which allows setting the optimisation level +* GCC: `__attribute__((optimize))` function attribute which allows setting the optimization level and using certain(?) `-f` flags for each function; -* IAR: Optimisations have a check box for “No size constraints”, which allows compiler to go out of +* IAR: Optimizations have a check box for “No size constraints”, which allows compiler to go out of its way to optimize without considering the size trade-off. Can only be applied on a per-compilation-unit basis. Enabled by default, as is appropriate for a compiler targeting embedded use-cases. @@ -166,8 +166,8 @@ embedded use-cases. * Should we also implement `optimize(always)`? `optimize(level=x)`? * Left for future discussion, but should make sure such extension is possible. -* Should there be any way to specify what global optimisation for speed level is used in - conjunction with the optimisation for speed option (e.g. `-Copt-level=s3` could be equivalent to +* Should there be any way to specify what global optimization for speed level is used in + conjunction with the optimization for speed option (e.g. `-Copt-level=s3` could be equivalent to `-Copt-level=3` and `#[optimize(size)]` on the crate item); * This may matter for users of `#[optimize(speed)]`. * Are the propagation and `unused_attr` approaches right? From ce58d2751c2015003c9bd1913d48afc55033bf7f Mon Sep 17 00:00:00 2001 From: Mazdak Farrokhzad Date: Sun, 7 Oct 2018 04:54:56 +0200 Subject: [PATCH 6/6] RFC 2412 --- text/{0000-optimize-attr.md => 2412-optimize-attr.md} | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) rename text/{0000-optimize-attr.md => 2412-optimize-attr.md} (97%) diff --git a/text/0000-optimize-attr.md b/text/2412-optimize-attr.md similarity index 97% rename from text/0000-optimize-attr.md rename to text/2412-optimize-attr.md index c76daebb101..6936c42d295 100644 --- a/text/0000-optimize-attr.md +++ b/text/2412-optimize-attr.md @@ -1,7 +1,7 @@ -- Feature Name: optimize_attr +- Feature Name: `optimize_attr` - Start Date: 2018-03-26 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#2412](https://github.com/rust-lang/rfcs/pull/2412) +- Rust Issue: [rust-lang/rust#54882](https://github.com/rust-lang/rust/issues/54882) # Summary [summary]: #summary