Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What mechanism should we use? #1

Closed
yuri91 opened this issue Nov 30, 2020 · 10 comments
Closed

What mechanism should we use? #1

yuri91 opened this issue Nov 30, 2020 · 10 comments

Comments

@yuri91
Copy link
Collaborator

yuri91 commented Nov 30, 2020

The main issue to solve for this proposal is how exactly we want to implement the branch hinting in the language.

As far as I know the 3 main possibilities are:

1 - A new set of instructions corresponding to if, br_if, br_table,... but with hinting. They would be under a common prefix like the saturating truncation instructions, and support an immediate indicating the predicted branch.

2- A single br_hint instruction, with an immediate for the predicted branch. The only allowed instruction after it are branching instructions, and possibly another br_hint (for multiple hints to
br_table)

3- Using a custom section to encode the hinting information.

As I recall, @binji and @tlively in the last meeting about this proposal mentioned that they would like to choose a solution that can benefit other present or future proposals similar to this. Could you elaborate on this?

My POC in V8 is actually a simple implementation of possibility 1, with just the if instruction implemented. It felt pretty easy and natural to add a new set of prefixed instructions, but I don't know about other engines.

Implementing possibility 2 seems a bit more complicated because there is nothing like it already (as far as I know), but I think that it is a superior solution spec-wise because it introduces only one new opcode, and it allows to elegantly support multiple likely/unlikely branches for br_table.

Possibility 3 was mentioned by @rossberg, but not investigated much.
An issue that I see with this approach is that if any intermediate tool does not support the custom section, it will be dropped or worse it will not be updated as needed and become garbage.
This could be fine in principle since the branch hinting is a no-op from the language point of view, but it could produce a big performance degradation.
What I fear is that, if this feature is not "really" in the language, there may be little pressure to implement it in the various parts of the tooling stack.

Any thoughts?

@tlively
Copy link
Member

tlively commented Nov 30, 2020

From a design and usability perspective, I don't have strong feelings about what solution would be best. I can't think of any conditional control flow we might add in the future that would be as pervasive and as amenable to hinting as things like if and br_if, so I'm not sure our solution needs to be particularly general or scalable. Adding new hinted versions of if and br_if (or even just br_if) might be the simplest thing to do. @yuri91 do you have an idea of how beneficial hinted br_table would be?

I don't exactly remember, but I think my previous comments were about the mechanism we use to specify these instructions in the formal specification. However we handle their semantically-insignificant performance effects will set a precedent that will be used for similar instructions (e.g. prefetches) in the future.

@yuri91
Copy link
Collaborator Author

yuri91 commented Dec 1, 2020

For my use case if and br_if are plenty enough. I had the impression after reading WebAssembly/design#1363 that the expectation was to potentially add hinting to all branching instructions present and future, but that doesn't need to be the case necessarily.

I thought a bit about br_table in particular, and I think that it can be handled at the compiler level:

If the compiler knows that one or more branches are likely/unlikely, it can lower the switch instruction into something like:

if(likely_cond) {
...
} else if(unlikely_cond){
...
}else{
  switch(..);
}

So I would say that handling hinting for br_table in wasm itself is probably redundant.

@dschuff
Copy link
Member

dschuff commented Dec 1, 2020

One argument for custom sections is that it could be generalized for other kinds of hinting; e.g. awhile back someone experimented with hints about which functions are hot or cold, used by engines to determine which tier to use. It would also be easy to strip them out for cases where size is much more important than speed. They could also be dropped by tools that don't understand them, but as you mentioned there's also a potential downside if a tool updates the code but not the hints.

@rossberg
Copy link
Member

rossberg commented Dec 2, 2020

Pros for a custom section:

  • more flexibility (you can put any form of annotation in there, in any format)
  • better backwards compatibility (implementations don't break if they do not understand it)
  • more choice (non-performance-oriented implementations can completely ignore it)
  • quicker to evolve (no need to wait for all implementations to understand it)
  • more flexibility to evolve (even leaves open the possibility of removing a hint feature again if it turns out to be no longer needed)
  • general separation of concerns (as optimisations do not affect meaning)

I don't understand the argument brought up as a potential downside. If a tool encounters performance hints that it does not understand, then

  • with a custom section, the tool still works, just breaks the annotation;
  • with an extended instruction set, the tool completely breaks.

Of course, you would expect a decent tool to handle hints properly in both cases, but custom sections enable graceful degradation. So it seems that they are preferable even in this regard?

The main downside to a custom section is that it requires slightly more encoding context, but that seems minor. For initial experimentation, I would definitely suggest going with a custom section, in order to be maximally independent of the rest of the ecosystem.

@carlopi
Copy link

carlopi commented Dec 2, 2020

One doubt I have generally about custom sections, are they subject to validation or not?
Example with branch-hinting, if there is an hint that make no sense, it should be mandated to skip the whole custom section or they can be used also with partially broken information?

And another similar question:
how to ensure a custom section can not be interpreted in two different ways?
Eg. some engine interpreting custom section X as "these branch target are unlikely" and another with "just use the interpreter/baseline compiler on a given function"


Depending on the answers, it's not really true that custom section can be any kind of garbage and still have things works for any combination of tooling pipeline / engine. (It might even have to be mandated that if a tool encounters a custom section it do not understands it will have to delete it, since it might now be invalidated, but now in a way every tools has to understand the concept of custom sections, and we are back with other problems).

And to mitigate some concerns about branch-hinting and tooling: substituting branch hinting with nops / stripping them completely it's always possible to do in an easy way, and I imagine it would be easy to implement also with other hints, so it's just a matter of having an additional strip-hints-from-code tool that takes a list of hints to remove.


Given the concerns a semi-serious proposal could be:
implementing hints as nops in the code - basically a branch hint could be encoded as { i32.const X; drop;}.
This is valid WebAssembly that any tool / engine is required to handle now, but a given engine might know that, instead of just eliminating this unnecessary couple of instructions, it can interpret as branch-hint. (it can be slightly more general with additional nop around to code what kind of hints it is, eg. {i32.const 7; drop} encodes "a branch hint is coming" and then {i32.const X; drop;} encodes the direction of the hint)

@rossberg
Copy link
Member

rossberg commented Dec 2, 2020

@carlopi, yes, engines are required to accept a module even with custom sections are unknown or malformed. Of course, an engine is still free to emit diagnostics in that case. Similar with other tools.

In general, that is as it's supposed to be. Keep in mind that the sole purpose of validation is to protect the integrity of the engine, with as little work as possible. It is not meant to verify the general correctness of tools producing Wasm code! If a tool creates broken performance hints, that's not a risk to the engine, so does not matter to the engine.

In practice, it's of course not all black and white, but generally speaking, the less all(!) engines are forced to validate the better.

Your proposal is kind of cute, btw. However, note that it does not have (and cannot have) stricter validation requirements than a custom section. An engine obviously has to allow and ignore malformed hints (such as unknown values for X), otherwise it would be rejecting valid code. So it would not actually address your concern, AFAICS.

@yuri91
Copy link
Collaborator Author

yuri91 commented Dec 3, 2020

I have given some thought to the custom section idea and I may have changed my mind.

One thing that I didn't think about initially is that it would allow us to ship one version of the code with the custom section for hints to all browsers, even if they don't support it yet.

I also looked at the closest thing that exists right now (compilationHints, only used by V8 AFAIK), and I think that adding something similar for branch hinting probably isn't very complicated.

@yuri91
Copy link
Collaborator Author

yuri91 commented Dec 3, 2020

This is a sketch of how the section could be:

First, an integer specifying the number of functions with hinting.

Then, a list of entries each describing the hints for a function.

An entry contains the index of the function in question, an integer representing the number of hints, and a list of hints.

For encoding the hint entry, there are several possibilities:

  • a pair of (byte_offset, hint_direction). The downside I see is that a byte offset is wasteful size-wise and just adds more validation to be done.
  • a pair of (instruction_index, hint_direction), with instruction_index only counting if/br_if instructions.
  • just hint_direction, and require to list all if/br_if instruction, potentially with a hint_direction of "no hint". This may or may not be shorter than the sparse version with the index, depending on how many branches are hinted in the function.

I will try to experiment with this approach in V8 a bit.

@yuri91
Copy link
Collaborator Author

yuri91 commented Dec 9, 2020

I played a bit with V8 and I am convinced to go for the custom section route now.

My POC is not more complex than the previous version with the new instruction. It is arguably simpler.

What I have right now, if someone wants to play with it, is here: https://github.com/yuri91/v8/tree/custom_section.

I also wrote a small tool to add the custom hint section to a wasm module, mostly for test purposes. Right now it just hints that all if and br_if are likely false. (here https://github.com/yuri91/custom_hint_section/tree/master).

There are a few things left to decide:

  • how strict should the validation be.
  • Where to put the section: streaming compilers probably want it before the code section. Not sure if there are other constraints.
  • How much future-proof the encoding should be.

@yuri91
Copy link
Collaborator Author

yuri91 commented Feb 15, 2021

I am closing this since I think that after the last meeting in which this was discussed there seemed to be consensus around proceeding with the custom section.

The Overview has an updated description of the current format of the section, and I also updated the spec text.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants