Syntactic requirements of the hinting section #2

lars-t-hansen · 2021-02-16T07:55:32Z

This was covered somewhat in #1, but it bears discussing again: Is it enforcable to create additional (essentially syntactic) requirements on the branch hinting section, given that the section is optional? For general engine simplicity and uniformity across implementations, my preference would be something like these:

function sections are sorted in increasing function index order
function indices are not repeated
hints are sorted in increasing offset order within each function section
hint offsets are not repeated
every offset in the hint section references a branch instruction

Failing validation would be desirable if the rules are violated, but is probably not possible. But by creating the rules, we make it more explicit when tools can discard the section and when engines can ignore it, and processing the section is simpler and faster.

(It's fine to do without these rules, but having them seems better than not having them. Put the complexity in the tool, not in the VM.)

yuri91 · 2021-02-16T09:08:24Z

The first 4 requirements seem straightforward to put in the spec. For example in the custom name section there are similar requirements for the name maps (https://webassembly.github.io/spec/core/appendix/custom.html#name-maps):

An indirect name map assigns names to a two-dimensional index space, where secondary indices are grouped by primary indices. It consists of a vector of primary index/name map pairs in order of increasing index value, where each name map in turn maps secondary indices to names. Each primary index must be unique, and likewise each secondary index per individual name map.

For the last one I wonder what the interaction with streaming compilation would be.
Currently we say that the custom branch hints section should appear before the code section, so the hints are available during the parsing of the code section.

But suppose that the first N hints are "correct", and then the N+1 hint has a wrong byte offset. We discover this only after some code has already compiled, and the hints section has already been validated. What should happen exactly?

My idea would be to allow any byte offset, but to ignore the ones that don't point to a valid branch instruction. This has also the side effect of being forward compatible with the expansion of valid hintable instructions.

Is this solution undesirable from an engine perspective?

lars-t-hansen · 2021-02-16T10:55:11Z

The last condition is not too hard to validate, but it must happen a little later. Our streaming compiler at least has access to the full bytecode for a function before it starts compiling the function. If we wanted to, and if there is a hints section for a function, we could traverse the hints section in one pass at the start of the compilation for the function and for each offset look up the bytecode in the function and verify that it is a branch. This would happen on the compilation thread, so it would parallelize nicely.

(In general, we perform bytecode validation in the streaming compiler as we read bytecodes during compilation - there's no separate validation pass - and bailing out on error is not an issue. But I think the hints are best validated up front, if we are going to validate them. Which we won't be able to, if it's a custom section.)

I worry most about differences between engines and tools resulting from a too-flexible format, not about implementation efficiency, though that counts too. Rules, even ones that don't let us reject a module, will help us avoid that because they will make the producers produce unambiguous output. Tools like wabt could warn or even error out on malformed data.

See #2

yuri91 · 2021-03-15T09:16:40Z

@lars-t-hansen does #10 sufficiently address this issue?

lars-t-hansen · 2021-03-15T09:21:36Z

Yeah, looks OK.

yuri91 · 2021-03-15T09:24:38Z

Thanks, I am closing this issue then.

See #2

yuri91 mentioned this issue Mar 1, 2021

Proposal motivation #5

Closed

yuri91 added a commit that referenced this issue Mar 8, 2021

Add syntactic requirements to the branchHints section.

8816e07

See #2

yuri91 mentioned this issue Mar 12, 2021

Add syntactic requirements to the branchHints section. #10

Merged

yuri91 added a commit that referenced this issue Mar 15, 2021

Add syntactic requirements to the branchHints section.

c0e8301

See #2

yuri91 added a commit that referenced this issue Mar 15, 2021

Add syntactic requirements to the branchHints section. (#10)

d954054

See #2

yuri91 closed this as completed Mar 15, 2021

yuri91 added a commit that referenced this issue Nov 3, 2021

Add syntactic requirements to the branchHints section. (#10)

0658931

See #2

yuri91 added a commit that referenced this issue Feb 8, 2022

Add syntactic requirements to the branchHints section. (#10)

b95bdc4

See #2

yuri91 added a commit that referenced this issue Feb 21, 2022

Add syntactic requirements to the branchHints section. (#10)

36a0550

See #2

yuri91 added a commit that referenced this issue Nov 1, 2022

Add syntactic requirements to the branchHints section. (#10)

2862b0d

See #2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syntactic requirements of the hinting section #2

Syntactic requirements of the hinting section #2

lars-t-hansen commented Feb 16, 2021

yuri91 commented Feb 16, 2021 •

edited

Loading

lars-t-hansen commented Feb 16, 2021

yuri91 commented Mar 15, 2021

lars-t-hansen commented Mar 15, 2021

yuri91 commented Mar 15, 2021

Syntactic requirements of the hinting section #2

Syntactic requirements of the hinting section #2

Comments

lars-t-hansen commented Feb 16, 2021

yuri91 commented Feb 16, 2021 • edited Loading

lars-t-hansen commented Feb 16, 2021

yuri91 commented Mar 15, 2021

lars-t-hansen commented Mar 15, 2021

yuri91 commented Mar 15, 2021

yuri91 commented Feb 16, 2021 •

edited

Loading