-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ATN optimization: removing of redundant epsilon transitions and related basic states #3676
base: dev
Are you sure you want to change the base?
Conversation
It looks like behavior of JavaScript runtime differs from other ones because some tests are failing. I'll investigate. |
07b1071
to
9e7d74e
Compare
Definitely it was a bug in JavaScript runtime. I've fixed it by separated commit. |
23c9591
to
af06ddf
Compare
Now all tests are green, @parrt it's ready for review :) @mike-lischke maybe it is interesting for you as well. |
Hiya. Thanks for looking into this but I'm just not sure we should be messing around with an alg/data structure that took me forever to understand and now I have to remember everything to see if this is OK. Shouldn't I be working on fixing bugs in the IDE? I only have a certain amount of time so I'm just trying to optimize. |
Actually, I was thinking about removal of useless states since I had the first graphical visualization in my VS Code extension years ago. Also I have seen a number of unused ATN states, but that might be caused by some optimization in the antlr4ts runtime. Just check the ATN for states that have no outgoing transitions. @parrt I don't think there's a significant risk by removing useless intermediate states. We have our test suites in place which confirm everything is still working as it should. @KvanTTT Regarding loop start and end states: I use these special states to handle loops in my sentence generator, so I'd prefer them to stay. |
transitions.remove(i); | ||
recalculateEpsilonOnlyTransitions(); | ||
if (epsilonOnlyTransitions != null && epsilonOnlyTransitions != e.isEpsilon()) { | ||
System.err.format(Locale.getDefault(), "ATN state %d has both epsilon and non-epsilon transitions.\n", stateNumber); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it useful to print this information? It's a pretty common situation, I'd say, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should set
method be consistent with add
method that has such check? https://github.com/antlr/antlr4/blob/master/runtime/Java/src/org/antlr/v4/runtime/atn/ATNState.java#L166 I think yes that's why I added the check to set
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the patch to see if I can find any obvious problems, but did not find any. Reviewing the code functionality is beyond me, however. I guess Ter has to take the burden to do that.
tool/src/org/antlr/v4/automata/optimization/ATNOptimizerHelper.java
Outdated
Show resolved
Hide resolved
tool/src/org/antlr/v4/automata/optimization/RedundantEpsilonRemover.java
Outdated
Show resolved
Hide resolved
I consider this is a quite important optimization since it affect all generated parsers. Also, I hope I wrote quite understandable code that is easy for reviewing. Is it possible to review the PR step by step? There is no need to merge it ASAP. |
Interesting. I haven't investigated into this.
Yes, it looks like removing of basic states is safe because they aren't used in parsing process decisions. Moreover, we already have similar epsilon-removing optimizations in
My implementation resolves all these problems. Also, I didn't change the code of runtimes except of Java interpreter and one bug in JS.
Yes, it should be enough.
I didn't mention start states, they shouldn't be removed. But only |
Overall this seems fine and a good idea. However I am not sure if the closure algorithm relies on this implicitly in some way. At first glance it doesn't seem like it does. Getting this in for 4.11 seems appropriate, which gives us a while. |
Do you mean basic states or end states? Basic states are used only for helper purposes, for merging or branching transitions. End states are used for closure algorithms, at least it's considered during deserialization: https://github.com/antlr/antlr4/blob/dev/runtime/Java/src/org/antlr/v4/runtime/atn/ATNDeserializer.java#L402 I think it can be fixed because start states contains all necessary info, but it requires changes in runmties. I've decided to postpone it at least until merging of the current PR. |
From a theory perspective, this seems fine and should be safe.
No. Out of curiosity, how did you generate the diagrams? The |
I don't understand, the first is safe or the second is also should be safe for removing?
Yes. Also, it possible just to use the following code in arbitraty place in code: DOTGenerator gen = new DOTGenerator(g);
String dot = gen.getDOT(atn.states.get(2), g.getRuleNames(), false); The string can be put to VSCode and render it instantly using graphviz plugin. |
Hiya! What problem are we trying to solve here? The risk reward here is unclear and I have 1 billion other open source projects that require my time. (I'm waiting for a video chat to start regarding my decision tree visualization tool with a colleague in Romania.) I can't possibly accept any change to ATN structure without a thorough investigation. There could be other tools that rely on that structure etc. And you are asking me to risk screwing up one of the most complicated algorithms I've ever had to work on ($ALL(*)$). If this were 2013, I would have all of that in my memory and could quickly evaluate whether this is worth the risk. (Actually I'd rely on @sharwell to do the thinking there ha ha.) Seems like I should be working on the plugin, which is what I'm focusing on at the moment. I'm not sure this is the best way to spend my limited time capital. If you guys see this as solving a real problem, then I'm happy to spend the time investigating. I think though at this point we are chasing a metric; remember that chasing low blood pressure metric causes death after a while haha |
I get really scared when I think it could even possibly impact things like #1398 |
My fork does optimization of the ATN at runtime instead of at generation time (applies to antlr4ts as well): This allows the serialized ATN to retain its accurate reflection of the behavior described by the grammar, but the runtime to still avoid unnecessary transitions during closure. Graphs produced by |
Thanks for your input @sharwell, as always. Guys, if it makes Sam nervous, it definitely makes me nervous! haha. |
Improving quality and performance of ANTLR. Especially considering current PR also fixes some potential bugs that I've described above. For instance, for
@mike-lischke one of developers of such tools. He said he didn't find any obvious problem.
Not all changes have big effect. Quality includes a lot of small improvements steps. Also, 30% improvement for size is not small (in new Kotlin compiler we fight for every second for big projects).
We have several tests that cover the case here that work fast: https://github.com/antlr/antlr4/tree/dev/runtime-testsuite/resources/org/antlr/v4/test/runtime/descriptors/Performance. Also, I can write more benchmark tests if it's necessary. Also, we can try to test them on our big number of grammars.
But it's not the complete solution, there is no need to encode useless states and transitions to ATN.
Maybe it makes sense of publishing RC versions for testing on real users. BTW, I can help with repository maintaining, I'm working on Kotlin compiler and this project is much more complicated than ANTLR at least for me. I'm not completely understand all ANTLR algorithms and details but it consumed a week to study ATN structure and write new optimizer. On the next step I'm going to investigate ALL(*) more deeper. I just wanted to consolidate our efforts but it looks like you have very limited time and it makes sense to develop my own ANTLR fork. |
Why don't we just throw the optimization implemented in this pull request behind a command line flag? Something like |
|
Note though, I have only checked the structure of the patch, not the functionality.
I agree, when the tests indicate that the behaviour is still like what is expected (and that's their entire purpose) then why not use the patch? If all fails, it can be reverted easily.
Good idea, if we can make the release automatic. Currently there's still too much manual work involved. |
@sharwell While we are talking: do you plan to update antl4ts to use the changed ATN serialization introduced in 4.10? Would you consider getting antlr4ts into the main ANTLR4 repo as just another target, so that we can use just one tool to generate code for all targets? |
Actually, optimizer also relinks several incoming transitions if outgoing is epsilon (old TailEpsilonRemover just cut all incoming transitions except one. It's not correct but doesn't lead to error because it's actually redundant links from end state. There is no info about incoming transitions in old implementation). Epsilon transitions also may have labels ( |
Yes, after a second thought I realized this and hence removed that comment. |
hi guys, thanks for the comments and all of your continued efforts! I appreciate that this reduces the size of the lexer but I don't think I've ever heard a complaint about .class file size. I would be much more interested in knowing how it affects lexing performance across a variety of problems and across targets. If I'm not mistaken, we've gotten massive improvements in performance just by optimizing the runtime libraries and a little bit of the code generation; e.g., work done by @mike-lischke and @jcking on C++ (soon Go). Seems like that's easier and safer than messing with the critical secret sauce of ANTLR. What I'm hearing is that you want me to accept a change to the secret sauce written by someone who admits they do not understand the algorithm fully. On the other side, you have the two authors recommending caution. We should all know our limitations and I currently do not have the algorithm in my head and all of the subtle details that we obsessed over for years. I just don't see how the risk reward equation suggest integrating this. I don't know about you, but I don't have confidence that we've covered everything in our unit test. I'm not saying this PR is incorrect, but it would take me a lot of work to evaluate and I don't know if it's worth the effort given other things I'd like to work on. I'd rather focus on getting a typescript target, improving Go target, improving the plug-in, working on maybe plugins for other tools. I don't understand this obsession with the ATN. Regarding potentially wasted effort, @KvanTTT, I believe you will recall me suggesting that you coordinate with me before you spend a week on some thing I might not accept. I hope you will agree that my leadership over literally 35 years on this project has objectively been good. I don't always make the right decisions but I hope you will admit that I exercise excellent judgment, given my plurality of successful open source projects across languages and machine learning. Let's leave this potentially valuable optimization open for when the algorithm has been reloaded in my mind from cold storage. I've only had a few months to restart my ANTLR efforts after ignoring it for six or seven years. |
It intentionally preserves extra information, which is valuable in cases where tools either manipulate the ATN at runtime for various reasons or use the ATN information to produce graphs representing the rule structure/relations.
The antlr4ts target is based on my optimized Java fork, so it's not compatible with the shared code generation approach. The antlr4ts-tool package contains the matching code generator for this target.
My optimized forks (Java, C#, and TypeScript) have a stronger backwards compatibility requirement than the main project. They will only be upgraded if the changes are meaningful in the context of that specific target and can be performed with strict adherence to backwards compatibility. Changes which impact compatibility (e.g. any change which would require an author regenerate their grammar in order to execute against the new runtime) will be reverted during the update. |
Ok, it makes sense to write some benchmark tests.
Maybe, but optimizing ATN also has advantage, it's applicable for all targets, not concrete.
Not obsession, just step-by-step improving of the tool.
I don't regret about wasted time, I'd still implemented it anyway out of curiosity. By pointing time period I just emphasized the fact the structure of ATN is not so complicated (ok, maybe I don't know about all details). Also, It's not so easy and meaningful to describe optimization without touching the code. After I had touched it, it was too late to start an issue.
So be it.
I started implementing the optimization because of bulky graphs in VSCode. We can estimate runtime functionality by performance tests (using JMH for instance?). |
@KvanTTT sounds great. Thanks for your understanding. (BTW, "bulky graphs in VSCode" actually does sound like a good motivation.) |
@KvanTTT I was wondering about the new 15-state ATN (in your opening comment) why states 11 and 12 are not removed and replaced with a single transition for the non-terminals C and D? Overall, I think tightening up the ATN might help in both performance and readability. But, the biggest factor in performance, at least in the parser and what I've seen in grammars-v4, is because the grammar has ambiguities or doesn't use Kleene operators. |
Because
Generally, it looks like ATN can be converted to DFA within one rule (no epsilon and no same transitions). It should improve performance but it might require a lot of changes in runtimes. Also it might significantly increase number of states (but for small number of cases I think). |
This sometimes produces a higher performing grammar, but sometimes does not. It can also have weird/unexpected side effects like changing the lookahead distance and recovery characteristics for certain syntax errors. |
Let's move discussion about inlining to the separated issue: #3685 |
I've completed some benchmarks on some grammar (from grammars-v4) and input (from examples subdirs): I used JMH library for JVM benchmarking (with warming up).
Old ATN
New ATN
Info about lexer-only grammar is not very relevant because it's very fast anyway and it looks like error. Here the full testing code: GrammarBenchmarks.java I've concluded that optimized ATN works up to |
Nice! It's a win--minor but everything counts. I still think you can minimize more states by eliminating all epsilon closures for an even simpler NFA. |
It's not so minor. In Kotlin compiler that currently I work on 1% is already win :)
I was thinking about it but it could lead a lot of changes in runtimes (I doubt @parrt will even review it especially considering the situation with this more safer request). Currently there a lot of states of different kind, they could be unified. In theory, it looks like every rule could be transformed to DFA or almost DFA. |
heh, great! Nice work, @KvanTTT. Can you tell me how much input you ran through the various grammars because I couldn't see anything obvious in the code looking on my phone before I race off to work. It's very possible that a reduction in the size of the ATN gives such a speed improvement for small input, but remember that the speed comes from the DFA cache which has to warm up. (The reason we are interested in preloading it.) The ATN size literally does not matter anymore once we have the DFA warmed up so I'd rather focus on warm up or whatever. On the other hand, as you pointed out there is motivation to reduce the size of ATNs for incorporation into development environments. I would have to relearn how all of this stuff works in great detail however before I could accept the change as we have discussed. |
All samples from
I included warmup iteration (for DFA cache building), this time is not included in result. By the way, warmup time is higher but not so much (+10%). Moreover, as I understand, warmup time is also important especially for C++.
Not sure it's completely true. At least lexer calls |
af06ddf
to
63b9867
Compare
Here is the table of ATN states number comparison for Java grammar:
|
Excellent! |
Also, it should decrease memory usage since deserialized ATN also takes less memory, see #3765 |
Signed-off-by: Ivan Kochurkin <[email protected]>
* Introduced `RedundantEpsilonRemover` that performs the following ATN optimizations (described below): * Removing of single incoming epsilon transition * Removing of single outgoing epsilon transition with several incoming transitions * Removed all optimizations from `ParserATNFactory` since they are useless, not fully correct and are performed on the separated optimization step * Introduced `ATNOptimizerHelper` that do the following: * Calculates incoming transitions that are used in ATN optimizers * Tracks replacement of state that are being removed during optimization. Old states are being replaced on the new ones in the final step (`updateAstNodes`) * Compresses array of ATN states (removes null items after previous optimization steps, `compressStates`) * Fixed the previous `ATNOptimizer` and renamed to `SetMerger`. Now it considers incoming transitions and `ATNOptimizerHelper` accurately tracks replacements. Implemented optimizations decreases ATN especially for lexers and should improve performance for generated parsers because of decreased number of method calls. Also they don't affect runtime code except of interpreter part (that is buggy anyway). Signed-off-by: Ivan Kochurkin <[email protected]>
63b9867
to
6e90d17
Compare
RedundantEpsilonRemover
that performs the following ATN optimizations forBasicState
:ParserATNFactory
since they are useless, not fully correct and are performed on the separated optimization stepATNOptimizerHelper
that do the following:updateAstNodes
)compressStates
)ATNOptimizer
and renamed toSetMerger
. Now it considers incoming transitions andATNOptimizerHelper
accurately tracks replacements.Implemented optimizations decreases ATN especially for lexers and should improve performance for generated parsers because of decreased number of method calls. Also they don't affect runtime code except of interpreter part (that is buggy anyway).
Removing of single incoming epsilon transition
Input
Output
Removing of single outgoing epsilon transition with several incoming transitions
Input
Output
Example
Old ATN
Number of states: 23
New ATN
Number of states: 15
Plans
It looks like
BlockEndState
(8, 14),LoopEndState
(10) can be also safely removed in the same way since they are not used in decisions. But it might require small changes in runtimes.