Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warnings ACTION_SHOULD_BE_PLACED_AFTER_PREDICATES, PREDICATE_AT_THE_BEGINNING_OF_LEXER_RULE_DEGRADES_PERFORMANCE and misc fixes #3626

Draft
wants to merge 4 commits into
base: dev
Choose a base branch
from

Conversation

KvanTTT
Copy link
Member

@KvanTTT KvanTTT commented Apr 5, 2022

Do not squash

@KvanTTT KvanTTT marked this pull request as draft April 6, 2022 15:04
@KvanTTT KvanTTT marked this pull request as ready for review April 7, 2022 14:56
@parrt
Copy link
Member

parrt commented Apr 7, 2022

Hmm...having second thoughts about adding this guys, @KvanTTT and @kaby76. Actions are never executed in lexer before predicates if I remember. The rule has to match lexically and preds before it's committed to match THEN actions would exec I think, right?

@KvanTTT
Copy link
Member Author

KvanTTT commented Apr 7, 2022

Actions are never executed in lexer before predicates if I remember. The rule has to match lexically and preds before it's committed to match THEN actions would exec I think, right?

Correct. It's also actual for subrules and closures. Moreover, warning should be thrown in closures if actions and predicates are used inside it independence of order. The full example for clarity:

lexer grammar T;
@members {String s = "";

// Action
void a(String x) {
    s += x;
}

// Predicate
boolean p() {
    return !s.equals("");
}}

// No warnings
OR: 'or1' {p()}? {a("or1");} | 'or2' {p()}? {a("or2");};
NO_PREDICATE: 'no_predicate' {a("no_predicate");};
PREDICATE: 'predicate' {p()}? {a("predicate");};
ACTIONS: 'actions' {a("action1");} {a("action2");};
CLOSURE: ('closure' {a("closure");})+;
OPTIONAL: 'optional' ('_' {a("optional");})?;
USE_FRAGMENT: 'f1_' FRAGMENT {a("fragment");};
RECURSIVE : '/*' {p()}? (RECURSIVE | .)*?  {a("recursive");}'*/';

// Warnings
OR_W: 'or1' {a("or1_w");} {p()}? | 'or2' {a("or2_w");} {p()}?;
PREDICATE_W: 'predicate_w' {a("predicate_w");} {p()}?;
CLOSURE_W1: ('closure_w1' {a("closure_w1");} {p()}?)+;
CLOSURE_W2: ('closure_w2' {p()}? {a("closure_w2");})+;
OPTIONAL_W: 'optional_w' ('_' {a("optional_w");})? {p()}?;
USE_FRAGMENT_W: 'f2_' {a("fragment_w");} FRAGMENT;
HIERARCHY_W: 'hierarchy_w' {a("hierarchy_w");} ('_1' {p()}? | '_2');
RECURSIVE_W : '/*' {a("recursive_w");} (RECURSIVE_W | .)*? {p()}? '*/';

fragment FRAGMENT: 'fragment' {p()}?;

WS: [ \r\n]+;

@KvanTTT KvanTTT changed the title ACTION_SHOULD_BE_PLACED_AFTER_PREDICATES warning and misc fixes Warnings ACTION_SHOULD_BE_PLACED_AFTER_PREDICATES, PREDICATE_AT_THE_BEGINNING_OF_LEXER_RULE_DEGRADES_PERFORMANCE and misc fixes Apr 9, 2022
@KvanTTT
Copy link
Member Author

KvanTTT commented Apr 9, 2022

I've simplified code and added a new warning PREDICATE_AT_THE_BEGINNING_OF_LEXER_RULE_DEGRADES_PERFORMANCE, see #3633. Full test sample:

lexer grammar T;

@members {int pCount = 0;
int aCount = 0;

boolean p(String s) {
    System.out.println("predicate(" + s + ", " + pCount++ + "); ");
    return true;
}

void a(String s) {
    System.out.println("action(" + s + ", " + aCount++ + "); ");
}}

// No warnings
SIMPLE_OK: 'SIMPLE_OK' {p("SIMPLE_OK")}?;
PREDICATES_OK: 'PREDICATES_' {p("PREDICATES_OK_1")}? {p("PREDICATES_OK_2")}?;
HIERARCHY_OK: ('HIERARCHY_OK_1' {p("HIERARCHY_OK_1")}? | 'HIERARCHY_OK_2' {p("HIERARCHY_OK_2")}?) '_';
USE_FRAGMENT_OK: 'USE_' FRAGMENT {p("USE_FRAGMENT_OK")}? '_FRAGMENT_OK_';
USE_FRAGMENT_2_OK: FRAGMENT_2 {p("USE_FRAGMENT_2_OK")}? '_';
RECURSIVE_OK: '/*' {p("RECURSIVE_OK")}? (RECURSIVE_OK | .)*? '*/';
MUTUALLY_RECURSIVE: MUTUALLY_RECURSIVE '_'; // Prevent endless recursion

// Warnings
SIMPLE: {p("SIMPLE")}? 'SIMPLE_';
OR: {p("OR_1")}? 'OR_1_' | {p("OR_2")}? 'OR_2_';
CLOSURE: ({p("CLOSURE")}? 'CLOSURE_')+;
ACTION: {a("ACTION");} {p("ACTION")}? 'ACTION'; // Warning since actions are executed after all predicates
PREDICATES: {p("PREDICATES_1")}? {p("PREDICATES_2")}? 'PREDICATES_'; // Two warnings
OPTIONAL: ({p("OPTIONAL")}?)? 'OPTIONAL_';
HIERARCHY: ('HIERARCHY_1' | {p("HIERARCHY_2")}? 'HIERARCHY_2') '_';
USE_FRAGMENT: FRAGMENT {p("USE_FRAGMENT")}? '_'; // Warn about predicate in fragment but not about predicate in rule
USE_FRAGMENT_2: FRAGMENT {p("USE_FRAGMENT_2")}? '_';
RECURSIVE: {p("RECURSIVE")}? '/*' (RECURSIVE | .)*? '*/';
USE_RULE: {p("USE_RULE")}? SUB_RULE;
SUB_RULE: {p("SUB_RULE")}? 'SUB_RULE_';

fragment FRAGMENT: {p("FRAGMENT")}? 'FRAGMENT';
fragment FRAGMENT_2: 'FRAGMENT_2' {p("FRAGMENT_2")}?;

@parrt
Copy link
Member

parrt commented Apr 9, 2022

I'm going to need some time to think through this very carefully so I will push past 4.10.

@KvanTTT
Copy link
Member Author

KvanTTT commented Apr 9, 2022

Makes sense. Also I'd recommend using the following action and predicate to check how and when they are executed:

int pCount = 0;
int aCount = 0;

boolean p(String s) {
    System.out.println("predicate(" + s + ", " + pCount++ + "); ");
    return true;
}

void a(String s) {
    System.out.println("action(" + s + ", " + aCount++ + "); ");
}

For instance, for the following grammar

A: {p("A")}? 'A';
B: 'B';

and input AB, the following output is printed:

predicate(A, 0); 
predicate(A, 1); 

It points out the predicate was called two times.

Also, take a look at the output of the grammar A: {a("A");} {p("A")}? 'A'; and input A. It prints predicate(A, 0); action(A, 0);. Hence the predicate was called before the action not how it's written.

@parrt
Copy link
Member

parrt commented Apr 10, 2022

If I remember correctly, the first action is special in that it is considered an initialization action to you can define variables. or, am I thinking about the previous version of antlr?

@KvanTTT
Copy link
Member Author

KvanTTT commented Apr 10, 2022

The first action in the lexer rule is executed after rule matching and after all predicates as we've discussed before. But the first predicate in the lexer rule is executed during matching of every token of lexer that significantly affect performance.

Maybe you mean Rule Attribute Definitions that allow variables initialization, but my checks are actual only for lexer unlike these attributes that are actual only for parser rules.

@parrt
Copy link
Member

parrt commented Apr 10, 2022

Ok, cool. We should definitely look at this carefully after release. too bad that the build system is all screwy. Even circleci is now apparently not running our jobs. github actions changed their API and @ericvergnaud had to rebuild things... not sure what's up with circleci. that was working last night.

@KvanTTT
Copy link
Member Author

KvanTTT commented Apr 10, 2022

too bad that the build system is all screwy. Even circleci is now apparently not running our jobs. github actions changed their API and @ericvergnaud had to rebuild things... not sure what's up with circleci. that was working last night.

Yes, build system configuring is complicated and tedious. I've encountered with it on the previous work.

@parrt
Copy link
Member

parrt commented Apr 10, 2022

After I gave them some money they started building again on circleci :) Looks like everything is green over there https://app.circleci.com/pipelines/github/antlr/antlr4 just running the last test

@KvanTTT
Copy link
Member Author

KvanTTT commented Apr 10, 2022

Maybe it makes sense to request for extended free plan since ANTLR is valuable free and open-source project.

@parrt
Copy link
Member

parrt commented Apr 10, 2022

a good idea. once 4.10 is out i can ask.

@KvanTTT
Copy link
Member Author

KvanTTT commented Apr 24, 2022

It looks like it should be implemented on the ATN level, not on Grammar AST and it should be even more easily (I thought ANTLR changes the order of actions and predicates before ATN constructing but it's not true). I'm changing status to the draft...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants