Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential out-of-bounds access of yy_nxt #627

Closed
szhorvat opened this issue Feb 23, 2024 · 7 comments
Closed

Potential out-of-bounds access of yy_nxt #627

szhorvat opened this issue Feb 23, 2024 · 7 comments

Comments

@szhorvat
Copy link

szhorvat commented Feb 23, 2024

When I use the -Cf option with a certain scanner, I get the following warning from UndefinedBehaviourSanitizer:

runtime error: index 208 out of bounds for type 'const flex_int16_t[128]' (aka 'const short[128]')

The warning is triggered on a line like this:

while ( (yy_current_state = yy_nxt[yy_current_state][ YY_SC_TO_UI(*yy_cp) ]) > 0 )

It seems that even though yy_nxt is defined as flex_int16_t yy_nxt[][128], it is accessed as yy_nxt[i][208], with an index beyond 127. This does not actually lead to a crash, as yy_nxt[i+1] also happens to contain data, so yy_nxt[i][208] is effectively equivalent to yy_nxt[i+1][80].

Is this quirky way of indexing intentional, or is it a bug in the code that Flex generates?


Flex version is 2.6.4. The scanner sources are at https://github.com/igraph/igraph/blob/f5a2860995dc85643b4d63f47ff0aa26515dc1f8/src/io/gml-lexer.l

This happens only when I use the -Cf option, and pass specific data to the scanner. It happens to perform very badly on this data without using this option.

@Mightyjo
Copy link
Contributor

Mightyjo commented Feb 23, 2024

Great question. Flex scanners can do a bunch of things that irritate Clang. This is one of the weirder ones.

Flex allocates an extra row in several of its internal tables so it can use some pointer math shortcuts without clobbering memory. The scanners switch between treating yy_nxt as a 1D and 2D array without much concern. It's functionally correct and even safe thanks to the extra table padding, but it makes strict type checking mad. Took me ages to figure out why it's written this way.

@szhorvat
Copy link
Author

So, in short, not a bug? I did not notice any incorrect behaviour. I just need to make sure that I can safely use -Cf despite this UBSan warning.

@westes
Copy link
Owner

westes commented Feb 23, 2024

Compiler warnings are not errors.

@szhorvat
Copy link
Author

This is not a compiler warning. It is an issue detected at runtime by UndefinedBehaviourSanitizer.

@westes
Copy link
Owner

westes commented Feb 23, 2024

Flex doesn't always follow Strunk & White. Generic warnings that code isn't pretty are just that. We know. At some point we may clean this kind of thing up, but we'd need a ton of test cases to ensure that flex will still do what flex has always done.

If you have an example where flex behavior is incorrect, please post a test case.

@szhorvat
Copy link
Author

@westes Could you please address the specific situation I describe? I am concerned that you may have misunderstood my question as being a request to live up to some arbitrary code quality standards. It is not that. Flex generated code is accessing a statically allocated array in the following way:

int arr[2][2] = {
    {1,2},
    {3,4}
};

arr[0][3]; // notice the index 3 going above the upper bound of 2

I have only seen this in one very specific situation even though I regularly test this code with UndefinedBehaviourSanitizer, which at least makes this behaviour suspicious.

My question is simply: Are you aware of doing this type of non-standard indexing intentionally within Flex?

@westes
Copy link
Owner

westes commented Feb 24, 2024

Please reread @Mightyjo's comment very carefully. He addresses your concern precisely and thoroughly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants