Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

topdown: fix re-wrapping of ndb_cache errors #5543

Conversation

srenatus
Copy link
Contributor

@srenatus srenatus commented Jan 5, 2023

Errors happening in the "cache hit" code path for NDBCache calls would come from the iterators called with the function call results retrieved from the cache. Now, these errors include the sentinel "early exit" errors, which would ordinarily be suppressed further up the call stack.

However, since the iter()-returned errors had been wrapped into topdown.Halt{}, they were not picked up by the suppression mechanisms.

Now, those iter-derived errors are no longer wrapped into Halt, and the code section that suggests that they should be has gotten a new comment explaining what's going on there.

To improve our blinds spots in testing, the Rego Yaml tests are now also run with NDBCache enabled. No further assertions are added, those are tested elsewhere, but it would have caught this problem. Hence it seems worthwhile to spend the extra 3s in tests.

Errors happening in the "cache hit" code path for NDBCache calls would
come from the iterators called with the function call results retrieved
from the cache. Now, these errors include the sentinel "early exit"
errors, which would ordinarily be suppressed further up the call stack.

However, since the iter()-returned errors had been wrapped into
topdown.Halt{}, they were not picked up by the suppression mechanisms.

Now, those iter-derived errors are no longer wrapped into Halt, and the
code section that suggests that they should be has gotten a new comment
explaining what's going on there.

To improve our blinds spots in testing, the Rego Yaml tests are now also
run with NDBCache enabled. No further assertions are added, those are
tested elsewhere, but it would have caught this problem. Hence it seems
worthwhile to spend the extra 3s in tests.

Signed-off-by: Stephan Renatus <[email protected]>
@srenatus srenatus marked this pull request as ready for review January 5, 2023 14:34
Copy link
Contributor Author

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewer notes

@@ -1696,7 +1696,7 @@ func (e evalBuiltin) eval(iter unifyIterator) error {

operands := make([]*ast.Term, len(e.terms))

for i := 0; i < len(e.terms); i++ {
for i := range e.terms {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style update, immaterial

}

if err != nil {
return Halt{Err: err}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Halt-wrapping is what we'd like to get rid of. The rest of this section is just a small restructuring, going with straightaway

return iter()

calls instead of

err = iter()

followed by a plain

return err

below. Since the Halt-wrapping is gone, this became applicable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a solid refactoring to me!

@@ -34,7 +35,19 @@ func TestOPARego(t *testing.T) {
}
}

func testRun(t *testing.T, tc cases.TestCase) {
func TestRegoWithNDBCache(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was one (!!) test case in our ~1700 cases that would have broken if we did this before. But still worthwhile, we'll now run all yaml tests with NDBCache enabled. Since those tests are our safety net for anything more complex going on in topdown, it's worth it, I think.

This is the case -- but it depends on your machine's IP setup, I think:

- data:
modules:
- |
package test
# one of these should be the case on any system
p {
net.lookup_ip_addr("localhost") == {"127.0.0.1"}
}
p {
net.lookup_ip_addr("localhost") == {"127.0.0.1", "::1"}
}
p {
net.lookup_ip_addr("localhost") == {"::1"}
}
note: net.lookup_ip_addr/localhost
query: data.test.p = x
want_result:
- x: true

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that it's just one testcase that blows up. 🤔 It's better than none though!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that's because by their very nature, non-deterministic builtins are difficult to test? 😅

want_result:
- x:
p: true
q: true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the test case that @istalker2 came up with to reproduce this with NDBCache enabled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that this one is what breaks everything. 🤔

@srenatus srenatus requested a review from philipaconrad January 5, 2023 14:35
Copy link
Contributor

@philipaconrad philipaconrad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good to me. Thank you for adding the extra round of tests! 😄

}

if err != nil {
return Halt{Err: err}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a solid refactoring to me!

want_result:
- x:
p: true
q: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that this one is what breaks everything. 🤔

@@ -34,7 +35,19 @@ func TestOPARego(t *testing.T) {
}
}

func testRun(t *testing.T, tc cases.TestCase) {
func TestRegoWithNDBCache(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that it's just one testcase that blows up. 🤔 It's better than none though!


type opt func(*Query) *Query

func testRun(t *testing.T, tc cases.TestCase, opts ...opt) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactoring to allow parameterizing the test runner. 👍

@philipaconrad philipaconrad merged commit 9036b00 into open-policy-agent:main Jan 5, 2023
@srenatus srenatus deleted the sr/topdown/escaped-early-exit-error branch January 5, 2023 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants