Skip to content
This repository was archived by the owner on Nov 30, 2022. It is now read-only.

intermittent failure in hubspot integration test #992

Closed
adamsachs opened this issue Jul 29, 2022 · 0 comments · Fixed by #1091
Closed

intermittent failure in hubspot integration test #992

adamsachs opened this issue Jul 29, 2022 · 0 comments · Fixed by #1091
Assignees
Labels
bug Something isn't working

Comments

@adamsachs
Copy link
Contributor

adamsachs commented Jul 29, 2022

Bug Description

As noted in this comment:

unsafe CI checks succeeded after 2 failed attempts, without any code changes, so there's something fishy going on with the unsafe CI test.

in this case, both failures were on the same assertion in the hubspot test erasure task -- the assertion that confirms the results of the access request that is accessing the erasure seed data. The same assertion failure occurred on this workflow run yesterday.

I've also noted this nondeterministic behavior locally when executing pytest tests/ops/integration_tests/saas/test_hubspot_task.py within my server shell.

Steps to Reproduce

Given what we know at this point, this seems to just occur "randomly" when running external integration tests. Once we narrow down the issue further, we'll probably be able to have more precise repro steps/scenarios :)

After ensuring you have vault access or the correct hubspot credentials in your local env, you can try executing pytest tests/ops/integration_tests/saas/test_hubspot_task.py within your local server shell. That being said, i've only seen the failure once locally after executing the test ~5 times.

In CI, with what we know at this point, just triggering the unsafe CI checks action enough times should reproduce the error, if this is truly just randomly occurring.

Expected behavior

Unsafe CI checks should reliably pass

Environment

This nondeterministic behavior seems to occur both:

  • in CI (unsafe PR checks)
  • locally

Additional context

here are some very rough thoughts based on some initial investigation:

the test failing here is just an assertion that the initial test data is seeded in the remote system. the fixture that's responsible for that seeding explicitly confirms the data is there in the remote system. the fixture isn't what fails. what fails is when we try to execute an access request against that again confirms that same data is there in the remote system. i can't imagine why the access request would provide nondeterministic results if the fixture is confirming that the remote data is there.

the two main ideas i have are:

  • the check that the fixture is doing for the remote data is executing a slightly different request against the remote system than the access request ultimately makes under the hood. at first glance, everything looks consistent to me, and there's not a ton of variables
  • something more internal to the access request, like the graph traversal isn't identifying the right nodes, etc. i noticed that this recent PR did make some high-touch changes that maybe could be impacting things here? that's a total shot in the dark, though.
@adamsachs adamsachs added the bug Something isn't working label Jul 29, 2022
@adamsachs adamsachs mentioned this issue Jul 29, 2022
10 tasks
@sanders41 sanders41 mentioned this issue Aug 2, 2022
10 tasks
@Kelsey-Ethyca Kelsey-Ethyca linked a pull request Aug 23, 2022 that will close this issue
10 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants