-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tests][dask] reduce number of collisions tests #4501
Conversation
Seems like it's still taking too long (53 minutes) logs
|
@jameslamb would you support adding |
We can set number of runs depending on the architecture tests are run on. from platform import machine
...
n_runs = 1000 if machine() == 'x86_64' else 25
for _ in range(n_runs):
... |
@StrikerRUS do you think it should still be run that many times in the other jobs? I think I may have got too excited with the "should never collide" idea. |
Haha yeah I personally would prefer to just run it, say, 25 times in each job (like you currently have in the PR) and not introduce a difference in different test environments. I think that's more than enough to catch issues, given the current level of activity in this repo and the number of concurrent CI jobs running on each commit.
I at least would support you pushing a commit to this PR right now so we can see those results in logs! I'd want to see how verbose that output is to make a decision of whether or not to merge such a change, but it would at least help us to understand where time is being spent. |
Hmm I thought changing Line 210 in 5fe27d5
Does that job run a different call to |
OK, please forget my suggestion. |
|
Ha ok, thanks for looking into it! I think that's strong evidence that #4498 was not the cause of timeouts on the QEMU builds. Could you please remove the changes to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the investigation!
I just observed another QEMU timeout on #4504 (https://dev.azure.com/lightgbm-ci/lightgbm-ci/_build/results?buildId=10669&view=logs&j=c2f9361f-3c13-57db-3206-cee89820d5e3).
If you have ideas from these timings on how we could reduce the runtime for those jobs without sacrificing too much test coverage, we'd welcome them!
Some jobs just seem to take longer in all the tests, for example here:
2 minutes
14 minutes |
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
The recent change of the search for open ports for the dask interface (#4498) removed the collisions and to make sure we never got any the test ran 1,000 times, which is too much for the QEMU CI job (#4498 (comment)).
This reduces the number of times the collisions are tested to 25 (as suggested in #4498 (comment)).