Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Test] Relax test_slurm_scaling for Rocky #6707

Merged

Conversation

gmarciani
Copy link
Contributor

Description of changes

Relax test_slurm_scaling for Rocky, increasing the accepted time to replace static nodes from 5min to 6 min. We observed in 3.13.0 an increase in the bootstrap time of Rocky nodes.

Tests

ONGOING test_slurm_scaling on Rocky

References

  • Link to impacted open issues.
  • Link to related PRs in other packages (i.e. cookbook, node).
  • Link to documentation useful to understand the changes.

Checklist

  • Make sure you are pointing to the right branch.
  • If you're creating a patch for a branch other than develop add the branch name as prefix in the PR title (e.g. [release-3.6]).
  • Check all commits' messages are clear, describing what and why vs how.
  • Make sure to have added unit tests or integration tests to cover the new/modified code.
  • Check if documentation is impacted by this change.

Please review the guidelines for contributing and Pull Request Instructions.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…me to replace static nodes from 5min to 6 min. We observed in 3.13.0 an increase in the bootstrap time of Rocky nodes.
@gmarciani gmarciani added skip-changelog-update Disables the check that enforces changelog updates in PRs 3.x Test labels Mar 12, 2025
@gmarciani gmarciani marked this pull request as ready for review March 12, 2025 22:02
@gmarciani gmarciani requested review from a team as code owners March 12, 2025 22:02
@gmarciani gmarciani enabled auto-merge (rebase) March 12, 2025 22:08
_wait_for_node_reset(scheduler_commands, static_nodes, dynamic_nodes)
# TOFIX We observe in 3.13.0 an increase in the bootstrap time for Rocky and RHEL.
# We must address it and restore the default wait time to 300s.
stop_max_delay_secs = 360 if os.starts_with("rocky") else 300
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add them too?

@gmarciani gmarciani merged commit ebf4200 into aws:develop Mar 13, 2025
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.x skip-changelog-update Disables the check that enforces changelog updates in PRs Test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants