Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record message_ix version in scenario data & GDX #765

Merged
merged 23 commits into from
Jan 11, 2024
Merged

Record message_ix version in scenario data & GDX #765

merged 23 commits into from
Jan 11, 2024

Conversation

khaeru
Copy link
Member

@khaeru khaeru commented Dec 11, 2023

Closes #747.

Housekeeping:

  • Close Split tutorial tests to separate CI job #739. This appears to bring the job time down to:
    • 4:40–10 minutes for the tests —so the first indication of success/failure comes in under 5 minutes, which is more fit for purpose.
    • 8–12 minutes for the tutorials.
  • Update older tests. The above change caused some of these to begin to fail, which makes me suspect they rely on state leaking from the tutorial tests or are otherwise fragile. The updates should ensure the tests are complete.
    • test_equation_NEW_CAPACITY_CONSTRAINT.py::test_new_capacity_up: fix typo in the file name. Clone the subject scenario to a unique URL, so that GDX I/O files do not collide even if the two test cases are run in parallel.
    • test_integration.py, test_macro.py: similarly clone to unique URLs.
    • Expand some flaky marks from Handling flaky tests #731 that were previously scoped to macOS only, to now apply to all platforms:
      • test_feature_bound_activity_shares.py::test_add_bound_activity_up_all_modes
      • test_feature_price_emission.py::test_custom_type_variable_periodlength
      • test_integration::test_multi_db_run.
  • Advertise and test Python 3.12 support.
  • Update copyright year to 2024.

How to review

Read the diff and note that the CI checks all pass.

PR checklist

@khaeru khaeru added the enh New features & functionality label Dec 11, 2023
@khaeru khaeru added this to the 3.8 milestone Dec 11, 2023
@khaeru khaeru self-assigned this Dec 11, 2023
Copy link

codecov bot commented Dec 11, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (f7be336) 95.2% compared to head (134b1cb) 95.2%.

Additional details and impacted files
@@          Coverage Diff          @@
##            main    #765   +/-   ##
=====================================
  Coverage   95.2%   95.2%           
=====================================
  Files         46      46           
  Lines       4312    4335   +23     
=====================================
+ Hits        4107    4130   +23     
  Misses       205     205           
Files Coverage Δ
message_ix/macro.py 96.7% <100.0%> (ø)
message_ix/models.py 99.0% <100.0%> (ø)
message_ix/report/__init__.py 100.0% <ø> (ø)
message_ix/testing/__init__.py 99.6% <100.0%> (+<0.1%) ⬆️
message_ix/tests/conftest.py 100.0% <100.0%> (ø)
message_ix/tests/test_cli.py 100.0% <100.0%> (ø)
message_ix/tests/test_core.py 100.0% <100.0%> (ø)
..._ix/tests/test_equation_NEW_CAPACITY_CONSTRAINT.py 100.0% <100.0%> (ø)
...age_ix/tests/test_feature_bound_activity_shares.py 100.0% <100.0%> (ø)
message_ix/tests/test_feature_price_emission.py 100.0% <ø> (ø)
... and 5 more

@khaeru khaeru force-pushed the issue/747 branch 2 times, most recently from f930098 to 3f3419d Compare December 11, 2023 23:21
@khaeru
Copy link
Member Author

khaeru commented Dec 12, 2023

@glatterf42 FYI, looking at the recent commits on the branch, I am finding that for some apparently flaky tests, it helps to .clone(scenario=request.node.name) first.

I think this is because we are using pytest-xdist: where this results in 2+ tests with identical model name and scenario names being run in parallel, then strange effects can occur; for instance, one test reads and sees the result from the GDX file of the other test. (This occurs because message_ix tries to heed users' desire to keep GDX I/O files all in the same directory, rather than separate, temporary directories per-run as the base ixmp GAMSModel class does.)

request.node.name is from the Pytest request fixture, and gives a string that is unique to each test—including individual cases of parametrized tests. It's the easiest way to get such a unique label.

We could later (no urgency) try this approach to address other apparently flaky tests.

@glatterf42
Copy link
Member

Very useful catch, would be great to remove some flaky markers :)

@glatterf42
Copy link
Member

Just rebasing onto current main.

@khaeru khaeru force-pushed the issue/747 branch 4 times, most recently from f56ca4e to 3f378b2 Compare January 10, 2024 12:07
@khaeru khaeru requested a review from glatterf42 January 10, 2024 16:09
Copy link
Member

@glatterf42 glatterf42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks :)

@glatterf42
Copy link
Member

Rebased after dependabot update to main: are we just seeing flakiness in these tests?

@glatterf42
Copy link
Member

Looks like flakiness; are we merging regardless and clean up/apply markers later if needed?

@khaeru
Copy link
Member Author

khaeru commented Jan 11, 2024

are we merging regardless and clean up/apply markers later if needed?

I would say the latter.

I have adapted several tests in this PR to ensure they use unique model/scenario names; this means there is no chance that another scenario being run in another pytest-xdist worker thread will try to access the same scenario in the database or in a GDX file. However, this is different from doing a full sweep through the test suite or adapting fixtures to be sure that no test has this fragility. We can do that in a subsequent PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enh New features & functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Record message_ix version in scenario data & GDX I/O Split tutorial tests to separate CI job
2 participants