Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Fix performance regression #102

Merged
merged 3 commits into from
Oct 2, 2020

Conversation

chrisjsewell
Copy link
Member

@chrisjsewell chrisjsewell commented Oct 2, 2020

Container.is_initialised is a costly operation, loading the config JSON every time.
In 1d7c389, the config is now called on every call to loose_prefix_len, leading to a massive performance degradation.

This PR makes sure the is_initialised test is called only if the config has not already been loaded into memory.

Before:

tests/test_benchmark.py::test_loose_read
ncalls  tottime percall cumtime percall filename:lineno(function)
1       0.0022  0.0022  0.3835  0.3835  disk-objectstore/disk_objectstore/container.py:824(get_objects_content)
1001    0.0055  0.0000  0.3758  0.0004  disk-objectstore/disk_objectstore/container.py:472(_get_objects_stream_meta_generator)
1000    0.0029  0.0000  0.3058  0.0003  disk-objectstore/disk_objectstore/container.py:213(_get_loose_path_from_hashkey)
3000    0.0019  0.0000  0.2959  0.0001  disk-objectstore/disk_objectstore/container.py:380(loose_prefix_len)
3000    0.0068  0.0000  0.2940  0.0001  disk-objectstore/disk_objectstore/container.py:371(_get_repository_config)
3000    0.0315  0.0000  0.2873  0.0001  disk-objectstore/disk_objectstore/container.py:270(is_initialised)

After:

tests/test_benchmark.py::test_loose_read
ncalls  tottime percall cumtime percall filename:lineno(function)
1       0.0014  0.0014  0.0793  0.0793  disk-objectstore/disk_objectstore/container.py:824(get_objects_content)
1001    0.0034  0.0000  0.0733  0.0001  disk-objectstore/disk_objectstore/container.py:472(_get_objects_stream_meta_generator)
1000    0.0240  0.0000  0.0240  0.0000  ~:0(<built-in method io.open>)
...
3000    0.0012  0.0000  0.0017  0.0000  disk-objectstore/disk_objectstore/container.py:380(loose_prefix_len)

chrisjsewell referenced this pull request Oct 2, 2020
Allows for the association of a container with an existing DB, or to uniquely refer to it.

🐛 This also fixes a bug, whereby config values were cached, but the cache was not cleared when re-initialising the container.
To reduce the risk of such a problem, now only the whole configuration dictionary is cached, rather than each single config value.
@codecov
Copy link

codecov bot commented Oct 2, 2020

Codecov Report

Merging #102 into develop will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##           develop      #102   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            7         7           
  Lines         1502      1502           
=========================================
  Hits          1502      1502           
Impacted Files Coverage Δ
disk_objectstore/container.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d7c389...e62aabd. Read the comment docs.

If the code coverage is uploaded multiple times for a single commit, then the coverage is simply overwritten.Thereofre, a single test run should be chosen.
@chrisjsewell
Copy link
Member Author

chrisjsewell commented Oct 2, 2020

I've also fixed the issue with codecov initially reporting lower test converage, caused by the fact that it was being uploaded (& overwritten) for every matrix run (and just by luck the windows tests are loaded last, which have the full test coverage)
See also codecov/codecov-action#40

@giovannipizzi
Copy link
Member

Thanks!
The fix is right, however the fix on the coverage is wrong.
Indeed, as you see now coverage went down because there is a very small fraction of tests that are only run on linux or on mac.

Are you sure that it does not merge?
In my experience, it does!
It's just that it will prepare a comment as soon as the first build finishes, so at the beginning it seems it didn't use the windows ones, but if you wait at the end it will update the comment and merge the results.

You can see this in the tests for commit 1d7c389 that all pass with 100% coverage.

So I suggest you remove the fix for the coverage and then we merge the actual bug fix

@chrisjsewell
Copy link
Member Author

Are you sure that it does not merge?

Ah good, I was lead to believe otherwise lol

@chrisjsewell chrisjsewell merged commit 1b84d6b into aiidateam:develop Oct 2, 2020
@chrisjsewell chrisjsewell deleted the fix/repo-config branch October 2, 2020 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants