Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-40521: Make dict free lists per-interpreter #20645

Merged
merged 1 commit into from
Jun 23, 2020
Merged

bpo-40521: Make dict free lists per-interpreter #20645

merged 1 commit into from
Jun 23, 2020

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Jun 5, 2020

Each interpreter now has its own dict free list:

  • Move dict free lists into PyInterpreterState.
  • Move PyDict_MAXFREELIST define to pycore_interp.h
  • Add _Py_dict_state structure.
  • Add tstate parameter to _PyDict_ClearFreeList() and _PyDict_Fini().
  • _PyLong_Fini() and _PyFloat_Fini() are now called after
    _PyDict_Fini().
  • Remove "#ifdef EXPERIMENTAL_ISOLATED_SUBINTERPRETERS".

https://bugs.python.org/issue40521

@vstinner
Copy link
Member Author

vstinner commented Jun 5, 2020

After tuple (commit 69ac6e5) and list (commit 88ec919) and others types (see bpo-40521), it's now the turn of dict to get its free lists moved to PyInterpreterState! FYI when I measured the tuple PR overhead, it was around 1 ns (and create a tuple of 1 item is around 20 ns): see PR #20247.

cc @methane @pablogsal

@vstinner
Copy link
Member Author

vstinner commented Jun 5, 2020

With this PR, all free lists cleared by gc.collect() are per-interpreter.

@vstinner
Copy link
Member Author

vstinner commented Jun 5, 2020

Before anyone asks, here is a microbenchmark on PyDict_New() to measure the overhead of this PR:

Mean +- std dev: [ref] 10.9 ns +- 0.9 ns -> [patch] 12.8 ns +- 1.0 ns: 1.18x slower (+18%)

The PR adds +1.9 ns per PyDict_New() call.

I used bench_dict.patch which is attached to https://bugs.python.org/issue40521.

UPDATE: please notice that the std dev is around 1 ns which is close to the overhead. These microbenchmark are so small that it is really hard to get a precise measure of timings.

@methane
Copy link
Member

methane commented Jun 5, 2020

@vstinner Would you benchmark without freelist too?
If benefit of the freelist becomes too small, removing it is better than making it per-interpreter.

@vstinner
Copy link
Member Author

vstinner commented Jun 5, 2020

@vstinner Would you benchmark without freelist too?

Here you have:

vstinner@apu$ python3 -m pyperf compare_to ref.json patch.json no_freelist.json 
Mean +- std dev: [ref] 10.9 ns +- 0.9 ns -> [patch] 12.8 ns +- 1.0 ns: 1.18x slower (+18%)
Mean +- std dev: [ref] 10.9 ns +- 0.9 ns -> [no_freelist] 18.0 ns +- 1.0 ns: 1.65x slower (+65%)

If benefit of the freelist becomes too small, removing it is better than making it per-interpreter.

Disabling the free list adds +7.1 ns per PyDict_New() call: 1.65x slower. It doesn't sound reasonable to remove free_list in dictobject.c.

@vstinner
Copy link
Member Author

vstinner commented Jun 8, 2020

TODO: implement commit bcb1983 for dict free lists. See https://bugs.python.org/issue40887 for the context.

Each interpreter now has its own dict free list:

* Move dict free lists into PyInterpreterState.
* Move PyDict_MAXFREELIST define to pycore_interp.h
* Add _Py_dict_state structure.
* Add tstate parameter to _PyDict_ClearFreeList() and _PyDict_Fini().
* In debug mode, ensure that the dict free lists are not used after
  _PyDict_Fini() is called.
* Remove "#ifdef EXPERIMENTAL_ISOLATED_SUBINTERPRETERS".
@vstinner
Copy link
Member Author

@methane: So, what do you think?

I rebased my PR on top of the master branch.

TODO: implement commit bcb1983 for dict free lists. See https://bugs.python.org/issue40887 for the context.

Done. In debug mode, trying to use dict free lists after they are finalized now fails with an assertion error.

Copy link
Member

@methane methane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The discussion in the ML didn't go anywhere.
I'm OK to merge this.

@vstinner vstinner merged commit b4e85ca into python:master Jun 23, 2020
@vstinner vstinner deleted the dict_free_list branch June 23, 2020 09:33
@vstinner
Copy link
Member Author

I'm OK to merge this.

Thanks @methane for the review! I'm still interested to attempt to reduce the overhead of this change. See my PR #20767 for example. But so I failed to find a method which has a significant impact on performance.

Again, measuring performance of a function taking around 10 ns is really hard. The machine code changes depending on LTO and/or PGO build modes, compiler flags, etc. And sometimes it's faster, sometimes it's slower...

fasih pushed a commit to fasih/cpython that referenced this pull request Jun 29, 2020
Each interpreter now has its own dict free list:

* Move dict free lists into PyInterpreterState.
* Move PyDict_MAXFREELIST define to pycore_interp.h
* Add _Py_dict_state structure.
* Add tstate parameter to _PyDict_ClearFreeList() and _PyDict_Fini().
* In debug mode, ensure that the dict free lists are not used after
  _PyDict_Fini() is called.
* Remove "#ifdef EXPERIMENTAL_ISOLATED_SUBINTERPRETERS".
arun-mani-j pushed a commit to arun-mani-j/cpython that referenced this pull request Jul 21, 2020
Each interpreter now has its own dict free list:

* Move dict free lists into PyInterpreterState.
* Move PyDict_MAXFREELIST define to pycore_interp.h
* Add _Py_dict_state structure.
* Add tstate parameter to _PyDict_ClearFreeList() and _PyDict_Fini().
* In debug mode, ensure that the dict free lists are not used after
  _PyDict_Fini() is called.
* Remove "#ifdef EXPERIMENTAL_ISOLATED_SUBINTERPRETERS".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants