-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Look into string sharing for deep-frozen code #218
Comments
FYI: In our mmap mechanism, we obverse ~10% slowdown in some Update: some numbers generated by this simple test case https://github.com/oraluben/pyperformance/blob/cd7050101e0c32c0e3fa0359785a4cc28530e3ed/pyperformance/data-files/benchmarks/bm_module_load_attr/run_benchmark.py
before:
after (strings in mapped memory are re-interned):
|
I liked @markshannon idea and I got it working on linux, it reduced the size of generated files by around 1.2Mb as it shares the constants etc. Here is the branch (https://github.com/kumaraditya303/cpython/tree/deepfreeze), next I'll see how to get it working on Windows. (EDIT: It should be Mb not MB) |
Status Update: @gvanrossum I have got it working on Linux and Windows so should I create a PR ? |
@oraluben the line numbers in that link will break when |
Sounds great! Let's make it a draft PR so I can try it out. |
@gvanrossum Created PR here python/cpython#30572 |
I see you already submitted a PR and created an issue (https://bugs.python.org/issue46430) to intern strings. We need another canonicalization: "small integers" should use their cached equivalent. |
That was much simpler See python/cpython#30715 |
Since most of what is mentioned in the issue is done, can the issue be changed with a checklist of all the items and be marked with what is already done ? |
Why don't you summarize what you think still needs to be done. |
Thanks for the list! Go ahead with the tuple singleton. The other PR is still in flux, so we have to wait. |
I create a demo in python/cpython#31154 to test the benchmark.
|
That's an amazing speed improvement. Are you sure those numbers are right? |
I am sure that I use same local vm、same compiler commands and same benchmarks(Using pyperformance) :). And I test some modules import benchmark sepereately. Looks like the one reason of benchmark(pyperformance) to speed up is modules import become quickly?(I am not sure).
|
Thanks! I don't have things set up to run benchmarks at the moment, sorry. It's definitely plausible to me that this optimization makes importing frozen modules a lot faster. Note though that in your tests, the modules you're testing are almost certainly already imported, so really you're just testing the path where the module is already in sys.modules. But it's surprising that it makes the high-level benchmarks faster. For example, |
Without deepfreeze, compile()+marshal() merged constants and interned names in pyc files. |
I don’t believe it, I will need to run the benchmark myself. |
main (8726067) vs python/cpython#31154 (067ad2)
|
All of the ideas in the to-do list are implemented by now. |
Great. Even the “weak linkage” idea? |
No, I was talking about this todo list #218 (comment) |
Okay, thanks so much! There is one new request. A flag to can pass to the Makefile or configure script to disable (deep)freezing (except freezing the importlib bootstrap), for rare platforms that need to squeeze space. |
Closing this, declaring victory. Thanks @kumaraditya303! We can open a bpo issue for the flag to disable. |
I don't have much experience with the automake/autoconfigure thing so I would defer it to the person who wants to implement for their rare platforms. |
@kumaraditya303 All I was asking was a command-line flag for freeze_modules.py to skip all categories except the importlib bootstrap. |
When code objects are created by the compiler or by marshal, certain strings are interned using
PyUnicode_InternInPlace()
. But code objects obtained from the deep-freeze process do not do this.While strings are deduped within a module, certain strings (e.g. dunders, 'name', 'self') occur frequently in different modules and thus the deep-frozen code object will use up slightly more space, and certain operations will be slightly slower (e.g. comparison of two strings that are both interned is done by pointer comparison).
We could do a number of different things (or combine several):
PyUnicode_InternInPlace()
calls to the "get toplevel" function in each deepfrozen .c file. This would still waste the space though.all_name_chars()
in codeobject.c) an external name with "weak linkage" so that the linker can dedupe them. (Props to @lpereira for this one.)_Py_Identifier
s, replace the string with a reference into that array, for pure savings. (@ericsnowcurrently has more details; IIRC this array isn't in "main" yet.)[I am not planning to attack this any time soon, so if somebody wants to tackle this, go ahead and assign to yourself.]
The text was updated successfully, but these errors were encountered: