Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cache dlsym lookup #2

Closed
wants to merge 1 commit into from
Closed

cache dlsym lookup #2

wants to merge 1 commit into from

Conversation

vtjnash
Copy link
Contributor

@vtjnash vtjnash commented Feb 20, 2013

I didn't do a test for whether this actually makes a significant difference in speed, but I wanted to see whether and how this would work (I'm not certain if it is actually better)

(note: there should probably be a type assert that func is actually a Symbol, otherwise we cache the result of a dynamic variable)

function libpython_name(python::String)
lib = replace(readall(`$python -c "import distutils.sysconfig; print distutils.sysconfig.get_config_var('LDLIBRARY')"`),
r"\.so(\.[0-9\.]+)?\s*$|\.dll\s*$", "", 1)
@osx_only if lib[1] != '/'; lib = "/System/Library/Frameworks/"*chomp(lib) end
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this fixes a separate issue where Mac seems to always return a relative path that should be prefixed by /System/Library/Frameworks

@stevengj
Copy link
Member

@vtjnash, sorry I didn't notice this earlier. How much of a performance difference does this make? e.g. on a simple benchmark like @time ccall(pyfunc(:Py_IsInitialized), Int32, ())?

@stevengj
Copy link
Member

Hmm, looks like it makes a big difference:

julia> @time for i = 1:10^6; ccall(pyfunc(:Py_IsInitialized), Int32, ()); end
elapsed time: 0.373013973236084 seconds

julia> f = pyfunc(:Py_IsInitialized)
Ptr{Void} @0x00007fe7edb60e60
julia> @time for i = 1:10^6; ccall(f::Ptr{Void}, Int32, ()); end
elapsed time: 0.0026140213012695312 seconds

A factor of 100 may be worth caching for...

@vtjnash
Copy link
Contributor Author

vtjnash commented Feb 22, 2013

i figured. a dictionary lookup is not as fast as a constant reference. (as a side note, the type assertion on the first argument of the ccall is unnecessary -- it should generate nearly the same code, but accept any Ptr{T}})

i tried merging this forward, but it was tricky since this doesn't check that the symbol it receives is actually constant and just blindly caches the value regardless which can cause some surprises

@stevengj
Copy link
Member

In all the places where the argument to pyfunc is not a constant, it should be possible to rearrange the code slightly to get a constant expression. This has to be done manually, of course; I don't think any macro magic could go back up the call tree of pyisinstance, for example, but it is straightforward to change pyisinstance(o, :Foo) to pyisinstance(o, @pyfunc :Foo) and make pyisinstance accept a PyPtr (= Ptr{Void}) argument.

@vtjnash
Copy link
Contributor Author

vtjnash commented Feb 23, 2013

that sounds even better. i wasn't certain if it needed to be a runtime value somewhere, so i was trying to just propagate the macro calls outward (almost as a primitive form of inlining)

@stevengj
Copy link
Member

Your type annotation global $z::Ptr{Void} doesn't seem to do anything. It seems like you have to attach the annotation $z::Ptr{Void} to every usage of $z. See also my comment at the end of Julia issue #964.

@vtjnash
Copy link
Contributor Author

vtjnash commented Mar 12, 2013

were you able to characterize the performance impact of this approach in actual code?

@stevengj
Copy link
Member

No, I didn't try anything except for the synthetic benchmark above. (I suspect that in most realistic circumstances the overhead of the dlsym is negligible compared to the Python overhead, but it is a clean enough change that I figured it doesn't hurt.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants