-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
updating code to match to match llamacpp tag b4689 #93
Conversation
Some concerns : Resolved
|
Can we enable the pipeline to see if it builds normally on other architecture. I only have ability to test on mac. |
Hey, thank you very much for this! Unfortunately I won't have the time until the weekend to look at it, but I'll try to approve the pipelines until then. Don't worry if they don't run yet, though, the workflows are quite brittle. |
So far everything looks great! I'm now trying to fix the github workflows. I previously didn't compile the shared libraries with curl support, because I didn't find an easy way to statically link libcurl. The other solution is to dynamically link it, but this requires users to having installed libcurl, which I wanted to avoid (particularly for Windows users this might cause problems). For now, we can dynamically link it though and find a solution later. |
The libcurl option is mostly for my usecases. If it hard we can remove it from workflow alltogether. |
@kherud were you able to check the workflow and if it working. |
Hey @vaiju1981 I did and I fixed the libcurl problems for Linux/Windows, but now there are other problems. I'll try to continue as soon as possible. Can you see the workflow results here? |
I have updated the test and moved the code to match the latest llama.cpp version. can you enable the workflow, i think the only issue is with windows build ( but i don't have windows machine to test ) |
@kherud can you try now, I have updated code and i could verify on Mac and Unix, I don't have access to windows so I can't test on windows. |
I am able to get ubuntu and mac-os pass but widows is failing i think the issue might be with architecture identification: Rest passed |
I'm currently looking into it and it seems like in the new llama.cpp version there are additional shared libraries "ggml-base.dll" and "ggml-cpu.dll" which are missing in the Java binding and probably cause the UnsatisfiedLinkError. There are multiple solutions to this:
I think we should go with option 1 for now and look how things turn out. I'll try to implement this and report back. |
Hi @kherud I liked the option 1. it looks the cleanest one. one thing we can do is have 2 builds for windows one without cuda and one with cuda. That way if user has gpu and want cuda support they can have that dependency. ( this would work for other architectures like unix) |
Yes, I agree. The binding doesn't offer windows cuda builds yet, and I always tried to avoid it since providing pre-built libraries really is a pain as you've seen, but maybe in the future. I'm also disabling curl for now since I can't figure out to statically link it (it's tricky due to dependencies on system libraries). Users can always manually download models or compile the bindings themselves. I think we should get a basic version for all major platforms working for now and finally merge the pull request. |
Amazing work @kherud looks like windows build passed. |
@vaiju1981 Sadly not, it's only working in a cmake debug configuration (which has much worse performance). The library is now loaded without an UnsatisfiedLinkError, but there happens a segmentation fault when loading the library. It only happens in release mode (the compiler heavily optimizes the code then). So far I don't really have a clue about the problem. I tried running address and undefined sanitizers on Linux but they didn't report any problems. |
src/main/cpp/jllama.cpp
Outdated
@@ -291,8 +326,12 @@ JNIEXPORT jint JNICALL JNI_OnLoad(JavaVM *vm, void *reserved) | |||
goto error; | |||
} | |||
|
|||
printf("loaded JNI symbols\n"); fflush(stdout); | |||
|
|||
llama_backend_init(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After inserting some debug statements, the problem seems to appear when initializing the llama backend here.
Is this the same error? #83 If so, Windows support is broken on previous versions too so a build with latest would be appreciated regardless |
@toystorynova Yes, good spot, this might likely be the same issue. The weird thing is that it correctly works when I build it on my Windows machine. I think I traced it down to this statement being called (via It's likely a a race condition or multi-threading issue that leads to |
What is the impact of debug vs release for windows. Is it that debug will run nX slower then in release mode or needs more memory. Since the debug mode worked for windows, wondering if the impact of debug vs release are not high we can go with debug option for windows. |
Yeah, on the one hand it's unusably slow, I think, on the other hand we didn't really solve the underlying issue and it might surface again later. I'll look for more insight today. If I can't find anything, we can release a debug build for now. It's a better option than releasing no library at all I guess. |
It seems to work now 🎉 thank you again for the continued effort! It's ready to merge. The next steps are:
|
This PR copies the code from #92