updating code to match to match llamacpp tag b4689 #93

vaiju1981 · 2025-02-12T06:05:45Z

This PR copies the code from #92

vaiju1981 · 2025-02-12T08:48:11Z

Some concerns : Resolved

Without setting setNPredict sometimes completion just hangs and does not return. - not anymore
With newer models especially non-llama models if you give a prompt "What is 2+2?" it might return repeated answer 2+2=4. 2+2=4. 2+2=4. 2+2=4. 2+2=4. 2+2=4. 2+2=4. 2+2=4. 2+2=4. 2+2=4. 2+2=4. 2+2=4. 2+2=4. 2+2=4. -- this is the issue of not applying proper template
Sometimes the close does not release the lock and just hangs, in the code you would need to have System.exit -- this was resolved by removing premature release of taskid when stream was set to true.

vaiju1981 · 2025-02-12T20:42:39Z

@kherud

Can we enable the pipeline to see if it builds normally on other architecture. I only have ability to test on mac.

kherud · 2025-02-13T07:41:28Z

Hey, thank you very much for this! Unfortunately I won't have the time until the weekend to look at it, but I'll try to approve the pipelines until then. Don't worry if they don't run yet, though, the workflows are quite brittle.

kherud · 2025-02-16T12:32:38Z

So far everything looks great! I'm now trying to fix the github workflows. I previously didn't compile the shared libraries with curl support, because I didn't find an easy way to statically link libcurl. The other solution is to dynamically link it, but this requires users to having installed libcurl, which I wanted to avoid (particularly for Windows users this might cause problems). For now, we can dynamically link it though and find a solution later.

vaiju1981 · 2025-02-16T17:06:20Z

The libcurl option is mostly for my usecases. If it hard we can remove it from workflow alltogether.

…nto b4689

vaiju1981 · 2025-02-20T19:01:38Z

@kherud were you able to check the workflow and if it working.

kherud · 2025-02-21T08:04:32Z

Hey @vaiju1981 I did and I fixed the libcurl problems for Linux/Windows, but now there are other problems. I'll try to continue as soon as possible. Can you see the workflow results here?

vaiju1981 · 2025-02-21T19:11:06Z

I have updated the test and moved the code to match the latest llama.cpp version. can you enable the workflow, i think the only issue is with windows build ( but i don't have windows machine to test )

vaiju1981 · 2025-03-06T00:55:05Z

@kherud can you try now, I have updated code and i could verify on Mac and Unix, I don't have access to windows so I can't test on windows.

vaiju1981 · 2025-03-06T05:26:41Z

I am able to get ubuntu and mac-os pass but widows is failing i think the issue might be with architecture identification:
Window error: java.lang.UnsatisfiedLinkError: No native library found for os.name=Windows, os.arch=x86_64,

Rest passed
Run for the build:

https://github.com/vaiju1981/java-llama.cpp/actions/runs/13691068423

kherud · 2025-03-08T19:08:54Z

I'm currently looking into it and it seems like in the new llama.cpp version there are additional shared libraries "ggml-base.dll" and "ggml-cpu.dll" which are missing in the Java binding and probably cause the UnsatisfiedLinkError. There are multiple solutions to this:

One option would be to try statically compiling all dependencies into "jllama.dll", so we just have to load the single library. I think this would cause the least headache to us, but I avoided this in the past, because I wanted users of the binding to be able to easily swap their "llama.dll" for an individually compiled version (e.g. with GPU support). We would lose this advantage, but users could still always compile the java-llama.cpp project to get a custom version.
We could adapt LlamaLoader.java to also load the missing libraries. I'm not a fan of this solution, though, since "ggml-cpu.dll" implies that the required libraries depend on the specific options used for compilation (e.g. is there also something like "ggml-cuda.dll"?). This would make the library loader complex and brittle.

I think we should go with option 1 for now and look how things turn out. I'll try to implement this and report back.

vaiju1981 · 2025-03-08T19:12:22Z

Hi @kherud I liked the option 1. it looks the cleanest one.

one thing we can do is have 2 builds for windows one without cuda and one with cuda. That way if user has gpu and want cuda support they can have that dependency. ( this would work for other architectures like unix)

kherud · 2025-03-08T19:58:46Z

Yes, I agree. The binding doesn't offer windows cuda builds yet, and I always tried to avoid it since providing pre-built libraries really is a pain as you've seen, but maybe in the future.

I'm also disabling curl for now since I can't figure out to statically link it (it's tricky due to dependencies on system libraries). Users can always manually download models or compile the bindings themselves.

I think we should get a basic version for all major platforms working for now and finally merge the pull request.

vaiju1981 · 2025-03-08T21:51:35Z

Amazing work @kherud looks like windows build passed.

kherud · 2025-03-08T22:39:46Z

@vaiju1981 Sadly not, it's only working in a cmake debug configuration (which has much worse performance). The library is now loaded without an UnsatisfiedLinkError, but there happens a segmentation fault when loading the library. It only happens in release mode (the compiler heavily optimizes the code then). So far I don't really have a clue about the problem. I tried running address and undefined sanitizers on Linux but they didn't report any problems.

kherud · 2025-03-08T22:44:49Z

src/main/cpp/jllama.cpp

@@ -291,8 +326,12 @@ JNIEXPORT jint JNICALL JNI_OnLoad(JavaVM *vm, void *reserved)
        goto error;
    }

+    printf("loaded JNI symbols\n"); fflush(stdout);
+
    llama_backend_init();


After inserting some debug statements, the problem seems to appear when initializing the llama backend here.

toystorynova · 2025-03-08T22:55:09Z

Is this the same error? #83 If so, Windows support is broken on previous versions too so a build with latest would be appreciated regardless

kherud · 2025-03-09T09:17:21Z

@toystorynova Yes, good spot, this might likely be the same issue. The weird thing is that it correctly works when I build it on my Windows machine. I think I traced it down to this statement being called (via JNI_OnLoad -> llama_backend_init() -> ggml_init() -> ggml_critical_section_start()):

https://github.com/ggml-org/llama.cpp/blob/0fd7ca7a210bd4abc995cd728491043491dbdef7/ggml/src/ggml-threading.cpp#L7

It's likely a a race condition or multi-threading issue that leads to ggml_critical_section_mutex being un-initialized.

vaiju1981 · 2025-03-09T09:44:54Z

What is the impact of debug vs release for windows. Is it that debug will run nX slower then in release mode or needs more memory. Since the debug mode worked for windows, wondering if the impact of debug vs release are not high we can go with debug option for windows.

kherud · 2025-03-09T10:06:39Z

Yeah, on the one hand it's unusably slow, I think, on the other hand we didn't really solve the underlying issue and it might surface again later. I'll look for more insight today. If I can't find anything, we can release a debug build for now. It's a better option than releasing no library at all I guess.

kherud · 2025-03-09T14:29:39Z

It seems to work now 🎉 thank you again for the continued effort! It's ready to merge.

The next steps are:

I'll also merge Expose json schema to grammar conversion method #94 it seems like a reasonable change and shouldn't cause any problems
I think we should release a new major version (i.e. 4.0.0)
Hopefully the release workflow will work, it pre-compiles more shared libraries than just the CI workflow

updating code to match to match llamacpp tag b4689

b33c447

vaiju1981 mentioned this pull request Feb 12, 2025

B4513 #92

Closed

Vaijanath Rao added 2 commits February 12, 2025 11:08

replacing local model with modelWithUri

a718e2e

updating version and readme and parameter.

5745611

Vaijanath Rao added 3 commits February 13, 2025 09:24

adding releaseTask and updated test to match workflow

0913373

replacing the modelPath

7c54bd3

adding chat format and LLAMA_CURL=ON to build

d87a103

vaiju1981 mentioned this pull request Feb 14, 2025

rerank #84

Open

Vaijanath Rao and others added 6 commits February 13, 2025 23:47

updating version to latest.

b7962aa

reverting to older version of llamacpp

dcb14ff

adding tool support

e9b3d52

adding condition for Grammar

ea1327a

fixing code for apply template

9fbebba

install libcurl in github workflows

22cefc5

Merge branch 'b4689' of https://github.com/vaiju1981/java-llama.cpp i…

60ab2f6

…nto b4689

Vaijanath Rao added 2 commits February 21, 2025 10:09

updating test case to make codellama model

2f8d2b0

updating to add speculative execution.

54bf4bd

updating dependency to latest llamacpp version

15dbe68

Vaijanath Rao added 2 commits March 5, 2025 20:26

removed releaseTask

c00de24

updated to remove unused and duplicate imports

7a3f672

Vaijanath Rao added 2 commits March 8, 2025 00:13

adding copy and verify step

8942628

adding copy and verify step

28c17b8

kherud added 2 commits March 8, 2025 20:54

statically link dependencies

0b304b8

ci workflow disable curl build

a93a79e

kherud added 8 commits March 8, 2025 21:26

ci workflow enable llama metal

01c202b

ignore logging test

6c70a31

ci workflow disable native ggml windows build

be6e34a

ci workflow upload windows libraries

e9df628

ci workflow build windows in release-debug mode

20a7df4

cmakelists add windows relwithdebinfo output path

b9bc6f3

ci workflow build windows in debug mode

3c5b489

add debug statements to jni load

50129c9

kherud reviewed Mar 8, 2025

View reviewed changes

kherud added 5 commits March 9, 2025 14:30

ci workflow windows use zulu 17

4481c1c

defer llama backend initialization

d549764

statically link windows system libraries

66b31d9

remove static linking and use older msvc in release workflow

5e6c5c9

initialize llama backend on jni load and remove cmake debug statements

f6ca909

kherud merged commit fd1b062 into kherud:master Mar 9, 2025
4 checks passed

vaiju1981 deleted the b4689 branch March 9, 2025 15:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

updating code to match to match llamacpp tag b4689 #93

updating code to match to match llamacpp tag b4689 #93

vaiju1981 commented Feb 12, 2025 •

edited

Loading

vaiju1981 commented Feb 12, 2025 •

edited

Loading

vaiju1981 commented Feb 12, 2025

kherud commented Feb 13, 2025

kherud commented Feb 16, 2025

vaiju1981 commented Feb 16, 2025

vaiju1981 commented Feb 20, 2025

kherud commented Feb 21, 2025

vaiju1981 commented Feb 21, 2025

vaiju1981 commented Mar 6, 2025

vaiju1981 commented Mar 6, 2025 •

edited

Loading

kherud commented Mar 8, 2025 •

edited

Loading

vaiju1981 commented Mar 8, 2025

kherud commented Mar 8, 2025

vaiju1981 commented Mar 8, 2025

kherud commented Mar 8, 2025 •

edited

Loading

kherud Mar 8, 2025

toystorynova commented Mar 8, 2025 •

edited

Loading

kherud commented Mar 9, 2025 •

edited

Loading

vaiju1981 commented Mar 9, 2025

kherud commented Mar 9, 2025

kherud commented Mar 9, 2025 •

edited

Loading

updating code to match to match llamacpp tag b4689 #93

updating code to match to match llamacpp tag b4689 #93

Conversation

vaiju1981 commented Feb 12, 2025 • edited Loading

vaiju1981 commented Feb 12, 2025 • edited Loading

vaiju1981 commented Feb 12, 2025

kherud commented Feb 13, 2025

kherud commented Feb 16, 2025

vaiju1981 commented Feb 16, 2025

vaiju1981 commented Feb 20, 2025

kherud commented Feb 21, 2025

vaiju1981 commented Feb 21, 2025

vaiju1981 commented Mar 6, 2025

vaiju1981 commented Mar 6, 2025 • edited Loading

kherud commented Mar 8, 2025 • edited Loading

vaiju1981 commented Mar 8, 2025

kherud commented Mar 8, 2025

vaiju1981 commented Mar 8, 2025

kherud commented Mar 8, 2025 • edited Loading

kherud Mar 8, 2025

Choose a reason for hiding this comment

toystorynova commented Mar 8, 2025 • edited Loading

kherud commented Mar 9, 2025 • edited Loading

vaiju1981 commented Mar 9, 2025

kherud commented Mar 9, 2025

kherud commented Mar 9, 2025 • edited Loading

vaiju1981 commented Feb 12, 2025 •

edited

Loading

vaiju1981 commented Feb 12, 2025 •

edited

Loading

vaiju1981 commented Mar 6, 2025 •

edited

Loading

kherud commented Mar 8, 2025 •

edited

Loading

kherud commented Mar 8, 2025 •

edited

Loading

toystorynova commented Mar 8, 2025 •

edited

Loading

kherud commented Mar 9, 2025 •

edited

Loading

kherud commented Mar 9, 2025 •

edited

Loading