Skip to content

Releases: ggml-org/llama.cpp

b4877

12 Mar 10:38
363f8c5
Compare
Choose a tag to compare
sycl : variable sg_size support for mmvq kernels (#12336)

b4876

12 Mar 09:57
34c961b
Compare
Choose a tag to compare
CUDA/HIP: Fix fattn-vec-* when device warp size is not 32 (#12315)

When fattn-wmma was ported over to warp64 various bits that also touch fattn-vec where converted to
selectable warp size, however the fattn-vec kernels dont work with 64 wide warps for now, so we need
to avoid launching them with parameters for warp64

b4875

12 Mar 09:26
7841fc7
Compare
Choose a tag to compare
llama : Add Gemma 3 support (+ experimental vision capability) (#12343)

* llama : Add Gemma 3 text-only support

* fix python coding style

* fix compile on ubuntu

* python: fix style

* fix ubuntu compile

* fix build on ubuntu (again)

* fix ubuntu build, finally

* clip : Experimental support for Gemma 3 vision (#12344)

* clip : Experimental support for Gemma 3 vision

* fix build

* PRId64

b4874

12 Mar 06:44
bf69cfe
Compare
Choose a tag to compare
vulkan: fix bug in coopmat1 mul_mat_id (#12316)

* tests: run mul_mat_id with a larger N

* vulkan: fix bug in coopmat1 mul_mat_id

b4873

11 Mar 20:00
10f2e81
Compare
Choose a tag to compare
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows …

b4872

11 Mar 14:59
ba76543
Compare
Choose a tag to compare
ggml-backend : fix backend search path (#12330)

* Fix backend search path

* replace .native() with '/'

* reverted .native()

b4871

11 Mar 12:36
6ab2e47
Compare
Choose a tag to compare
metal : Cache the Metal library at the device context level (#12265)

b4870

11 Mar 09:08
96e1280
Compare
Choose a tag to compare
clip : bring back GPU support (#12322)

* clip : bring back GPU support

* use n_gpu_layers param

* fix double free

* ggml_backend_init_by_type

* clean up

b4869

10 Mar 20:10
2c9f833
Compare
Choose a tag to compare
mat vec double buffer (#12188)

b4868

10 Mar 18:10
2513645
Compare
Choose a tag to compare
musa: support new arch mp_31 and update doc (#12296)

Signed-off-by: Xiaodong Ye <[email protected]>