Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add llama.cpp args missing in llamafile. #715

Open
4 tasks done
fastzombies opened this issue Mar 14, 2025 · 2 comments
Open
4 tasks done

Feature Request: Add llama.cpp args missing in llamafile. #715

fastzombies opened this issue Mar 14, 2025 · 2 comments

Comments

@fastzombies
Copy link

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

As of 0.9.1, there are several options in the help that are not recognized.

--seed
--min-p
--top-k
--top-p
--samplers

Additionally, some args are partially supported:

  • --prompt is unknown but -p works.
  • --file is unknown but -f works.

I assume the help is forked from llama.cpp or is somehow linked.

Motivation

Models like QwQ-32B have recommended samplers from Qwen so I would like to be able to use their recommended settings to stop looping at the end of a reply.

Possible Implementation

I noted #692 is about --repeat-penalty wasn't supported in the 0.8 stream but is fixed as of 0.9.1 so I assume these args need to be added to some list. I think that issue can be closed.

There may be more but these are the args that are not supported at all that I have found.

@fastzombies fastzombies changed the title Feature Request: Feature Request: Add llama.cpp args missing in llamafile. Mar 14, 2025
@fastzombies
Copy link
Author

#697 Looks to be related.

@cjpais
Copy link
Collaborator

cjpais commented Mar 15, 2025

yep similar. upstream sync is needed. not sure when it might happen.

however if someone submits a PR adding support for some of these directly it will be pulled in. i hear you on QwQ, would be good to have proper support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants