Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Max completion tokens #720

Closed

Conversation

joshuacoles
Copy link

This PR builds on the changes from #716 adding the max_completion_tokens property to the ChatGPT API's ChatCompletionRequest in alignment with the OpenAI API spec. This is used to provide a client defined limit on the maximum number of tokens returned from inference, as long as this is less than the node.max_generate_tokens specified on the command line. This supports both the new max_completion_tokens and the older max_tokens option if the former is not present.

The main structural change in this PR is the addition of the GenerationOptions parameter to the SendPrompt and SendTensor grpc methods and the methods which they call. This allows for request specific options during inference. I kept this separate from the existing InferenceState as this seemed focused on stable diffusion specific data which could change over the course of inference.

If any more changes are made to the parent PR I will rebase this PR to match.

@joshuacoles
Copy link
Author

Also @AlexCheema whilst I was adding this I noticed that the SendPrompt and SendTensor both return a Tensor as per the grpc definitions but the corresponding methods on Node (Node#process_prompt and Node#process_tensor) do not return anything, so this tensor is always empty.

I see this changed from Tensor -> Empty in c9ded9b then back in 9954ce8. I have left this signature unchanged but wanted to check which is the expected behaviour?

@joshuacoles
Copy link
Author

Closed in favour of the combined PR #734

@joshuacoles joshuacoles closed this Mar 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant