Consolidated ChatGPT API improvements: Improve Compatibility, add requests specific token limits, and textual stop sequences #734

joshuacoles · 2025-02-25T14:44:48Z

Building on #720 and #716 this introduces textual stop sequences to the ChatGPT API, aligning with the OpenAI API Reference, allowing a request to specify that generation should cease after a given textual sequence has been generated.

This is implemented by buffering tokens on the final node of the inference loop until enough characters have been generated to guarantee that the stop sequences listed are not present. If a stop sequence is found in the middle of a token or spanning tokens, the tokens returned from the final node to the ChatGPT API will not be a faithful replication of the tokens that the inference process emitted, but instead the text prior to the stop sequence retokenised.

This data is passed around in the GenerationOptions object as I introduced in #720. This PR should be considered after its two dependents, I will rebase this and apply any changes to it as needed.

structure

…on type

…ocess_inference_result`

…o more closely align with OpenAI

…tion from the ChatGPT API

…und for textual stop sequence matches

This removes direct references the internals of BufferedOutput to allow for better abstraction

…quences

…anch

joshuacoles · 2025-03-07T16:00:37Z

@AlexCheema post our discussions yesterday I've chosen to consolidate all my changes up but not including structural generation (and hence function calling) into this PR to make it easier to review. I am going to add some tests and then this will be ready to review.

I am going to rebase the latter two features on-top of this with a git history with less false starts

joshuacoles · 2025-03-07T16:22:10Z

I've added the tests and ensured bench.py runs successfully. This is now ready to review!

joshuacoles · 2025-03-07T16:30:37Z

I have used the official OpenAI SDK to perform the ChatGPT API tests as this is the reference client implementation. I have not however added this to any requirements / setup file as I could not find the other testing dependencies (eg pytest and pytest-asyncio) listed anywhere.

…e zero-length token output

joshuacoles mentioned this pull request Feb 25, 2025

Support structured responses from the ChatGPT API #735

Closed

joshuacoles marked this pull request as ready for review March 6, 2025 15:27

joshuacoles and others added 28 commits March 7, 2025 15:49

Strip EOS token from output to mirror the OpenAI behaviour

75c0598

Add "[DONE]" message and change streaming response to mirror OpenAI

cf40bd6

structure

Fix bench.py to support "[DONE]" terminating event in stream

244b8ed

Resolve out of range error in debug line

2c2e907

Add generation options protocol buf definition and corresponding pyth…

6c23ba0

…on type

Fish generation_options through from the ChatGPT API request to `pr…

929503d

…ocess_inference_result`

Apply the generation options to inference

c5e001a

Emit the finish reason completion chunk separately from the content t…

4323a2b

…o more closely align with OpenAI

Add stop to the ChatGPT API and GenerationOptions

605a60f

Refactor process_inference_result

1dbae1e

Add finish_reason to the SendResultRequest proto type

76a6716

Pipe finish reason around the place

93622ce

Add finish_reason extraction to ChatGPT API

f92d7c2

Add missing parameter to update_topology_viz

b9f2009

Move finish reason determination to the process_inference_result func…

37cad05

…tion from the ChatGPT API

Move to BufferedOutput object

cea5f68

Add BufferedOutput#token_count

7b8b52f

Move next token determination to BufferedOutput

cb85946

Delay emission by a number of tokens to simulate keeping a buffer aro…

4a85723

…und for textual stop sequence matches

Return None from process_inference_result as it is unused.

4678a94

This removes direct references the internals of BufferedOutput to allow for better abstraction

Resolve issue with stop parameters in GRPC communication

d737fe8

Skip empty completions before finish as caused by buffering

24985ed

Resolve issue with shared array being used for BufferedOutput

d49b58b

Move more logic into BufferedOutput

286335c

Move to keeping the text and tokens so that we can search for stop se…

25a91bf

…quences

Initial stop sequence work

c5b501a

Search for stop sequences in the entire string

1c6f7c1

Handle partial tokens and retokenise when tokens are split

1857aad

joshuacoles and others added 3 commits March 7, 2025 15:54

Buffer by character count rather than token count

0328f5e

Handle no stop sequences being provided

90b8069

Fix issue with finish_reason not being defined in stable diffusion br…

3731605

…anch

joshuacoles force-pushed the stop-sequences branch from 1e54165 to 3731605 Compare March 7, 2025 15:58

This was referenced Mar 7, 2025

Max completion tokens #720

Closed

Improvements to the ChatGPT API to more closely match the official OpenAI endpoint and improve compatibility #716

Closed

Add tests for the different ChatGPT API features introduced in this PR

e88e37d

joshuacoles changed the title ~~Support textual stop sequences in the ChatGPT API~~ Consolidated ChatGPT API improvements: Improve Compatibility, add requests specific token limits, and textual stop sequences Mar 7, 2025

Add test for immediate stop sequence and fix issues resulting from th…

2f4b1ac

…e zero-length token output

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consolidated ChatGPT API improvements: Improve Compatibility, add requests specific token limits, and textual stop sequences #734

Consolidated ChatGPT API improvements: Improve Compatibility, add requests specific token limits, and textual stop sequences #734

joshuacoles commented Feb 25, 2025 •

edited

Loading

joshuacoles commented Mar 7, 2025 •

edited

Loading

joshuacoles commented Mar 7, 2025

joshuacoles commented Mar 7, 2025

Consolidated ChatGPT API improvements: Improve Compatibility, add requests specific token limits, and textual stop sequences #734

Are you sure you want to change the base?

Consolidated ChatGPT API improvements: Improve Compatibility, add requests specific token limits, and textual stop sequences #734

Conversation

joshuacoles commented Feb 25, 2025 • edited Loading

joshuacoles commented Mar 7, 2025 • edited Loading

joshuacoles commented Mar 7, 2025

joshuacoles commented Mar 7, 2025

joshuacoles commented Feb 25, 2025 •

edited

Loading

joshuacoles commented Mar 7, 2025 •

edited

Loading