-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run handler code via a timer in the main loop #41
Conversation
Thanks for the update! And I'm looking forward to getting the I'm a bit confused on the timer solution, though. Is this equivalent to running the handler code in a different thread? And if so, why not just use threading? About the use of buffer-local variables, I think I can fix this, and in fact I'm working on something that will fix it now. It's part of my work in updating the package to use generic functions to let packages define the parts that change while having the implementations of the main llm functions be written (mostly) just once. That part is done on the The one problematic case is for Open AI's function streaming, where we need to keep a structured updated with streaming information. I don't know how I'm going to do that yet. I'm still working on this and it will probably take at least this weekend to straighten out. |
Hi @ahyatt, you can think of the timer solution that instead of running the user callback when the process filter is called (and maybe delaying its completion) the user callback is scheduled to run "later". "Later" is when Emacs is accepting output from a sub process again, which is after the process filter completed. I'm not sure if this is "equivalent" to running the handler in a different Emacs thread. Emacs also only switches to a different thread when accepting output from a process, but I'm not sure if it is the same. Since this is a tricky area, I stayed as close as I could to jsonrpc.el because our use cases are similar. I can experiment with threads if you want, but I'm not sure it will solve the issue about the shared buffer local variable. I think the issue I ran into is with the tester is that multiple calls are in flight, and share this variable. One test assumes it is a string it can concatenate on, the other thinks it is an object for function calling. At least that was my impression. If we could close over that variable (similar to what Vertex does with I'm not sure if I follow the problematic case about OpenAI. Isn't the state there also local to the I'm also looking forward to getting this out. I'm using this at work mostly with Ellama and I haven't noticed any major issues, except for these tricky things around the process filter. Nevertheless it's actually quite stable, and I hope we can also sort out the last issues. |
Thanks for the explanation! BTW, I'm looking at the |
Hi @ahyatt, I'm a bit confused because you say "return". I think you mean that the handler is called with the event source object, and the event object, right? Ideally, I would like the event source and the text/event-stream media type to follow the spec here: Now that I look at it it, I probably got this wrong, because the handler is called with 2 arguments the event source, and the event object. I initially thought the handler might want access not only the event, but the source as well. So, I can change that. But you are probably more interested in the media type handlers. Let me explain, how they work right now: The text/event-stream :handlers is a map from event type to a function. The event type is a symbol as defined in the spec and the function takes 2 arguments, the event source and the event object. There are some pre-defined event types in the spec. And depending on the event type, the data slot of the event object carries different information.
I think the For the other events I set the data slot to something I thought is useful to the consumer of that event. I thought the open and close events might be interested in the plz-response object and the error event in the plz-error object. I'm not sure if we even need to register So, that's what we have now. You seem to want the handler to be called with with only the data slot of the event and not the event object itself, is that right? This would be a bit against the event source spec, but I can come up with a modified media type that suits your needs. I don't fully understand why this is causing trouble though. Can you point me to the code you have so far, so I can better understand it? I will look a this then tomorrow, since it is getting late here. Also, if I should change the media type, can you please explain how exactly the handlers should be called for the |
Looking at the main branch, it seems you would like the APIs of the text/event-stream, json-array, and ndjson to be more unified. Is that right? Since the event source is kind of well specified, maybe the others should follow a similar API as text/event-stream. So, instead of a single :handler function they also take a :handlers alist, and maybe also have "event types" like open, close, error etc. Would that help? |
I would separate building the data structure that describes a request from the action that does the request, and let each provider be able to tweak parts of the request description. Most importantly setting the media types. But I would not expose setting the media type as a generic function. So maybe something along those lines: (cl-defmethod llm-chat-streaming-request ((provider llm-standard-chat-provider) prompt partial-callback response-callback error-callback)
(list :url (llm-provider-chat-streaming-url provider)
:data (llm-provider-chat-request-data provider prompt t)
:then (llm-provider-on-success provider response-callback)
:else (llm-provider-on-error provider error-callback)))
(cl-defmethod llm-chat-streaming-request ((provider llm-openai) prompt partial-callback response-callback error-callback)
(append (cl-call-next-method provider prompt partial-callback response-callback error-callback)
(list :media-types
(cons `(application/json
. (plz-media-type:application/json-array
:handlers `((message . (lambda (event)
(llm-provider-on-partial provider partial-callback (json-read (oref event data))))))))
plz-media-types))))
(cl-defmethod llm-chat-streaming-request ((provider llm-vertex) prompt partial-callback response-callback error-callback)
(append (cl-call-next-method provider prompt partial-callback response-callback error-callback)
(list :media-types
(cons `(application/json
. (plz-media-type:application/json-array
:handler (lambda (object)
(llm-provider-on-partial provider partial-callback object))))
plz-media-types))))
(cl-defmethod llm-chat-streaming-request ((provider llm-ollama) prompt partial-callback response-callback error-callback)
(append (cl-call-next-method provider prompt partial-callback response-callback error-callback)
:media-types
(cons `(application/json
. (plz-media-type:application/x-ndjson
:handler (lambda (object)
(llm-provider-on-partial provider partial-callback object))))
plz-media-types)))
(cl-defmethod llm-chat-streaming ((provider llm-standard-chat-provider) prompt partial-callback response-callback error-callback)
(llm-request-plz (llm-chat-streaming-request provider prompt partial-callback response-callback error-callback)))
That's it for today. Talk to you tomorrow! |
Thanks for the helpful messages. I think in all the cases you describe, open, error, message, the data is the thing important to the user, so should be just passed as-is to the callback in the event handler. It would be different if the various event handlers needed the other slots, such as I worked a few hours today and got everything working with plz again and the new structure. Your proposal is interesting on a way to structure the calls. I read it after I had a different solution, though, and I'm not (yet) sure it's solving problem that aren't solved in a more straightforward way. But, it may come in useful, if I run into problems I'll see if this design can help. Thank you!
It might be nice to have things more unified, I don't think it's necessary - but the fact that no other callback was using a wrapper but event-handler was seemed wrong to me. It's currently fine with me if things remain separate, but I'd to resolve the event-handler callback issue we've be discussing. |
Hi @ahyatt , I updated the PR with the following changes:
I noticed that the indentation in many files seems to be strange. I saw the body of a when from left to the actual when symbol. I didn't touch that. I also updated my own provider to use the generic methods. I works very well and I could throw away a lot of code. Nice job! I have run the following providers with the changes in this PR through the llm-tester:
It all works and I haven't noticed anything bad. |
Thanks for the changes and the fixes! About the indentation, yes, something weird is going on and I don't know what yet. It appears I have tabs in my source despite turning off tabs in emacs. I'm still trying to figure out what's going on there. Going back to the discussion of returning just the event data, do my arguments convince you, or have I missed something? Also, about the timer change itself - although I merged this in, I'm curious if there is or is not something wrong in |
Hi @ahyatt, about the event data, I'm not really convinced. I think this is hiding data to a user of an event source. The SSE spec has this example here:
And the handlers in the media type serve the same purpose. For a media type that is advertised as being for text/event-stream I think they should receive the same thing as the spec says. So, I would like to keep it that way. Is that a problem for you? I didn't see where this is causing issues yet. About the timer, I think alphapapa just mentioned that plz does this as well, when I said that I discovered that jsonrpc.el is scheduling work to be done on the main loop from the process filter. plz does this with The original issue was such a race condition. And it happened in rare cases. I was using a function from plz in the process filter that temporarily narrowed the body to parse the response. And sometimes the process finished earlier than the filter. In that case plz crashed because it was trying to find the HTTP status code in the buffer that got narrowed in the process filter. I have now changed this code and we don't narrow anymore. Additionally the process filter doesn't run user code anymore, which was causing it to be terminating after the process in rare cases. Since those changes I haven't seen this happening anymore. I think its still possible that this can happen, but the timer is supposed to make this more unlikely. I initially had some code in the PR that tried to call the |
I see your point. For me, adherence to the spec isn't particularly interesting, but having consistency between the various streaming handler classes is. So I think we still disagree, but I'm willing to live with the current state for now, especially because I want to finish this work. AFAICT, finishing the work means landing the PR in |
Yes, we need to get the PR merged. My paperwork is complete but I haven't heard back from Craig to send alphapapa the ok. I put you on CC on that mail. I will ping Craig again if he does not reply in the next days. About EIEIO, do we really have to split hairs about this? We are already full in with the other half of the Common Lisp extensions. Both are a part of Emacs and how does removing it simplify anything here? It actually adds value by providing documentation and validation for the slots of the classes. Those classes are part of the public API and users are supposed to set these slots. I invested time in writing documentation for them, and defined the types so users can better understand how to set them. I can open another PR tomorrow to rename the modules. Do you want just the renamed modules or the complete code including the testing infrastructure I have in my existing repositories? Please take a look at the repos to see what's there. If you want the tests as well, can we then also please setup automated tests in the llm repository? |
About EIEIO, if the code will soon move out to it's own library, then I don't mind much. But if not, the cost is that I need to potentially understand EIEIO when investigating issues, and it's just more complication for me as a maintainer. I do know that you will help if there's anything wrong with these classes while they are part of the About your testing infratructure, yes, it probably makes sense to include (perhaps in a subdirectory as you do in your own packages). |
Forgive the intrusion, but another consideration regarding EIEIO may be that of performance. If these functions are being called often and in quick succession, the overhead of EIEIO dispatch might be worth considering vs. simpler alternatives. |
What about this alternative: I ask on emacs-devel if they are interested in adding the 2 repositories to ELPA. Then anyone who wants to use those libraries can, those who don't want to, don't have to. I can develop them independently and I don't have to plz anyone's coding style or political interest. We are now at least 2 months into this and the most important issue hasn't been solved in a way Emacs users can benefit from easily: Anyone wanting to make a streaming HTTP request today (let's take text/event-stream as an example) has to either:
Additionally, they have to write a parser for the text/event-stream format. This is what I set out to solve after our initial discussion on the plz issue. Remember, we were talking about JSONRPC and all those formats. We now have a (at least to me satisfying) solution. The plz-media-type repository ships with a couple of batteries included classes that handle some commonly used formats. But users can also bring their own, plz-event-source being an example for this. Both streaming and non-streaming requests are supported. What's now suggested is to bury the only solution that we have into the llm library and "hide" it from Emacs users. Why are we even doing this? Is this to the best interest of Emacs users? Say, there is another project that wants to consume text/event-stream (I actually might have one soon, @josephmturner quickly showed up), should I depend on the llm library? Write my own process filter? Yes, those 2 libraries might still not be perfect and have undiscovered issues. But this can be fixed. Nothing is perfect from the beginning, and it doesn't have to. If we put now all this code into the llm library, I don't see myself "moving the code out" again in the immediate future. I already separated things, so I don't see any benefit in combining the code again, renaming files, and then "moving it out" again. @ahyatt When do you think the "soon" you were talking of actually is? As I said previously, my time on this is ticking, and if the code does what it should, I want to move on to more interesting things. So, if @alphapapa isn't going to add something similar to plz itself, releasing this to some Emacs package archive is the next best solution in my opinion. For @ahyatt to use it, I guess it should be GNU ELPA, right? When asking on emacs-devel I can imagine some people asking to support url-retrieve as well. I briefly looked into this, but once I realized that the url library is replacing content in the response buffer with hooks I run away. I now know what @ahyatt meant when he said there is something strange with control characters going on. So, this library is probably tied to plz in the near term future, at least the request handling. I may make the code that parses the buffer more reusable, but probably not now. If you have better ideas how Emacs users can benefit from this, I'm all ears. I still think plz-media-type and plz-event-source are good names for the projects. I can also remove the plz- prefix if that helps. I'm also open to other naming suggestions. I only think if emacs-devel doesn't want to add it to ELPA (which I doubt), renaming the files and copying the code to the llm library is actually a good solution. |
I agree that asking on the emacs-devel mailing list is a good idea. You can just send a standard GNU ELPA inclusion proposal (see how others do it for an example), and explain the background. I've seen them also be skeptical of EIEIO, though. They tend to accept packages, but there's been debate before about As far as "soon", I was thinking in 6 months or less. But I agree that trying GNU ELPA is a good approach so that "soon" really is ASAP. Thanks for your patience! |
I would be glad to have I don't think it would be appropriate for the As well, the EIEIO-based, content-type dispatching is more than I'd want to add to But I think that support for both of those collections of functionality could be based on a If you do want to publish Also, if you were to publish those on ELPA, would you be intending to maintain them in the long-term? You've mentioned that your "time is ticking," and you "want to move on to more interesting things," so I'm curious about your plans for that. I can't say that there's an obligation to keep maintaining a package published to ELPA, but in general, it's preferable that packages have active maintainers, and it's good to be clear about intentions. One other comment:
Please don't misunderstand me: the intention is not to bury or hide anything. The intention is (or would be) to allow code to mature and stabilize before publishing it as a library intended to be supported and used more widely in the long term. As I mentioned, |
@alphapapa I don't have the time to continue working on something About my intentions: My intentions are to support @ahyatt with the migration to use curl for making HTTP requests in the llm library. I don't have plans to make plz-media-type or plz-event-source compatible with a future libcurl based alternative, nor with url-retrieve. The libraries are designed to work with plz and that's it. For that reason I would also keep the plz- prefix, so users can easily find it. I recently saw there is https://github.com/astoff/plz-see.el, so I think it would make sense to keep that prefix for my libraries as well, since it is tied to plz. @ahyatt what about the other approach I mentioned. I continue work in my repositories and make sure it works with the llm library. We "vendor" plz-media-type and plz-event-source by copying them over from time to time and distribute the files for now with the llm library. If nothing serious comes up, and they were used "in anger" we submit the plz-media-type and plz-event-source libraries to ELPA in a couple of months. I make sure nothing breaks, should I change or improve something, will open PRs to the llm library and test all of this thoroughly. What I'm less keen on doing is to move all those files into the llm library, rename them, just to tear them apart, and rename them again once we publish this to ELPA. |
@r0man, keeping the files in my library temporarily sounds reasonable, but please add comments to that effect in the files themselves, so people know these will soon move out to their own repository. Let's have a goal to move them out no later than 6 months from now (also note this in the files). Thanks! |
@ahyatt I pushed a change to the plz branch/PR with a note about the files being vendor-ed for now, and when we plan to submit them to ELPA. |
Hi @ahyatt,
here is another update. I have opened another PR on plz.el. This PR allows installing a process filter. alphapapa is fine with merging this when the FSF replies to him. My paperwork is basically done.
alphapapa/plz.el#50
On this PR we discussed another issue that I saw sometimes. It was signaling the following error:
I think I now also know why this is happening. It happened in rare cases when the process filter was still using a function from plz.el that narrowed the buffer for a moment. If the process filter isn't fast enough it can happend that the process exits before the filter is done. If in such a case the buffer is narrowed the error can happen.
I now removed the code that narrows the buffer. alphapapa suggested that it is a bad idea to run callbacks in the process filter, since it should try to be as fast as possible. I looked a bit around and found the jsonrpc.el package. It also uses a process filter and also calls user code. But the way it does this is by scheduling the work on the main loop with a timer.
https://github.com/emacs-mirror/emacs/blob/master/lisp/jsonrpc.el#L785
This might be a better solution and I changed the code now to do the same.
So what now changed is the following:
I tested the Ollama and the Vertex provider with the llm-tester and they are working fine. I tested the OpenAI provider with Ellama and it also seems to work.
There is however an issue with the OpenAI provider and the llm-tester, which seem to be related to the buffer local llm-openai-current-response variable.
It always fails, but at different steps. If I run the failing step alone it works. There seems to be a race condition with the llm-openai-current-response variable, now that we are running code with the timer.
I have the suspicion that some test steps compete in using this variable.
Can you help me with this? Do you think we can somehow get rid of the llm-openai-current-response variable?
Thanks, Roman.