-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writing Assistance APIs #1067
Comments
Chromium Intent to Prototypes: |
Thanks for bringing this to us. The capabilities here are relatively new and undergoing rapid changes. We don’t know the exact right shape for APIs here and don’t think it’s the right time to stabilize them. A lot of exploration can already be done in userland today by either calling out to server-side APIs, or by delivering models to clients and executing them with WASM, WebGPU, and other low level APIs. We expect to see continued and substantial improvements in model capability and performance. We also expect that the ways in which applications interact with models will evolve. We’d like to let cowpaths develop before we pave them and converge on new web APIs. We acknowledge a downside with this approach related to lack of shared client storage for model weights — it would be a better experience if the browser only had to download large weights one time. We don’t know of a privacy-preserving way to do this, short of high level APIs like these which abstract away the details of inference. But since we expect developers will need fine-grained control over inference, models, and prompts to effectively explore new user experiences, we think the extra storage (for sites that use client side inference) is reasonable for the moment. For general purpose models like LLMs, given their size, browsers will likely be forced to make some choices eventually, but we don’t think it would be effective to stabilize at this time. There’s also a question about who bears the cost of inference and how to balance server- and client-side compute responsibilities. Sites are able to source their own resources for intensive compute operations on the server side or to download and run models on clients. In neither case does the browser yet need to get involved. There are some cases where exposing high-level ML interfaces to the web may make sense. For example, background blur is a widely-adopted feature in video applications. While sites could theoretically download and run their own background blur models client side, it’s a well-understood use case with a simple interface, and the browser has access to enable more efficient processing on the hardware. We should look for these opportunities and standardize these capabilities selectively. In the meantime, we are experimenting with a WebExtension trial API that will share model weights across sites in a more privileged context. While that is a very different context, we hope we can learn more about use cases from extension developers and will share what we find. |
Thanks for taking a look, and for the thoughtful comment! We agree that we need to experiment and gain more experience before stabilization. That is why we are currently running origin trials for these APIs before we move toward shipping. We will be sure to publicize the results and share them with the community. Our current plan is to do so in the WebML Community Group, where I believe Mozilla is also participating, but let us know if it would be helpful to report them back to this thread as well. Similarly, we look forward to hearing about the results of your own trial. |
I suggest we mark this as |
Request for Mozilla Position on an Emerging Web Specification
@
-mention GitHub accounts): @domenicOther information
TAG review: w3ctag/design-reviews#991
The text was updated successfully, but these errors were encountered: