Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming Result for Query/QueryRange HTTP API #10040

Open
bwplotka opened this issue Dec 17, 2021 · 6 comments
Open

Streaming Result for Query/QueryRange HTTP API #10040

bwplotka opened this issue Dec 17, 2021 · 6 comments

Comments

@bwplotka
Copy link
Member

In Thanos we are moving forward with query sharding, parallelization and pushdown ideas 💪🏽 Recently, we pushed a change that is capable of safely pushdown certain aggregations to Prometheus (thanos-io/thanos#4917).

What we do in sidecar now, for certain "safe" functions like min, max, group, min_over_time, and max_over_time instead of fetching all series through remote read, we change to HTTP Query API. We build PromQL from select hints we get from root Querier and perform query. Then we transform the Query result to Thanos GRPC StoreAPI.

Now, we have efficiency problem with Query API, because it's not streamed (something I was moaning about for long time 🙈). This means we unnecessary wait for full response payload, instead of streaming each resulted series by series like we do with remote read. AFAIK for most cases PromQL performs series by series calculations, so first series is calculated before next one, so that fits server part of this API too.

Now, to solve our use case, there are two options we could do to improve Prometheus for these use cases and actually win in other fields too.

A) Introduce streaming (per series) for query and query range Prometheus HTTP APIs. We could do this exactly as we did with remote read. Don't stream by default and opt-in using header negotiation for streamed JSON using chunked JSON logic we use with profobuf in new remote read. It's not straightforward as our current JSON result with this data format was never attempted to be streamed.
B) Allow remote-read to perform PromQL using hints. It's not hard to implement, but quite weird as there is strong overlap now between Query + Query range and remote write that can do PromQL. Plus it's easy right now to make mistake and pass wrong hints (aggregations/function that are NOT safe to be pushed down).

I would vote for A. I believe streaming response on query would help with many other use cases (e.g Prometheus -> Grafana interaction and Querier -> Query frontend in Thanos).

If we are ok to pursue A I could try to work with community on proposing exact format of JSON streamed API.

cc @tomwilkie @roidelapluie @juliusv @fpetkovski @brian-brazil @moadz @RichiH

@brian-brazil
Copy link
Contributor

AFAIK for most cases PromQL performs series by series calculations,

That's not the case. This is true for range vector functions internally, however basically everything else is step by step including the functions you list.
So there's not much to be streamed, and you'd have to rewrite a large chunk of PromQL to make it happen - plus ensure that failure semantics are dealt with correctly, and somehow ensure that resource usage isn't adversely affected given you're now doing more work and keeping more intermediate results in memory.

@juliusv
Copy link
Member

juliusv commented Dec 17, 2021

Yeah, making the PromQL computation itself streamable on a per-series basis sounds hard, but the final JSON result could of course be streamed quite easily. Maybe that's still worth it for large payloads in some use cases?

@brian-brazil
Copy link
Contributor

Once we have all the result samples, streaming from there makes sense. I think I ended up not doing it when making the json iterator changes as it ended up worse performance wise, but that could have changed.

@bwplotka
Copy link
Member Author

Thanks for the input. Looks like we could descope it to just an API and check if it helps with anything (it helps for Thanos sidecar case for sure).

On separate note then...

and you'd have to rewrite a large chunk of PromQL to make it happen - plus ensure that failure semantics are dealt with correctly, and somehow ensure that resource usage isn't adversely affected given you're now doing more work and keeping more intermediate results in memory.

@brian-brazil you made me curious now! So do you think there is room to make PromQL computing as much series by series internally, even if it's not currently? 🤔

@brian-brazil
Copy link
Contributor

brian-brazil commented Dec 17, 2021 via email

@beorn7
Copy link
Member

beorn7 commented Aug 13, 2024

Hello from the bug scrub!

In the meantime, a lot has happened in the area of "yet another PromQL engine". Thanos has one, Mimir is working on one, but they are all meant for their specific use case and not necessarily to just plug that new engine into a vanilla Prometheus.

Having said that, the topic of creating a new PromQL engine for vanilla Prometheus is still coming up quite often, and with a new engine, a streaming as proposed here might be feasible. We'll keep this issue open as a reminder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants