-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] 0.27.1 introduced an bug when two queries are sent one after another #1210
Comments
@MaximillionMai Can you confirm which version this breaks on for you? That will help me do a comparison between that version and the previous version to figure out what might have changed here. |
Hi @designatednerd! We're experiencing the same issue. Testing shows that it reproduces in 0.27.1 and does NOT reproduce in 0.24.1 and 0.25.1. |
I have just noticed this too, in v0.27.1. In my case it's when calling |
We've been seeing this occasionally also, have not had time to look into it with more depth as we were assuming it was the dogshit wifi we have currently. Just a series of normal fetches in our case |
Looks like the problem is here - the data object that's been dequeued for the task which has just finished is This was introduced in Thought: Are any of you calling |
Nope! The only place I reliably see it is when I'm calling a series of fetch requests one after another, at one point I kick off a request in parallel for each of the objects received, but the request that fails isn't always one of those. |
@sneakyness Got a rough guess on how many requests at once you've got going before you see this issue? |
Also, that error has an associated value that should have the |
2-3 completed (each one must finish before the next can start), then 1+ fetch for one type while also making a singular request of another. The 1+ is commonly 4-10+, making more does not seem to change how sporadic it is. I'll start logging that now but IME it's always either of the last two |
Thanks. Woof, sporadic bugs are my faaaavorite to troubleshoot 😆 |
Unfortunately I'm in the middle of an API Model refactor, if I'm lucky I should be able to automate firing this process over and over to try and reproduce by end of week |
Just pushed up a branch where I added a test that checks if 300 identical URL requests hitting the client at the same time (from different threads) create different IDs, and it passes. Throwing some breakpoints in and doing some The only place the I've added this level of detail because I'm basically stumped. Clearly, something's causing the |
I'm using I'm not doing this specifically for the network requests, there's a decent amount of objects being initialized and sorted, just happened to be a clean spot to snag an id I needed for the request. Actually I've just realized that it might get called multiple times per index, do you think that's causing it? 🤡 |
I'm not sure what you mean by this |
If it's parallel can't it rarely call the same index twice? Especially in cases where the number of tasks is >8 (afaik 8 is the max number of threads it'll kick off) |
I still am missing something - what is "it" that could be calling the same index multiple times in this context - do you mean the |
concurrentPerform, so it might be calling the same fetch multiple times, but that doesn't explain why it would fail with "dataForRequestNotFound" instead of eventually just succeeding |
Ah! That makes more sense. I updated the test to use |
(basically if multiple tasks with the same ID are getting generated, then my "I will unique this based on the task identifier" strategy falls apart entirely) |
My case may be different from others in that the first request was fine. If we know for sure this is the cause, I can do whatever to make sure it's only called once. If there was an implementation of fetch/watch/etc like |
The first request would probably be fine in what you were describing because it's not trying to make multiple requests at the same time, it's once you start making multiple concurrent requests that the ID incrementor fails. Kind of wish they'd gone for a I'll think on that - for now I need to look at what to do about the fact that the ID isn't unique if it's hit concurrently repeatedly, because I thought Apple was handling that and they ain't. 🙃 |
👋Can I ask a couple of you folks to try out the |
Just tested the fix/multifetch branch. Unfortunately the issue still persists. In our case we just make two identical serial calls to fetch function on the same queue, and for one of these invocations resultHandler is not called back. It happens roughly 4/5 times, sometimes it works as expected. |
this sounds like completion handler overwritten somewhere without cancelling and new call created without canceling old one |
@RolandasRazma That's definitely what was happening when using the task identifier as the key, since the task identifier would not get incremented if the session was getting hit quickly enough, and everything for the particular key would get hammered by the next task. @yuriiik - Are you making other calls at the same time, or just the two identical calls? Also, thank you for testing, it's a bit frustrating that the |
@designatednerd – just two identical calls, one after another. |
The var is different in mine but otherwise they are the same call. Applied branch, saw the bug once, set a breakpoint, never saw it again 👻 The process of triggering the request was streamlined, I was able to run it on repeat long enough to forget about it and still got nothing! |
@sneakyness Did it continue to work fine after you removed the breakpoint? I'm thinking it might be a Heisenbug (in that it changes when observed, probably because the breakpoint slows things up enough to stop whatever was happening concurrently before) if it goes back to not working. If it randomly broke once and then worked forevermore, I have no goddamned idea 🙃 @yuriiik Something occurred to me - can you confirm that you're not using Query Watchers? There's a bug in the current version where we're super-agressively cancelling in-flight fetches if a cache result comes in that's fixed with #1156 (which is merged but not shipped). If you are using query watchers, do you mind pulling |
That lines up with what I figured was going on earlier, the only time it was happening was either after the backend had been rebuilt and the query did a cold run, or if we were in the "best worst zone" of wifi coverage which caused a similar amount of delay. This weekend I've got some nebulous and grand refactoring planned for the backend, I suspect once the schema tidies up this will disappear. If I manage to cause it again after that I will let you know. I'm inclined to say that this error is a "smell" and that ultimately it's best to find a way to get the result you want with less requests, even if it shouldn't happen |
@designatednerd Sorry to get back to you late! It was indeed introduced in version 0.27.1. |
@designatednerd we're not using Query Watchers, so that's not the case I think. |
OK, I think I got it. Basically: Concurrency is hard. 🙃 Please see #1227. |
#1227 was merged and is now available as part of Thanks for uncovering this! |
We can no longer reproduce the issue on |
For this function in ApolloClient,
@discardableResult public func fetch<Query: GraphQLQuery>(query: Query,
cachePolicy: CachePolicy = .returnCacheDataElseFetch,
context: UnsafeMutableRawPointer? = nil,
queue: DispatchQueue = DispatchQueue.main,
resultHandler: GraphQLResultHandler<Query.Data>? = nil) -> Cancellable
[Bug] 0.27.1 introduced a bug when calling the function above, one after another, with different queries. One of the two queries would not call the "resultHandler" callback. Sometimes the one that does call the callback would come back with this error: dataForRequestNotFound.
This bug does not reproduce in 0.24.1. I will check whether it reproduces in 0.25.1 and report back.
The text was updated successfully, but these errors were encountered: