-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
box_auth_service() randomly returns a 400 (Failed to connect to box.com API) message #166
Comments
I don't know, I have seen this happen more-and-more recently myself... We were very fortunate for #148 so that we can be more resilient, thanks @jameslamb! @nathancday - do you have any insights here? |
I've always noticed similar behavior with other API auth apps, I'm not sure what causes it, my gut feel is that it happens more often in R-sessions with multiple auth attempts. So my solve is to restart the R session and retry the auth, which seems to work for me most of the time. Beyond those feels, I think the automatic retry with backoff behavior added by Chicago R Collab is a reasonable attempt that balances end-user utility with API respect. I'm open to suggestions or demonstrations in other packages. |
In #148 I kept all the defaults for I know this is the hardest type of bug to address because it's so hard to reproduce 😬 |
Thanks so much, all, for engaging with this. Wondering what @ijlyttle and @nathancday you think of the idea of increasing the time that the function waits between retries. I wonder if that could reduce the frequency of errors from roughly 1/10 to something more like 1/20 ... or more. That would make it a lot more feasible to reliably connect via shinyapps.io (what I've been working on doing... per #165). Open to other ideas to address that goal and others here, of course, and happy to try things out and report back what works. Thank you again for creating a fantastic package that made it possible for me to d what I was trying to - and for talking about this. |
One way we could go is to implement an httr::RETRY(
<other args>
times = getOption("boxr.retry.times", default = 5)
) If you wanted to modify the number of times, you could:
|
We have a testable hypothesis: does increasing the back-off reduce the number of 400 errors for authentication? I'm not in favor of retrying more times, the http protocol says 400 error-ed requests should not be retried by a client without modification. I think we should view this as a back-off only problem, so we remain good API citizens. |
I agree - we want to remain good API citizens, and should with our defaults. At some point, I might like to supply the options, though. A couple of questions that are bugging me:
|
Hi, all, just thinking about this and wondering about next steps, first and foremost:
Happy to help however I can. My concern - zero percent yours - I am hoping to go "live" with this to collect data via a Shiny app sometime in November, so am for that reason especially motivated to help however I might! |
Hi @jrosen48 For me, this happens intermittently, almost like the occasional "bad hair day" - is it the same for you? Or is it more constant? Of course, intermittent is still unacceptable if there is a fix at hand. |
I think it is approximately 1 times in 10 based on the logs and from using and sharing it with colleagues. About how often I have a bad hair day (okay, maybe a bit less often). It is available from here: https://faast.shinyapps.io/generality-shiny/ I think the issue on my end is that the entire app fails if box has an authentication error. I'm uncertain as to whether that or collecting no data is worse, but, this does make errors very apparent. A very practical work-around is to simply ask people to re-load when it fails. |
I did a bit of googling and came across this. As a short-term response, we could modify the code so that a 400 error on JWT auth (seems to be the only place I see it, too) will also print the |
I did two little experiments on this yesterday. The goal was to understand how frequently this occurs and if the delay between auth attempts has an effect on frequency. There are two levels 10 seconds and 100 seconds between auth attempts. Two interesting things popped out to me:
https://gist.github.com/nathancday/e7a038d70c93169afead1d131ce752cc |
I have a first theory as to what is going on - I started getting the failures for a while today. The failures were to get the token (in addition to the test request). When we make the request, we assert that our claim is valid for a limited time; see the
If your expiry claim is invalid, you get a 400 error - I was able to (maybe) reproduce the error by setting the expiry to an invalid value (1 hour in the future). I say maybe because I don't know if this is the problem. However, if the local computer's clock is sufficiently different from Box' clock, I can see where you might get a problem, and I can see how that could be intermittent. If that is the problem, is there a way to ask Box what time it is? |
One of the frustrating things about this is that it is not suited to a reprex. That said, I am having the problem again - I looked at the timestamps in the requests and responses, and compared my clock to the world time API - all good (which is frustrating). Then, I changed the Thinking to write up a small PR that will try different expiry times... |
Woohoo!
Thank you @ijlyttle and @nathancday for your incredible persistence in tracking down this very unusual (to me) kind of issue. |
@jrosen48 Thank you for bringing this to the surface and for your continued patience and willingness to test solutions (both working and non-working)! |
Thanks! Each level of |
Well then that's just perfect! |
fixed in #193 |
Now I'm getting this again:
In an email, @nathancday thinks that this may have to do with frequent auth attempts - why not check it out? To implement a back-off, one could initialize a sleep time, then double it and add some noise should we need to repeat the loop. |
I also got this:
|
This is puzzling and frustrating |
Thanks once again for a great package. I noticed that
box_auth_service()
sometimes returns a 400 (Failed to connect to box.com API) message. The frustrating part is that it appears to be mostly random when it does.For example, below I run the function 10 times, and 4 of those 10 times were associated with 400 messages, three which were retried which succeeded, and one which did not.
I recognize that this appears to be on Box's - not {boxr}'s - side, but do you have any suggestions or ideas about how to fix or understand what is happening here?
The text was updated successfully, but these errors were encountered: