-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeline status update may fail on landing zone deletion #1894
Comments
Still looking for a way to reproduce this. An interesting thing I observed is that in an affected project, other landing zone operations have been successfully triggered after the zone deletion. This means we reach the project unlock code in Hence my interest is now targeted at this bit in run_flow():
There is no error in the log. Based on that it looks like neither of these conditions are met. This is a weird one. I could add some debug logging to the aforementioned point in |
I queried our production database. Out of 901 deletions, in 15 the timeline event has gotten stuck in the I'll try to come up with and test some variables which could affect this but if no luck, I'll leave it ongoing now and see if I can observe more of this in production. Mental note: here's a twoliner to grab affected events in the Django shell.
One thing I noticed that at least some of these zones in question (maybe all, I need to check) had previously failed with |
I'll keep my eye out on further occurrences of this issue. It might simply be because of an admin setting the deleted status in shell and neglecting to update/delete the related timeline status. If this is the case, the proper fix would be to enable proper admin tools. For that, see #1892. This may well be the same issue as with #1798, so I'm setting that one to |
This is a very random bug I've observed in production a couple of times (two times at the time of writing, to be exact):
DELETED
zone_delete
event status remains atSUBMIT
I'll try to see how to reproduce this. Setting the zone status is the last task in the
landing_zone_delete
flow. A possible explanation could be an error/crash inrun_flow()
after the flow has returned. I have to look into server logs to see if any errors have been raised.One possible might be an iRODS timeout similar to #1458 and #1893. OTOH, today I witnessed this happen in production at a time when iRODS connections were otherwise working without issues.
The text was updated successfully, but these errors were encountered: