-
Notifications
You must be signed in to change notification settings - Fork 51
Submission false positives recorded during load testing? #154
Comments
That's a good question. I'll throw in a few thoughts and perhaps you can look into which ones seem more likely. One thing I remember is that Umbraco doesn't guarantee the database stuff will work from a background thread. It could be that, under high load, that lack of guarantee is especially relevant. IIRC, Formulate submits to the database from a background thread. It could also be that there are errors happening (e.g., with database locks or something) that are happening on that background thread; maybe they aren't being logged properly, so something to look into. Note that Formulate will return a success=true result pretty much all the time. This happens nearly immediately, as most form submission handlers do the majority of their work on a background thread. |
Thanks for your feedback. After a week of testing every other thing we thought might be contributing to this issue, we're back to square one. In one test today we lost 3 of the first 10 records in a test of 750 submissions spread over 5 minutes (including record #1) and then many more after those. In similar tests on Friday we lost none. So server congestion may be playing a part. But in this situation the submission should fail on the front-end - not present a success message. Naturally we're disappointed Formulate is not using a system to ensure the record is written to the database before displaying a success message. Our trust in the platform has taken a real hit. Right now we're trying to figure out if we can write our own save workflow before the success message (in the Any advice you might have that could assist us (and prevent us from losing submissions) would be greatly appreciated. Cheers. |
Would you be willing to submit a pull request? If the issue is what I imagine it to be, I imagine what you could do is change the "Store Data" handler so that it makes a web request from the background thread, then in the new request would do the database work (which would no longer be a background thread, but would also not slow down the form submission request). You may also want to try and verify first that this is what's happening. Perhaps the background thread is throwing an error that isn't being logged, so adding a bit of logging first may help with testing. Here is the current handler code for reference: https://github.com/rhythmagency/formulate/blob/master/src/formulate.app/Forms/Handlers/StoreData/StoreDataHandler.cs#L225-L244 Here is where the background thread is created (looks like it does have a bit of logging, FYI): https://github.com/rhythmagency/formulate/blob/master/src/formulate.api/Submissions.cs#L249-L265 |
Well, it looks like what is happening is duplicate GUIDs are being inserted resulting in the following
I have left off the parts that are not relevant of stack trace for the moment. I thought what was happening was a race condition. Keep in mind, we are hitting this form pretty hard, so if two submissions race to the insert and the I thought putting a thread lock around the database insert statement in the Then I realized that I notice that the To test this, I took off the thread generation and it submitted to the database without any errors like the one above. To mitigate this issue, while still allowing to spawn a new thread for This does not quite get rid of the issue of the spawned thread passing back any errors, but at least it deals with the Form getting hammered by our stress tests. To deal with the asynchronous issues, AJAX would have to deal with a promise and three states rather than success or failure:
I am not quite familiar with your javascript side of things to handle any of this at the moment, without a bit of guidance where to look. I am interested in your thoughts about creating instances for active FormHandlers. Thanks |
I noticed in your Change Releases:
This seems to be exactly the issue. If I put the I am looking into using |
Somebody is looking into the error relating to the GUID. The plan is to handle that as part of this ticket. Thanks for the pointers; that should make the fix happen much faster. Regarding changing how Formulate works, please create a new GitHub issue for that so we can track the issues separately. My thought on that one is that we'd want to have two API calls (the first to send the data, the second to complete once the background tasks have completed). Then we can emit a new event ("formulate: submission verified" or something like that) on top of the existing event(s). That way, you can choose which event you want to respond to. |
@TFAstudio FYI, we've validated that your analysis is correct and we have a fix we've spoken about. Just need to find some time to implement it; that should happen this week, I'm guessing. The fix is relatively simple, so it shouldn't take long. |
Excellent. It would be really good to get the duplicate GUID fix by the end of the week, we have another scheduled load testing run at the beginning of next week. I will open a new issue regarding the threading aspects soon. Thanks again |
…er data being shared among multiple threads. See: #154
FYI, I have a fix for Umbraco 7 (not yet released): 273b149 I you are able to test that fix before I release it, that would be super. I'll see if I can work on a similar fix for Umbraco 8 (supposing it's an issue with both versions). |
I did a quick review of the Umbraco 8 code and it seems likely that it will not be impacted by this issue, so no fix should be required. I'm thinking that because the FormHandlerTypeCollectionBuilder has a transient lifetime, which I interpret to mean that the form handler instances will be created fresh on each web request: formulate/src/formulate.app/CollectionBuilders/FormHandlerTypeCollectionBuilder.cs Line 11 in 3004d96
And if each request has their own instance, then there should be no issue of data being shared across threads. |
Actually, that's just for the instantiation of the IFormHandlerType instances. I'm not sure about the IFormHandler instances. Caching may still cause issues. Will need to review Umbraco 8 a bit more. |
…er data being shared among multiple threads. See: #154
Created an Umbraco 8 fix too: 9e9405e |
FYI, Formulate 2.4.8 has the fix included: https://github.com/rhythmagency/formulate/releases/tag/v2.4.8 Would be great if @TFAstudio could test it to be sure it's fixed. |
Great news! We'll test ASAP (probably tomorrow, as our lead dev is out sick today). |
I built off the tagged 2.4.8 version for Umbraco 7 and deployed to our testing server. Our load testing has shown that there are no more duplicate IDs, so that fix seems to have worked! We have not tested the Umbraco 8 version. Thanks for your quick response. |
Great, thanks for the detailed report and follow up. FYI, I'm going to hold off on releasing the Umbraco 8 version for a bit since there seems to be less urgency behind it (though the fix is already in the repo). Closing this ticket now. |
We've done serious load testing of the solution today and the results are very promising. So here's our questions
|
IIRC, the order is important and a failure in one handler will prevent the subsequent handlers from running (which is why I tend to make the "Store Data" handler the first one). |
We recently used Jmeter to test the limits of a form on some website hosting infrastructure. The page response times were excellent and errors were minuscule. Hosting seems terrific for the traffic volumes we anticipate.
However, we're currently investigating why some submissions did not land in the database. Jmeter recorded Success:True for all attempted form submissions but our resulting csv download revealed missing records. Tests were conducted from 500 submissions in 5 minutes to 8000 in the same time frame.
Is it likely the environment Jmeter was run in, or the test configuration scripts, were somehow able to generate false positives? Perhaps the error reporting was not catching failures as it should?
Or could the validation or submit-to-success round trip be misfiring under load?
We've not seen this kind of thing before and would welcome any theories.
The text was updated successfully, but these errors were encountered: