-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rhizome.db growing without shrinking through meshms flood #106
Comments
Journal bundles use less network bytes to transfer, but we currently We could impose a meshms ply size limit and advance the tail of the bundle. We've also planned for multiple rhizome bundles to be created for the same Line 586 in ebb7500
Which might need to run on some other trigger, and probably isn't being tested very well. We've also wanted to build a new storage layer for some time with a number
In other words, a better git object store I keep wanting to start this, but we haven't had a pressing need or the On Tue, May 17, 2016 at 10:24 PM, gh0st42 [email protected] wrote:
|
Thanks for your explanation so far.
And all versions are kept wasting soo much disk space (for 3 messages, 6 are kept on disk). A few kilobytes of text can end up in severals tens or even hundreds of megabytes on disk. In our test a 95kb conversation produced a 111mb rhizome.db :D This keeps growing and growing, this doesn't really scale well. So at least for our paper it makes it hard so sell, because large local communities using MeshMS as a daily communication system will come to a big storage problem on their nodes in a short amount of time. A few hundred messages per conversation is not really much if you look at usage statistics from current messengers. Especially during an emergency people will write even more messages in a shorter time because of panic, eyewitness reports, contacting family etc.. Taking the small router space and precious storage on mobile devices into account this is a big problem! And we're not even talking about malicious individuals or hackers here.. Sqlite might cause some problems and make stuff slow but here the problem ist more the concept itself, storing the same data directly on disk doesn't help either. I understand what you wanted to achieve here but the trade off of network bytes vs disk space imho is not working here. If i have to store roughly 2kb to transmit 7 message a 53 byte (=371 byte + overhead = 483 bytes real data) which gets even worse over time I might be better off transmitting the whole 371 bytes over the wire/air at least with bluetooth or wifi links. Sure modern computers have lots of disk space (our tp-link router don't:( ) but while testing I had 35GB of blobs in my database just for one long conversation. Also I'm not sure if the garbage collection will help that much, timing problems aside, getting rid of the oldest entries is getting rid of the smallest. Even if we only keep the 3 newest per conversation, once the database has reached a significant size (long term use) the historic copies will also be quite large. But short term we can reduce a 100mb database back to 1mb or something like that which would be good for the moment. Getting back to finding a solution: I don't like complaining to someone else without real solutions myself but this problem is probably none that can be fixed with a few lines of code and has big implications on longterm use for larger communities or selling it in our disaster scenarios. |
Also wouldn't some kind of ACK from the conversation parties be enough to discard messages both parties have already received? No need for all nodes in the network to keep the history throughout eternity. If a new node comes by that has old messages it can discard them as soon has he also sees the ACK and middle nodes only need to keep a journal of what hasn't been ACK'd so far. Would be a bit more management and meta information that need to be distributed but could help in the long run to keep the network clean and in a working state... |
I'm not disagreeing with you. Running;
Will hopefully tidy everything up. Which is what we try to run every 30 minutes. Clearly for your test case this isn't often enough. Triggering a cleanup based on some kind of Used / Free ratio somewhere in; Deleting orphan payloads shortly after the manifest is replaced will probably help too. But there's another reason we delay removing old payloads. If one of our neighbours is fetching the current version, and a new version arrives, we want to ensure that we can complete the delivery of the current version. If you have a bundle like this that might be rapidly changing, it's better to complete each transfer than to abort it because you have a newer version. Solving all of these issues without creating new ones is complicated. I would much rather nuke it from orbit and start again (Plan A). Anyway, on to Plan B. We teach the rhizome store layer to handle journal bundles differently;
If we can't create a new link on this filesystem (errno==EXDEV?), there's still a way to save space. But things are a bit more complicated. For other errors, just fall back to the current code path. I think that should do it. Of course we'll need some test cases... |
Also note that using the meshms command line at the same time that rhizome synchronisation is occurring will cause delay and perhaps failures due to database locking. While using curl to send messages via the restful API may be adding a 1 second delay. By default curl sends an "Expect:" header and waits for a "100 - Continue" response. You can avoid this delay by adding a '-H "Expect:"' argument. I've just added this to our existing test cases. |
While testing the new version of servald we encountered a big bug regarding sqlite and meshms.
We flooded one machine from another with 32000 messages, after a few messages we can only send one message per second and a new file entry for the message is created in the database but the old journal is not removed, so each ne entry is as big as the old one plus the new size. After a certain rhizome.db size is reached blobs are written to the filesystem.
A simple script to flood another host with messages can be found here: https://github.com/umr-ds/serval-tests/blob/master/meshms-flood
Even after a few hundred mesages you can see that every few seconds the database grows another megabyte even though the messages are only 53 bytes long.
Using restful api or commandline doesn't make a difference.
The text was updated successfully, but these errors were encountered: