-
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
possible speed up for HDDs initial sync, pre-sizing database to the current size of the database instead of constant resizing #9822
Comments
As you may (or may not) recall, LMDB was designed to have applications set a single maximum size at the time a DB is created and never bother with it again. Yes, resizing incrementally is a major waste of time. But ignorant whiners complained about dedicating all that disk space to the blockchain and so we had to write the resizing capability. Another thing to note - on most Linux filesystems, setting a large size on a file doesn't actually allocate the space for the requested size. It's only Windows and some Apple filesystems that don't support sparse files that require all of the space to allocated immediately. Yes, ext4fs uses extent-based allocation and so should be able to grow files more quickly than older filesystem designs. Some other high performance filesystems like XFS and JFS do as well. Regardless, none of these filesystems will force preallocation of the requested space, so there's no performance impact at creation time for setting a large size. IMO any fresh run of monerod should just check the amount of free space on the blockchain storage device and set the DB size to "all available space" and be done with it. This question is not specific to HDDs, it should just be the standard operating procedure, always. |
well this seems absolutely foolish. The blockchain will eventually take up all that space, so doing it incrementally doesn't provide any benefit for those whiners. |
Correct. It was never a good idea, but it's hard to go against user perception. There's a lot of added locking code in the DB driver now just to enable online resizing. Multithread concurrency would improve if all that overhead was removed. Probably the best approach would be to have the DB use all available space by default, with a commandline option to set a specific smaller size limit if the space needs to be shared with other programs. |
well for the initial sync, the resizing doesn't make any sense. But we still need all that for the constant growth of the DB right? |
If you've already allocated as much space as is available, then the only time you'll be resizing is when you've upgraded to larger storage. At that point, I don't think it's unreasonable to stop monerod and restart it with a new, larger size specified on the commandline. I don't see why it needs to be an automatic online operation. It makes no sense to constantly pay the overhead cost of supporting that, when it should be an extremely infrequent event. |
gotcha. I guess I don't understand resizing then. I was thinking that as you are getting new blocks from the network after initial sync, then you would need to resize the database. My point with this issue is that at the time of a release, we could hard code the current size of the monero database, so that during initial sync, there are no resizing events. As the node continues to remain syncd with the network after initial sync, the database would need resizing (this im not sure on based on what you said). Sure, if the release has been sitting out for a while (2 years), then a new user would end up "non-resize synchronizing" to the database size upon release, but then encountering resize events for the blocks that formed after release. Or we could get fancy and incorporate current database size into the seed node information / DNS info so that users always experience "non-resize synchronizing" for the initial sync. |
edited title to be more specific, and get rid of "LMDB database" (lightning memory mapped database database) |
Built into the checkpoint system, we list specific blocks. Given that we mainly have one database scheme (whatever is in
That at least eliminates most runtime overhead without requiring users to re-run |
Again, that's wasted work.
It is pointless and stupid to try to track the currently used amount of space since a blockchain always grows. |
If the online resizing isn't active, would monerod just crash out when the 500 GB is hit? Or would the behavior be something like it throwing an error message that says "allocated database space full, please restart with new database size indicated" ? |
So when testing the new 2for1 writes patches on HDDs, I often noticed that I would find monerod stuck. I thought I would dive into the logs for the "time gaps" that would appear. I turned to claude for a python script to grab these time jumps:
It turns out its LMDB resizing.
here's an example:
so just grepping and cutting the output file of part of my sync (i was using a high log level, so i don't have the entire sync in logs), i got 2.7 hours of database resizing.
now, this is where my computer science knowledge limits and understanding of LMDB starts to show, and those that hate AI will want to bludgeon me. But in conversations with claude, it might be that pre-allocating the space on a spinny HDD could reduce fragmentation, placing all the blockchain data closer to each other. But apparently, ext4 should be doing this itself to a degree, but perhaps windows users don't get to experience that. And of course there would be a massive amount of time spent at the beginning of sync allocating 230 GB or whatever.
This would also cause an immediate failure if the user doesn't have enough space for this initial allocation. Which would be good info for the user, and get them to use --prune mode off the bat, instead of waiting for an error and having to resync because you can't prune an existing when you have no space left.
anyhoo, just musing about . Perhaps we don't need to worry about spinny HDDs for much longer.... but plenty of folks still have them.
The text was updated successfully, but these errors were encountered: