-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Steadily increasing memory consumption even with memory_strategy = "autoclean" #1257
Comments
What happens with |
Then the memory usage remains constant and low (after more than 4 hours now). |
Thanks for checking. So now it looks like futures are somehow holding onto superfluous data. |
Also, what |
I am trying to reproduce your issue, and so far I am not successful. Here is a workflow with targets large enough to noticeably impact memory.
Each target takes around 282 MB in memory. > x <- do.call(rbind, replicate(1e5, mtcars))
> pryr::object_size(x)
282 MB With the default So it looks like By the way, I also noticed you set |
|
In #1257 (comment) I had a typo in the code that made targets small. Memory on Windows was still constant with time on average, but up around 2400 MB. Not sure why it was so much higher, but autoclean + garbage collection still appears to be working. |
Thanks for your effort so far! I use plan("multisession"). I will update to the new version and try again. I'll let you know if it works. |
Upgrading to 7.12.1 does not seem to have an effect. Another thought: In my plan, I call Stata via Powershell. Might this be a problem? Apart from this, there's nothing unusual, I guess... |
Are there any failed targets? That's where I thought 7.12.1 would help.
I am not sure, I am not familiar with Stata. How are you calling it? Does Stata run in a child process? Is there a way to test your workflow without Stata? |
No, all targets build just fine. And yes, I can skip the Stata targets. I'll let you know tomorrow, if this has any impact. |
Skipping the Stata targets didn't solve the problem either. I think, I need to dig in deeper to create a reprex, so that you can actually see what's going on. Thanks for already spending time on this! I'll post a reprex as soon as I've figured out in which cases exactly the problem occurs. |
Small Update: If the issue is in fact related to a large cache, this might be the reason why its hard to create a reprex which shows the problem. Are there any known issues with large caches? And is ~ 690 GB large in drake terms? |
690 GB is larger than most |
Any change since 7bb9b51 on this issue? Also, do you use any dynamic branching in your plan? |
No changes yet. Finished the plan sequentially since I needed the results. Didn't have the time to review the issue yet, sorry! But I will work on a plan with targets of similar size this week, maybe I get some insights from that. I use only static branching in my plan. |
Thanks, that helps. Also, I wonder if it is something to do with the data structures you are using. Ggplots and lm objects contain their own special environments, and the data in them can get surprisingly large. Maybe check to see if later targets are actually larger than earlier ones. |
Sorry for the late reply. Large objects are only data.tables. Apart from that just small lists or vectors. The targets differ is size, but not substantially. |
Glad you eliminated that possible explanation, that helps. To troubleshoot further, I think we've reached the point where we really do need a |
I'm pretty busy at the moment but I'll try to make a reprex. Sidenote: My stakes in here are quite large since me and my colleagues are working on a proof of concept for our future data production which (at the moment) includes drake (which I'm pretty excited about). This was only possible due to your patient help during the last months! |
This issue has been up for a while, so I am closing it until we can reproduce it. Please ping me again when you have an end-to-end |
Prework
drake
's code of conduct.remotes::install_github("ropensci/drake")
) and mention the SHA-1 hash of the Git commit you install.[Sorry, I cannot install from github from this machine.]
Description
I use drake for producing a large research dataset (~1TB), which is chunked into pieces of 5 to 10 GB. My machine has 192 GB RAM and I run only 3 jobs in parallel.
In order to keep memory usage low, I specify the following configuration:
I run my plan via
r_make()
and everything seems to work nicely. However, despite configuringmemory_strategy = "autoclean"
andgarbage_collection = TRUE
, the memory usage grows steadily (over several hours). Finally, the machine crashes and I have to start over again (thanks to drake I can pick up right where it crashed).From what I read in the documentation, I would expect a rather constant memory usage since every target is discarded from memory after it is finished and only direct dependencies were loaded beforehand. None of my targets has dependencies of more than 3 GB (stored as fst in the cache). Thus, I do not expect to see a memory usage of more than 40 to 60 GB.
Reproducible example
Since the error occurs only after several hours and the dataset is confidential, it is hard to generate a simple reproducible example. Please comment, if you have suggestions.
Expected result
Memory usage should not grow steadily and
r_make()
should finish without issues.Session info
The text was updated successfully, but these errors were encountered: