You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened? (You can include a screenshot if it helps explain)
In ert 13.0.4, using REALIZATION_MEMORY to allocate RAM to the compute nodes does not seem to work. The SLURM command seff indicates that the compute nodes have been granted 2 GB of memory (the default for our system configuration) despite setting "REALIZATION_MEMORY 10G".
We have included a zip file with an ERT case that runs a very simple forward model that allocates a NumPy array of variable size.
The 'logs' folder contains two files: one of a run that was set to low memory consumption (which finished) and one with high memory consumption (which failed). One thing that stands out is that the command which runs SLURM does not include the --mem option which is typically used to allocate memory for a node.
NOTE: the SLURM tool sacct does not always give the same error message for a killed node, sometimes it is OUT_OF_MEMORY and sometimes it simply says FAILED, but our system administrator confirmed that the cause of failure was due to OOM.
What happened? (You can include a screenshot if it helps explain)
In ert 13.0.4, using REALIZATION_MEMORY to allocate RAM to the compute nodes does not seem to work. The SLURM command seff indicates that the compute nodes have been granted 2 GB of memory (the default for our system configuration) despite setting "REALIZATION_MEMORY 10G".
We have included a zip file with an ERT case that runs a very simple forward model that allocates a NumPy array of variable size.
The 'logs' folder contains two files: one of a run that was set to low memory consumption (which finished) and one with high memory consumption (which failed). One thing that stands out is that the command which runs SLURM does not include the --mem option which is typically used to allocate memory for a node.
NOTE: the SLURM tool sacct does not always give the same error message for a killed node, sometimes it is OUT_OF_MEMORY and sometimes it simply says FAILED, but our system administrator confirmed that the cause of failure was due to OOM.
ert_test.zip
What did you expect to happen?
We expect the forward model not to crash if its memory consumption remains less than REALIZATION_MEMORY at all times. This does not happen.
steps to reproduce
Environment where bug has been observed
The text was updated successfully, but these errors were encountered: