Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/sina/hdf5 output support #1480

Open
wants to merge 43 commits into
base: develop
Choose a base branch
from

Conversation

doutriaux1
Copy link

Summary

  • This PR is a feature
  • It does the following (modify list as needed):
    • Adds the capacity to output sina files as hdf5 and opens the door for more formats in the future

message << "The '" << RELATIONSHIPS_KEY
<< "' element of a document must be an array";
throw std::invalid_argument(message.str());
conduit::Node relationship_nodes = asNode[RELATIONSHIPS_KEY];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be a ref to avoid a copy

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed as many copies as I could, this one underwent additional changes

conduit::Node modifiedRelationshipsNode;

removeSlashes(relationshipNode, modifiedRelationshipsNode);
relationshipsNode.append() = modifiedRelationshipsNode;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could pass ref to append result into removeSlashes to avoid one extra copy.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy removed

Copy link
Member

@cyrush cyrush Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are the changes pushed? I see:

removeSlashes(relationshipNode, modifiedRelationshipsNode);
        relationshipsNode.append() = modifiedRelationshipsNode;

however, I expect something like:

removeSlashes(relationshipNode, relationshipsNode.append());

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyrush @gwaegner cannot push. I just pushed the changes.

src/spack Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this sub module add intended?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submodule removed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the PR is trying to make unintended changes to submodules (blt, data, radiuss-spack-configs)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submodule changes came from develop merges, double checked for intentionality

Copy link
Member

@bgunnar5 bgunnar5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some suggestions for you. Also wondering if the fortran changes are tested?

@doutriaux1
Copy link
Author

@gwaegner please take a look at the failures

@kennyweiss kennyweiss requested a review from bmhan12 January 27, 2025 22:19
Copy link
Contributor

@bmhan12 bmhan12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated ReadTheDocs documentation for this PR can be previewed here: https://axom.readthedocs.io/en/feature-sina-hdf5_output_support/axom/sina/docs/sphinx/index.html

Comment on lines 196 to 200
/**
* \brief Dump this document as an HDF5 File
*
* \return None, conduit automatically dumps the hdf5 file without a return
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of \return..., this comment can be something like:
\param [out] filename hdf5 file to dump the document

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation updated

@@ -203,9 +226,10 @@ class Document
*
* \param document the Document to save
* \param fileName the location to which to save the file
* \param protocol the file type requested to save as, default = JSON
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe want to note here that HDF5 file type is also supported, or update Document class description.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation updated

Comment on lines 51 to 53
/**
* \brief An object representing the top-level object of a Sina JSON file
*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the class header description for Document be updated, generalized to include hdf5?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation updated

@doutriaux1
Copy link
Author

@kennyweiss I think @gwaegner adressed everything and the tests are now passing. Can you please take on last look to make sure we covered everything. Thanks!

Copy link
Member

@kennyweiss kennyweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the branch @gwaegner

I built it locally and fixed the submodules and includes for axom/config.hpp, which needs to be brought in before checking for AXOM_USE_HDF5. I also added a few guards around uses of axom::Protocol::HDF5.

I think this branch still needs a bit more work to handle configurations without hdf5. Specifically, one decision is about whether sina::Protocol::HDF5 should always be defined or if it should only be defined when AXOM_USE_HDF5. In the former case, users can use the enum value without guarding against HDF5, but need some runtime checks or try/catch loops to handle cases where it's not available. Another issue relates to how the Fortran wrapper and examples handle this case.

I tested the branch by building a library set without hdf5 --

> ./scripts/uberenv/uberenv.py --spec "%[email protected]+devtools~hdf5+c2c~scr+profiling  ^conduit~hdf5"

and then configuring axom with the host-config file that it generated.

In doing so, I discovered some test failures in sidre and quest tests. I'll fix these in a separate branch.

@@ -15,13 +15,24 @@

#include "axom/sina/core/Document.hpp"

#include "axom/config.hpp"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: We need to #include "axom/config.hpp" to bring in compiler defines such as AXOM_USE_HDF5. If we don't, they will always appear to be off.

I updated the includes in this branch to bring in axom/config.hpp

#include "axom/sina.hpp"

int main (void) {
axom::sina::Document myDocument = axom::sina::loadDocument("MySinaData.hdf5", axom::sina::Protocol::HDF5);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: W/ the current implementation, this snippet will raise a compilation error when the code is built without HDF5, so effectively, HDF5 is required to build the code!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HDF5 is always defined so there shouldn't be a compilation error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HDF5 is always defined so there shouldn't be a compilation error

I'm not sure I understand this.
Are you saying it's always defined in our configurations that build the docs?

I'm suggesting either adding an ifdef guard to this snippet, or mentioning in the surrounding text that this snippet assumes axom is built w/ hdf5 support.

Comment on lines 314 to 318
else
{
throw std::invalid_argument(
"Invalid format choice. Please enter 'json' or 'hdf5'.");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the code ever get to this else?

Copy link

@gwaegner gwaegner Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the above default, the code will hit this else if a user attempts Protocol::HDF5 with no HDF5 support since the else if(Protocol::HDF5) case will not be loaded

@kennyweiss
Copy link
Member

Note: I created a PR to fix some tests for Axom configs without hdf5 (independent of this PR): #1504

@doutriaux1
Copy link
Author

I've seen that. Thanks @kennyweiss . @gwaegner please take a look at Kenny's PR as well to see how he took care of things there.

@gwaegner
Copy link

gwaegner commented Mar 5, 2025

All requested changes should be accounted for. Is someone able to give this another look over?

Copy link
Member

@kennyweiss kennyweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gwaegner for incorporating all the suggestions.

I added a few more relatively small comments.
My only remaining concern is about how sina is installed, which should be easy to remedy. @white238 -- would you mind looking at the CMake files?

Please also update the RELEASE_NOTES to add a note about the new HDF5 output support.

Also, when it's ready, this PR will have to be squashed merged due to the submodule changes in its history.

Comment on lines 61 to 62
blt_list_append( TO sina_sources
ELEMENTS interface/sina_fortran_interface.cpp interface/sina_fortran_interface.f)
ELEMENTS interface/sina_fortran_interface.cpp interface/sina_fortran_interface.f.in)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to include the template file interface/sina_fortran_interface.f.in,

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to remove .f.in's inclusion

if (ENABLE_FORTRAN)
blt_list_append( TO sina_headers ELEMENTS interface/sina_fortran_interface.h)
blt_list_append( TO sina_sources
ELEMENTS interface/sina_fortran_interface.cpp interface/sina_fortran_interface.f)
ELEMENTS interface/sina_fortran_interface.cpp interface/sina_fortran_interface.f.in)
configure_file(interface/sina_fortran_interface.f.in interface/sina_fortran_interface.f @ONLY)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll likely also need to install the generated file interface/sina_fortran_interface.f

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.f file is now explicitly installed.

Comment on lines +53 to +57
if(AXOM_USE_HDF5 AND ENABLE_FORTRAN)
set(AXOM_USE_HDF5_FORTRAN ".true.")
else()
set(AXOM_USE_HDF5_FORTRAN ".false.")
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, exposing the enum values will be a lot cleaner w/ shroud

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kennyweiss yes at some point we should convert to shroud as a separate PR. But it's kind of "if it's ain't broken why fix it?" at the moment 😉

Comment on lines 8 to 12
try:
import h5py
AXOM_USE_HDF5 = True
except ImportError:
AXOM_USE_HDF5 = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better way to do this?
It seems like this will depend more on the user's python environment than on their Axom config.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configured to now base AXOM_USE_HDF5 on the preprocessor version using a config.py.in file

@@ -116,8 +123,153 @@ def test_validate_contents_of_record(self):
self.assertEqual(double_arr, record["curve_sets"][curveset]["dependent"][double_2_name]["value"])


#HDF5 Test
@unittest.skipUnless(AXOM_USE_HDF5, "Requires h5py for HDF5-dependent tests")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice solution!


* HDF5 offers better size and speed efficiency when dealing with larger files/curve sets
and only outperforms more dramatically as size increases
* Hierarchal structure leads to being 2.5x more size efficient and 5x faster at our largest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Hierarchal structure leads to being 2.5x more size efficient and 5x faster at our largest
* Hierarchical structure leads to being 2.5x more size efficient and 5x faster at our largest

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@@ -0,0 +1,55 @@
.. ## Copyright (c) 2017-2024, Lawrence Livermore National Security, LLC and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please run the update copyright script on this branch

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

#include "axom/sina.hpp"

int main (void) {
axom::sina::Document myDocument = axom::sina::loadDocument("MySinaData.hdf5", axom::sina::Protocol::HDF5);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HDF5 is always defined so there shouldn't be a compilation error

I'm not sure I understand this.
Are you saying it's always defined in our configurations that build the docs?

I'm suggesting either adding an ifdef guard to this snippet, or mentioning in the surrounding text that this snippet assumes axom is built w/ hdf5 support.

Comment on lines +36 to +46
enum class Protocol
{
JSON,
HDF5
};

const std::vector<std::string> supported_types = {"JSON",
#ifdef AXOM_USE_HDF5
"HDF5"
#endif
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +312 to +335
if(protocol == Protocol::JSON)
{
protocolWarn(".json", fileName);
auto asJson = document.toJson();
std::ofstream fout {tmpFileName};
fout.exceptions(std::ostream::failbit | std::ostream::badbit);
fout << asJson;
fout.close();
}
#ifdef AXOM_USE_HDF5
else if(protocol == Protocol::HDF5)
{
protocolWarn(".hdf5", fileName);
document.toHDF5(tmpFileName);
}
#endif
else
{
std::ostringstream message;
message << "Invalid format choice. Please choose from one of the supported "
"protocols: "
<< get_supported_file_types();
throw std::invalid_argument(message.str());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor (consistency):
Is there a reason to use a switch in loadDocument() but if/else conditions in saveDocument()?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switch statements do not allow us to instantiate variables within cases which causes problems for the JSON case. If we used a switch statement we would end up running auto asJson = document.toJson() even when saving as HDF5 or getting an unsupported protocol.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants