Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In Memory only Dataset #3094

Merged
merged 206 commits into from
Nov 10, 2021
Merged

Conversation

jenshnielsen
Copy link
Collaborator

@jenshnielsen jenshnielsen commented Jun 8, 2021

Implements an in memory dataset that can be used to measure without writing the raw data to the db.
Furthermore this enables you to reload a dataset from a netcdf file and then write the metadata back to a db file and a plot it / explore it in other ways

  • Write a changelog entry

@codecov
Copy link

codecov bot commented Jun 8, 2021

Codecov Report

Merging #3094 (fc38c14) into master (de6e7de) will increase coverage by 0.38%.
The diff coverage is 90.60%.

@@            Coverage Diff             @@
##           master    #3094      +/-   ##
==========================================
+ Coverage   65.31%   65.69%   +0.38%     
==========================================
  Files         223      225       +2     
  Lines       29969    30463     +494     
==========================================
+ Hits        19573    20012     +439     
- Misses      10396    10451      +55     

@jenshnielsen jenshnielsen force-pushed the dataset_in_memory branch 2 times, most recently from 2277422 to 9a18a70 Compare June 10, 2021 10:28
@jenshnielsen
Copy link
Collaborator Author

jenshnielsen commented Jun 10, 2021

Things to figure out.

  • If a dataset is loaded from netcdf should we add it to the runs table in the current db. No but see below.
  • Need to handle attempt to write extra metadata for a dataset loaded from netcdf to a file that does not have the dataset
  • Should there be a function to add a dataset including the results table back to the db (yes but done outside this pr)
  • Currently changes to metadata of any kind will not be written to the db except when the run is marked started and completed. It will also not be added to exported data after the data has been exported. This is now done
  • How do we handle the __len__ attribute. This is somewhat sqlite specific since it counts the number of sqlite rows. Captures the number of points written.
  • Upgrade create_runs to somehow mark the runs table for not containing the data but it being external
  • Not strictly here but important. Add test for complex data export. Handled by Export complex numbers to netcdf #3126
  • Test that data including complex can round trip correctly when exported to netcdf

@jenshnielsen jenshnielsen force-pushed the dataset_in_memory branch 3 times, most recently from c2dbe6b to 5052976 Compare June 18, 2021 10:32
@jenshnielsen jenshnielsen force-pushed the dataset_in_memory branch 6 times, most recently from a5a585e to 8350557 Compare July 2, 2021 08:30
@astafan8 astafan8 added this to the 0.28.0 milestone Jul 14, 2021
@jenshnielsen jenshnielsen force-pushed the dataset_in_memory branch 3 times, most recently from e61f790 to d434f59 Compare August 19, 2021 12:37
@jenshnielsen jenshnielsen force-pushed the dataset_in_memory branch 7 times, most recently from 32bc756 to 422165e Compare August 26, 2021 08:30
@astafan8 astafan8 modified the milestones: 0.28.0, 0.29.0 Aug 31, 2021
@astafan8 astafan8 modified the milestones: 0.29.0, 0.30.0 Sep 14, 2021
@jenshnielsen jenshnielsen merged commit faa6735 into microsoft:master Nov 10, 2021
@jenshnielsen jenshnielsen deleted the dataset_in_memory branch November 10, 2021 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants