Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for ingesting datasets having multiple dimensions in earth engine #492

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

kbp45-cusp
Copy link
Contributor

@kbp45-cusp kbp45-cusp commented Jan 23, 2025

resolves #305

Solution approach:

As the dataset contains multiple dimensions apart from latitude and longitude, we need to convert the data arrays to 2D grids. Hence We need to remove the all other dimensions. So, we are going to accept a list of dimension from user, on which we will partition the data. There might be cases when user don't want to partition the data on the basis of all dimensions. In such cases, the remaining dimensions will be flattened out and final grid will be of 2D. Also, we need to allow user to specify the name of the resulting COGs based on the partitioned dimensions.

Key notes for each cli arguments:

  • partition_dims: Dimensions list on basis of which the dataset needs to be partitioned. Let's say if a dataset having 5 dimensions (time, step, level, latitude, longitude) and user only provides [time, step] then flatten the 'level' dimension into variables.
  • asset_name_format: This argument enables the user to decide the name of the resulting COG. The user will be providing a template string containing dimension names. Here, user needs to provide only the dimensions used to partition. User can add 'init_time' and 'valid_time' as dimension name, provided that dim_mapping is passed as well.
  • Need for dim_mapping: This mostly handles the case when the 'step' dimension (or any dimension suggesting forecasting values) is not in datetime format (it is either in timedelta or an integer). So, this dictionary will contain mapping to 'init_time' and 'valid_time' dimension, so we will get to know to which dimension the timedelta needs to be added (basically which dimension serves as init_time). With these information, we can add the forecasting datetime in the asset_name as well (if user requires it) also we can add 'start_time' and 'end_time' values in the attributes which will then be ingested in earth engine.
  • date_format: The format for the datetime values used in the asset_name.

@kbp45-cusp kbp45-cusp marked this pull request as draft January 23, 2025 13:08
@kbp45-cusp kbp45-cusp marked this pull request as ready for review January 27, 2025 06:18
Copy link
Collaborator

@j9sh264 j9sh264 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial pass comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

weather-mv ee does not support multiple time dimesions
2 participants