Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Control file module files and validation #2445

Merged
merged 64 commits into from
Feb 7, 2025

Conversation

inv-jishnu
Copy link
Contributor

@inv-jishnu inv-jishnu commented Jan 2, 2025

Description

This PR adds files related to control file and its validation. A control file is used to map column names specified in import source file to actual table columns.

Related issues and/or PRs

Please review and merge this PR once the following PRs are merged

Changes made

I have added files related to control file and its validation

Checklist

The following is a best-effort checklist. If any items in this checklist are not applicable to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes.
  • Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
  • Tests (unit, integration, etc.) have been added for the changes.
  • My changes generate no new warnings.
  • Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

Road map to merge remaining data loader core files. Current status

Release notes

NA

@ypeckstadt ypeckstadt marked this pull request as ready for review February 4, 2025 05:57
@inv-jishnu inv-jishnu requested a review from komamitsu February 4, 2025 06:07
Copy link
Contributor

@Torch3333 Torch3333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

throws ControlFileValidationException {
Set<String> mappedTargetColumns = new HashSet<>();
for (ControlFileTableFieldMapping mapping : controlFileTable.getMappings()) {
if (mappedTargetColumns.contains(mapping.getTargetColumn())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] You can call mappedTargetColumns.add(columnName) instead as it returns false when the entry already exists.

I confused this line with https://github.com/scalar-labs/scalardb/pull/2445/files#diff-fe051a3fcbc578fb1d270a16e032dbb16af0284fbd34ef077783848fba783138R89...

* @return Set of uniquely mapped target columns
* @throws ControlFileValidationException when a duplicate mapping is found
*/
private static Set<String> checkDuplicateColumnMappings(ControlFileTable controlFileTable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function name should be getTargetColumnSet or something since it converts the data structure not only checking. What do you think?

@inv-jishnu inv-jishnu requested a review from komamitsu February 5, 2025 10:10
@inv-jishnu
Copy link
Contributor Author

@komamitsu san,
I have added changes based on your feedback.
Please take a look at this again when you get a chance.
Thank you.

Copy link
Contributor

@komamitsu komamitsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

Copy link
Collaborator

@brfrn169 brfrn169 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of comments. Please take a look when you have time!

@@ -765,6 +765,44 @@ public enum CoreError implements ScalarDbError {
DATA_LOADER_TABLE_METADATA_RETRIEVAL_FAILED(
Category.USER_ERROR, "0166", "Failed to retrieve table metadata. Details: %s", "", ""),

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@@ -0,0 +1,14 @@
package com.scalar.db.dataloader.core.dataimport.controlfile;

/** Represents the control file */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This javadoc is wrong, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brfrn169 san,

I have updated the javadoc with more details.
Please check it when you get a chance and if it needs further changes, I will update it.
Thank you.

@inv-jishnu inv-jishnu requested a review from brfrn169 February 6, 2025 07:45
Copy link
Collaborator

@brfrn169 brfrn169 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment. Other than that, LGTM! Thank you!

"0173",
"Duplicated data mappings found for column '%s' in table '%s'",
"",
""),
//
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add an empty line here?

Suggested change
//
//

@ypeckstadt ypeckstadt merged commit 8164078 into master Feb 7, 2025
48 checks passed
@ypeckstadt ypeckstadt deleted the feat/data-loader/control-file branch February 7, 2025 07:55
feeblefakie pushed a commit that referenced this pull request Feb 7, 2025
feeblefakie pushed a commit that referenced this pull request Feb 7, 2025
feeblefakie pushed a commit that referenced this pull request Feb 7, 2025
feeblefakie pushed a commit that referenced this pull request Feb 7, 2025
feeblefakie pushed a commit that referenced this pull request Feb 7, 2025
inv-jishnu added a commit that referenced this pull request Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants