Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reporting missing fields deeper in structure #26

Closed
marek-kuticka opened this issue Jun 28, 2023 · 8 comments
Closed

Reporting missing fields deeper in structure #26

marek-kuticka opened this issue Jun 28, 2023 · 8 comments
Labels
duplicate This issue or pull request already exists

Comments

@marek-kuticka
Copy link

Having a json:

{"field1": "val", "field2" : {"field3" : "value", "unknown" : { "inner" : "val1", "inner2" : "val2" }}}
and structures:

type Dummy struct {
	Field1 string json:"field1"
	Field2 Inner  json:"field2"
}

type Inner struct {
	Field3 string json:"field3"
}

unmarshall.go does not report the "unknown" key at all.

for !d.lexer.IsDelim('}') {
		key := d.lexer.UnsafeFieldName(false)
		d.lexer.WantColon()
>>		refInfo, exists := fields[key].  // => exists is false
		if exists {
			value, isValidType := d.valueByReflectType(refInfo.t)
			if isValidType {
				if value != nil && doPopulate {
					field := refInfo.field(structValue)
					assignValue(field, value)
				}
				if !d.options.excludeKnownFieldsFromMap {
					if result != nil {
						result[key] = value
					}
					if clone != nil {
						clone[key] = value
					}
				}
			} else {
				switch d.options.mode {
				case ModeFailOnFirstError:
					return nil, false
				case ModeFailOverToOriginalValue:
					if !forcePopulate {
						result[key] = value
					} else {
						clone[key] = value
						d.lexer.WantComma()
						d.drainLexerMap(clone)
						return clone, false
					}
				}
			}
>>		} else {. // => both result and clone is nil, so nothing is reported
			value := d.lexer.Interface()
			if result != nil {
				result[key] = value
			}
			if clone != nil {
				clone[key] = value
			}
		}

see comments inside code from unmarshall.go

would there be an option to modify the code?
best would be to have an information, that "field2.unknown" field is not correctly mapped in data.

thank you

Marek

@avivpxi
Copy link
Collaborator

avivpxi commented Jun 28, 2023

@marek-kuticka thank you for this.
I'll need a few hours to get to it and make sure this is indeed a bug, but if it is it's an interesting one. Thanks for reporting it.
Since you've already dived into code and understand the cause and possible solutions, I'd encourage you to fork the repo and open a PR providing a fix.
Cheers 🙏

@marek-kuticka
Copy link
Author

thx a lot for the encouragement. I indeed know, where the issue might approx be, but as a Go beginner, I am not sure about possible solution.

I might try to do it however.

I must admit, that the combination of "known fields" and error handling for unknown ones seems to be quite nice.

I will let you know, if I can find some proper solution.

If possible, pls look at it too, to confirm, it's a bug.
thx a lot

Marek

@avivpxi avivpxi added the duplicate This issue or pull request already exists label Jun 28, 2023
@avivpxi
Copy link
Collaborator

avivpxi commented Jun 28, 2023

@marek-kuticka sorry for the slight delay, hope you haven't started.
This behavior is intentional - more details and the solution in this issue.

TL;DR - we want nested objects in the resulting map to be of the same type as in the resulting struct. in your example: we want field2 both in the result struct and the result map to be of type Inner. Think about it - otherwise, what's the added value of Marshmallow in the first place? you could simply do json.Unmarshal(data, &map[string]interface{}) and get all data including the unknown field. However, in this case you want the unmarshalling phase to account for known types, meaning you want field2 to be of type Inner.

However, the issue of losing unknown fields in nested objects has already been addressed in issue #15, and the solution is to implement the HandleJSONDatamethod - solution in go playground.

@avivpxi
Copy link
Collaborator

avivpxi commented Jun 28, 2023

This is briefly mentioned in the API section of the readme, but I would love to get your feedback on how else would you document this to make it clearer.

@marek-kuticka
Copy link
Author

marek-kuticka commented Jun 28, 2023

marek-kuticka@e4159cd

here I have attached link to my fork, where I added testcases to confirm the bug

I have noticed your coment also before , in the API Section, and I have also read through the mentioned issue, but ..

when looking into the code, the culprit imho lies in buildStruct method

value := reflect.New(structType).Interface()
	handler, ok := value.(JSONDataHandler)
	if !ok {
		return d.populateStruct(true, value, nil)
	}

as... I HAVE to have this "unknown" field in my class, so that I can assign a handler to it. , there is a method created for this in the playground showcase.

in the use-case I am solving right now ( so far no code in GitHub, that would be public ) is, that I have approx 40k JSON files, that COULD be a bit different in deeper levels. that means, although I create Go structures to cover most of the json files, there might be some member, deep in structure, that I omit. I don't even know, I have the field there... and I need to find out.

and I'd like exactly this to be caught by the code.

What do you think about this?

@marek-kuticka
Copy link
Author

marek-kuticka commented Jun 28, 2023

Think I know now, what you were meaning

after adding

func (a *PersonAddress) HandleJSONData(data map[string]interface{}) {
	for key := range data {
		fmt.Printf("from handle: %s value: %s \n", key, data[key])
	}
}

to the test class, I got output about field "unknown", which indeed is the expected behavior, only issue I feel here is, that I had to create this method for the PersonAddress inner struct. if this is created for main struct only, then I will not catch this error.

@avivpxi
Copy link
Collaborator

avivpxi commented Jun 28, 2023

@marek-kuticka
Yes, you have to implement the HandleJSONData method in any nested struct you wish to capture unknown fields for.
Not sure what your code looks like, if you have a lot of different structs it might get tedious, but any other way I think of is not very clear.
If the API allows implementing such a function only at the root level and then get the raw data of all it's nested fields - there's no simple way of passing where exactly in the subtree we're currently looking at.

I would suggest you simply implement HandleJSONData for each nested struct in your code. However, if you feel like you can think of a better API - feel free to pitch it here. Although I warn you - we don't want Marshmallow to become fragmented. It's already complicated as it as - I'm not sure 2-3 different approaches for capturing nested fields is a good idea. However, I will be curious to hear if you think of a simple and interesting implementation for this.

@marek-kuticka
Copy link
Author

I agree, that this approach you suggested is working, and I also agree, that it's not so easy to remember, where in the tree we are currently traversing.

thx for valuable discussion. closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants