-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multiple data configs in evaluate #283
Conversation
return [DATASET_CONFIGS[dataset_name] for dataset_name in EVAL_DATASETS[eval_name]] | ||
elif isinstance(data_config, list): | ||
return data_config | ||
elif isinstance(data_config, tuple): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we handling the case where data_config
is a tuple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since Python is a dynamic language with type annotations, rather than strictly enforcing types, it seemed like passing in a tuple would be an easy mistake for users to make and falling through to the else to return [(cfg1, cfg2)]
would be a needlessly annoying/hard-to-debug consequence for that
- There could be fancier options for more complete
Sequence
support, like checkinghasattr(data_config, "__iter__")
and__getitem__
(which I believe would include dicts, not ideal) or checkingisinstance(data_config, collections.abc.Sequence)
(which would include strings - also weird)... But idk if you might want to support more sequence-like properties on DataConfig itself one day? - We could explicitly check
isinstance(data_config, DataConfig)
rather than having a plainelse
, and throw an error if none of the branches match, but doingisinstance
on our custom class felt like an even bigger violation of Python's duck typing convention
...So my compromise was just to explicitly support both the basic builtin array-likes but not try to handle other weird collections
that might crop up more rarely? 🤷♂️ But agree it's a little odd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion for or against adding the elif
branch for handling tuples, but I personally think it makes the code somewhat confusing, since the type annotation for data_config
doesn't include the tuple
case.
I don't think it's very likely for someone to pass in a tuple, as lists are predominantly used. To be honest, I can't even remember the last time I encountered a function that accepted a tuple instead of a list.
Extend EvalAlgorithmInterface.evaluate() interface to support specifying a list of multiple data_config objects. Previously the function already returned a list of results by dataset because when run with no data_config argument, all built-in datasets would be tested. However, users were not able to specify multiple datasets via data_config.
Rebased to current Appreciate if y'all can help get this merged so we can have a more intuitive API 🙏 |
Issue #, if available: #269
Description of changes:
Extend
EvalAlgorithmInterface.evaluate()
interface to support specifying a list of multipledata_config
objects. evaluate() already returns a list of results by dataset, because when run with nodata_config
argument all applicable built-in datasets are analyzed. As mentioned in the attached issue, it was weird and confusing that users couldn't explicitly specify a set of more than one datasets to use.Testing done:
get_dataset_configs()
FactualKnowledge
) where multiple integration test datasets had already been defined.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.