-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug in WER plugin caused by special characters in field name #544
Fix bug in WER plugin caused by special characters in field name #544
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #544 +/- ##
==========================================
- Coverage 73.98% 73.96% -0.02%
==========================================
Files 284 284
Lines 23521 23538 +17
==========================================
+ Hits 17401 17409 +8
- Misses 6120 6129 +9
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
b3c4225
to
2f51dae
Compare
…characters-in-field-name
for pattern in camel_case_patterns: | ||
key = pattern.sub(r"\1_\2", key) | ||
|
||
# Keep only basic characters in key | ||
key = re.sub(r"[^a-zA-Z0-9_]", "", key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I foresee two possibly unwanted results:
- The key becomes empty ("невидимый")
- Or you might end up with duplicate keys (áErreur, éErreur)*
- 'lowercasing' might also cause this
If that is no problem, then it does not matter of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True! Changed the code a bit around to be able to warn the user when these edge cases happen. I chose for a warning, because I don't think it's worth it to fully handle these edge cases. But please let me know if you think differently about this.
…characters-in-field-name
…characters-in-field-name
(DIS-2507)