-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default custom regex enhancement #571
Conversation
Codecov Report
@@ Coverage Diff @@
## master #571 +/- ##
=======================================
Coverage 89.45% 89.45%
=======================================
Files 54 54
Lines 2561 2561
Branches 327 327
=======================================
Hits 2291 2291
Misses 203 203
Partials 67 67
Continue to review full report at Codecov.
|
anitya/tests/test_config.py
Outdated
@@ -52,7 +52,7 @@ | |||
|
|||
librariesio_platform_whitelist = ['pypi', 'rubygems'] | |||
|
|||
default_regex = '%(name)s(?:[-_]?(?:minsrc|src|source))?[-_]([^-/_\s]+?)(?i)(?:[-_](?:minsrc|src|source|asc|release))?\.(?:tar|t[bglx]z|tbz2|zip)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also just split the string over multiple lines, as you prefer :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already fixed this by using dummy regex in the test.
Your solution will not work (tested), because pytoml
has issue with reading multiline string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got one small change request, but otherwise it looks good!
anitya/lib/backends/__init__.py
Outdated
|
||
REGEX = '%(name)s(?:[-_]?(?:minsrc|src|source))?[-_]([^-/_\s]+?)(?i)(?:[-_]'\ | ||
'(?:minsrc|src|source|asc))?\.(?:tar|t[bglx]z|tbz2|zip)' | ||
REGEX = anitya_config.get('DEFAULT_REGEX') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since 'DEFAULT_REGEX' should always be present in the dictionary, it'd be better here to use anitya_config['DEFAULT_REGEX']
. That way if (somehow) it's not present it fails early and in an obvious way at import time rather than being None later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will change this, I looked how other value is read in the code and just copied it.
files/anitya.toml.sample
Outdated
@@ -51,6 +51,9 @@ social_auth_redirect_is_https = true | |||
# via Libraries.io. | |||
librariesio_platform_whitelist = [] | |||
|
|||
# Default regular expression used for backend | |||
default_regex = '%(name)s(?:[-_]?(?:minsrc|src|source))?[-_]([^-/_\s]+?)(?i)(?:[-_](?:minsrc|src|source|asc|release))?\.(?:tar|t[bglx]z|tbz2|zip)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe TOML requires double quotes (") for strings. In case you'd like to try and break it up into multiple lines, the syntax for multi-line strings without adding extraneous whitespace is at https://github.com/toml-lang/toml#string. There's only so much that can be done to make the regex readable, though, so do what you think looks best here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already tried that, but it looks like pytoml has issue with this syntax and it fails with error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you are right, this should be at least in double quotes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it looks like the issue is in backslash character. When trying to parse text without special characters I didn't have any issue with multiline string.
I will check if there is any escape character for toml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yep, \\
should escape backslashes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is weird, It looks like the regex is read correctly from the configuration, but I don't know why it is shown without backslashes on the page.
Right now I have this as input:
default_regex = """\
%(name)s(?:[-_]?(?:minsrc|src|source))?[-_]([^-/_\\]+?)(?i)(?:[-_]\
(?:minsrc|src|source|asc|release))?\\.(?:tar|t[bglx]z|tbz2|zip).\
"""
And this is what is saved to config dictionary:
DEFAULT_REGEX = %(name)s(?:[-_]?(?:minsrc|src|source))?[-_]([^-/_\]+?)(?i)(?:[-_](?:minsrc|src|source|asc|release))?\.(?:tar|t[bglx]z|tbz2|zip).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issues with missing backslash in frontend is there even when not using multiline string.
So this looks like issue that was there before, but nobody noticed it.
This will cause the application to fail if the DEFAULT_REGEX key is not defined.
Fix render issue with backslashes in HTML render
anitya/lib/backends/custom.py
Outdated
@@ -31,6 +31,7 @@ class CustomBackend(BaseBackend): | |||
more_info = 'More information in the '\ | |||
'<a href=\'/about#test-your-regex\'>about#test-your-regex</a>' | |||
default_regex = REGEX % {'name': '{project name}'} | |||
default_regex_html = default_regex.replace('\\', '\\\\') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is only relevant in the HTML template, I'd recommend just putting it there. Jinja lets you write Python in the templates.
anitya/templates/project_new.html
Outdated
@@ -109,7 +109,7 @@ <h1>{{ context }} project</h1> | |||
|
|||
examples["{{ plugin.name }}"]="{{ plugin.examples | format_examples }}"; | |||
more_info["{{ plugin.name }}"]="{{ plugin.more_info}}"; | |||
default_regex["{{ plugin.name }}"]="{{ plugin.default_regex }}"; | |||
default_regex["{{ plugin.name }}"]="{{ plugin.default_regex_html }}"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this could just be plugin.default_regex.replace('\\', '\\\\')
, I believe
I think this is fine, but just in case you hit some more complex (or user-provided) input that requires sanitizing, https://github.com/mozilla/bleach is a solid library used elsewhere in Fedora infra apps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one minor simplification and I think this will be good to go.
This should solve the issue #482
The lint test fails because
pytoml
can't parse multiline string.