Default custom regex enhancement #571

Zlopez · 2018-08-06T13:19:02Z

This should solve the issue #482

The lint test fails because pytoml can't parse multiline string.

codecov-io · 2018-08-06T14:13:54Z

Codecov Report

Merging #571 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #571   +/-   ##
=======================================
  Coverage   89.45%   89.45%           
=======================================
  Files          54       54           
  Lines        2561     2561           
  Branches      327      327           
=======================================
  Hits         2291     2291           
  Misses        203      203           
  Partials       67       67

Impacted Files	Coverage Δ
anitya/config.py	`100% <ø> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 74ea662...ceddcdc. Read the comment docs.

pypingou · 2018-08-06T14:15:02Z

anitya/tests/test_config.py

@@ -52,7 +52,7 @@

 librariesio_platform_whitelist = ['pypi', 'rubygems']

-default_regex = '%(name)s(?:[-_]?(?:minsrc|src|source))?[-_]([^-/_\s]+?)(?i)(?:[-_](?:minsrc|src|source|asc|release))?\.(?:tar|t[bglx]z|tbz2|zip)'


You could also just split the string over multiple lines, as you prefer :)

I already fixed this by using dummy regex in the test.
Your solution will not work (tested), because pytoml has issue with reading multiline string.

jeremycline

I've got one small change request, but otherwise it looks good!

jeremycline · 2018-08-06T14:11:50Z

anitya/lib/backends/__init__.py

-
-REGEX = '%(name)s(?:[-_]?(?:minsrc|src|source))?[-_]([^-/_\s]+?)(?i)(?:[-_]'\
-        '(?:minsrc|src|source|asc))?\.(?:tar|t[bglx]z|tbz2|zip)'
+REGEX = anitya_config.get('DEFAULT_REGEX')


Since 'DEFAULT_REGEX' should always be present in the dictionary, it'd be better here to use anitya_config['DEFAULT_REGEX']. That way if (somehow) it's not present it fails early and in an obvious way at import time rather than being None later.

Ok, I will change this, I looked how other value is read in the code and just copied it.

jeremycline · 2018-08-06T14:12:55Z

files/anitya.toml.sample

@@ -51,6 +51,9 @@ social_auth_redirect_is_https = true
 # via Libraries.io.
 librariesio_platform_whitelist = []

+# Default regular expression used for backend
+default_regex = '%(name)s(?:[-_]?(?:minsrc|src|source))?[-_]([^-/_\s]+?)(?i)(?:[-_](?:minsrc|src|source|asc|release))?\.(?:tar|t[bglx]z|tbz2|zip)'


I believe TOML requires double quotes (") for strings. In case you'd like to try and break it up into multiple lines, the syntax for multi-line strings without adding extraneous whitespace is at https://github.com/toml-lang/toml#string. There's only so much that can be done to make the regex readable, though, so do what you think looks best here :)

I already tried that, but it looks like pytoml has issue with this syntax and it fails with error.

But you are right, this should be at least in double quotes.

Ok, it looks like the issue is in backslash character. When trying to parse text without special characters I didn't have any issue with multiline string.
I will check if there is any escape character for toml.

Ah, yep, \\ should escape backslashes

It looks like the escaped backslashes aren't visible on html page :-(

This is weird, It looks like the regex is read correctly from the configuration, but I don't know why it is shown without backslashes on the page.
Right now I have this as input:

default_regex = """\ %(name)s(?:[-_]?(?:minsrc|src|source))?[-_]([^-/_\\]+?)(?i)(?:[-_]\ (?:minsrc|src|source|asc|release))?\\.(?:tar|t[bglx]z|tbz2|zip).\ """

And this is what is saved to config dictionary:

DEFAULT_REGEX = %(name)s(?:[-_]?(?:minsrc|src|source))?[-_]([^-/_\]+?)(?i)(?:[-_](?:minsrc|src|source|asc|release))?\.(?:tar|t[bglx]z|tbz2|zip).

The issues with missing backslash in frontend is there even when not using multiline string.
So this looks like issue that was there before, but nobody noticed it.

This will cause the application to fail if the DEFAULT_REGEX key is not defined.

Fix render issue with backslashes in HTML render

jeremycline · 2018-08-17T10:08:44Z

anitya/lib/backends/custom.py

@@ -31,6 +31,7 @@ class CustomBackend(BaseBackend):
    more_info = 'More information in the '\
        '<a href=\'/about#test-your-regex\'>about#test-your-regex</a>'
    default_regex = REGEX % {'name': '{project name}'}
+    default_regex_html = default_regex.replace('\\', '\\\\')


Since this is only relevant in the HTML template, I'd recommend just putting it there. Jinja lets you write Python in the templates.

jeremycline · 2018-08-17T10:12:43Z

anitya/templates/project_new.html

@@ -109,7 +109,7 @@ <h1>{{ context }} project</h1>

        examples["{{ plugin.name }}"]="{{ plugin.examples | format_examples }}";
        more_info["{{ plugin.name }}"]="{{ plugin.more_info}}";
-        default_regex["{{ plugin.name }}"]="{{ plugin.default_regex }}";
+        default_regex["{{ plugin.name }}"]="{{ plugin.default_regex_html }}";


So this could just be plugin.default_regex.replace('\\', '\\\\'), I believe

I think this is fine, but just in case you hit some more complex (or user-provided) input that requires sanitizing, https://github.com/mozilla/bleach is a solid library used elsewhere in Fedora infra apps.

jeremycline

Just one minor simplification and I think this will be good to go.

Zlopez added 2 commits August 6, 2018 15:16

Default custom regex enhancement

b3bfc8f

Fix lint test

a4f619b

pypingou reviewed Aug 6, 2018

View reviewed changes

jeremycline suggested changes Aug 6, 2018

View reviewed changes

Zlopez added 4 commits August 6, 2018 16:41

Change how the default_regex value is initialized

6d37e74

This will cause the application to fail if the DEFAULT_REGEX key is not defined.

Use basic string instead of literal string

6b3796a

Format regex as mutliline string

3d45718

Fix render issue

50021a4

Fix render issue with backslashes in HTML render

jeremycline reviewed Aug 17, 2018

View reviewed changes

Move regex update to HTML template

ceddcdc

jeremycline approved these changes Aug 17, 2018

View reviewed changes

Zlopez merged commit 7e171ce into fedora-infra:master Aug 17, 2018

Zlopez deleted the custom_regex_enhancement branch August 17, 2018 12:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default custom regex enhancement #571

Default custom regex enhancement #571

Zlopez commented Aug 6, 2018

codecov-io commented Aug 6, 2018 •

edited

Loading

pypingou Aug 6, 2018

Zlopez Aug 6, 2018 •

edited

Loading

jeremycline left a comment

jeremycline Aug 6, 2018

Zlopez Aug 6, 2018

jeremycline Aug 6, 2018

Zlopez Aug 6, 2018

Zlopez Aug 6, 2018

Zlopez Aug 6, 2018

jeremycline Aug 6, 2018

Zlopez Aug 6, 2018

Zlopez Aug 7, 2018

Zlopez Aug 7, 2018

jeremycline Aug 17, 2018

jeremycline Aug 17, 2018

jeremycline left a comment

		@@ -52,7 +52,7 @@

		librariesio_platform_whitelist = ['pypi', 'rubygems']

		default_regex = '%(name)s(?:[-_]?(?:minsrc\|src\|source))?[-_]([^-/_\s]+?)(?i)(?:[-_](?:minsrc\|src\|source\|asc\|release))?\.(?:tar\|t[bglx]z\|tbz2\|zip)'

Default custom regex enhancement #571

Default custom regex enhancement #571

Conversation

Zlopez commented Aug 6, 2018

codecov-io commented Aug 6, 2018 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Zlopez Aug 6, 2018 • edited Loading

Choose a reason for hiding this comment

jeremycline left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeremycline left a comment

Choose a reason for hiding this comment

codecov-io commented Aug 6, 2018 •

edited

Loading

Zlopez Aug 6, 2018 •

edited

Loading