Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue-373: Fix UniqueFieldValidator to handle unicode catalog queries better #466

Merged
merged 6 commits into from
Dec 12, 2017

Conversation

ramonski
Copy link
Contributor

Description of the issue/feature this PR addresses

Linked issue: #373

Current behavior before PR

Unique field validation fails if unicode characters are entered, e.g. for Calculations.

Desired behavior after PR is merged

If a catalog query failed to fetch the object candidates, fall back to api.get_parent(context).objectValues() to get a list of objects to check.

--
I confirm I have tested this PR thoroughly and coded it according to PEP8
and Plone's Python styleguide standards.

Copy link
Member

@xispa xispa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just focused on your effective changes, haven't paid too much attention to PEP8. I've suggested a change that I think could reduce some overhead.


# return directly if nothing changed
if value == field.get(context):
return True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could move this check at the very beginning of __call__ so no extra calls api.get_uid(context), getToolByName(...) will be done if the field value has not changed

# N.B. We want to use the catalog to speed things up, because using
# `parent.objectValues` is very expensive if the parent object contains
# many items and causes the UI to block too long
catalog_query = self.make_catalog_query(context, field, value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside make_catalog_query you check if there is an index for the current field and if so, you add it in the query as a parameter. Note you only generate the query if there is an index for the field, otherwise you always return None.
So, in fact, if catalog_query is not None then you don't need to go further waking up objects (map inside query_parent_objects) and iterating through siblings later.

Something like this would work I guess:

if catalog_query:
    catalog = api.get_catalogs_for(context)[0]
    brains = catalog(catalog_query)
    return len(brains) == 0

With this approach you could also remove query_parent_objects function and just call get_parent_objects instead:

catalog_query = self.make_catalog_query(context, field, value)
if catalog_query:
    catalog = api.get_catalogs_for(context)[0]
    brains = catalog(catalog_query)
    return len(brains) == 0
parent_objects = self.get_parent_objects(context)
[...]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm actually not sure if a field query would return a unique value,
e.g. if I entered "Foo" as the title for a new Client, the catalog result might probably contain brains with the titles "Foo-1", "Foo-2" and "Foo-3".
However, none of them is equal to "Foo" and the validation for the title "Foo" might falsly fail.

Also you have to distinguish between a successful catalog query which just returned an empty list and a failed one, where either the field is not indexed or an UnicodeDecodeError occurred.

In the latter case (catalog query failed) I'm doing the self.get_parent_objects(context), but I do not want to do so for an empty successful result...

Copy link
Member

@xispa xispa Dec 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not wrong, searches against FieldIndex type indexes only return values that matches entirely with the search term (http://zope.readthedocs.io/en/latest/zope2book/SearchingZCatalog.html#searching-field-indexes), so I if the search term is "Foo", it will not return those brains with values "Foo-1" or "Foo-2" for the same index. But understand that this would mean an extra check to be sure the type of the index is FieldIndex. Worth to mention that another solution would be to use the uniqueValues() function, that retrieves all unique values for a certain index: https://docs.plone.org/develop/plone/searching_and_indexing/query.html#unique-values

Re successful catalog query, you are pretty right here, but well.. if we have to take into account the eventuality of a non indexed field everywhere, then "Houston, we have a problem". Re UnicodeDecodeError (as well as other errors that may rise):

catalog_query = self.make_catalog_query(context, field, value)
if catalog_query:
    try:
        catalog = api.get_catalogs_for(context)[0]
        brains = catalog(catalog_query)
        return len(brains) == 0
    except (IndexError, UnicodeDecodeError, ParseError, api.BikaLIMSError) as e:
        logger.warn('Something really wrong happened!')
parent_objects = self.get_parent_objects(context)
[...]

Of course, would be better if all this block of code was encapsulated in another function, or even reuse make_catalog_query by renaming it and let it make the query and return the brains if all is ok or None otherwise. An extra-bonus is that then, you'll not need to call api.get_catalogs_for(context) twice:

brains = self.try_catalog_query(context, field, value)
if brains != None:
    return len(brains) == 0
parent_objects = self.get_parent_objects(context)
[...]

fieldname in item.Schema() and \
str(item.Schema()[fieldname].get(item)) == str(value):
str(item.Schema()[fieldname].get(item)) == str(value).strip():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value returned from calling the getter should always have priority over the value stored in the Schema's field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so you mean I should rather first check for an "accessor" of the field? E.g.:

str(item.getField(fieldName).getAccessor()()) == str(value).strip()

Copy link
Member

@xispa xispa Dec 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes... unfortunately, I am afraid we have some accessors that may return a value not consistent with the value stored in the Schema's field... Don't think there are too many places, but indeed, this is something for which we need to find some time to look in detail. Maybe a function in the api like get_value(obj, field_name, default=None) to deal with this would be helpful on this regard.

@xispa xispa merged commit 3cb9bb0 into master Dec 12, 2017
@xispa xispa deleted the bugfix/issue-373-unique-field-validator branch December 17, 2017 01:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

2 participants