Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow unicode in test names #30

Closed
wants to merge 1 commit into from
Closed

Allow unicode in test names #30

wants to merge 1 commit into from

Conversation

davehunt
Copy link
Collaborator

@The-Compiler unicode hurts my head! I encountered a test with as a parameter. This caused the report to fail due to UnicodeDecodeError. This patch fixes it, but would you mind taking a look to see if there's a smarter approach? I also found it difficult to write a test for this use case.

@The-Compiler
Copy link
Member

Any idea how I can reproduce the issue? I tried running py.test --html foo.html test_foo.py with this file and python 2:

# encoding: utf-8

import pytest

@pytest.mark.parametrize('val', ['〈'])
def test_foo(val):
    pass

but that passes without any trouble.

@davehunt
Copy link
Collaborator Author

It's this test and the parameter value is here. I'll see if I can create a simplified test case.

@davehunt
Copy link
Collaborator Author

@The-Compiler here you go:

# -*- coding: utf-8 -*-

import pytest

@pytest.mark.parametrize('val', ['foo'], ids=['〈'])
def test_unicode(val):
    pass

@The-Compiler
Copy link
Member

So what I think is happening is that pytest passes through the IDs verbatim, i.e. as a python2 str (i.e. bytes with a particular encoding, utf-8 in this case).

Then when you try to get unicode from the HTML document, py.xml internally calls unicode() (without an encoding) on it, which fails for non-ascii chars.

I think the proper solution would be to do something similar to what pytest does:

if _PY3:
    import codecs

    def _escape_bytes(val):
        """
        If val is pure ascii, returns it as a str(), otherwise escapes
        into a sequence of escaped bytes:
        b'\xc3\xb4\xc5\xd6' -> u'\\xc3\\xb4\\xc5\\xd6'
        note:
           the obvious "v.decode('unicode-escape')" will return
           valid utf-8 unicode if it finds them in the string, but we
           want to return escaped bytes for any byte, even if they match
           a utf-8 string.
        """
        if val:
            # source: http://goo.gl/bGsnwC
            encoded_bytes, _ = codecs.escape_encode(val)
            return encoded_bytes.decode('ascii')
        else:
            # empty bytes crashes codecs.escape_encode (#1087)
            return ''
else:
    def _escape_bytes(val):
        """
        In py2 bytes and str are the same type, so return it unchanged if it
        is a full ascii string, otherwise escape it into its binary form.
        """
        try:
            return val.decode('ascii')
        except UnicodeDecodeError:
            return val.encode('string-escape')

pytest seems to not do that when ids is given (only when generating them).

@nicoddemus - sorry for all the highlights today, but I'd like your opinion on this 😉 Does this sound correct? Should pytest core call _escape_bytes even if bytes are given via ids?

@The-Compiler
Copy link
Member

Oh, here is the stacktrace of @davehunt's example:

foo.py .Traceback (most recent call last):
  File "/home/florian/.venv2/bin/py.test", line 11, in <module>
    sys.exit(main())
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/config.py", line 48, in main
    return config.hook.pytest_cmdline_main(config=config)
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 724, in __call__
    return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 338, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 333, in <lambda>
    _MultiCall(methods, kwargs, hook.spec_opts).execute()
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 596, in execute
    res = hook_impl.function(*args)
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/main.py", line 115, in pytest_cmdline_main
    return wrap_session(config, _main)
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/main.py", line 110, in wrap_session
    exitstatus=session.exitstatus)
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 724, in __call__
    return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 338, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 333, in <lambda>
    _MultiCall(methods, kwargs, hook.spec_opts).execute()
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 595, in execute
    return _wrapped_call(hook_impl.function(*args), self.execute)
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 249, in _wrapped_call
    wrap_controller.send(call_outcome)
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/terminal.py", line 361, in pytest_sessionfinish
    outcome.get_result()
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 279, in get_result
    _reraise(*ex)  # noqa
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 264, in __init__
    self.result = func()
  File "/home/florian/.venv2/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 596, in execute
    res = hook_impl.function(*args)
  File "/home/florian/.venv2/lib/python2.7/site-packages/pytest_html/plugin.py", line 269, in pytest_sessionfinish
    unicode_doc = doc.unicode(indent=2)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 69, in unicode
    HtmlVisitor(l.append, indent, shortempty=False).visit(self)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 126, in visit
    visitmethod(node)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 158, in Tag
    self.visit(x)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 126, in visit
    visitmethod(node)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 158, in Tag
    self.visit(x)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 126, in visit
    visitmethod(node)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 158, in Tag
    self.visit(x)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 126, in visit
    visitmethod(node)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 141, in list
    self.visit(elem)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 126, in visit
    visitmethod(node)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 158, in Tag
    self.visit(x)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 126, in visit
    visitmethod(node)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 158, in Tag
    self.visit(x)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 126, in visit
    visitmethod(node)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 141, in list
    self.visit(elem)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 126, in visit
    visitmethod(node)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 158, in Tag
    self.visit(x)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 126, in visit
    visitmethod(node)
  File "/home/florian/.venv2/lib/python2.7/site-packages/py/_xmlgen.py", line 132, in __object
    self.write(escape(unicode(obj)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 21: ordinal not in range(128)

@nicoddemus
Copy link
Member

@The-Compiler

pytest seems to not do that when ids is given (only when generating them).

Hmm if you convert user-provided not ascii test ids into "escaped" ascii, won't that be surprising to users? Because in the end the HTML output will contain escaped bytes instead of their UTF-8 encoded strings... on the other hand, they did provide a utf-8-encoded bytes instance to ids= instead of a unicode instance, and that in Python 2 is asking for trouble. 😁

Does this work?

# -*- coding: utf-8 -*-

import pytest

@pytest.mark.parametrize('val', ['foo'], ids=[u'〈'])
def test_unicode(val):
    pass

@davehunt
Copy link
Collaborator Author

I see your point @nicoddemus, unfortunately your suggestion gives a different exception:

../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/runner.py:149: in __init__
    self.result = func()
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/main.py:431: in _memocollect
    return self._memoizedcall('_collected', lambda: list(self.collect()))
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/main.py:311: in _memoizedcall
    res = function()
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/main.py:431: in <lambda>
    return self._memoizedcall('_collected', lambda: list(self.collect()))
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/python.py:604: in collect
    return super(Module, self).collect()
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/python.py:458: in collect
    res = self.makeitem(name, obj)
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/python.py:470: in makeitem
    collector=self, name=name, obj=obj)
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py:724: in __call__
    return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py:338: in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py:333: in <lambda>
    _MultiCall(methods, kwargs, hook.spec_opts).execute()
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py:595: in execute
    return _wrapped_call(hook_impl.function(*args), self.execute)
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py:249: in _wrapped_call
    wrap_controller.send(call_outcome)
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/python.py:329: in pytest_pycollect_makeitem
    res = list(collector._genfunctions(name, obj))
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/python.py:500: in _genfunctions
    subname = "%s[%s]" %(name, callspec.id)
../../.virtualenvs/tmp-dd105f0f1432d97c/lib/python2.7/site-packages/_pytest/python.py:863: in id
    return "-".join(map(str, filter(None, self._idlist)))
E   UnicodeEncodeError: 'ascii' codec can't encode character u'\u3008' in position 0: ordinal not in range(128)

@nicoddemus
Copy link
Member

Thanks @davehunt.

It was less a suggestion and more curiosity if that would work... but I didn't have my hopes up. ☺️

I think we should try to implement escaping the bytes as @The-Compiler suggested.

@davehunt
Copy link
Collaborator Author

I think we should try to implement escaping the bytes as @The-Compiler suggested.

Just to clarify, are you saying this should be done in pytest core?

@nicoddemus
Copy link
Member

Hmmm I think so, not sure if it is possible to fix in pytest-html

@davehunt
Copy link
Collaborator Author

davehunt commented Feb 1, 2016

@nicoddemus I'd be happy to raise an issue or even submit a patch for a test (possibly even a fix) but I'd need a little bit of guidance. I'm not too familiar with the pytest codebase, and it's not clear to me how best to expose/address this issue.

@nicoddemus
Copy link
Member

Sure @davehunt, please open up an issue. @The-Compiler any chance you can tackle this? I'm pretty busy this week, unfortunately...

@The-Compiler
Copy link
Member

I can't promise anything right now - busy with preparing a pytest talk for next week and a pytest training for in 3 weeks. 😉

@davehunt
Copy link
Collaborator Author

davehunt commented Feb 1, 2016

Raised as pytest-dev/pytest#1351

@davehunt
Copy link
Collaborator Author

This should be fixed in pytest core.

@davehunt davehunt closed this Feb 18, 2016
@davehunt davehunt deleted the allow-unicode branch November 28, 2016 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants