Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decode error AND Add catalan language to LANGUAGE_MAPPING #5

Merged
merged 3 commits into from
Jan 30, 2015

Conversation

dmiro
Copy link

@dmiro dmiro commented Jan 27, 2015

1. Add catalan language to LANGUAGE_MAPPING.
I previously I added the file with stop words in project "stop-words"

2. Decode error

stop_words = [line.strip().decode('utf-8')
             for line in language_file.readlines()]

Strip() return a copy of the string with leading and trailing whitespace characters removed.
But if the string contains non-ascii characters, Strip() causes a UnicodeDecodeError error
(eg UnicodeDecodeError: 'utf8' codec can not decode byte 0xc3 in position 34: unexpected end of data).

The workaround is to reorder the call:

stop_words = [line.decode('utf-8').strip()
             for line in language_file.readlines()]

dmiro added 3 commits January 27, 2015 15:15
stop_words = [line.strip().decode('utf-8')
             for line in language_file.readlines()]

Strip() return a copy of the string with leading and trailing whitespace characters removed.
But if the string contains non-ascii characters, Strip() causes a UnicodeDecodeError error
(eg UnicodeDecodeError: 'utf8' codec can not decode byte 0xc3 in position 34: unexpected end of data).

The workaround is to reorder the call:

stop_words = [line.decode('utf-8').strip()
             for line in language_file.readlines()]
Alir3z4 added a commit that referenced this pull request Jan 30, 2015
Decode error AND Add catalan language to LANGUAGE_MAPPING

Thanks @dmiro
@Alir3z4 Alir3z4 merged commit 8323629 into Alir3z4:master Jan 30, 2015
@Alir3z4
Copy link
Owner

Alir3z4 commented Jan 30, 2015

@dmiro good job.
I've merged both of the patches for repos.

I'll update the sub-module for this repo and release soon.

Thanks ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants