You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Jeff Hammerbacher (JIRA)" <ji...@apache.org> on 2010/06/06 05:02:05 UTC
[jira] Created: (AVRO-565) Investigate Python encoding error
Investigate Python encoding error
---------------------------------
Key: AVRO-565
URL: https://issues.apache.org/jira/browse/AVRO-565
Project: Avro
Issue Type: Bug
Components: python
Reporter: Jeff Hammerbacher
Tyler B is seeing the following encoding error: http://avro.pastebin.com/b4HSYjCz.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (AVRO-565) Investigate Python encoding error
Posted by "R. Tyler Ballance (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877886#action_12877886 ]
R. Tyler Ballance commented on AVRO-565:
----------------------------------------
I found the source of our issues, it appears we have some places where UTF-8 encoded str objects are floating around (with utf-8 code-points in them) that were failing to "re-encode" to UTF-8
The solution we're working on is removing encoded strings in the code base and just using unicode objects for everything, I think we can close this ticket
> Investigate Python encoding error
> ---------------------------------
>
> Key: AVRO-565
> URL: https://issues.apache.org/jira/browse/AVRO-565
> Project: Avro
> Issue Type: Bug
> Components: python
> Reporter: Jeff Hammerbacher
>
> Tyler B is seeing the following encoding error: http://avro.pastebin.com/b4HSYjCz.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (AVRO-565) Investigate Python encoding error
Posted by "Philip Zeyliger (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876044#action_12876044 ]
Philip Zeyliger commented on AVRO-565:
--------------------------------------
What's the client call? What's the value of datum?
I believe you see this error when you pass a string that python refuses to encode as unicode.
{quote}
>>> "\xe2".encode("utf-8")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)
{quote}
> Investigate Python encoding error
> ---------------------------------
>
> Key: AVRO-565
> URL: https://issues.apache.org/jira/browse/AVRO-565
> Project: Avro
> Issue Type: Bug
> Components: python
> Reporter: Jeff Hammerbacher
>
> Tyler B is seeing the following encoding error: http://avro.pastebin.com/b4HSYjCz.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (AVRO-565) Investigate Python encoding error
Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875980#action_12875980 ]
Jeff Hammerbacher commented on AVRO-565:
----------------------------------------
Copying the error here in case the pastebin expires:
{code}
File "/usr/local/lib/python2.6/site-packages/avro/ipc.py", line 134, in request
self.write_call_request(message_name, request_datum, buffer_encoder)
File "/usr/local/lib/python2.6/site-packages/avro/ipc.py", line 181, in write_call_request
self.write_request(message.request, request_datum, encoder)
File "/usr/local/lib/python2.6/site-packages/avro/ipc.py", line 185, in write_request
datum_writer.write(request_datum, encoder)
File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 720, in write
self.write_data(self.writers_schema, datum, encoder)
File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 755, in write_data
self.write_record(writers_schema, datum, encoder)
File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 843, in write_record
self.write_data(field.type, datum.get(field.name), encoder)
File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 753, in write_data
self.write_union(writers_schema, datum, encoder)
File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 833, in write_union
self.write_data(writers_schema.schemas[index_of_schema], datum, encoder)
File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 733, in write_data
encoder.write_utf8(datum)
File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 328, in write_utf8
datum = datum.encode("utf-8")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 6: ordinal not in range(128)
{code}
> Investigate Python encoding error
> ---------------------------------
>
> Key: AVRO-565
> URL: https://issues.apache.org/jira/browse/AVRO-565
> Project: Avro
> Issue Type: Bug
> Components: python
> Reporter: Jeff Hammerbacher
>
> Tyler B is seeing the following encoding error: http://avro.pastebin.com/b4HSYjCz.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.