You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Jeff Hammerbacher (JIRA)" <ji...@apache.org> on 2010/06/06 05:02:05 UTC

[jira] Created: (AVRO-565) Investigate Python encoding error

Investigate Python encoding error
---------------------------------

                 Key: AVRO-565
                 URL: https://issues.apache.org/jira/browse/AVRO-565
             Project: Avro
          Issue Type: Bug
          Components: python
            Reporter: Jeff Hammerbacher


Tyler B is seeing the following encoding error: http://avro.pastebin.com/b4HSYjCz.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-565) Investigate Python encoding error

Posted by "R. Tyler Ballance (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877886#action_12877886 ] 

R. Tyler Ballance commented on AVRO-565:
----------------------------------------

I found the source of our issues, it appears we have some places where UTF-8 encoded str objects are floating around (with utf-8 code-points in them) that were failing to "re-encode" to UTF-8

The solution we're working on is removing encoded strings in the code base and just using unicode objects for everything, I think we can close this ticket

> Investigate Python encoding error
> ---------------------------------
>
>                 Key: AVRO-565
>                 URL: https://issues.apache.org/jira/browse/AVRO-565
>             Project: Avro
>          Issue Type: Bug
>          Components: python
>            Reporter: Jeff Hammerbacher
>
> Tyler B is seeing the following encoding error: http://avro.pastebin.com/b4HSYjCz.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-565) Investigate Python encoding error

Posted by "Philip Zeyliger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876044#action_12876044 ] 

Philip Zeyliger commented on AVRO-565:
--------------------------------------

What's the client call?  What's the value of datum?

I believe you see this error when you pass a string that python refuses to encode as unicode.

{quote}
>>> "\xe2".encode("utf-8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)
{quote}

> Investigate Python encoding error
> ---------------------------------
>
>                 Key: AVRO-565
>                 URL: https://issues.apache.org/jira/browse/AVRO-565
>             Project: Avro
>          Issue Type: Bug
>          Components: python
>            Reporter: Jeff Hammerbacher
>
> Tyler B is seeing the following encoding error: http://avro.pastebin.com/b4HSYjCz.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-565) Investigate Python encoding error

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875980#action_12875980 ] 

Jeff Hammerbacher commented on AVRO-565:
----------------------------------------

Copying the error here in case the pastebin expires:

{code}
  File "/usr/local/lib/python2.6/site-packages/avro/ipc.py", line 134, in request
    self.write_call_request(message_name, request_datum, buffer_encoder)
  File "/usr/local/lib/python2.6/site-packages/avro/ipc.py", line 181, in write_call_request
    self.write_request(message.request, request_datum, encoder)
  File "/usr/local/lib/python2.6/site-packages/avro/ipc.py", line 185, in write_request
    datum_writer.write(request_datum, encoder)
  File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 720, in write
    self.write_data(self.writers_schema, datum, encoder)
  File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 755, in write_data
    self.write_record(writers_schema, datum, encoder)
  File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 843, in write_record
    self.write_data(field.type, datum.get(field.name), encoder)
  File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 753, in write_data
    self.write_union(writers_schema, datum, encoder)
  File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 833, in write_union
    self.write_data(writers_schema.schemas[index_of_schema], datum, encoder)
  File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 733, in write_data
    encoder.write_utf8(datum)
  File "/usr/local/lib/python2.6/site-packages/avro/io.py", line 328, in write_utf8
    datum = datum.encode("utf-8")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 6: ordinal not in range(128)
{code}

> Investigate Python encoding error
> ---------------------------------
>
>                 Key: AVRO-565
>                 URL: https://issues.apache.org/jira/browse/AVRO-565
>             Project: Avro
>          Issue Type: Bug
>          Components: python
>            Reporter: Jeff Hammerbacher
>
> Tyler B is seeing the following encoding error: http://avro.pastebin.com/b4HSYjCz.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.