You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Trung Ly (JIRA)" <ji...@apache.org> on 2014/08/13 01:14:12 UTC

[jira] [Commented] (THRIFT-1460) why not add unicode strings support to python directly?

    [ https://issues.apache.org/jira/browse/THRIFT-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14094865#comment-14094865 ] 

Trung Ly commented on THRIFT-1460:
----------------------------------

I realize this is an old issue, but I don't get why this ticket was closed.  It appears to still be a problem in 0.9.1

{code}
>>> from thrift.Thrift import TType
>>> class MyThriftModel(object):
...     def __init__(self, mystring):
...         self.mystring=mystring
...     def write(self, prot):
...         prot.writeFieldBegin('mystring', TType.STRING, 1)
...         prot.writeString(self.mystring)
...         prot.writeFieldEnd()
...
>>> from thrift.TSerialization import serialize
>>> serialize(MyThriftModel(u'Shaq\u2013Kobe feud'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/tly/.virtualenvs/test/lib/python2.7/site-packages/thrift/TSerialization.py", line 28, in serialize
    thrift_object.write(protocol)
  File "<stdin>", line 6, in write
  File "/Users/tly/.virtualenvs/test/lib/python2.7/site-packages/thrift/protocol/TBinaryProtocol.py", line 123, in writeString
    self.trans.write(str)
  File "/Users/tly/.virtualenvs/test/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 223, in write
    self._buffer.write(buf)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 4: ordinal not in range(128)
{code}

The problem is the usage of cStringIO (in TMemoryBuffer), particularly when your data contains utf-8 characters.

> why not add unicode strings support to python directly?
> -------------------------------------------------------
>
>                 Key: THRIFT-1460
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1460
>             Project: Thrift
>          Issue Type: Bug
>          Components: Python - Library
>    Affects Versions: 0.8
>         Environment: redhat linux
>            Reporter: shen guanpu
>            Assignee: Jake Farrell
>              Labels: python, unicode
>             Fix For: 0.9
>
>
> i install thrift python lib by easy_install
> but in version0.8,I still didn't see the patch(https://issues.apache.org/jira/browse/THRIFT-395) be added in the lib
> I had to hack the code follow the patch manually (https://issues.apache.org/jira/secure/attachment/12404198/0003-THRIFT-395.-python-Phase-Two-of-support-for-unicode.patch)
> Can any of the developers fix the problem?



--
This message was sent by Atlassian JIRA
(v6.2#6252)