You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "bryan newbold (Updated) (JIRA)" <ji...@apache.org> on 2011/12/01 06:19:40 UTC

[jira] [Updated] (THRIFT-1229) Python fastbinary.c can not handle unicode as generated python code

     [ https://issues.apache.org/jira/browse/THRIFT-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

bryan newbold updated THRIFT-1229:
----------------------------------

    Attachment: python_fastbinary_utf8.patch

This git-style patch adds an extra arg to the fastbinary methods; see also https://github.com/octopart/thrift/commit/79611ef2cad47714d8addaa429bec4ce51bdf297

As a disclaimer, I don't have much experience writing Python C-API code, and this was not tested with Python3 at all. Use of global variables or separate functions ('encode_binary_utf8') may be more appropriate style. 

The binary encode function checks every passed string argument and only does UTF-8 encoding on Unicode PyObjects, which is arguably poor behavior but fit our use case best. If the utf8strings flag is set then all read objects are decoded as UTF-8; this could potentially lead to a situation where a client writes a non-UTF8 byte string with the utf8strings flag set with no error, but the server (also with the utf8strings flag set) has trouble decoding.

Code generated with the optional utf8strings flag to fastbinary would require the most recent version of the python libraries to be installed, i'm not sure if that flavor of backwards incompatibility is an issue. 

The Fastbinary.py is non-functional; see https://github.com/octopart/thrift/commit/1152508165783dcf624471ac66458dac3ca67e62 for a partial fix. 
                
> Python fastbinary.c can not handle unicode as generated python code
> -------------------------------------------------------------------
>
>                 Key: THRIFT-1229
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1229
>             Project: Thrift
>          Issue Type: Bug
>          Components: Python - Compiler, Python - Library
>    Affects Versions: 0.7
>         Environment: mac osx 10.6
>            Reporter: Favo
>         Attachments: python_fastbinary_utf8.patch
>
>
> #THRIFT-395 ([r959516|http://svn.apache.org/viewvc?view=revision&revision=959516]) fixed python unicode support by adding a parameter to thrift command line for py-generator. However this will not affect fastbinary.c. A normal generated Read/Write function looks like below, notice that the function returned before reach unicode handling logic.
> {code:title=TType.py|borderStyle=solid}
>   def write(self, oprot):
>     if oprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated and self.thrift_spec is not None and fastbinary is not None:
>       oprot.trans.write(fastbinary.encode_binary(self, (self.__class__, self.thrift_spec)))
>       return
>     if self.ip is not None:
>       oprot.writeFieldBegin('ip', TType.STRING, 6)
>       oprot.writeString(self.ip.encode('utf-8'))
>       oprot.writeFieldEnd()
> {code}
> Any suggestion for this?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira