You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@thrift.apache.org by "John R. Frank" <jr...@mit.edu> on 2013/04/24 13:25:44 UTC

thrift compile to CPython objects?

Thrift Developers,

We have noticed that reading (deserializing) messages in python is ~100x 
slower than deserializing the exact same data in C++.  Is this expected? 
(I can provide examples, if useful.)

Is there a better compilation flag to use than py:new_style,slots

Seems like it would be possible/useful to compile directly to PyObject 
structures   http://docs.python.org/2/c-api/structures.html

Does this exist or has anyone looked into it?


jrf


Re: thrift compile to CPython objects?

Posted by Matthew Chambers <mc...@wetafx.co.nz>.
We're writing out C++ and then wrapping it with Cython not only for 
speed, but to provide a more pythonic API.

https://github.com/sqlboy/plow/tree/master/lib/python/src

On 25/04/13 07:30, Randy Abernethy wrote:
> Hello John,
>
> Have you guys tried the fastbinary protocol or twisted?
>
> Having the Thrift compiler emit C++ to compile into a Python library with
> PyObjects for the Client/Server stubs and structs would be interesting.
> These could in turn use the existing C++ lib for protocols and transports
> making everything after the method call on the Python client compiled
> code. There is no such support in the Thrift main branch but it is 
> possible
> someone somewhere has done it, though I have not see such a thing. Would
> be great to hear how it works for you guys if you take a crack at it.
>
> Regards,
> Randy
>
> On 4/24/2013 4:25 AM, John R. Frank wrote:
>> Thrift Developers,
>>
>> We have noticed that reading (deserializing) messages in python is 
>> ~100x slower than deserializing the exact same data in C++. Is this 
>> expected? (I can provide examples, if useful.)
>>
>> Is there a better compilation flag to use than py:new_style,slots
>>
>> Seems like it would be possible/useful to compile directly to 
>> PyObject structures http://docs.python.org/2/c-api/structures.html
>>
>> Does this exist or has anyone looked into it?
>>
>>
>> jrf
>>
>
>


Re: thrift compile to CPython objects?

Posted by "John R. Frank" <jr...@mit.edu>.
>> Have you guys tried the fastbinary protocol or twisted?

after reading more source (and some great help from Randy Abernethy), I 
*think* I understand how to use fastbinary in python.

For an application to cause thrift-generated python code to use 
fastbinary, it seems necessary to do something like this try/except import 
statement to select between TBinaryProtocol and TBinaryProtocolAccelerated

Is this the best practice?  Is there a reason that TBinaryProtocol does 
not do this for us under the hood?

     from thrift.protocol.TBinaryProtocol import TBinaryProtocol, TBinaryProtocolAccelerated
     fastbinary_import_failure = None
     try:
         from thrift.protocol import fastbinary
         ## use faster C program to read/write
         protocol = TBinaryProtocolAccelerated

     except Exception, exc:
         fastbinary_import_failure = exc
         ## fall back to pure python
         protocol = TBinaryProtocol



Also, in order to make TBinaryProtocolAccelerated work with 
TBufferedTransport wrapping a flat file, I had to add this "readAll" 
method to the file object before passing into TBufferedTransport.

How does this differ from simply using read(sz)?

     def readAll(self, sz):
         '''
         This method allows TBinaryProtocolAccelerated to actually function.
         Copied from here
         http://svn.apache.org/repos/asf/hive/trunk/service/lib/py/thrift/transport/TTransport.py
         '''
         buff = ''
         have = 0
         while (have < sz):
             chunk = self.read(sz-have)
             have += len(chunk)
             buff += chunk

             if len(chunk) == 0:
                 raise EOFError()

         return buff


The code is all here:

https://github.com/trec-kba/streamcorpus/commit/68d752cf7f09c726bf5701324a65d78787349570


jrf

Re: thrift compile to CPython objects?

Posted by "John R. Frank" <jr...@mit.edu>.
> Have you guys tried the fastbinary protocol or twisted?

No.  Can you point me toward any docs on how to use those?

The interface to fastbinary differs from TBinaryProtocol.TBinaryProtocol, 
so I hunted for examples on how to use it --- haven't found example yet.


I tried using TBinaryProtocol.TBinaryProtocolAccelerated, and got errors 
like these:

>       retstring += self.__trans.readAll(reqlen - len(retstring))
E       AttributeError: 'file' object has no attribute 'readAll'


Or when I used TFramedTransport:

E TypeError: refill claimed to have refilled the buffer, but didn't!!



Sorry if I am missing an obvious place to find documentation on these!


Thanks!
John

Re: thrift compile to CPython objects?

Posted by Randy Abernethy <Ra...@rx-m.com>.
Hello John,

Have you guys tried the fastbinary protocol or twisted?

Having the Thrift compiler emit C++ to compile into a Python library with
PyObjects for the Client/Server stubs and structs would be interesting.
These could in turn use the existing C++ lib for protocols and transports
making everything after the method call on the Python client compiled
code. There is no such support in the Thrift main branch but it is possible
someone somewhere has done it, though I have not see such a thing. Would
be great to hear how it works for you guys if you take a crack at it.

Regards,
Randy

On 4/24/2013 4:25 AM, John R. Frank wrote:
> Thrift Developers,
>
> We have noticed that reading (deserializing) messages in python is 
> ~100x slower than deserializing the exact same data in C++.  Is this 
> expected? (I can provide examples, if useful.)
>
> Is there a better compilation flag to use than py:new_style,slots
>
> Seems like it would be possible/useful to compile directly to PyObject 
> structures http://docs.python.org/2/c-api/structures.html
>
> Does this exist or has anyone looked into it?
>
>
> jrf
>


-- 
Randy Abernethy
Managing Partner, RX-M, LLC
randy.abernethy@rx-m.com
Cell: +1-415-624-6447
San Francisco: +1-415-800-2922
Tokyo: +81-50-5532-8040
www.rx-m.com
@rxmllc