You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by David Li <li...@gmail.com> on 2019/07/29 14:16:31 UTC

Further Flight optimizations (was Re: BigQuery Storage API now supports Arow)

This is getting rather off the original topic, so I changed the subject.

This is the code in gRPC-Python, where incoming message data is copied
into a Python bytearray:
https://github.com/grpc/grpc/blob/b8b6df08ae6d9f60e1b282a659d26b8c340de5c9/src/python/grpcio/grpc/_cython/_cygrpc/operation.pyx.pxi#L165-L173

In fact, I think the `bytes(bytearray)` call at the end is an additional copy.

We do something similar in Flight-C++:
https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/serialization-internal.cc#L105-L118

It's an open question whether we can get gRPC to avoid these copies.

Somewhat related, Flight-Java performance is hindered by this gRPC
issue: https://github.com/grpc/grpc-java/issues/5433

Essentially, the backpressure signal in gRPC-Java is currently not
related to actual network conditions at all. Alluxio implemented their
own flow control for a 30% throughput improvement:
https://github.com/Alluxio/alluxio/commit/6f02b41ea529b9f59c0c42de216f402b3b4c9882

Best,
David

On 7/29/19, Antoine Pitrou <an...@python.org> wrote:
>
> Le 29/07/2019 à 15:13, David Li a écrit :
>> Ah, sorry, I was unclear - the performance issue is not with Flight at
>> all, but with putting Arrow over gRPC naively.
>>
>> At some point, we benchmarked gRPC-Python carrying Arrow data, and
>> found that it only achieved ~half the throughput of Flight-Python. So
>> implementing BigQuery-Flight would also avoid that performance
>> pitfall, assuming the client library for BigQuery-Arrow uses
>> gRPC-Python.
>>
>> The reason we found is that since gRPC technically does not require
>> Protobuf, it copies message payloads into a CPython bytestring, and
>> then the Python code then turns around and hands that to Protobuf,
>> which then copies data into its data structures and gives it back to
>> Python
>
> gRPC shouldn't need to copy the payload into a CPython bytestring.
> Instead, it could instantiate a buffer-like Python object pointing to
> the original data.  This is "easily" done in Cython, and gRPC-python
> already uses Cython:
> https://cython.readthedocs.io/en/latest/src/userguide/buffer.html
> https://docs.python.org/3/c-api/buffer.html
>
> Regards
>
> Antoine.
>

Re: Further Flight optimizations (was Re: BigQuery Storage API now supports Arow)

Posted by Antoine Pitrou <an...@python.org>.
Le 29/07/2019 à 16:16, David Li a écrit :
> This is getting rather off the original topic, so I changed the subject.
> 
> This is the code in gRPC-Python, where incoming message data is copied
> into a Python bytearray:
> https://github.com/grpc/grpc/blob/b8b6df08ae6d9f60e1b282a659d26b8c340de5c9/src/python/grpcio/grpc/_cython/_cygrpc/operation.pyx.pxi#L165-L173
> 
> In fact, I think the `bytes(bytearray)` call at the end is an additional copy.

Right.  This is definitely not optimal.  Ideally they would do a single
b''.join(...) on a list of memoryviews (or, if there is a single slice,
return a memoryview to that slice).

Since this is off-topic for Flight, I'll leave it here though :-)

Regards

Antoine.


> 
> We do something similar in Flight-C++:
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/serialization-internal.cc#L105-L118
> 
> It's an open question whether we can get gRPC to avoid these copies.
> 
> Somewhat related, Flight-Java performance is hindered by this gRPC
> issue: https://github.com/grpc/grpc-java/issues/5433
> 
> Essentially, the backpressure signal in gRPC-Java is currently not
> related to actual network conditions at all. Alluxio implemented their
> own flow control for a 30% throughput improvement:
> https://github.com/Alluxio/alluxio/commit/6f02b41ea529b9f59c0c42de216f402b3b4c9882
> 
> Best,
> David
> 
> On 7/29/19, Antoine Pitrou <an...@python.org> wrote:
>>
>> Le 29/07/2019 à 15:13, David Li a écrit :
>>> Ah, sorry, I was unclear - the performance issue is not with Flight at
>>> all, but with putting Arrow over gRPC naively.
>>>
>>> At some point, we benchmarked gRPC-Python carrying Arrow data, and
>>> found that it only achieved ~half the throughput of Flight-Python. So
>>> implementing BigQuery-Flight would also avoid that performance
>>> pitfall, assuming the client library for BigQuery-Arrow uses
>>> gRPC-Python.
>>>
>>> The reason we found is that since gRPC technically does not require
>>> Protobuf, it copies message payloads into a CPython bytestring, and
>>> then the Python code then turns around and hands that to Protobuf,
>>> which then copies data into its data structures and gives it back to
>>> Python
>>
>> gRPC shouldn't need to copy the payload into a CPython bytestring.
>> Instead, it could instantiate a buffer-like Python object pointing to
>> the original data.  This is "easily" done in Cython, and gRPC-python
>> already uses Cython:
>> https://cython.readthedocs.io/en/latest/src/userguide/buffer.html
>> https://docs.python.org/3/c-api/buffer.html
>>
>> Regards
>>
>> Antoine.
>>