You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "lidavidm (via GitHub)" <gi...@apache.org> on 2023/09/27 12:17:05 UTC

[GitHub] [arrow] lidavidm commented on issue #37900: Memory tracking for arrow flight over grpc

lidavidm commented on issue #37900:
URL: https://github.com/apache/arrow/issues/37900#issuecomment-1737277716

   I don't think there is anything we can do on the Arrow side about gRPC. By the time the message reaches Arrow code, gRPC has already allocated the memory for the message. (And even if we redesigned the protocol to send message headers separately from the data - which would be a massive breaking change - there's no guarantee that gRPC won't read ahead in the stream anyways while the application is processing the first message. That might be tunable, of course. [Relevant SO answer from a gRPC team member.](https://stackoverflow.com/questions/47577031/unbuffered-bidirectional-data-streaming-with-grpc-how-to-get-the-size-of-the-cl))
   
   gRPC used to support a custom allocator API, but they removed it a long time ago: https://github.com/grpc/proposal/blob/master/L60-core-remove-custom-allocator.md
   
   From the justifications there, it sounds unlikely they'd care to bring it back (though perhaps you could persuade them to use one only for message data).
   
   One workaround: this would add a ton of round trips, but you could do something with DoExchange at the application level, where the server tells the client the size of each message and waits for the client to send a message actively confirming that it is ready for data. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org