You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Amol Umbarkar (Jira)" <ji...@apache.org> on 2020/05/18 09:23:00 UTC

[jira] [Commented] (ARROW-8845) [c++] Selective compression on the wire

    [ https://issues.apache.org/jira/browse/ARROW-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17110073#comment-17110073 ] 

Amol Umbarkar commented on ARROW-8845:
--------------------------------------

Response from Wes:
thanks for pointing that out. Such a heuristic (observing compression
ratios of stream messages) could be implemented at some point so that
compression could be toggled off mid-stream if it doesn't seem to be
helping. Feel free to open a JIRA issue about this{color:#888888}


{color}
!https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif! I just opened https://issues.apache.org/jira/browse/ARROW-8823
since we don't track "what the uncompressed size would have been"
without compression turned on.
 

> [c++] Selective compression on the wire
> ---------------------------------------
>
>                 Key: ARROW-8845
>                 URL: https://issues.apache.org/jira/browse/ARROW-8845
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: FlightRPC
>            Reporter: Amol Umbarkar
>            Priority: Major
>
> Dask seems to be selectively do compression if it is found to be useful. They sort of pick 10kb of sample upfront to calculate compression and if the results are good then the whole batch is compressed. This seems to save de-compression effort on receiver side.
>  
> Please take a look at [https://blog.dask.org/2016/04/14/dask-distributed-optimizing-protocol#problem-3-unwanted-compression]
>  
> Thought this could be relevant to arrow batch transfers as well. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)