You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Antoine Pitrou (Jira)" <ji...@apache.org> on 2021/11/10 16:29:00 UTC

[jira] [Commented] (ARROW-8845) [C++] Selective compression on the wire

    [ https://issues.apache.org/jira/browse/ARROW-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17441830#comment-17441830 ] 

Antoine Pitrou commented on ARROW-8845:
---------------------------------------

One limitation is that compression is enabled for entire record batches, but it's quite conceivable that some fields or even individual buffers would compress very well, but others not.

cc [~emkornfield]  [~lidavidm] 

> [C++] Selective compression on the wire
> ---------------------------------------
>
>                 Key: ARROW-8845
>                 URL: https://issues.apache.org/jira/browse/ARROW-8845
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, FlightRPC
>            Reporter: Amol Umbarkar
>            Priority: Major
>
> Dask seems to be selectively do compression if it is found to be useful. They sort of pick 10kb of sample upfront to calculate compression and if the results are good then the whole batch is compressed. This seems to save de-compression effort on receiver side.
>  
> Please take a look at [https://blog.dask.org/2016/04/14/dask-distributed-optimizing-protocol#problem-3-unwanted-compression]
>  
> Thought this could be relevant to arrow batch transfers as well. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)