You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/04/27 09:11:39 UTC

[GitHub] [arrow-datafusion] tustvold opened a new issue #209: Arrow Flight Dictionary Support

tustvold opened a new issue #209:
URL: https://github.com/apache/arrow-datafusion/issues/209


   The dictionary support added in #1262 hydrates dictionaries for arrow flight. In some situations it is possible to do better than this.
   
   This is somewhat complicated because dictionaries may be shared across columns for some record batches, however, the dictionary ID is encoded in the schema and must be constant for a given column.
   
   A very basic protocol would assign each column in the schema a unique dictionary ID, and before sending each record batch send out a non-differential dictionary update containing the dictionary for the column within that record batch.
   
   This is potentially wasteful, and will likely want to incorporate heuristics for when it is better to hydrate the values and/or re-encode the dictionary, but should be easy to implement.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] tustvold closed issue #209: Arrow Flight Dictionary Support

Posted by GitBox <gi...@apache.org>.
tustvold closed issue #209:
URL: https://github.com/apache/arrow-datafusion/issues/209


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] tustvold edited a comment on issue #209: Arrow Flight Dictionary Support

Posted by GitBox <gi...@apache.org>.
tustvold edited a comment on issue #209:
URL: https://github.com/apache/arrow-datafusion/issues/209#issuecomment-827451022


   Created on the wrong repository... :facepalm: 
   
   I believe a repo owner should be able to delete this...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] tustvold commented on issue #209: Arrow Flight Dictionary Support

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #209:
URL: https://github.com/apache/arrow-datafusion/issues/209#issuecomment-827451022


   Created on the wrong repository... :facepalm: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org