You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/02 12:50:05 UTC

[GitHub] [arrow-datafusion] Dandandan opened a new issue #240: Left join could use bitmap for left join instead of `Vec`

Dandandan opened a new issue #240:
URL: https://github.com/apache/arrow-datafusion/issues/240


   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   To save some memory usage, and potentially also is faster, the data in`visited_left_side` in the `HashJoinStream` could be stored in a bitmap instead of a `Vec<bool>`. This would save ~7/8 byte per left row.
   If we store _only_ 32 bit integers on the left, the savings would be ~4-5% assuming we use 4 bytes for the items and roughly 16 bytes per left side row for the hasmap. Not too big, but a nice win in some cases. This could be bigger when we use a more memory-efficient data-structure for the hashmap.
   
   Additionally, in case every row is not matches or no row is unmatched, it could include a fast path for those cases.
   
   **Describe the solution you'd like**
   Use a bitmap instead of `Vec<bool>`. The bitmap could be from arrow or maybe the `bitvec` crate.
   
   **Describe alternatives you've considered**
   Keep using a `Vec<bool>`
   
   **Additional context**
   n/a


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb closed issue #240: Left join could use bitmap for left join instead of `Vec`

Posted by GitBox <gi...@apache.org>.
alamb closed issue #240:
URL: https://github.com/apache/arrow-datafusion/issues/240


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] boaz-codota commented on issue #240: Left join could use bitmap for left join instead of `Vec`

Posted by GitBox <gi...@apache.org>.
boaz-codota commented on issue #240:
URL: https://github.com/apache/arrow-datafusion/issues/240#issuecomment-841678377


   I have a PR waiting for this issue: https://github.com/apache/arrow-datafusion/pull/342


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org