You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/16 18:55:57 UTC

[GitHub] [arrow-datafusion] realno commented on pull request #1831: determine build side in hash join by `total_byte_size` instead of `num_rows`

realno commented on pull request #1831:
URL: https://github.com/apache/arrow-datafusion/pull/1831#issuecomment-1042033212


   @xudong963 @Dandandan These are good ideas. +1 on not using num of rows, and provide alternative logic to estimate size. I feel it make sense to add a size estimator to consolidate the logic. An additional thought - is it possible to get size (or estimation) from Arrow directly - I think this is one of the benefit Arrow format provides. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org