You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/04/28 16:52:00 UTC

[jira] [Created] (IMPALA-9704) Consider doing remote reads for small dimension tables instead of scan+exchange

Tim Armstrong created IMPALA-9704:
-------------------------------------

             Summary: Consider doing remote reads for small dimension tables instead of scan+exchange
                 Key: IMPALA-9704
                 URL: https://issues.apache.org/jira/browse/IMPALA-9704
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
            Reporter: Tim Armstrong


The remote data cache changes the calculus for certain broadcast join plans. Previously we always did local reads then broadcast the output of the scan. But with the data cache it could make sense to read small tables into the data cache into all of the nodes and replicate the scan on all nodes without the exchange.

There's a variety of factors in play, including cost of predicate evaluation and runtime filter evaluations, and whether the remote data cache is enabled, so it's not guaranteed to be a win.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)