You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/04/28 16:52:00 UTC
[jira] [Created] (IMPALA-9704) Consider doing remote reads for
small dimension tables instead of scan+exchange
Tim Armstrong created IMPALA-9704:
-------------------------------------
Summary: Consider doing remote reads for small dimension tables instead of scan+exchange
Key: IMPALA-9704
URL: https://issues.apache.org/jira/browse/IMPALA-9704
Project: IMPALA
Issue Type: Improvement
Components: Frontend
Reporter: Tim Armstrong
The remote data cache changes the calculus for certain broadcast join plans. Previously we always did local reads then broadcast the output of the scan. But with the data cache it could make sense to read small tables into the data cache into all of the nodes and replicate the scan on all nodes without the exchange.
There's a variety of factors in play, including cost of predicate evaluation and runtime filter evaluations, and whether the remote data cache is enabled, so it's not guaranteed to be a win.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)