You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Csaba Ringhofer (Jira)" <ji...@apache.org> on 2021/10/19 11:14:00 UTC
[jira] [Created] (IMPALA-10973) Empty scan nodes are scheduled to
the (exclusive) coordinator
Csaba Ringhofer created IMPALA-10973:
----------------------------------------
Summary: Empty scan nodes are scheduled to the (exclusive) coordinator
Key: IMPALA-10973
URL: https://issues.apache.org/jira/browse/IMPALA-10973
Project: IMPALA
Issue Type: Bug
Components: Backend
Reporter: Csaba Ringhofer
Currently fragments with scan nodes that have no scan ranges are scheduled to the coordinator, even if it is an exclusive coordinator:
https://github.com/apache/impala/blob/master/be/src/scheduling/scheduler.cc#L805
As "parent" fragments are often scheduled to be collocated with their children, the condition of "being scheduled to the coordinator" can spread through the plan tree.
This can be disastrous to scalability in clusters with lot of executors but few coordinators and is also very counter-intuitive, as scanning an empty table shouldn't have a major effect on the query.
To reproduce locally:
bin/start-impala-cluster.py --use_exclusive_coordinators -c 1
in Impala shell:
select id from functional.alltypes;
profile; -- scan nodes will be scheduled to 2 hosts
select f2 from functional.emptytable union all select id from functional.alltypes;
profile; -- scan nodes will be scheduled to 3 hosts
--
This message was sent by Atlassian Jira
(v8.3.4#803005)