You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@impala.apache.org by "Lars Volker (JIRA)" <ji...@apache.org> on 2018/04/10 22:48:00 UTC

[jira] [Created] (IMPALA-6834) Enforce consistent, pseudo-random replica order during local, non-random scheduling

Lars Volker created IMPALA-6834:
-----------------------------------

Summary: Enforce consistent, pseudo-random replica order during local, non-random scheduling
Key: IMPALA-6834
URL: https://issues.apache.org/jira/browse/IMPALA-6834
Project: IMPALA
Issue Type: Improvement
Components: Backend, Frontend
Affects Versions: Impala 2.12.0
Reporter: Lars Volker

When scheduling local, non-cached reads, the scheduler breaks ties between otherwise equivalent replicas by picking the first one in the list of candidates as reported by the frontend ([scheduler.h:375|https://github.com/apache/impala/blob/master/be/src/scheduling/scheduler.h#L375]). We do this to optimize the use of OS buffer caches. We also provide the random_replica query option to improve cases where the default behavior leads to CPU hotspots.

The frontend merely passes the replicas of a block to the backend in the order they are returned from HDFS. This can result in poor load distribution for scans with many small scan ranges, depending on the order in which HDFS returns block locations.

To alleviate this we should add another level of pseudo-random shuffling. The main idea is to change the order in which replicas are picked, but to keep the order consistent across backends.

For example, we can compute a hash for each replica over (replica address, THdfsFileSplit::file_name, THdfsFileSplit::offset) and use this as a key to find the min_element in the list of candidates [scheduler.cc:L772|https://github.com/apache/impala/blob/master/be/src/scheduling/scheduler.cc#L772]. Then we use this element instead of the first one.

We need to make sure that we run sufficiently comprehensive performance tests on this change.

--
This message was sent by Atlassian JIRA
(v7.6.3#76005)