You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Joe McDonnell (JIRA)" <ji...@apache.org> on 2019/08/14 23:21:00 UTC

[jira] [Assigned] (IMPALA-8677) Removing an unused node does not leave consistent remote scheduling unchanged

     [ https://issues.apache.org/jira/browse/IMPALA-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joe McDonnell reassigned IMPALA-8677:
-------------------------------------

    Assignee: Joe McDonnell

> Removing an unused node does not leave consistent remote scheduling unchanged
> -----------------------------------------------------------------------------
>
>                 Key: IMPALA-8677
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8677
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 3.2.0
>            Reporter: Joe McDonnell
>            Assignee: Joe McDonnell
>            Priority: Major
>
> When working on IMPALA-8630, I discovered that SchedulerTest::RemoteExecutorCandidateConsistency works mostly by happenstance.
> The root of the issue is that in Scheduler::GetRemotExecutorCandidates() we want to avoid returning duplicates and put all the IpAddrs in a set:
> {code:java}
> set<IpAddr> distinct_backends;
> ...
> distinct_backends.insert(*executor_addr);
> ...
> for (const IpAddr& addr : distinct_backends) {
>   remote_executor_candidates->push_back(addr);
> }{code}
> This sorts the IpAddrs, and the remote_executor_candidates does not return elements in the order in which they are encountered.
> Suppose that we are running with num_remote_executor_candidates=2 and random replicas is false. There is exactly one file. GetRemoteExecutorCandidates() returns these executor candidates (IpAddrs):
> {192.168.1.2, 192.168.1.3}
> The first entry is chosen because it is first. Nothing was scheduled on 192.168.1.3, but removing it may change the scheduling outcome. This is because of the sort. Suppose 192.168.1.3 is gone, but the next closest executor is 192.168.1.1 (or some node less than 192.168.1.2). Even though it is farther in the context of the hashring, GetRemoteExecutorCandidates() would return:
> {192.168.1.1, 192.168.1.2}
> and the first entry would be chosen.
> To eliminate this inconsistency, it might be useful to retain the order in which elements match via the hashring.
> In terms of impact, this would increase the number of files that would potentially change scheduling when a node leaves. It might have unnecessary changes. If using random replica set to true, it doesn't matter. It is unclear how much this would impact otherwise.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org