You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Sahil Takiar (Jira)" <ji...@apache.org> on 2020/01/16 21:12:00 UTC

[jira] [Created] (IMPALA-9301) Aux error info should detect multiple RPC failures

Sahil Takiar created IMPALA-9301:
------------------------------------

             Summary: Aux error info should detect multiple RPC failures
                 Key: IMPALA-9301
                 URL: https://issues.apache.org/jira/browse/IMPALA-9301
             Project: IMPALA
          Issue Type: Sub-task
          Components: Backend
            Reporter: Sahil Takiar


Suggested during the review of ([IMPALA-9296|http://issues.cloudera.org/browse/IMPALA-9296]) [https://gerrit.cloudera.org/#/c/15046/]

{quote}

I'm not sure that this is the right wa[y] to do it, since it means that if a backend sees multiple rpc failures in a single query only one will ever be reported to the coordinator.

Of course, I've been advocating for being aggressive about blacklisting. Suppose there were two rpc failures, then there are two cases here - either both rpcs were to the same other executor, in which case the fact that there were two failures makes us more confident something is going on with that executor and we might actually want to blacklist the executor twice (which will just extend the amount of time that it stays blacklisted for), or the two rpcs were to different executors, in which case if we only blacklist one of them if we then retry the query it may very well fail again.

And even if we do want to stay more conservative about blacklisting, you've suggested before (and I agree) that its generally preferable to report as much info about errors as we've got, and then centralize the logic for deciding how to act on those errors in the coordinator.

{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)