You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/03/18 16:30:00 UTC

[jira] [Commented] (IMPALA-10577) Add retrying of AdmitQuery

    [ https://issues.apache.org/jira/browse/IMPALA-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304262#comment-17304262 ] 

ASF subversion and git services commented on IMPALA-10577:
----------------------------------------------------------

Commit c823f17f2c842d679b5b36e3b214b5a88b473b45 in impala's branch refs/heads/master from Thomas Tauber-Marshall
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c823f17 ]

IMPALA-10577: Add retrying of AdmitQuery

This patch adds retries of the AdmitQuery rpc by coordinators.
This helps to ensure that if an admissiond goes down and is restarted
or is temporarily unreachable, queries won't fail.

The retries are done with backoff and jitter to avoid overloading the
admissiond in these scenarios.

A new flag, --admission_max_retry_time_s, is added to control how long
queries will continue retrying before giving up.

The AdmitQuery rpc is made idempotent - if a query is submitted with
the same query id as one the admissiond already knows about,
AdmitQuery will return OK without submitting the query to be scheduled
again.

Testing:
- Added a custom cluster test that checks that queries won't fail when
  the admissiond goes down.

Change-Id: I8bc0cac666bbd613a1143c0e2c4f84d3b0ad003a
Reviewed-on: http://gerrit.cloudera.org:8080/17188
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Add retrying of AdmitQuery
> --------------------------
>
>                 Key: IMPALA-10577
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10577
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>    Affects Versions: Impala 4.0
>            Reporter: Thomas Tauber-Marshall
>            Assignee: Thomas Tauber-Marshall
>            Priority: Major
>
> Current in the admission control service, if the AdmitQuery rpc fails, the entire query is failed immediately. This means that if the admissiond goes down or is temporarily unreachable, any new queries will fail.
> We should make this more fault tolerant by adding retries of AdmitQuery.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org