You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/07/09 16:10:00 UTC

[jira] [Commented] (IMPALA-9502) Avoid copying TExecRequest when retrying queries

    [ https://issues.apache.org/jira/browse/IMPALA-9502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154699#comment-17154699 ] 

ASF subversion and git services commented on IMPALA-9502:
---------------------------------------------------------

Commit 65722d3e9051d6a08cb1e69fd36a06684745c226 in impala's branch refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=65722d3 ]

IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors

Fixed two warnings reported by TSAN. One of them is a data race on
exec_request_. The other is a lock-inversion warning.

The data race on exec_request_ is present because the same exec_request_
is used for the original and retried query. The issue is when a query is
retried it needs to set a new query id in the exec_request_, so retrying
a query requires mutating the exec_request_. However, even after a query
is retried the exec_request_ can still be accessed for the original
query. Specifically, an exec status report from a fragment can still be
propagated even after the query has been cancelled (e.g.
ControlService::ReportExecStatus can still access the original
exec_request_). IMPALA-9199 attempted to be smart about re-using the
TExecRequest for retried queries, but given the race conditions it seems
like a pre-mature optimization. We can re-visit IMPALA-9502 if we think
it actually makes a perf difference.

The lock-inversion warning is between the sharded locks in
ShardedQueryMap (specifically the QueryDriverMap query_driver_map_ in
ImpalaServer) and QueryDriver::client_request_state_lock_. The correct
lock ordering is to acquire the sharded lock in query_driver_map_ and then
to acquire the client_request_state_lock_.
QueryDriver::RetryQueryFromThread reversed the ordering. I wasn't able
to actually produce a deadlock, but fixing the ordering issue was
trivial.

Testing:
* Ran core tests
* Manually checked that the query retry tests are TSAN clean

Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61
Reviewed-on: http://gerrit.cloudera.org:8080/16148
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Avoid copying TExecRequest when retrying queries
> ------------------------------------------------
>
>                 Key: IMPALA-9502
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9502
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>             Fix For: Impala 4.0
>
>
> There are a few issues that occur when re-using a {{TExecRequest}} across query retries. We should investigate if there is a way to work around those issues so that the {{TExecRequest}} does not need to be copied when retrying a query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org