You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2018/11/06 01:06:00 UTC
[jira] [Commented] (IMPALA-4063) Make fragment instance reports per-query (or per-host) instead of per-fragment instance.

    [ https://issues.apache.org/jira/browse/IMPALA-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675960#comment-16675960 ] 

ASF subversion and git services commented on IMPALA-4063:
---------------------------------------------------------

Commit 941038229ae7073ddf7b9c6f58e9eaf866b89b2c in impala's branch refs/heads/master from Michael Ho
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=9410382 ]

IMPALA-4063: Merge report of query fragment instances per executor

Previously, each fragment instance executing on an executor will
independently report its status to the coordinator periodically.
This creates a huge amount of RPCs to the coordinator under highly
concurrent workloads, causing lock contention in the coordinator's
backend states when multiple fragment instances send them at the
same time. In addition, due to the lack of coordination between query
fragment instances, a query may end without collecting the profiles
from all fragment instances when one of them hits an error before
another fragment instance manages to finish Prepare(), leading to
missing profiles for certain fragment instances.

This change fixes the problem above by making a thread per QueryState
(started by QueryExecMgr) to be responsible for periodically reporting
the status and profiles of all fragment instances of a query running
on a backend. As part of this refactoring, each query fragment instance
will not report their errors individually. Instead, there is a cumulative
status maintained per QueryState. It's set to the error status of the first
fragment instance which hits an error or any general error (e.g. failure
to start a thread) when starting fragment instances. With this change,
the status reporting threads are also removed.

Testing done: exhaustive tests

This patch is based on a patch by Sailesh Mukil

Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa
Reviewed-on: http://gerrit.cloudera.org:8080/11615
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Make fragment instance reports per-query (or per-host) instead of per-fragment instance.
> ----------------------------------------------------------------------------------------
>
>                 Key: IMPALA-4063
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4063
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Distributed Exec
>    Affects Versions: Impala 2.7.0
>            Reporter: Sailesh Mukil
>            Assignee: Michael Ho
>            Priority: Major
>              Labels: impala-scalability-sprint-08-13-2018, performance
>
> Currently we send a report per-fragment instance to the coordinator every 5 seconds (by default; modifiable via query option 'status_report_interval').
> For queries with a large number of fragment instances, this generates tremendous amounts  of network traffic to the coordinator, which will only be aggravated with higher a DOP.
> We should instead queue per-fragment instance reports and send out a per-query report to the coordinator instead.
> For code references, see:
> PlanFragmentExecutor:: ReportProfile()
> PlanFragmentExecutor:: SendReport()
> FragmentExecState:: ReportStatusCb()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org