You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Seonggon Namgung (Jira)" <ji...@apache.org> on 2023/01/26 09:40:00 UTC

[jira] [Commented] (HIVE-26986) A DAG created by OperatorGraph is not equal to the Tez DAG.

    [ https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17680933#comment-17680933 ] 

Seonggon Namgung commented on HIVE-26986:
-----------------------------------------

The attatched image files("Query71 TezDAG.png" and "Query71 OperatorGraph.png") show Tez DAG and OperatorGraph of TPC-DS query71.
I set tez.generate.debug.artifacts to get a dot file of Tez DAG.
The OperatorGraph is created after ParallelEdgeFixer is applied.

The number of clusters in the OperatorGraph is 10, but the number of vertices in the Tez DAG is 12.
The difference comes from cluster 3 of the OperatorGraph, which contains 3 TS operators and a UNION operator.

Current OperatorGraph creates a singleton cluster for each operator and merges parent operator's cluster to child operator's cluster unless parent operator is ReduceSink operator.
As a result, there can be a cluster with multiple root operators, which cannot form a single vertex in Tez DAG.
This inequality between Tez DAG and OperatorGraph makes false-positive errors when detecting parallel edges and leads to insertion of unnecessary concentrator RS.

> A DAG created by OperatorGraph is not equal to the Tez DAG.
> -----------------------------------------------------------
>
>                 Key: HIVE-26986
>                 URL: https://issues.apache.org/jira/browse/HIVE-26986
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 4.0.0-alpha-2
>            Reporter: Seonggon Namgung
>            Assignee: Seonggon Namgung
>            Priority: Major
>         Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png
>
>
> A DAG created by OperatorGraph is not equal to the corresponding DAG that is submitted to Tez.
> Because of this problem, ParallelEdgeFixer reports a pair of normal edges to a parallel edge.
> We observe this problem by comparing OperatorGraph and Tez DAG when running TPC-DS query 71 on 1TB ORC format managed table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)