You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Xuefu Zhang (JIRA)" <ji...@apache.org> on 2014/11/12 19:36:35 UTC
[jira] [Created] (HIVE-8840) Print prettier Spark work graph after
HIVE-8793 [Spark Branch]
Xuefu Zhang created HIVE-8840:
---------------------------------
Summary: Print prettier Spark work graph after HIVE-8793 [Spark Branch]
Key: HIVE-8840
URL: https://issues.apache.org/jira/browse/HIVE-8840
Project: Hive
Issue Type: Improvement
Components: Spark
Reporter: Xuefu Zhang
Because of HIVE-8793, the work graph for Spark is possibly modified by SplitSparkWorkResolver. Original:
{code}
Spark
Edges:
Reducer 2 <- Map 1 (SORT, 1)
Reducer 3 <- Reducer 2 (GROUP, 1)
Reducer 4 <- Reducer 2 (GROUP, 1)
{code}
New graph
{code}
Spark
Edges:
Reducer 3 <- Reducer 5 (GROUP, 1)
Reducer 4 <- Reducer 6 (GROUP, 1)
Reducer 5 <- Map 1 (SORT, 1)
Reducer 6 <- Map 1 (SORT, 1)
{code}
where Reducer2 was splitted into Reducer5 and Reducer6.
Two types of ordering can be considered:
1. Topological order
{code}
Spark
Edges:
Reducer 5 <- Map 1 (SORT, 1)
Reducer 6 <- Map 1 (SORT, 1)
Reducer 3 <- Reducer 5 (GROUP, 1)
Reducer 4 <- Reducer 6 (GROUP, 1)
{code}
2. DFS
{code}
Spark
Edges:
Reducer 5 <- Map 1 (SORT, 1)
Reducer 3 <- Reducer 5 (GROUP, 1)
Reducer 6 <- Map 1 (SORT, 1)
Reducer 4 <- Reducer 6 (GROUP, 1)
{code}
Both seems better, though topolical seems more suitable for a graph. Please feel free to create a patch on trunk if needed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)