You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Rui Li (JIRA)" <ji...@apache.org> on 2014/08/14 14:09:13 UTC
[jira] [Commented] (HIVE-7731) Incorrect result returned when a map
work has multiple downstream reduce works
[ https://issues.apache.org/jira/browse/HIVE-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096899#comment-14096899 ]
Rui Li commented on HIVE-7731:
------------------------------
Some quick thoughts: I suspect we hit the output collector problem gain. If an ExecMapper (or ExecReducer) has multiple RS, maybe they should have different output collectors.
> Incorrect result returned when a map work has multiple downstream reduce works
> ------------------------------------------------------------------------------
>
> Key: HIVE-7731
> URL: https://issues.apache.org/jira/browse/HIVE-7731
> Project: Hive
> Issue Type: Bug
> Components: Spark
> Reporter: Rui Li
>
> Encountered when running on spark. Suppose we have three tables:
> {noformat}
> table1(x int, y int);
> table2(x int);
> table3(x int);
> {noformat}
> I run the following query:
> {noformat}
> from table1
> insert overwrite table table2 select x group by x
> insert overwrite table table3 select y group by y;
> {noformat}
> The query generates 1 map and 2 reduces. The map operator has 2 RS, so I suppose it has output for both reduces.
> The problem is all (incorrect) results go to table2 and table3 is empty.
> I tried the same query on MR and it gives correct results.
--
This message was sent by Atlassian JIRA
(v6.2#6252)