You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Kevin Wilfong (JIRA)" <ji...@apache.org> on 2013/01/18 07:26:12 UTC

[jira] [Commented] (HIVE-3915) Union with map-only query on one side and two MR job query on the other produces wrong results

    [ https://issues.apache.org/jira/browse/HIVE-3915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556994#comment-13556994 ] 

Kevin Wilfong commented on HIVE-3915:
-------------------------------------

https://reviews.facebook.net/D8019
                
> Union with map-only query on one side and two MR job query on the other produces wrong results
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-3915
>                 URL: https://issues.apache.org/jira/browse/HIVE-3915
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.11.0
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>
> When a query contains a union with a map only subquery on one side and a subquery involving two sequential map reduce jobs on the other, it can produce wrong results.  It appears that if the map only queries table scan operator is processed first the task involving a union is made a root task.  Then when the other subquery is processed, the second map reduce job gains the task involving the union as a child and it is made a root task.  This means that both the first and second map reduce jobs are root tasks, so the dependency between the two is ignored.  If they are run in parallel (i.e. the cluster has more than one node) no results will be produced for the side of the union with the two map reduce jobs and only the results of the other side of the union will be returned.
> The order TableScan operators are processed is crucial to reproducing this bug, and it is determined by the order values are retrieved from a map, and hence hard to predict, so it doesn't always reproduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira