You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Jeff Hammerbacher (JIRA)" <ji...@apache.org> on 2009/03/03 22:52:56 UTC

[jira] Commented: (HIVE-318) [Hive] union all queries broken - all kinds of problems

    [ https://issues.apache.org/jira/browse/HIVE-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678466#action_12678466 ] 

Jeff Hammerbacher commented on HIVE-318:
----------------------------------------

Hey Namit,

Quick comment: would it be easier to track these issues if they were filed as separate tickets? Perhaps you are preparing a solution as a single patch, or they are all manifestations of the same root issue, but otherwise, this seems like a ticket that could be busted up.

Later,
Jeff

> [Hive] union all queries broken - all kinds of problems
> -------------------------------------------------------
>
>                 Key: HIVE-318
>                 URL: https://issues.apache.org/jira/browse/HIVE-318
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>
> 1. Map-only job : same input
>    Hangs because mapper tries to same open twice, and hadoop filesystem complains.
>    Fix: Only initialize once - keep state at the Operator level for the same. Should do same for Close.
> 2. Map-only job : different inputs
>    Loss of data due to rename.
>    Fix: change rename to move files to the directory.
> 3. Map-only job in subquery + RedSink: works currently
> 4. 2 variables: so 4 sub-cases
>    Number of sub-queries having map-reduce jobs. (1/2)
>    Operator after Union (RS/FS)
>    
> a.   Number of sub-queries having map-reduce jobs. 1
>      Operator after Union: RS
>      Can be done in 2MR - really difficult with current infrastructure.
>      Should do with 3 MR jobs now - break on top of UNION. 
>      Future optimization: move operators between Union and RS before Union.
> b.   Number of sub-queries having map-reduce jobs. 2
>      Operator after Union: RS
>      Needs 3MR - Should do with 3 MR jobs - break on top of UNION. 
>      Future optimization: move operators between Union and RS before Union.
> c.   Number of sub-queries having map-reduce jobs. 1
>      Operator after Union: FS
>      Can be done in 1MR - really difficult with current infrastructure.
>      Can be easily done with 2 MR by removing UNION and cloning operators between Union and FS.
>      Should do with 3 MR jobs now - break on top of UNION. 
>      Followup optimization: 2MR should be able to handle
> d.   Number of sub-queries having map-reduce jobs. 2
>      Operator after Union: FS
>      Can be easily done with 2 MR by removing UNION and cloning operators between Union and FS.
>      Should do with 3 MR jobs now - break on top of UNION. 
>      Followup optimization: 2MR should be able to handle

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.