You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2009/01/05 09:07:44 UTC

[jira] Created: (HIVE-205) file sink operator needs unique file names

file sink operator needs unique file names
------------------------------------------

                 Key: HIVE-205
                 URL: https://issues.apache.org/jira/browse/HIVE-205
             Project: Hadoop Hive
          Issue Type: Bug
          Components: Query Processor
            Reporter: Zheng Shao


a query like "SELECT * FROM (SELECT 'a' from table UNION ALL SELECT 'b' from table) c" will fail, because the 2 sub queries of "UNION" will be executed inside the same mapper (because the input is the same) towards the same destination table, and the 2 file sink operators will have the same output file name.

We need to append the file sink operator id to the file name to make it unique.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-205) file sink operator needs unique file names

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660781#action_12660781 ] 

Ashish Thusoo commented on HIVE-205:
------------------------------------

what happens in the test

./ql/src/test/queries/clientpositive/union.q

the seems to have the same general pattern as the reported query...


> file sink operator needs unique file names
> ------------------------------------------
>
>                 Key: HIVE-205
>                 URL: https://issues.apache.org/jira/browse/HIVE-205
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Zheng Shao
>
> a query like "SELECT * FROM (SELECT 'a' from table UNION ALL SELECT 'b' from table) c" will fail, because the 2 sub queries of "UNION" will be executed inside the same mapper (because the input is the same) towards the same destination table, and the 2 file sink operators will have the same output file name.
> We need to append the file sink operator id to the file name to make it unique.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HIVE-205) file sink operator needs unique file names

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao resolved HIVE-205.
-----------------------------

    Resolution: Duplicate

> file sink operator needs unique file names
> ------------------------------------------
>
>                 Key: HIVE-205
>                 URL: https://issues.apache.org/jira/browse/HIVE-205
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Zheng Shao
>            Priority: Critical
>
> a query like "SELECT * FROM (SELECT 'a' from table UNION ALL SELECT 'b' from table) c" will fail, because the 2 sub queries of "UNION" will be executed inside the same mapper (because the input is the same) towards the same destination table, and the 2 file sink operators will have the same output file name.
> We need to append the file sink operator id to the file name to make it unique.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-205) file sink operator needs unique file names

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660866#action_12660866 ] 

Zheng Shao commented on HIVE-205:
---------------------------------

The test runs in a different mode I think. I remember the output file names of the tests looks different from what the file names are from the jobs on the cluster.

> file sink operator needs unique file names
> ------------------------------------------
>
>                 Key: HIVE-205
>                 URL: https://issues.apache.org/jira/browse/HIVE-205
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Zheng Shao
>
> a query like "SELECT * FROM (SELECT 'a' from table UNION ALL SELECT 'b' from table) c" will fail, because the 2 sub queries of "UNION" will be executed inside the same mapper (because the input is the same) towards the same destination table, and the 2 file sink operators will have the same output file name.
> We need to append the file sink operator id to the file name to make it unique.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-205) file sink operator needs unique file names

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Thusoo updated HIVE-205:
-------------------------------

    Priority: Critical  (was: Major)

Needed for the first release.


> file sink operator needs unique file names
> ------------------------------------------
>
>                 Key: HIVE-205
>                 URL: https://issues.apache.org/jira/browse/HIVE-205
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Zheng Shao
>            Priority: Critical
>
> a query like "SELECT * FROM (SELECT 'a' from table UNION ALL SELECT 'b' from table) c" will fail, because the 2 sub queries of "UNION" will be executed inside the same mapper (because the input is the same) towards the same destination table, and the 2 file sink operators will have the same output file name.
> We need to append the file sink operator id to the file name to make it unique.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.