You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "He Yongqiang (JIRA)" <ji...@apache.org> on 2010/08/22 23:48:16 UTC

[jira] Created: (HIVE-1582) merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'

merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'
------------------------------------------------------------------------------

                 Key: HIVE-1582
                 URL: https://issues.apache.org/jira/browse/HIVE-1582
             Project: Hadoop Hive
          Issue Type: Bug
            Reporter: He Yongqiang


hive> 
    > 
    > 
    >      SET hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
hive>            SET hive.exec.compress.output=false;
hive>                INSERT OVERWRITE DIRECTORY 'xxxxx'
    >                  SELECT yyyy from  a;
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
......
Ended Job = job_201008191557_54169
Ended Job = 450290112, job is filtered out (removed at runtime).
Launching Job 2 out of 2
.....

the second job should not get started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1582) merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901250#action_12901250 ] 

Ning Zhang commented on HIVE-1582:
----------------------------------

I'm confused. Do you mean the second job should not be started or the second job should not be filtered out? I've tested the behaviors before and after HIVE-1307, and they are the same and always fires the merge job. 

> merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-1582
>                 URL: https://issues.apache.org/jira/browse/HIVE-1582
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>
> hive> 
>     > 
>     > 
>     >      SET hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> hive>            SET hive.exec.compress.output=false;
> hive>                INSERT OVERWRITE DIRECTORY 'xxxxx'
>     >                  SELECT yyyy from  a;
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> ......
> Ended Job = job_201008191557_54169
> Ended Job = 450290112, job is filtered out (removed at runtime).
> Launching Job 2 out of 2
> .....
> the second job should not get started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HIVE-1582) merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang resolved HIVE-1582.
------------------------------

    Resolution: Not A Problem

Taked to Namit and Yongqiang, this is not a bug. INSERT OVERWRITE to (HDFS) directory should be merged as before. INSERT OVERWRITE LOCAL DIRECTORY cannot be merged and this is not the case. 

> merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-1582
>                 URL: https://issues.apache.org/jira/browse/HIVE-1582
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>
> hive> 
>     > 
>     > 
>     >      SET hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> hive>            SET hive.exec.compress.output=false;
> hive>                INSERT OVERWRITE DIRECTORY 'xxxxx'
>     >                  SELECT yyyy from  a;
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> ......
> Ended Job = job_201008191557_54169
> Ended Job = 450290112, job is filtered out (removed at runtime).
> Launching Job 2 out of 2
> .....
> the second job should not get started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1582) merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901260#action_12901260 ] 

Namit Jain commented on HIVE-1582:
----------------------------------

@Ning, there should be no merge job for insert directory, we only used to merge for inserting into tables and partitions before

> merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-1582
>                 URL: https://issues.apache.org/jira/browse/HIVE-1582
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>
> hive> 
>     > 
>     > 
>     >      SET hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> hive>            SET hive.exec.compress.output=false;
> hive>                INSERT OVERWRITE DIRECTORY 'xxxxx'
>     >                  SELECT yyyy from  a;
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> ......
> Ended Job = job_201008191557_54169
> Ended Job = 450290112, job is filtered out (removed at runtime).
> Launching Job 2 out of 2
> .....
> the second job should not get started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1582) merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901239#action_12901239 ] 

Ning Zhang commented on HIVE-1582:
----------------------------------

Is hive.merge.mapfiles=true? If so the second merge job should be fired. Am I missing something?

> merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-1582
>                 URL: https://issues.apache.org/jira/browse/HIVE-1582
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>
> hive> 
>     > 
>     > 
>     >      SET hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> hive>            SET hive.exec.compress.output=false;
> hive>                INSERT OVERWRITE DIRECTORY 'xxxxx'
>     >                  SELECT yyyy from  a;
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> ......
> Ended Job = job_201008191557_54169
> Ended Job = 450290112, job is filtered out (removed at runtime).
> Launching Job 2 out of 2
> .....
> the second job should not get started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1582) merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901264#action_12901264 ] 

Ning Zhang commented on HIVE-1582:
----------------------------------

@namit, merging happens even before HIVE-1307. There does not seems to exist a unit test for this feature -- no merge for inserting to directory). BTW, what's the rationale behind this? 

> merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-1582
>                 URL: https://issues.apache.org/jira/browse/HIVE-1582
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>
> hive> 
>     > 
>     > 
>     >      SET hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> hive>            SET hive.exec.compress.output=false;
> hive>                INSERT OVERWRITE DIRECTORY 'xxxxx'
>     >                  SELECT yyyy from  a;
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> ......
> Ended Job = job_201008191557_54169
> Ended Job = 450290112, job is filtered out (removed at runtime).
> Launching Job 2 out of 2
> .....
> the second job should not get started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1582) merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901242#action_12901242 ] 

He Yongqiang commented on HIVE-1582:
------------------------------------

Ended Job = 450290112, job is filtered out (removed at runtime).

the second job seems be filtered out at runtime

> merge mapfiles task behaves incorrectly for 'inserting overwrite directory...'
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-1582
>                 URL: https://issues.apache.org/jira/browse/HIVE-1582
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>
> hive> 
>     > 
>     > 
>     >      SET hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> hive>            SET hive.exec.compress.output=false;
> hive>                INSERT OVERWRITE DIRECTORY 'xxxxx'
>     >                  SELECT yyyy from  a;
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks is set to 0 since there's no reduce operator
> ......
> Ended Job = job_201008191557_54169
> Ended Job = 450290112, job is filtered out (removed at runtime).
> Launching Job 2 out of 2
> .....
> the second job should not get started.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.