You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Esteban Gutierrez (Created) (JIRA)" <ji...@apache.org> on 2012/01/23 17:25:40 UTC
[jira] [Created] (HIVE-2737) CombineFileInputFormat fails if
mapred.job.tracker is set to local with a sub-query
CombineFileInputFormat fails if mapred.job.tracker is set to local with a sub-query
-----------------------------------------------------------------------------------
Key: HIVE-2737
URL: https://issues.apache.org/jira/browse/HIVE-2737
Project: Hive
Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Esteban Gutierrez
If the CombineFileInputFormat and mapred.job.tracker=local are used together, the CombineFileInputFormat throws a java.io.FileNotFoundException if the query statment contains a sub-query:
{code}
hive> select count(*) from (select count(*), a from hivetest2 group by a) x;
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
Execution log at: /tmp/esteban/esteban_20120119134040_5d105797-1444-43ce-8ca8-3b4735b7a70d.log
Job running in-process (local Hadoop)
2012-01-19 13:40:49,618 null map = 100%, reduce = 100%
Ended Job = job_local_0001
Launching Job 2 out of 2
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
Execution log at: /tmp/esteban/esteban_20120119134040_5d105797-1444-43ce-8ca8-3b4735b7a70d.log
java.io.FileNotFoundException: File does not exist: /tmp/esteban/hive_2012-01-19_13-40-45_277_494412568828098242/-mr-10002/000000_0
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:546)
at org.apache.hadoop.mapred.lib.CombineFileInputFormat$OneFileInfo.<init>(CombineFileInputFormat.java:462)
at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:256)
at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:212)
at org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileInputFormatShim.getSplits(Hadoop20SShims.java:347)
at org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileInputFormatShim.getSplits(Hadoop20SShims.java:313)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:377)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:971)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:963)
at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:671)
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:1092)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: /tmp/esteban/hive_2012-01-19_13-40-45_277_494412568828098242/-mr-10002/000000_0)'
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2737) CombineFileInputFormat fails if
mapred.job.tracker is set to local with a sub-query
Posted by "Carl Steinbach (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carl Steinbach updated HIVE-2737:
---------------------------------
Component/s: Query Processor
> CombineFileInputFormat fails if mapred.job.tracker is set to local with a sub-query
> -----------------------------------------------------------------------------------
>
> Key: HIVE-2737
> URL: https://issues.apache.org/jira/browse/HIVE-2737
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.8.0
> Reporter: Esteban Gutierrez
>
> If the CombineFileInputFormat and mapred.job.tracker=local are used together, the CombineFileInputFormat throws a java.io.FileNotFoundException if the query statment contains a sub-query:
> {code}
> hive> select count(*) from (select count(*), a from hivetest2 group by a) x;
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> Execution log at: /tmp/esteban/esteban_20120119134040_5d105797-1444-43ce-8ca8-3b4735b7a70d.log
> Job running in-process (local Hadoop)
> 2012-01-19 13:40:49,618 null map = 100%, reduce = 100%
> Ended Job = job_local_0001
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> Execution log at: /tmp/esteban/esteban_20120119134040_5d105797-1444-43ce-8ca8-3b4735b7a70d.log
> java.io.FileNotFoundException: File does not exist: /tmp/esteban/hive_2012-01-19_13-40-45_277_494412568828098242/-mr-10002/000000_0
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:546)
> at org.apache.hadoop.mapred.lib.CombineFileInputFormat$OneFileInfo.<init>(CombineFileInputFormat.java:462)
> at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:256)
> at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:212)
> at org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileInputFormatShim.getSplits(Hadoop20SShims.java:347)
> at org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileInputFormatShim.getSplits(Hadoop20SShims.java:313)
> at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:377)
> at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:971)
> at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:963)
> at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:671)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:1092)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: /tmp/esteban/hive_2012-01-19_13-40-45_277_494412568828098242/-mr-10002/000000_0)'
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2737) CombineFileInputFormat fails if
mapred.job.tracker is set to local with a sub-query
Posted by "Navis (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212293#comment-13212293 ]
Navis commented on HIVE-2737:
-----------------------------
Similar problem with https://issues.apache.org/jira/browse/HIVE-2778
Hadoop-CDH3 removes scheme part of path URI in the CombineFileInputFormat, in this case 'file:///', which causes various problems in hive(HIVE-2778, HIVE-2784).
> CombineFileInputFormat fails if mapred.job.tracker is set to local with a sub-query
> -----------------------------------------------------------------------------------
>
> Key: HIVE-2737
> URL: https://issues.apache.org/jira/browse/HIVE-2737
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.8.0
> Reporter: Esteban Gutierrez
>
> If the CombineFileInputFormat and mapred.job.tracker=local are used together, the CombineFileInputFormat throws a java.io.FileNotFoundException if the query statment contains a sub-query:
> {code}
> hive> select count(*) from (select count(*), a from hivetest2 group by a) x;
> Total MapReduce jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> Execution log at: /tmp/esteban/esteban_20120119134040_5d105797-1444-43ce-8ca8-3b4735b7a70d.log
> Job running in-process (local Hadoop)
> 2012-01-19 13:40:49,618 null map = 100%, reduce = 100%
> Ended Job = job_local_0001
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
> set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
> set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
> set mapred.reduce.tasks=<number>
> Execution log at: /tmp/esteban/esteban_20120119134040_5d105797-1444-43ce-8ca8-3b4735b7a70d.log
> java.io.FileNotFoundException: File does not exist: /tmp/esteban/hive_2012-01-19_13-40-45_277_494412568828098242/-mr-10002/000000_0
> at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:546)
> at org.apache.hadoop.mapred.lib.CombineFileInputFormat$OneFileInfo.<init>(CombineFileInputFormat.java:462)
> at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:256)
> at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:212)
> at org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileInputFormatShim.getSplits(Hadoop20SShims.java:347)
> at org.apache.hadoop.hive.shims.Hadoop20SShims$CombineFileInputFormatShim.getSplits(Hadoop20SShims.java:313)
> at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:377)
> at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:971)
> at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:963)
> at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:671)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:1092)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: /tmp/esteban/hive_2012-01-19_13-40-45_277_494412568828098242/-mr-10002/000000_0)'
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira