You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Marc Limotte (JIRA)" <ji...@apache.org> on 2013/02/22 23:06:12 UTC
[jira] [Created] (HADOOP-9328) INSERT INTO a S3 external table with no reduce phase results in FileNotFoundException

Marc Limotte created HADOOP-9328:
------------------------------------

             Summary: INSERT INTO a S3 external table with no reduce phase results in FileNotFoundException
                 Key: HADOOP-9328
                 URL: https://issues.apache.org/jira/browse/HADOOP-9328
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 0.9.0
         Environment: YARN, Hadoop 2.0.2-alpha
Ubuntu
            Reporter: Marc Limotte


With Yarn and Hadoop 2.0.2-alpha, hive 0.9.0.

The destination is an S3 table, the source for the query is a small hive managed table.

CREATE EXTERNAL TABLE payout_state_product (
  state STRING,
  product_id STRING,
  element_id INT,
  element_value DOUBLE,
  number_of_fields INT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 's3://com.weatherbill.foo/bar/payout_state_product/';

A simple query to copy the results from the hive managed table into a S3. 

hive> INSERT OVERWRITE TABLE payout_state_product 
SELECT * FROM payout_state_product_cached; 

Total MapReduce jobs = 2 
Launching Job 1 out of 2 
Number of reduce tasks is set to 0 since there's no reduce operator 
Starting Job = job_1360884012490_0014, Tracking URL = http://i-9ff9e9ef.us-east-1.production.climatedna.net:8088/proxy/application_1360884012490_0014/ 
Kill Command = /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=i-9ff9e9ef.us-east-1.production.climatedna.net:8032 -kill job_1360884012490_0014 
Hadoop job information for Stage-1: number of mappers: 100; number of reducers: 0 
2013-02-22 19:15:46,709 Stage-1 map = 0%, reduce = 0% 
...snip... 
2013-02-22 19:17:02,374 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 427.13 sec 
MapReduce Total cumulative CPU time: 7 minutes 7 seconds 130 msec 
Ended Job = job_1360884012490_0014 
Ended Job = -1776780875, job is filtered out (removed at runtime). 
Launching Job 2 out of 2 
Number of reduce tasks is set to 0 since there's no reduce operator 
java.io.FileNotFoundException: File does not exist: /tmp/hive-marc/hive_2013-02-22_19-15-31_691_7365912335285010827/-ext-10002/000000_0 
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:782) 
at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat$OneFileInfo.<init>(CombineFileInputFormat.java:493) 
at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:284) 
at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:244) 
at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:69) 
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:386) 
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:352) 
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.processPaths(CombineHiveInputFormat.java:419) 
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:390) 
at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:479) 
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:471) 
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366) 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367) 
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367) 
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:435) 
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) 
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) 
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) 
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326) 
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118) 
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951) 
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) 
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) 
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) 
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689) 
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
at java.lang.reflect.Method.invoke(Method.java:597) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:208) 
Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: /tmp/hive-marc/hive_2013-02-22_19-15-31_691_7365912335285010827/-ext-10002/000000_0)' 
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira