You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Renato Javier Marroquín Mogrovejo (JIRA)" <ji...@apache.org> on 2012/08/30 07:47:07 UTC

[jira] [Commented] (PIG-2371) attempt to load a non-existen file results in a stack trace rather an error message

    [ https://issues.apache.org/jira/browse/PIG-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13444708#comment-13444708 ] 

Renato Javier Marroquín Mogrovejo commented on PIG-2371:
--------------------------------------------------------

Hi,

So we could fix this within the getSlit method from the PigInputFormat class, my guess is that inputs being loaded from conf.get("pig.inputs") is returning nothing so could we just add a validation for this or let the ReadToEndLoader to verify this, what is the consensus on this one? IMHO ReadToEndLoader should do it while in the init() method. Exactly before this code:
{code}
         try {
            inpSplits = inputFormat.getSplits(HadoopShims.createJobContext(conf,
                    new JobID()));
        } catch (InterruptedException e) {
            throw new IOException(e);
        }      
{\code}
                
> attempt to load a non-existen file results in a stack trace rather an error message
> -----------------------------------------------------------------------------------
>
>                 Key: PIG-2371
>                 URL: https://issues.apache.org/jira/browse/PIG-2371
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.2
>         Environment: -bash-3.1$ hadoop version
> Hadoop 0.23.0.1111080202
> Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common -r 1196973
> Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
> From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae
> -bash-3.1$ pig -version
> Apache Pig version 0.9.2.1111101150 (r1200499)
> compiled Nov 10 2011, 19:50:15
>            Reporter: Araceli Henley
>            Priority: Trivial
>
> a = load 'does_not_exist' using PigStorage();
> dump a;
> Failed Jobs:
> JobId   Alias   Feature Message Outputs
> job_1321041443489_2010  a       MAP_ONLY        Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: hdfs://gsbl90892.blue.ygrid.yahoo.com:8020/user/hadoopqa/does_not_exist
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:282)
>         at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:445)
>         at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:462)
>         at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:360)
>         at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1159)
>         at org.apache.hadoop.mapreduce.Job$2.run(Job.java:1156)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1156)
>         at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
>         at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:233)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://gsbl90892.blue.ygrid.yahoo.com:8020/user/hadoopqa/does_not_exist
>         at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:243)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
>         at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:269)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:270)
>         ... 12 more
>         hdfs://gsbl90892.blue.ygrid.yahoo.com:8020/tmp/temp1739481333/tmp-502339,
> Backend error message
> ---------------------
> AttemptID:attempt_1321041443489_2008_m_000001_0 Info:Error: java.lang.RuntimeException: java.io.IOException: Can't get JobTracker Kerberos principal for use as renewer
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.SkewedPartitioner.setConf(SkewedPartitioner.java:119)
>         at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:70)
>         at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
>         at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:627)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:695)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:147)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:142)
> Caused by: java.io.IOException: Can't get JobTracker Kerberos principal for use as renewer
>         at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:106)
>         at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:90)
>         at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:83)
>         at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:205)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37)
>         at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:269)
>         at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:154)
>         at org.apache.pig.impl.io.ReadToEndLoader.<init>(ReadToEndLoader.java:116)
>         at org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil.loadPartitionFileFromLocalCache(MapRedUtil.java:101)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.SkewedPartitioner.setConf(SkewedPartitioner.java:114)
>         ... 10 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira