You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Viraj Bhat (JIRA)" <ji...@apache.org> on 2014/02/13 08:26:45 UTC

[jira] [Commented] (MAPREDUCE-5309) 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900091#comment-13900091 ] 

Viraj Bhat commented on MAPREDUCE-5309:
---------------------------------------

This is an issue even when parsing Job History Logs generated in Hadoop 0.23.9.10
Viraj

> 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server
> -----------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5309
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5309
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobhistoryserver, mrv2
>    Affects Versions: 2.0.4-alpha
>            Reporter: Vrushali C
>         Attachments: Test20JobHistoryParsing.java, job_2_0_3-KILLED.jhist
>
>
> When the 2.0.4 JobHistoryParser tries to parse a job history file generated by hadoop 2.0.3, the jobhistoryparser throws as an error as
> java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
>     at org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
>     at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
>     at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
>     at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
>     at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
>     at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
>     at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
>     at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
>     at org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
>     at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
>     at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
>     at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
>     at com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
>     at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>     at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
>     at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
>     at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
>     at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>     at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
>     at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
>     at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
>     at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
>     at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
>     at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
>     at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
>     at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
>     at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
> Test code and the job history file are attached.
> Test code:
> package com.twitter.somepackagel;
> import java.io.IOException;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
> import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
> import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
> import org.junit.Test;
> import org.apache.hadoop.yarn.YarnException;
> public class Test20JobHistoryParsing {
>    
>   @Test
>   public void testFileAvro() throws IOException
>   {
>       Path local_path2 = new Path("/tmp/job_2_0_3-KILLED.jhist");
>      JobHistoryParser parser2 = new JobHistoryParser(FileSystem.getLocal(new Configuration()), local_path2);
>      try {
>        JobInfo ji2 = parser2.parse();
>        System.out.println(" job info: " + ji2.getJobname() + " "
>              + ji2.getFinishedMaps() + " "
>              + ji2.getTotalMaps() + " "
>              + ji2.getJobId() ) ;
>      }
>      catch (IOException e) {
>         throw new YarnException("Could not load history file "
>            + local_path2.getName(), e);
>      }
>   }
> }
> This seems to stem from the fix in https://issues.apache.org/jira/browse/MAPREDUCE-4693
> that added counters to the historyserver  for failed tasks.
> This breaks backward compatibility with JobHistoryServer. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)