You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@tez.apache.org by "Jeff Zhang (JIRA)" <ji...@apache.org> on 2015/04/29 08:02:05 UTC

[jira] [Comment Edited] (TEZ-2377) RandomWriter ends up using TextOutputFormat instead of SequenceFileOutputFormat

    [ https://issues.apache.org/jira/browse/TEZ-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518785#comment-14518785 ] 

Jeff Zhang edited comment on TEZ-2377 at 4/29/15 6:01 AM:
----------------------------------------------------------

This issue only happens when translating MR job to Tez.  MROutput may be associated with either mapper or reducer while MRInput can only be associated with mapper. So MRInput don't have this kind of issue. 



was (Author: zjffdu):
This issue only happen when translating MR job to Tez.  MROutput may be associated with either mapper or reducer while MRInput can only be associated with mapper. So MRInput don't have this kind of issue. 


> RandomWriter ends up using TextOutputFormat instead of SequenceFileOutputFormat
> -------------------------------------------------------------------------------
>
>                 Key: TEZ-2377
>                 URL: https://issues.apache.org/jira/browse/TEZ-2377
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-2377.1.patch
>
>
> {code}
> yarn jar ./dist/tez/tez-tests-0.7.0-SNAPSHOT.jar randomwriter "-Dmapreduce.randomwriter.totalbytes=10737418" /tmp/test1
> {code}
> This ends up generating TextOutputFormat.  
> {code}
> yarn jar ./dist/tez/tez-tests-0.7.0-SNAPSHOT.jar sort  
> "-Dmapreduce.framework.name=yarn-tez" -r 5 /tmp/test1 /tmp/test_sorted
> {code}
> This ends up throwing error 
> {noformat}
> Failure while running task:java.io.IOException: hdfs://tez-vm:56565/tmp/test1/part-00000 not a SequenceFile
>         at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1851)
>         at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1811)
>         at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1760)
>         at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1774)
>         at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>         at org.apache.tez.mapreduce.lib.MRReaderMapReduce.setupNewRecordReader(MRReaderMapReduce.java:149)
>         at org.apache.tez.mapreduce.lib.MRReaderMapReduce.<init>(MRReaderMapReduce.java:78)
>         at org.apache.tez.mapreduce.input.MRInput.initializeInternal(MRInput.java:475)
>         at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)