You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Xavier Stevens <Xa...@fox.com> on 2008/12/04 01:00:04 UTC
Reading Protocol Buffers as BytesWritable value
I am currently trying to read in values where I have previously output a Text,BytesWritable pair into . The key is actually Hadoop's Text writable, and the value is a Protocol Buffer byte array output into a BytesWritable. Here is a snippet showing the output configuration.
FileOutputFormat.setCompressOutput(job, true);
FileOutputFormat.setOutputCompressorClass(job, DefaultCodec.class);
SequenceFileAsBinaryOutputFormat.setOutputCompressionType(job, SequenceFile.CompressionType.BLOCK);
SequenceFileAsBinaryOutputFormat.setSequenceFileOutputKeyClass(job, Text.class);
SequenceFileAsBinaryOutputFormat.setSequenceFileOutputValueClass(job, BytesWritable.class);
FileOutputFormat.setOutputPath(job, outDataPath);
In a another job I am trying to read this back in:
job.setInputFormat(org.apache.hadoop.mapred.SequenceFileAsBinaryInputFormat.class);
public static class Map extends MapReduceBase implements Mapper<Text,BytesWritable,Text,LongWritable> { ... }
I get an error like this:
java.io.IOException: hdfs://localhost:4000/user/myuser/step1-out/part-00003.deflate not a SequenceFile
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1458)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1431)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1420)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1415)
at org.apache.hadoop.mapred.SequenceFileAsBinaryInputFormat$SequenceFileAsBinaryRecordReader.<init>(SequenceFileAsBinaryInputFormat.java:67)
at org.apache.hadoop.mapred.SequenceFileAsBinaryInputFormat.getRecordReader(SequenceFileAsBinaryInputFormat.java:48)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:219)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
Am I doing something wrong here? Or is there just some inherent problem with what I am trying to do?
-Xavier