You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by Rahul <rs...@xebia.com> on 2012/07/09 10:55:25 UTC
SeqFileReaderFactory give exception
Guys,
I have a SequenceFile with LogWritable Keys and Text as values . I am
using SequenceFileSource with MRPipeline. But when I use MemPipeline it
is giving back the following exception.
3503 [main] INFO com.cloudera.crunch.io.seq.SeqFileReaderFactory - Error reading from path: file:/home/rahul/software/crunch/sampleFile
java.io.IOException: wrong key class: org.apache.hadoop.io.ObjectWritable is not class org.apache.hadoop.io.LongWritable
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1895)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947)
at com.cloudera.crunch.io.seq.SeqFileReaderFactory$1.hasNext(SeqFileReaderFactory.java:68)
at com.cloudera.crunch.io.CompositePathIterable$2.hasNext(CompositePathIterable.java:81)
Now this is due to the fact that the file contains LongWritable Keys but
it is using a NullWritable to read them. This gives error in MemPipline
only, it works in the MRPipeline because the KeyClass is passed there
using the MapContext of Hadoop and thus it is the correct one. I
modified the SeqFileReaderFactory to pass the KeyClass also but is this
the correct way of doing so ?
regards
Rahul
Re: SeqFileReaderFactory give exception
Posted by Josh Wills <jo...@gmail.com>.
SequenceFileTableSouce will let you read it the file as a PTable,
which is probably the quickest way to get what you want.
On Mon, Jul 9, 2012 at 1:55 AM, Rahul <rs...@xebia.com> wrote:
> Guys,
>
> I have a SequenceFile with LogWritable Keys and Text as values . I am using
> SequenceFileSource with MRPipeline. But when I use MemPipeline it is giving
> back the following exception.
>
> 3503 [main] INFO com.cloudera.crunch.io.seq.SeqFileReaderFactory - Error
> reading from path: file:/home/rahul/software/crunch/sampleFile
> java.io.IOException: wrong key class: org.apache.hadoop.io.ObjectWritable is
> not class org.apache.hadoop.io.LongWritable
> at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1895)
> at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947)
> at
> com.cloudera.crunch.io.seq.SeqFileReaderFactory$1.hasNext(SeqFileReaderFactory.java:68)
> at
> com.cloudera.crunch.io.CompositePathIterable$2.hasNext(CompositePathIterable.java:81)
>
> Now this is due to the fact that the file contains LongWritable Keys but it
> is using a NullWritable to read them. This gives error in MemPipline only,
> it works in the MRPipeline because the KeyClass is passed there using the
> MapContext of Hadoop and thus it is the correct one. I modified the
> SeqFileReaderFactory to pass the KeyClass also but is this the correct way
> of doing so ?
>
> regards
> Rahul