You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Lajos <la...@protulae.com> on 2009/09/21 17:53:26 UTC
Using ArrayWritable as a key?
Hi all,
I seem to have a problem using ArrayWritable (of Texts) as a key in my
MR jobs. I want my Mapper output key to be ArrayWritable, and both input
& output keys in my Reducer the same.
I've tried this with both mapred and mapreduce versions (I'm using
0.20.0 here).
I also tried extending ArrayWritable as TextArrayWritable:
public class TextArrayWritable extends ArrayWritable {
public TextArrayWritable() {
super(TextArrayWritable.class);
}
}
Regardless of what I do, I get the error:
09/09/24 15:10:36 INFO mapred.JobClient: Task Id :
attempt_200909102223_0017_m_000000_0, Status : FAILED
java.lang.ClassCastException: class org.apache.hadoop.io.ArrayWritable
at java.lang.Class.asSubclass(Class.java:3018)
at
org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:664)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:689)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:348)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Is there a reason I'm not aware of about using this class as a key?
TIA,
Lajos
Re: Using ArrayWritable as a key?
Posted by Todd Lipcon <to...@cloudera.com>.
Hi Lajos,
ArrayWritable does not implement WritableComparable, so it can't currently
be used as a mapper output key - those keys have to be sorted during the
shuffle, and thus the type must be WritableComparable.
-Todd
On Mon, Sep 21, 2009 at 8:53 AM, Lajos <la...@protulae.com> wrote:
> Hi all,
>
> I seem to have a problem using ArrayWritable (of Texts) as a key in my MR
> jobs. I want my Mapper output key to be ArrayWritable, and both input &
> output keys in my Reducer the same.
>
> I've tried this with both mapred and mapreduce versions (I'm using 0.20.0
> here).
>
> I also tried extending ArrayWritable as TextArrayWritable:
>
> public class TextArrayWritable extends ArrayWritable {
> public TextArrayWritable() {
> super(TextArrayWritable.class);
> }
> }
>
> Regardless of what I do, I get the error:
>
> 09/09/24 15:10:36 INFO mapred.JobClient: Task Id :
> attempt_200909102223_0017_m_000000_0, Status : FAILED
> java.lang.ClassCastException: class org.apache.hadoop.io.ArrayWritable
> at java.lang.Class.asSubclass(Class.java:3018)
> at
> org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:664)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:689)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:348)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
> Is there a reason I'm not aware of about using this class as a key?
>
> TIA,
>
> Lajos
>
Re: Using ArrayWritable as a key?
Posted by Lajos <la...@protulae.com>.
Apologies, I should'a checked the source first ... I see that keys have
to be WritableComparable, and hence I'll have to implement that
interface in my custom class.
Lajos
Lajos wrote:
> Hi all,
>
> I seem to have a problem using ArrayWritable (of Texts) as a key in my
> MR jobs. I want my Mapper output key to be ArrayWritable, and both input
> & output keys in my Reducer the same.
>
> I've tried this with both mapred and mapreduce versions (I'm using
> 0.20.0 here).
>
> I also tried extending ArrayWritable as TextArrayWritable:
>
> public class TextArrayWritable extends ArrayWritable {
> public TextArrayWritable() {
> super(TextArrayWritable.class);
> }
> }
>
> Regardless of what I do, I get the error:
>
> 09/09/24 15:10:36 INFO mapred.JobClient: Task Id :
> attempt_200909102223_0017_m_000000_0, Status : FAILED
> java.lang.ClassCastException: class org.apache.hadoop.io.ArrayWritable
> at java.lang.Class.asSubclass(Class.java:3018)
> at
> org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:664)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:689)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:348)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
> Is there a reason I'm not aware of about using this class as a key?
>
> TIA,
>
> Lajos
>
>
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.5.409 / Virus Database: 270.13.110/2385 - Release Date: 09/20/09 17:51:00
>