You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Lajos <la...@protulae.com> on 2009/09/21 17:53:26 UTC

Using ArrayWritable as a key?

Hi all,

I seem to have a problem using ArrayWritable (of Texts) as a key in my 
MR jobs. I want my Mapper output key to be ArrayWritable, and both input 
& output keys in my Reducer the same.

I've tried this with both mapred and mapreduce versions (I'm using 
0.20.0 here).

I also tried extending ArrayWritable as TextArrayWritable:

public class TextArrayWritable extends ArrayWritable {
	public TextArrayWritable() {
		super(TextArrayWritable.class);
	}
}

Regardless of what I do, I get the error:

09/09/24 15:10:36 INFO mapred.JobClient: Task Id : 
attempt_200909102223_0017_m_000000_0, Status : FAILED
java.lang.ClassCastException: class org.apache.hadoop.io.ArrayWritable
         at java.lang.Class.asSubclass(Class.java:3018)
         at 
org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:664)
         at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:689)
         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:348)
         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
         at org.apache.hadoop.mapred.Child.main(Child.java:170)


Is there a reason I'm not aware of about using this class as a key?

TIA,

Lajos

Re: Using ArrayWritable as a key?

Posted by Todd Lipcon <to...@cloudera.com>.
Hi Lajos,

ArrayWritable does not implement WritableComparable, so it can't currently
be used as a mapper output key - those keys have to be sorted during the
shuffle, and thus the type must be WritableComparable.

-Todd

On Mon, Sep 21, 2009 at 8:53 AM, Lajos <la...@protulae.com> wrote:

> Hi all,
>
> I seem to have a problem using ArrayWritable (of Texts) as a key in my MR
> jobs. I want my Mapper output key to be ArrayWritable, and both input &
> output keys in my Reducer the same.
>
> I've tried this with both mapred and mapreduce versions (I'm using 0.20.0
> here).
>
> I also tried extending ArrayWritable as TextArrayWritable:
>
> public class TextArrayWritable extends ArrayWritable {
>        public TextArrayWritable() {
>                super(TextArrayWritable.class);
>        }
> }
>
> Regardless of what I do, I get the error:
>
> 09/09/24 15:10:36 INFO mapred.JobClient: Task Id :
> attempt_200909102223_0017_m_000000_0, Status : FAILED
> java.lang.ClassCastException: class org.apache.hadoop.io.ArrayWritable
>        at java.lang.Class.asSubclass(Class.java:3018)
>        at
> org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:664)
>        at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:689)
>        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:348)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
> Is there a reason I'm not aware of about using this class as a key?
>
> TIA,
>
> Lajos
>

Re: Using ArrayWritable as a key?

Posted by Lajos <la...@protulae.com>.
Apologies, I should'a checked the source first ... I see that keys have 
to be WritableComparable, and hence I'll have to implement that 
interface in my custom class.

Lajos


Lajos wrote:
> Hi all,
> 
> I seem to have a problem using ArrayWritable (of Texts) as a key in my 
> MR jobs. I want my Mapper output key to be ArrayWritable, and both input 
> & output keys in my Reducer the same.
> 
> I've tried this with both mapred and mapreduce versions (I'm using 
> 0.20.0 here).
> 
> I also tried extending ArrayWritable as TextArrayWritable:
> 
> public class TextArrayWritable extends ArrayWritable {
>     public TextArrayWritable() {
>         super(TextArrayWritable.class);
>     }
> }
> 
> Regardless of what I do, I get the error:
> 
> 09/09/24 15:10:36 INFO mapred.JobClient: Task Id : 
> attempt_200909102223_0017_m_000000_0, Status : FAILED
> java.lang.ClassCastException: class org.apache.hadoop.io.ArrayWritable
>         at java.lang.Class.asSubclass(Class.java:3018)
>         at 
> org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:664)
>         at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:689)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:348)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>         at org.apache.hadoop.mapred.Child.main(Child.java:170)
> 
> 
> Is there a reason I'm not aware of about using this class as a key?
> 
> TIA,
> 
> Lajos
> 
> 
> ------------------------------------------------------------------------
> 
> 
> No virus found in this incoming message.
> Checked by AVG - www.avg.com 
> Version: 8.5.409 / Virus Database: 270.13.110/2385 - Release Date: 09/20/09 17:51:00
>