You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "W.P. McNeill" <bi...@gmail.com> on 2011/09/11 20:41:45 UTC

Can't print an ArrayWritable as a key even if it implements the ArrayWritable interface

I have an array of strings object called ItemSet defined like so:

   public class ItemSet extends ArrayWritable implements
WritableComparable<ItemSet>

I am able to use ItemSet as both a key and value in Hadoop jobs that I have
written. However, when I try to use "hadoop fs -text" to dump a sequence
file containing ItemSet keys and values, I get a ClassCastException. I am
specifying the jar that contains ItemSet as a "-libjars" argument, so the
class can be seen.

When I stepped into the FsShell code in the debugger I saw that the problem
occurs in org.apache.hadoop.fs.TextRecordInputStream constructor. When this
constructor tries to create the key class, it calls
ItemSet.asSubclass(WritableComparable.class). This call throws the
ClassCastException.

   1. Shouldn't it be possible to cast ItemSet to WritableComparable?
   2. Why don't I see this exception when I run code other than FsShell that
   uses ItemSet as a key?
   3. Is this the same issue as
HADOOP-4466<https://issues.apache.org/jira/browse/HADOOP-4466>
   ?

This is Hadoop 0.20.203.0.

Re: Can't print an ArrayWritable as a key even if it implements the ArrayWritable interface

Posted by "W.P. McNeill" <bi...@gmail.com>.
Now I'm not able to reproduce this bug. The only thing I've changed recently
is moving ItemSet down to a lower package (from wpmcn to wpmcn.structure),
but I'd be surprised if that was the problem.

I'll have to see if I can repro with an older version of the code.

I've implemented ItemSet differently than you've implemented your custom
writable. See https://gist.github.com/1214627. I don't think this is the
issue, though.

Re: Can't print an ArrayWritable as a key even if it implements the ArrayWritable interface

Posted by Harsh J <ha...@cloudera.com>.
Hello W.P.,

Could you paste your exact exception/strace?

I tried to reproduce your issue and failed both on 0.20 and trunk.
Here are the stuff I used to ape yours (mostly dummy stuff):
https://gist.github.com/1210817

Perhaps am not doing some step that you are doing.

My suspicion is that your ItemSet import may be wrong? Or a different,
serializable class is getting loaded instead of the WritableComprable
one when you perform your job, due to an improper import statement and
similar names?

On Mon, Sep 12, 2011 at 12:49 AM, W.P. McNeill <bi...@gmail.com> wrote:
> With a bit more debugging I may have partially answered question (3) for
> myself.
>
> When I run a Hadoop job other than that called by "hadoop fs -text", an
> ItemSet key is created by the org.apache.hadoop.ioWritableComparator.newKey
> method, which looks like this:
>
>  public WritableComparable newKey() {
>    return ReflectionUtils.newInstance(keyClass, null);
>  }
>
> Here keyClass is equal to ItemSet.class. Compare this to the call in the
> TextRecordInputStream constructor.
>
>      key =
> ReflectionUtils.newInstance(r.getKeyClass().asSubclass(WritableComparable.class),
>                                        getConf());
>
> This does look related to
> HADOOP-4466<https://issues.apache.org/jira/browse/HADOOP-4466>,
> but that is listed as fixed in 0.19.0.
>



-- 
Harsh J

Re: Can't print an ArrayWritable as a key even if it implements the ArrayWritable interface

Posted by "W.P. McNeill" <bi...@gmail.com>.
With a bit more debugging I may have partially answered question (3) for
myself.

When I run a Hadoop job other than that called by "hadoop fs -text", an
ItemSet key is created by the org.apache.hadoop.ioWritableComparator.newKey
method, which looks like this:

  public WritableComparable newKey() {
    return ReflectionUtils.newInstance(keyClass, null);
  }

Here keyClass is equal to ItemSet.class. Compare this to the call in the
TextRecordInputStream constructor.

      key =
ReflectionUtils.newInstance(r.getKeyClass().asSubclass(WritableComparable.class),
                                        getConf());

This does look related to
HADOOP-4466<https://issues.apache.org/jira/browse/HADOOP-4466>,
but that is listed as fixed in 0.19.0.