You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Guruprasad DV <dv...@gmail.com> on 2012/07/03 09:55:35 UTC

implementing a generic list writable

Hi,

I am working on building a map reduce pipeline of jobs(with one MR job's
output feeding to another as input). The values being passed around are
fairly complex, in that there are lists of different types and hash maps
with values as lists. Hadoop api does not seem to have a ListWritable. Am
trying to write a generic one, but it seems i can't instantiate a generic
type in my readFields implementation, unless i pass in the class type
itself:

public class ListWritable<T extends Writable> implements Writable {
    private List<T> list;
    private Class<T> clazz;

    public ListWritable(Class<T> clazz) {
       this.clazz = clazz;
       list = new ArrayList<T>();
    }

    @Override
    public void write(DataOutput out) throws IOException {
        out.writeInt(list.size());
        for (T element : list) {
            element.write(out);
        }
     }

     @Override
     public void readFields(DataInput in) throws IOException{
     int count = in.readInt();
     this.list = new ArrayList<T>();
     for (int i = 0; i < count; i++) {
        try {
            T obj = clazz.newInstance();
            obj.readFields(in);
            list.add(obj);
        } catch (InstantiationException e) {
            e.printStackTrace();
        } catch (IllegalAccessException e) {
            e.printStackTrace();
        }
      }
    }
}


But hadoop requires all writables to have a no argument constructor to read
the values back. Has anybody tried to do the same and solved this problem?
TIA.

-- 
Thanks and regards,
guru

Re: implementing a generic list writable

Posted by Dave Beech <db...@apache.org>.
You could serialise the class name along with the list items in an extra
Text writable, and then in the readFields method get it back using
Class.forName(textClassName.toString())

In the write method, to avoid having the class as a variable you must pass
into the constructor, just get the class type of the first item in your
list and serialise that.

textClassName.set(list.get(0).getClass().getName());

Cheers,
Dave

On 3 July 2012 08:55, Guruprasad DV <dv...@gmail.com> wrote:

> Hi,
>
> I am working on building a map reduce pipeline of jobs(with one MR job's
> output feeding to another as input). The values being passed around are
> fairly complex, in that there are lists of different types and hash maps
> with values as lists. Hadoop api does not seem to have a ListWritable. Am
> trying to write a generic one, but it seems i can't instantiate a generic
> type in my readFields implementation, unless i pass in the class type
> itself:
>
> public class ListWritable<T extends Writable> implements Writable {
>     private List<T> list;
>     private Class<T> clazz;
>
>     public ListWritable(Class<T> clazz) {
>        this.clazz = clazz;
>        list = new ArrayList<T>();
>     }
>
>     @Override
>     public void write(DataOutput out) throws IOException {
>         out.writeInt(list.size());
>         for (T element : list) {
>             element.write(out);
>         }
>      }
>
>      @Override
>      public void readFields(DataInput in) throws IOException{
>      int count = in.readInt();
>      this.list = new ArrayList<T>();
>      for (int i = 0; i < count; i++) {
>         try {
>             T obj = clazz.newInstance();
>             obj.readFields(in);
>             list.add(obj);
>         } catch (InstantiationException e) {
>             e.printStackTrace();
>         } catch (IllegalAccessException e) {
>             e.printStackTrace();
>         }
>       }
>     }
> }
>
>
> But hadoop requires all writables to have a no argument constructor to
> read the values back. Has anybody tried to do the same and solved this
> problem? TIA.
>
> --
> Thanks and regards,
> guru
>