You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Amit Sela <am...@infolinks.com> on 2013/02/10 13:00:40 UTC

Generic output key class

Hi all,

Has anyone ever used some kind of a "generic output key" for a mapreduce
job ?

I have a job running multiple tasks and I want them to be able to use both
Text and IntWritable as output key classes.

Any suggestions ?

Thanks,

Amit.

Re: Generic output key class

Posted by Amit Sela <am...@infolinks.com>.
If I'm running only one MapReduce job then IntWritable output is OK but if
I'm running several together and some are Text output, I don't want to have
duplicate MapReduce jobs for different output types, I'm trying to find a
more generic solution...

On Mon, Feb 11, 2013 at 3:18 AM, Michael Segel <mi...@hotmail.com>wrote:

> Why not just write out the int as a numeric string?
>
> On Feb 10, 2013, at 1:07 PM, Sandy Ryza <sa...@cloudera.com> wrote:
>
> Hi Amit,
>
> One way to accomplish this would be to create a custom writable
> implementation, TextOrIntWritable, that has fields for both.  It could look
> something like:
>
> class TextOrIntWritable implements Writable {
>   private boolean isText;
>   private Text text;
>   private IntWritable integer;
>
>   void writeFields(DataOutput out) {
>     out.writeBoolean(isText);
>     if (isText) {
>       text.writeFields(out);
>     } else {
>       integer.writeFields(out);
>     }
>   }
>
>   [... readFields method that works in a similar way]
> }
>
> -Sandy
>
> On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela <am...@infolinks.com> wrote:
>
>> Hi all,
>>
>> Has anyone ever used some kind of a "generic output key" for a mapreduce
>> job ?
>>
>> I have a job running multiple tasks and I want them to be able to use
>> both Text and IntWritable as output key classes.
>>
>> Any suggestions ?
>>
>> Thanks,
>>
>> Amit.
>>
>
>
> Michael Segel  <ms...@segel.com> | (m) 312.755.9623****
>
> Segel and Associates****
>
>

Re: Generic output key class

Posted by Amit Sela <am...@infolinks.com>.
If I'm running only one MapReduce job then IntWritable output is OK but if
I'm running several together and some are Text output, I don't want to have
duplicate MapReduce jobs for different output types, I'm trying to find a
more generic solution...

On Mon, Feb 11, 2013 at 3:18 AM, Michael Segel <mi...@hotmail.com>wrote:

> Why not just write out the int as a numeric string?
>
> On Feb 10, 2013, at 1:07 PM, Sandy Ryza <sa...@cloudera.com> wrote:
>
> Hi Amit,
>
> One way to accomplish this would be to create a custom writable
> implementation, TextOrIntWritable, that has fields for both.  It could look
> something like:
>
> class TextOrIntWritable implements Writable {
>   private boolean isText;
>   private Text text;
>   private IntWritable integer;
>
>   void writeFields(DataOutput out) {
>     out.writeBoolean(isText);
>     if (isText) {
>       text.writeFields(out);
>     } else {
>       integer.writeFields(out);
>     }
>   }
>
>   [... readFields method that works in a similar way]
> }
>
> -Sandy
>
> On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela <am...@infolinks.com> wrote:
>
>> Hi all,
>>
>> Has anyone ever used some kind of a "generic output key" for a mapreduce
>> job ?
>>
>> I have a job running multiple tasks and I want them to be able to use
>> both Text and IntWritable as output key classes.
>>
>> Any suggestions ?
>>
>> Thanks,
>>
>> Amit.
>>
>
>
> Michael Segel  <ms...@segel.com> | (m) 312.755.9623****
>
> Segel and Associates****
>
>

Re: Generic output key class

Posted by Amit Sela <am...@infolinks.com>.
If I'm running only one MapReduce job then IntWritable output is OK but if
I'm running several together and some are Text output, I don't want to have
duplicate MapReduce jobs for different output types, I'm trying to find a
more generic solution...

On Mon, Feb 11, 2013 at 3:18 AM, Michael Segel <mi...@hotmail.com>wrote:

> Why not just write out the int as a numeric string?
>
> On Feb 10, 2013, at 1:07 PM, Sandy Ryza <sa...@cloudera.com> wrote:
>
> Hi Amit,
>
> One way to accomplish this would be to create a custom writable
> implementation, TextOrIntWritable, that has fields for both.  It could look
> something like:
>
> class TextOrIntWritable implements Writable {
>   private boolean isText;
>   private Text text;
>   private IntWritable integer;
>
>   void writeFields(DataOutput out) {
>     out.writeBoolean(isText);
>     if (isText) {
>       text.writeFields(out);
>     } else {
>       integer.writeFields(out);
>     }
>   }
>
>   [... readFields method that works in a similar way]
> }
>
> -Sandy
>
> On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela <am...@infolinks.com> wrote:
>
>> Hi all,
>>
>> Has anyone ever used some kind of a "generic output key" for a mapreduce
>> job ?
>>
>> I have a job running multiple tasks and I want them to be able to use
>> both Text and IntWritable as output key classes.
>>
>> Any suggestions ?
>>
>> Thanks,
>>
>> Amit.
>>
>
>
> Michael Segel  <ms...@segel.com> | (m) 312.755.9623****
>
> Segel and Associates****
>
>

Re: Generic output key class

Posted by Amit Sela <am...@infolinks.com>.
If I'm running only one MapReduce job then IntWritable output is OK but if
I'm running several together and some are Text output, I don't want to have
duplicate MapReduce jobs for different output types, I'm trying to find a
more generic solution...

On Mon, Feb 11, 2013 at 3:18 AM, Michael Segel <mi...@hotmail.com>wrote:

> Why not just write out the int as a numeric string?
>
> On Feb 10, 2013, at 1:07 PM, Sandy Ryza <sa...@cloudera.com> wrote:
>
> Hi Amit,
>
> One way to accomplish this would be to create a custom writable
> implementation, TextOrIntWritable, that has fields for both.  It could look
> something like:
>
> class TextOrIntWritable implements Writable {
>   private boolean isText;
>   private Text text;
>   private IntWritable integer;
>
>   void writeFields(DataOutput out) {
>     out.writeBoolean(isText);
>     if (isText) {
>       text.writeFields(out);
>     } else {
>       integer.writeFields(out);
>     }
>   }
>
>   [... readFields method that works in a similar way]
> }
>
> -Sandy
>
> On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela <am...@infolinks.com> wrote:
>
>> Hi all,
>>
>> Has anyone ever used some kind of a "generic output key" for a mapreduce
>> job ?
>>
>> I have a job running multiple tasks and I want them to be able to use
>> both Text and IntWritable as output key classes.
>>
>> Any suggestions ?
>>
>> Thanks,
>>
>> Amit.
>>
>
>
> Michael Segel  <ms...@segel.com> | (m) 312.755.9623****
>
> Segel and Associates****
>
>

Re: Generic output key class

Posted by Michael Segel <mi...@hotmail.com>.
Why not just write out the int as a numeric string? 

On Feb 10, 2013, at 1:07 PM, Sandy Ryza <sa...@cloudera.com> wrote:

> Hi Amit,
> 
> One way to accomplish this would be to create a custom writable implementation, TextOrIntWritable, that has fields for both.  It could look something like:
> 
> class TextOrIntWritable implements Writable {
>   private boolean isText;
>   private Text text;
>   private IntWritable integer;
> 
>   void writeFields(DataOutput out) {
>     out.writeBoolean(isText);
>     if (isText) {
>       text.writeFields(out);
>     } else {
>       integer.writeFields(out);
>     }
>   }
> 
>   [... readFields method that works in a similar way]
> }
> 
> -Sandy 
> 
> On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela <am...@infolinks.com> wrote:
> Hi all, 
> 
> Has anyone ever used some kind of a "generic output key" for a mapreduce job ?
> 
> I have a job running multiple tasks and I want them to be able to use both Text and IntWritable as output key classes.
> 
> Any suggestions ?
> 
> Thanks, 
> 
> Amit.
> 

Michael Segel  | (m) 312.755.9623

Segel and Associates



Re: Generic output key class

Posted by Michael Segel <mi...@hotmail.com>.
Why not just write out the int as a numeric string? 

On Feb 10, 2013, at 1:07 PM, Sandy Ryza <sa...@cloudera.com> wrote:

> Hi Amit,
> 
> One way to accomplish this would be to create a custom writable implementation, TextOrIntWritable, that has fields for both.  It could look something like:
> 
> class TextOrIntWritable implements Writable {
>   private boolean isText;
>   private Text text;
>   private IntWritable integer;
> 
>   void writeFields(DataOutput out) {
>     out.writeBoolean(isText);
>     if (isText) {
>       text.writeFields(out);
>     } else {
>       integer.writeFields(out);
>     }
>   }
> 
>   [... readFields method that works in a similar way]
> }
> 
> -Sandy 
> 
> On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela <am...@infolinks.com> wrote:
> Hi all, 
> 
> Has anyone ever used some kind of a "generic output key" for a mapreduce job ?
> 
> I have a job running multiple tasks and I want them to be able to use both Text and IntWritable as output key classes.
> 
> Any suggestions ?
> 
> Thanks, 
> 
> Amit.
> 

Michael Segel  | (m) 312.755.9623

Segel and Associates



Re: Generic output key class

Posted by Michael Segel <mi...@hotmail.com>.
Why not just write out the int as a numeric string? 

On Feb 10, 2013, at 1:07 PM, Sandy Ryza <sa...@cloudera.com> wrote:

> Hi Amit,
> 
> One way to accomplish this would be to create a custom writable implementation, TextOrIntWritable, that has fields for both.  It could look something like:
> 
> class TextOrIntWritable implements Writable {
>   private boolean isText;
>   private Text text;
>   private IntWritable integer;
> 
>   void writeFields(DataOutput out) {
>     out.writeBoolean(isText);
>     if (isText) {
>       text.writeFields(out);
>     } else {
>       integer.writeFields(out);
>     }
>   }
> 
>   [... readFields method that works in a similar way]
> }
> 
> -Sandy 
> 
> On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela <am...@infolinks.com> wrote:
> Hi all, 
> 
> Has anyone ever used some kind of a "generic output key" for a mapreduce job ?
> 
> I have a job running multiple tasks and I want them to be able to use both Text and IntWritable as output key classes.
> 
> Any suggestions ?
> 
> Thanks, 
> 
> Amit.
> 

Michael Segel  | (m) 312.755.9623

Segel and Associates



Re: Generic output key class

Posted by Michael Segel <mi...@hotmail.com>.
Why not just write out the int as a numeric string? 

On Feb 10, 2013, at 1:07 PM, Sandy Ryza <sa...@cloudera.com> wrote:

> Hi Amit,
> 
> One way to accomplish this would be to create a custom writable implementation, TextOrIntWritable, that has fields for both.  It could look something like:
> 
> class TextOrIntWritable implements Writable {
>   private boolean isText;
>   private Text text;
>   private IntWritable integer;
> 
>   void writeFields(DataOutput out) {
>     out.writeBoolean(isText);
>     if (isText) {
>       text.writeFields(out);
>     } else {
>       integer.writeFields(out);
>     }
>   }
> 
>   [... readFields method that works in a similar way]
> }
> 
> -Sandy 
> 
> On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela <am...@infolinks.com> wrote:
> Hi all, 
> 
> Has anyone ever used some kind of a "generic output key" for a mapreduce job ?
> 
> I have a job running multiple tasks and I want them to be able to use both Text and IntWritable as output key classes.
> 
> Any suggestions ?
> 
> Thanks, 
> 
> Amit.
> 

Michael Segel  | (m) 312.755.9623

Segel and Associates



Re: Generic output key class

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Amit,

One way to accomplish this would be to create a custom writable
implementation, TextOrIntWritable, that has fields for both.  It could look
something like:

class TextOrIntWritable implements Writable {
  private boolean isText;
  private Text text;
  private IntWritable integer;

  void writeFields(DataOutput out) {
    out.writeBoolean(isText);
    if (isText) {
      text.writeFields(out);
    } else {
      integer.writeFields(out);
    }
  }

  [... readFields method that works in a similar way]
}

-Sandy

On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela <am...@infolinks.com> wrote:

> Hi all,
>
> Has anyone ever used some kind of a "generic output key" for a mapreduce
> job ?
>
> I have a job running multiple tasks and I want them to be able to use both
> Text and IntWritable as output key classes.
>
> Any suggestions ?
>
> Thanks,
>
> Amit.
>

Re: Generic output key class

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Amit,

One way to accomplish this would be to create a custom writable
implementation, TextOrIntWritable, that has fields for both.  It could look
something like:

class TextOrIntWritable implements Writable {
  private boolean isText;
  private Text text;
  private IntWritable integer;

  void writeFields(DataOutput out) {
    out.writeBoolean(isText);
    if (isText) {
      text.writeFields(out);
    } else {
      integer.writeFields(out);
    }
  }

  [... readFields method that works in a similar way]
}

-Sandy

On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela <am...@infolinks.com> wrote:

> Hi all,
>
> Has anyone ever used some kind of a "generic output key" for a mapreduce
> job ?
>
> I have a job running multiple tasks and I want them to be able to use both
> Text and IntWritable as output key classes.
>
> Any suggestions ?
>
> Thanks,
>
> Amit.
>

Re: Generic output key class

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Amit,

One way to accomplish this would be to create a custom writable
implementation, TextOrIntWritable, that has fields for both.  It could look
something like:

class TextOrIntWritable implements Writable {
  private boolean isText;
  private Text text;
  private IntWritable integer;

  void writeFields(DataOutput out) {
    out.writeBoolean(isText);
    if (isText) {
      text.writeFields(out);
    } else {
      integer.writeFields(out);
    }
  }

  [... readFields method that works in a similar way]
}

-Sandy

On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela <am...@infolinks.com> wrote:

> Hi all,
>
> Has anyone ever used some kind of a "generic output key" for a mapreduce
> job ?
>
> I have a job running multiple tasks and I want them to be able to use both
> Text and IntWritable as output key classes.
>
> Any suggestions ?
>
> Thanks,
>
> Amit.
>

Re: Generic output key class

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Amit,

One way to accomplish this would be to create a custom writable
implementation, TextOrIntWritable, that has fields for both.  It could look
something like:

class TextOrIntWritable implements Writable {
  private boolean isText;
  private Text text;
  private IntWritable integer;

  void writeFields(DataOutput out) {
    out.writeBoolean(isText);
    if (isText) {
      text.writeFields(out);
    } else {
      integer.writeFields(out);
    }
  }

  [... readFields method that works in a similar way]
}

-Sandy

On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela <am...@infolinks.com> wrote:

> Hi all,
>
> Has anyone ever used some kind of a "generic output key" for a mapreduce
> job ?
>
> I have a job running multiple tasks and I want them to be able to use both
> Text and IntWritable as output key classes.
>
> Any suggestions ?
>
> Thanks,
>
> Amit.
>