You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Paul van Hoven <pa...@googlemail.com> on 2013/02/27 19:34:27 UTC

Custom output value for map function

The output value in the map function is in most examples for hadoop
something like this:

public static class Map extends Mapper<LongWritable, Text, outputKey,
outputValue>

Normally outputValue is something like Text or IntWriteable.

I got a custom class with its own properties like

public class Dog {
   string name;
   Date birthday;
   double weight;
}

Now how would I accomplish the following map function:

public static class Map extends Mapper<LongWritable, Text, IntWritable, Dog>

?

Re: Custom output value for map function

Posted by Sandy Ryza <sa...@cloudera.com>.
That's right, the date needs to be written and read in the same order.

On Wed, Feb 27, 2013 at 11:04 AM, Paul van Hoven <
paul.van.hoven@googlemail.com> wrote:

> Great! Thank you.
>
> I guess the order for writing and reading the data this way is
> important. I mean, for
>
> out.writeUTF("blabla")
> out.writeInt(12)
>
> the following would be correct
>
> text = in.readUTF();
> number = in.readInt();
>
> and this would fail:
>
> number = in.readInt();
> text = in.readUTF();
>
> ?
>
> 2013/2/27 Sandy Ryza <sa...@cloudera.com>:
> > Hi Paul,
> >
> > To do this, you need to make your Dog class implement Hadoop's Writable
> > interface, so that it can be serialized to and deserialized from bytes.
> >
> http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html
> >
> > The methods you implement would look something like this:
> >
> > public void write(DataOutput out) {
> >   out.writeDouble(weight);
> >   out.writeUTF(name);
> >   out.writeLong(date.toTimeInMillis());
> > }
> >
> > public void readFields(DataInput in) {
> >   weight = in.readDouble();
> >   name = in.readUTF();
> >   date = new Date(in.readLong());
> > }
> >
> > hope that helps,
> > Sandy
> >
> > On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven
> > <pa...@googlemail.com> wrote:
> >>
> >> The output value in the map function is in most examples for hadoop
> >> something like this:
> >>
> >> public static class Map extends Mapper<LongWritable, Text, outputKey,
> >> outputValue>
> >>
> >> Normally outputValue is something like Text or IntWriteable.
> >>
> >> I got a custom class with its own properties like
> >>
> >> public class Dog {
> >>    string name;
> >>    Date birthday;
> >>    double weight;
> >> }
> >>
> >> Now how would I accomplish the following map function:
> >>
> >> public static class Map extends Mapper<LongWritable, Text, IntWritable,
> >> Dog>
> >>
> >> ?
> >
> >
>

Re: Custom output value for map function

Posted by Sandy Ryza <sa...@cloudera.com>.
That's right, the date needs to be written and read in the same order.

On Wed, Feb 27, 2013 at 11:04 AM, Paul van Hoven <
paul.van.hoven@googlemail.com> wrote:

> Great! Thank you.
>
> I guess the order for writing and reading the data this way is
> important. I mean, for
>
> out.writeUTF("blabla")
> out.writeInt(12)
>
> the following would be correct
>
> text = in.readUTF();
> number = in.readInt();
>
> and this would fail:
>
> number = in.readInt();
> text = in.readUTF();
>
> ?
>
> 2013/2/27 Sandy Ryza <sa...@cloudera.com>:
> > Hi Paul,
> >
> > To do this, you need to make your Dog class implement Hadoop's Writable
> > interface, so that it can be serialized to and deserialized from bytes.
> >
> http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html
> >
> > The methods you implement would look something like this:
> >
> > public void write(DataOutput out) {
> >   out.writeDouble(weight);
> >   out.writeUTF(name);
> >   out.writeLong(date.toTimeInMillis());
> > }
> >
> > public void readFields(DataInput in) {
> >   weight = in.readDouble();
> >   name = in.readUTF();
> >   date = new Date(in.readLong());
> > }
> >
> > hope that helps,
> > Sandy
> >
> > On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven
> > <pa...@googlemail.com> wrote:
> >>
> >> The output value in the map function is in most examples for hadoop
> >> something like this:
> >>
> >> public static class Map extends Mapper<LongWritable, Text, outputKey,
> >> outputValue>
> >>
> >> Normally outputValue is something like Text or IntWriteable.
> >>
> >> I got a custom class with its own properties like
> >>
> >> public class Dog {
> >>    string name;
> >>    Date birthday;
> >>    double weight;
> >> }
> >>
> >> Now how would I accomplish the following map function:
> >>
> >> public static class Map extends Mapper<LongWritable, Text, IntWritable,
> >> Dog>
> >>
> >> ?
> >
> >
>

Re: Custom output value for map function

Posted by Sandy Ryza <sa...@cloudera.com>.
That's right, the date needs to be written and read in the same order.

On Wed, Feb 27, 2013 at 11:04 AM, Paul van Hoven <
paul.van.hoven@googlemail.com> wrote:

> Great! Thank you.
>
> I guess the order for writing and reading the data this way is
> important. I mean, for
>
> out.writeUTF("blabla")
> out.writeInt(12)
>
> the following would be correct
>
> text = in.readUTF();
> number = in.readInt();
>
> and this would fail:
>
> number = in.readInt();
> text = in.readUTF();
>
> ?
>
> 2013/2/27 Sandy Ryza <sa...@cloudera.com>:
> > Hi Paul,
> >
> > To do this, you need to make your Dog class implement Hadoop's Writable
> > interface, so that it can be serialized to and deserialized from bytes.
> >
> http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html
> >
> > The methods you implement would look something like this:
> >
> > public void write(DataOutput out) {
> >   out.writeDouble(weight);
> >   out.writeUTF(name);
> >   out.writeLong(date.toTimeInMillis());
> > }
> >
> > public void readFields(DataInput in) {
> >   weight = in.readDouble();
> >   name = in.readUTF();
> >   date = new Date(in.readLong());
> > }
> >
> > hope that helps,
> > Sandy
> >
> > On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven
> > <pa...@googlemail.com> wrote:
> >>
> >> The output value in the map function is in most examples for hadoop
> >> something like this:
> >>
> >> public static class Map extends Mapper<LongWritable, Text, outputKey,
> >> outputValue>
> >>
> >> Normally outputValue is something like Text or IntWriteable.
> >>
> >> I got a custom class with its own properties like
> >>
> >> public class Dog {
> >>    string name;
> >>    Date birthday;
> >>    double weight;
> >> }
> >>
> >> Now how would I accomplish the following map function:
> >>
> >> public static class Map extends Mapper<LongWritable, Text, IntWritable,
> >> Dog>
> >>
> >> ?
> >
> >
>

Re: Custom output value for map function

Posted by Sandy Ryza <sa...@cloudera.com>.
That's right, the date needs to be written and read in the same order.

On Wed, Feb 27, 2013 at 11:04 AM, Paul van Hoven <
paul.van.hoven@googlemail.com> wrote:

> Great! Thank you.
>
> I guess the order for writing and reading the data this way is
> important. I mean, for
>
> out.writeUTF("blabla")
> out.writeInt(12)
>
> the following would be correct
>
> text = in.readUTF();
> number = in.readInt();
>
> and this would fail:
>
> number = in.readInt();
> text = in.readUTF();
>
> ?
>
> 2013/2/27 Sandy Ryza <sa...@cloudera.com>:
> > Hi Paul,
> >
> > To do this, you need to make your Dog class implement Hadoop's Writable
> > interface, so that it can be serialized to and deserialized from bytes.
> >
> http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html
> >
> > The methods you implement would look something like this:
> >
> > public void write(DataOutput out) {
> >   out.writeDouble(weight);
> >   out.writeUTF(name);
> >   out.writeLong(date.toTimeInMillis());
> > }
> >
> > public void readFields(DataInput in) {
> >   weight = in.readDouble();
> >   name = in.readUTF();
> >   date = new Date(in.readLong());
> > }
> >
> > hope that helps,
> > Sandy
> >
> > On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven
> > <pa...@googlemail.com> wrote:
> >>
> >> The output value in the map function is in most examples for hadoop
> >> something like this:
> >>
> >> public static class Map extends Mapper<LongWritable, Text, outputKey,
> >> outputValue>
> >>
> >> Normally outputValue is something like Text or IntWriteable.
> >>
> >> I got a custom class with its own properties like
> >>
> >> public class Dog {
> >>    string name;
> >>    Date birthday;
> >>    double weight;
> >> }
> >>
> >> Now how would I accomplish the following map function:
> >>
> >> public static class Map extends Mapper<LongWritable, Text, IntWritable,
> >> Dog>
> >>
> >> ?
> >
> >
>

Re: Custom output value for map function

Posted by Paul van Hoven <pa...@googlemail.com>.
Great! Thank you.

I guess the order for writing and reading the data this way is
important. I mean, for

out.writeUTF("blabla")
out.writeInt(12)

the following would be correct

text = in.readUTF();
number = in.readInt();

and this would fail:

number = in.readInt();
text = in.readUTF();

?

2013/2/27 Sandy Ryza <sa...@cloudera.com>:
> Hi Paul,
>
> To do this, you need to make your Dog class implement Hadoop's Writable
> interface, so that it can be serialized to and deserialized from bytes.
> http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html
>
> The methods you implement would look something like this:
>
> public void write(DataOutput out) {
>   out.writeDouble(weight);
>   out.writeUTF(name);
>   out.writeLong(date.toTimeInMillis());
> }
>
> public void readFields(DataInput in) {
>   weight = in.readDouble();
>   name = in.readUTF();
>   date = new Date(in.readLong());
> }
>
> hope that helps,
> Sandy
>
> On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven
> <pa...@googlemail.com> wrote:
>>
>> The output value in the map function is in most examples for hadoop
>> something like this:
>>
>> public static class Map extends Mapper<LongWritable, Text, outputKey,
>> outputValue>
>>
>> Normally outputValue is something like Text or IntWriteable.
>>
>> I got a custom class with its own properties like
>>
>> public class Dog {
>>    string name;
>>    Date birthday;
>>    double weight;
>> }
>>
>> Now how would I accomplish the following map function:
>>
>> public static class Map extends Mapper<LongWritable, Text, IntWritable,
>> Dog>
>>
>> ?
>
>

Re: Custom output value for map function

Posted by Paul van Hoven <pa...@googlemail.com>.
Great! Thank you.

I guess the order for writing and reading the data this way is
important. I mean, for

out.writeUTF("blabla")
out.writeInt(12)

the following would be correct

text = in.readUTF();
number = in.readInt();

and this would fail:

number = in.readInt();
text = in.readUTF();

?

2013/2/27 Sandy Ryza <sa...@cloudera.com>:
> Hi Paul,
>
> To do this, you need to make your Dog class implement Hadoop's Writable
> interface, so that it can be serialized to and deserialized from bytes.
> http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html
>
> The methods you implement would look something like this:
>
> public void write(DataOutput out) {
>   out.writeDouble(weight);
>   out.writeUTF(name);
>   out.writeLong(date.toTimeInMillis());
> }
>
> public void readFields(DataInput in) {
>   weight = in.readDouble();
>   name = in.readUTF();
>   date = new Date(in.readLong());
> }
>
> hope that helps,
> Sandy
>
> On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven
> <pa...@googlemail.com> wrote:
>>
>> The output value in the map function is in most examples for hadoop
>> something like this:
>>
>> public static class Map extends Mapper<LongWritable, Text, outputKey,
>> outputValue>
>>
>> Normally outputValue is something like Text or IntWriteable.
>>
>> I got a custom class with its own properties like
>>
>> public class Dog {
>>    string name;
>>    Date birthday;
>>    double weight;
>> }
>>
>> Now how would I accomplish the following map function:
>>
>> public static class Map extends Mapper<LongWritable, Text, IntWritable,
>> Dog>
>>
>> ?
>
>

Re: Custom output value for map function

Posted by Paul van Hoven <pa...@googlemail.com>.
Great! Thank you.

I guess the order for writing and reading the data this way is
important. I mean, for

out.writeUTF("blabla")
out.writeInt(12)

the following would be correct

text = in.readUTF();
number = in.readInt();

and this would fail:

number = in.readInt();
text = in.readUTF();

?

2013/2/27 Sandy Ryza <sa...@cloudera.com>:
> Hi Paul,
>
> To do this, you need to make your Dog class implement Hadoop's Writable
> interface, so that it can be serialized to and deserialized from bytes.
> http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html
>
> The methods you implement would look something like this:
>
> public void write(DataOutput out) {
>   out.writeDouble(weight);
>   out.writeUTF(name);
>   out.writeLong(date.toTimeInMillis());
> }
>
> public void readFields(DataInput in) {
>   weight = in.readDouble();
>   name = in.readUTF();
>   date = new Date(in.readLong());
> }
>
> hope that helps,
> Sandy
>
> On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven
> <pa...@googlemail.com> wrote:
>>
>> The output value in the map function is in most examples for hadoop
>> something like this:
>>
>> public static class Map extends Mapper<LongWritable, Text, outputKey,
>> outputValue>
>>
>> Normally outputValue is something like Text or IntWriteable.
>>
>> I got a custom class with its own properties like
>>
>> public class Dog {
>>    string name;
>>    Date birthday;
>>    double weight;
>> }
>>
>> Now how would I accomplish the following map function:
>>
>> public static class Map extends Mapper<LongWritable, Text, IntWritable,
>> Dog>
>>
>> ?
>
>

Re: Custom output value for map function

Posted by Paul van Hoven <pa...@googlemail.com>.
Great! Thank you.

I guess the order for writing and reading the data this way is
important. I mean, for

out.writeUTF("blabla")
out.writeInt(12)

the following would be correct

text = in.readUTF();
number = in.readInt();

and this would fail:

number = in.readInt();
text = in.readUTF();

?

2013/2/27 Sandy Ryza <sa...@cloudera.com>:
> Hi Paul,
>
> To do this, you need to make your Dog class implement Hadoop's Writable
> interface, so that it can be serialized to and deserialized from bytes.
> http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html
>
> The methods you implement would look something like this:
>
> public void write(DataOutput out) {
>   out.writeDouble(weight);
>   out.writeUTF(name);
>   out.writeLong(date.toTimeInMillis());
> }
>
> public void readFields(DataInput in) {
>   weight = in.readDouble();
>   name = in.readUTF();
>   date = new Date(in.readLong());
> }
>
> hope that helps,
> Sandy
>
> On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven
> <pa...@googlemail.com> wrote:
>>
>> The output value in the map function is in most examples for hadoop
>> something like this:
>>
>> public static class Map extends Mapper<LongWritable, Text, outputKey,
>> outputValue>
>>
>> Normally outputValue is something like Text or IntWriteable.
>>
>> I got a custom class with its own properties like
>>
>> public class Dog {
>>    string name;
>>    Date birthday;
>>    double weight;
>> }
>>
>> Now how would I accomplish the following map function:
>>
>> public static class Map extends Mapper<LongWritable, Text, IntWritable,
>> Dog>
>>
>> ?
>
>

Re: Custom output value for map function

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Paul,

To do this, you need to make your Dog class implement Hadoop's Writable
interface, so that it can be serialized to and deserialized from bytes.
http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html

The methods you implement would look something like this:

public void write(DataOutput out) {
  out.writeDouble(weight);
  out.writeUTF(name);
  out.writeLong(date.toTimeInMillis());
}

public void readFields(DataInput in) {
  weight = in.readDouble();
  name = in.readUTF();
  date = new Date(in.readLong());
}

hope that helps,
Sandy

On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven <
paul.van.hoven@googlemail.com> wrote:

> The output value in the map function is in most examples for hadoop
> something like this:
>
> public static class Map extends Mapper<LongWritable, Text, outputKey,
> outputValue>
>
> Normally outputValue is something like Text or IntWriteable.
>
> I got a custom class with its own properties like
>
> public class Dog {
>    string name;
>    Date birthday;
>    double weight;
> }
>
> Now how would I accomplish the following map function:
>
> public static class Map extends Mapper<LongWritable, Text, IntWritable,
> Dog>
>
> ?
>

Re: Custom output value for map function

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Paul,

To do this, you need to make your Dog class implement Hadoop's Writable
interface, so that it can be serialized to and deserialized from bytes.
http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html

The methods you implement would look something like this:

public void write(DataOutput out) {
  out.writeDouble(weight);
  out.writeUTF(name);
  out.writeLong(date.toTimeInMillis());
}

public void readFields(DataInput in) {
  weight = in.readDouble();
  name = in.readUTF();
  date = new Date(in.readLong());
}

hope that helps,
Sandy

On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven <
paul.van.hoven@googlemail.com> wrote:

> The output value in the map function is in most examples for hadoop
> something like this:
>
> public static class Map extends Mapper<LongWritable, Text, outputKey,
> outputValue>
>
> Normally outputValue is something like Text or IntWriteable.
>
> I got a custom class with its own properties like
>
> public class Dog {
>    string name;
>    Date birthday;
>    double weight;
> }
>
> Now how would I accomplish the following map function:
>
> public static class Map extends Mapper<LongWritable, Text, IntWritable,
> Dog>
>
> ?
>

Re: Custom output value for map function

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Paul,

To do this, you need to make your Dog class implement Hadoop's Writable
interface, so that it can be serialized to and deserialized from bytes.
http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html

The methods you implement would look something like this:

public void write(DataOutput out) {
  out.writeDouble(weight);
  out.writeUTF(name);
  out.writeLong(date.toTimeInMillis());
}

public void readFields(DataInput in) {
  weight = in.readDouble();
  name = in.readUTF();
  date = new Date(in.readLong());
}

hope that helps,
Sandy

On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven <
paul.van.hoven@googlemail.com> wrote:

> The output value in the map function is in most examples for hadoop
> something like this:
>
> public static class Map extends Mapper<LongWritable, Text, outputKey,
> outputValue>
>
> Normally outputValue is something like Text or IntWriteable.
>
> I got a custom class with its own properties like
>
> public class Dog {
>    string name;
>    Date birthday;
>    double weight;
> }
>
> Now how would I accomplish the following map function:
>
> public static class Map extends Mapper<LongWritable, Text, IntWritable,
> Dog>
>
> ?
>

Re: Custom output value for map function

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Paul,

To do this, you need to make your Dog class implement Hadoop's Writable
interface, so that it can be serialized to and deserialized from bytes.
http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html

The methods you implement would look something like this:

public void write(DataOutput out) {
  out.writeDouble(weight);
  out.writeUTF(name);
  out.writeLong(date.toTimeInMillis());
}

public void readFields(DataInput in) {
  weight = in.readDouble();
  name = in.readUTF();
  date = new Date(in.readLong());
}

hope that helps,
Sandy

On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven <
paul.van.hoven@googlemail.com> wrote:

> The output value in the map function is in most examples for hadoop
> something like this:
>
> public static class Map extends Mapper<LongWritable, Text, outputKey,
> outputValue>
>
> Normally outputValue is something like Text or IntWriteable.
>
> I got a custom class with its own properties like
>
> public class Dog {
>    string name;
>    Date birthday;
>    double weight;
> }
>
> Now how would I accomplish the following map function:
>
> public static class Map extends Mapper<LongWritable, Text, IntWritable,
> Dog>
>
> ?
>