You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Adeel Qureshi <ad...@gmail.com> on 2013/09/01 17:43:14 UTC

Re: custom writablecomparable with complex fields

Okay that makes sense .. so the same order I write is how I can read ..
taking it a step further, in the compareto method, how can I use the bytes
provided to do a comparison on let's say on a list object
On Aug 31, 2013 4:52 PM, "Harsh J" <ha...@cloudera.com> wrote:

> The idea behind write(…) and readFields(…) is simply that of ordering.
> You need to write your custom objects (i.e. a representation of them)
> in order, and read them back in the same order.
>
> An example way of serializing a list would be to first serialize the
> length (so you know how many you'll be needed to read back), and then
> serialize each item appropriately, using delimiters, or using
> length-prefixes just like lists.
>
> Mainly, you're required to tackle the serialization/deserialization on
> your own.
>
> This is one of the reasons I highly recommend using a library like
> Apache Avro instead. Its more powerful, faster, and yet simple to use:
> http://avro.apache.org/docs/current/gettingstartedjava.html and
> http://avro.apache.org/docs/current/mr.html. It is also popular and
> carries first-grade support on several other hadoop-ecosystem
> projects, such as Flume and Crunch.
>
> On Sun, Sep 1, 2013 at 1:23 AM, Adeel Qureshi <ad...@gmail.com>
> wrote:
> > I want to write a custom writablecomparable object with two List objects
> > within it ..
> >
> > public class CompositeKey implements WritableComparable {
> >
> > private List<JsonKey> groupBy;
> > private List<JsonKey> sortBy;
> > ...
> > }
> >
> > what I am not sure about is how to write
> >
> > readFields and write methods for this object. Any help would be
> appreciated.
> >
> > Thanks
> > Adeel
>
>
>
> --
> Harsh J
>

Re: custom writablecomparable with complex fields

Posted by Harsh J <ha...@cloudera.com>.
The easy way is to deserialize each stream into objects, then compare
them, pretty much what most of the defaults do.

Comparing without deserializing the whole stream is much faster and is
the point behind true RawComparators. Read
http://avro.apache.org/docs/current/spec.html#order for example.

On Sun, Sep 1, 2013 at 9:13 PM, Adeel Qureshi <ad...@gmail.com> wrote:
> Okay that makes sense .. so the same order I write is how I can read ..
> taking it a step further, in the compareto method, how can I use the bytes
> provided to do a comparison on let's say on a list object
>
> On Aug 31, 2013 4:52 PM, "Harsh J" <ha...@cloudera.com> wrote:
>>
>> The idea behind write(…) and readFields(…) is simply that of ordering.
>> You need to write your custom objects (i.e. a representation of them)
>> in order, and read them back in the same order.
>>
>> An example way of serializing a list would be to first serialize the
>> length (so you know how many you'll be needed to read back), and then
>> serialize each item appropriately, using delimiters, or using
>> length-prefixes just like lists.
>>
>> Mainly, you're required to tackle the serialization/deserialization on
>> your own.
>>
>> This is one of the reasons I highly recommend using a library like
>> Apache Avro instead. Its more powerful, faster, and yet simple to use:
>> http://avro.apache.org/docs/current/gettingstartedjava.html and
>> http://avro.apache.org/docs/current/mr.html. It is also popular and
>> carries first-grade support on several other hadoop-ecosystem
>> projects, such as Flume and Crunch.
>>
>> On Sun, Sep 1, 2013 at 1:23 AM, Adeel Qureshi <ad...@gmail.com>
>> wrote:
>> > I want to write a custom writablecomparable object with two List objects
>> > within it ..
>> >
>> > public class CompositeKey implements WritableComparable {
>> >
>> > private List<JsonKey> groupBy;
>> > private List<JsonKey> sortBy;
>> > ...
>> > }
>> >
>> > what I am not sure about is how to write
>> >
>> > readFields and write methods for this object. Any help would be
>> > appreciated.
>> >
>> > Thanks
>> > Adeel
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Re: custom writablecomparable with complex fields

Posted by Harsh J <ha...@cloudera.com>.
The easy way is to deserialize each stream into objects, then compare
them, pretty much what most of the defaults do.

Comparing without deserializing the whole stream is much faster and is
the point behind true RawComparators. Read
http://avro.apache.org/docs/current/spec.html#order for example.

On Sun, Sep 1, 2013 at 9:13 PM, Adeel Qureshi <ad...@gmail.com> wrote:
> Okay that makes sense .. so the same order I write is how I can read ..
> taking it a step further, in the compareto method, how can I use the bytes
> provided to do a comparison on let's say on a list object
>
> On Aug 31, 2013 4:52 PM, "Harsh J" <ha...@cloudera.com> wrote:
>>
>> The idea behind write(…) and readFields(…) is simply that of ordering.
>> You need to write your custom objects (i.e. a representation of them)
>> in order, and read them back in the same order.
>>
>> An example way of serializing a list would be to first serialize the
>> length (so you know how many you'll be needed to read back), and then
>> serialize each item appropriately, using delimiters, or using
>> length-prefixes just like lists.
>>
>> Mainly, you're required to tackle the serialization/deserialization on
>> your own.
>>
>> This is one of the reasons I highly recommend using a library like
>> Apache Avro instead. Its more powerful, faster, and yet simple to use:
>> http://avro.apache.org/docs/current/gettingstartedjava.html and
>> http://avro.apache.org/docs/current/mr.html. It is also popular and
>> carries first-grade support on several other hadoop-ecosystem
>> projects, such as Flume and Crunch.
>>
>> On Sun, Sep 1, 2013 at 1:23 AM, Adeel Qureshi <ad...@gmail.com>
>> wrote:
>> > I want to write a custom writablecomparable object with two List objects
>> > within it ..
>> >
>> > public class CompositeKey implements WritableComparable {
>> >
>> > private List<JsonKey> groupBy;
>> > private List<JsonKey> sortBy;
>> > ...
>> > }
>> >
>> > what I am not sure about is how to write
>> >
>> > readFields and write methods for this object. Any help would be
>> > appreciated.
>> >
>> > Thanks
>> > Adeel
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Re: custom writablecomparable with complex fields

Posted by Harsh J <ha...@cloudera.com>.
The easy way is to deserialize each stream into objects, then compare
them, pretty much what most of the defaults do.

Comparing without deserializing the whole stream is much faster and is
the point behind true RawComparators. Read
http://avro.apache.org/docs/current/spec.html#order for example.

On Sun, Sep 1, 2013 at 9:13 PM, Adeel Qureshi <ad...@gmail.com> wrote:
> Okay that makes sense .. so the same order I write is how I can read ..
> taking it a step further, in the compareto method, how can I use the bytes
> provided to do a comparison on let's say on a list object
>
> On Aug 31, 2013 4:52 PM, "Harsh J" <ha...@cloudera.com> wrote:
>>
>> The idea behind write(…) and readFields(…) is simply that of ordering.
>> You need to write your custom objects (i.e. a representation of them)
>> in order, and read them back in the same order.
>>
>> An example way of serializing a list would be to first serialize the
>> length (so you know how many you'll be needed to read back), and then
>> serialize each item appropriately, using delimiters, or using
>> length-prefixes just like lists.
>>
>> Mainly, you're required to tackle the serialization/deserialization on
>> your own.
>>
>> This is one of the reasons I highly recommend using a library like
>> Apache Avro instead. Its more powerful, faster, and yet simple to use:
>> http://avro.apache.org/docs/current/gettingstartedjava.html and
>> http://avro.apache.org/docs/current/mr.html. It is also popular and
>> carries first-grade support on several other hadoop-ecosystem
>> projects, such as Flume and Crunch.
>>
>> On Sun, Sep 1, 2013 at 1:23 AM, Adeel Qureshi <ad...@gmail.com>
>> wrote:
>> > I want to write a custom writablecomparable object with two List objects
>> > within it ..
>> >
>> > public class CompositeKey implements WritableComparable {
>> >
>> > private List<JsonKey> groupBy;
>> > private List<JsonKey> sortBy;
>> > ...
>> > }
>> >
>> > what I am not sure about is how to write
>> >
>> > readFields and write methods for this object. Any help would be
>> > appreciated.
>> >
>> > Thanks
>> > Adeel
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Re: custom writablecomparable with complex fields

Posted by Harsh J <ha...@cloudera.com>.
The easy way is to deserialize each stream into objects, then compare
them, pretty much what most of the defaults do.

Comparing without deserializing the whole stream is much faster and is
the point behind true RawComparators. Read
http://avro.apache.org/docs/current/spec.html#order for example.

On Sun, Sep 1, 2013 at 9:13 PM, Adeel Qureshi <ad...@gmail.com> wrote:
> Okay that makes sense .. so the same order I write is how I can read ..
> taking it a step further, in the compareto method, how can I use the bytes
> provided to do a comparison on let's say on a list object
>
> On Aug 31, 2013 4:52 PM, "Harsh J" <ha...@cloudera.com> wrote:
>>
>> The idea behind write(…) and readFields(…) is simply that of ordering.
>> You need to write your custom objects (i.e. a representation of them)
>> in order, and read them back in the same order.
>>
>> An example way of serializing a list would be to first serialize the
>> length (so you know how many you'll be needed to read back), and then
>> serialize each item appropriately, using delimiters, or using
>> length-prefixes just like lists.
>>
>> Mainly, you're required to tackle the serialization/deserialization on
>> your own.
>>
>> This is one of the reasons I highly recommend using a library like
>> Apache Avro instead. Its more powerful, faster, and yet simple to use:
>> http://avro.apache.org/docs/current/gettingstartedjava.html and
>> http://avro.apache.org/docs/current/mr.html. It is also popular and
>> carries first-grade support on several other hadoop-ecosystem
>> projects, such as Flume and Crunch.
>>
>> On Sun, Sep 1, 2013 at 1:23 AM, Adeel Qureshi <ad...@gmail.com>
>> wrote:
>> > I want to write a custom writablecomparable object with two List objects
>> > within it ..
>> >
>> > public class CompositeKey implements WritableComparable {
>> >
>> > private List<JsonKey> groupBy;
>> > private List<JsonKey> sortBy;
>> > ...
>> > }
>> >
>> > what I am not sure about is how to write
>> >
>> > readFields and write methods for this object. Any help would be
>> > appreciated.
>> >
>> > Thanks
>> > Adeel
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J