You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Yaron Gonen <ya...@gmail.com> on 2015/08/19 12:21:10 UTC

Custom comparator when using Kryo serializer for MapReduce serialization

Hi all,
(I'm using Hadoop 1.2.1)
I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
<https://github.com/twitter/chill>) as my serializer (instead of the
Writable interface).
However, I'm having trouble with the comparator: on one hand, since none of
my objects are Writable, I cannot use WritableComparator. On the other
hand, I can work with the RawComparator, but it means to deserialize the
bytes array each time - seems not very efficient...
Is there a way to give just an implementation of Java's Comparator? or to
make the serialized object Comparable?

Regards,
Yaron

Re: Custom comparator when using Kryo serializer for MapReduce serialization

Posted by William Slacum <ws...@gmail.com>.
Being in a distributed system shouldn't matter too much in this case.
You're worried about two things: mapping your data into byte[], and then
comparing against other data that has been mapped to byte[].


On Fri, Aug 21, 2015 at 1:46 AM, Yaron Gonen <ya...@gmail.com> wrote:

> Thanks for the reply.
> How can I guarantee that in a distributed system?
>
> On Wed, Aug 19, 2015 at 8:06 PM, William Slacum <ws...@gmail.com> wrote:
>
>> In a general sense, if you can guarantee that your objects serialize in
>> lexicographical order, then you should be able to do a comparator on the
>> raw bytes themselves without any interpretation.
>>
>> On Wed, Aug 19, 2015 at 5:21 AM, Yaron Gonen <ya...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>> (I'm using Hadoop 1.2.1)
>>> I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
>>> <https://github.com/twitter/chill>) as my serializer (instead of the
>>> Writable interface).
>>> However, I'm having trouble with the comparator: on one hand, since none
>>> of my objects are Writable, I cannot use WritableComparator. On the
>>> other hand, I can work with the RawComparator, but it means to
>>> deserialize the bytes array each time - seems not very efficient...
>>> Is there a way to give just an implementation of Java's Comparator? or
>>> to make the serialized object Comparable?
>>>
>>> Regards,
>>> Yaron
>>>
>>
>>
>

Re: Custom comparator when using Kryo serializer for MapReduce serialization

Posted by William Slacum <ws...@gmail.com>.
Being in a distributed system shouldn't matter too much in this case.
You're worried about two things: mapping your data into byte[], and then
comparing against other data that has been mapped to byte[].


On Fri, Aug 21, 2015 at 1:46 AM, Yaron Gonen <ya...@gmail.com> wrote:

> Thanks for the reply.
> How can I guarantee that in a distributed system?
>
> On Wed, Aug 19, 2015 at 8:06 PM, William Slacum <ws...@gmail.com> wrote:
>
>> In a general sense, if you can guarantee that your objects serialize in
>> lexicographical order, then you should be able to do a comparator on the
>> raw bytes themselves without any interpretation.
>>
>> On Wed, Aug 19, 2015 at 5:21 AM, Yaron Gonen <ya...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>> (I'm using Hadoop 1.2.1)
>>> I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
>>> <https://github.com/twitter/chill>) as my serializer (instead of the
>>> Writable interface).
>>> However, I'm having trouble with the comparator: on one hand, since none
>>> of my objects are Writable, I cannot use WritableComparator. On the
>>> other hand, I can work with the RawComparator, but it means to
>>> deserialize the bytes array each time - seems not very efficient...
>>> Is there a way to give just an implementation of Java's Comparator? or
>>> to make the serialized object Comparable?
>>>
>>> Regards,
>>> Yaron
>>>
>>
>>
>

Re: Custom comparator when using Kryo serializer for MapReduce serialization

Posted by William Slacum <ws...@gmail.com>.
Being in a distributed system shouldn't matter too much in this case.
You're worried about two things: mapping your data into byte[], and then
comparing against other data that has been mapped to byte[].


On Fri, Aug 21, 2015 at 1:46 AM, Yaron Gonen <ya...@gmail.com> wrote:

> Thanks for the reply.
> How can I guarantee that in a distributed system?
>
> On Wed, Aug 19, 2015 at 8:06 PM, William Slacum <ws...@gmail.com> wrote:
>
>> In a general sense, if you can guarantee that your objects serialize in
>> lexicographical order, then you should be able to do a comparator on the
>> raw bytes themselves without any interpretation.
>>
>> On Wed, Aug 19, 2015 at 5:21 AM, Yaron Gonen <ya...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>> (I'm using Hadoop 1.2.1)
>>> I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
>>> <https://github.com/twitter/chill>) as my serializer (instead of the
>>> Writable interface).
>>> However, I'm having trouble with the comparator: on one hand, since none
>>> of my objects are Writable, I cannot use WritableComparator. On the
>>> other hand, I can work with the RawComparator, but it means to
>>> deserialize the bytes array each time - seems not very efficient...
>>> Is there a way to give just an implementation of Java's Comparator? or
>>> to make the serialized object Comparable?
>>>
>>> Regards,
>>> Yaron
>>>
>>
>>
>

Re: Custom comparator when using Kryo serializer for MapReduce serialization

Posted by William Slacum <ws...@gmail.com>.
Being in a distributed system shouldn't matter too much in this case.
You're worried about two things: mapping your data into byte[], and then
comparing against other data that has been mapped to byte[].


On Fri, Aug 21, 2015 at 1:46 AM, Yaron Gonen <ya...@gmail.com> wrote:

> Thanks for the reply.
> How can I guarantee that in a distributed system?
>
> On Wed, Aug 19, 2015 at 8:06 PM, William Slacum <ws...@gmail.com> wrote:
>
>> In a general sense, if you can guarantee that your objects serialize in
>> lexicographical order, then you should be able to do a comparator on the
>> raw bytes themselves without any interpretation.
>>
>> On Wed, Aug 19, 2015 at 5:21 AM, Yaron Gonen <ya...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>> (I'm using Hadoop 1.2.1)
>>> I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
>>> <https://github.com/twitter/chill>) as my serializer (instead of the
>>> Writable interface).
>>> However, I'm having trouble with the comparator: on one hand, since none
>>> of my objects are Writable, I cannot use WritableComparator. On the
>>> other hand, I can work with the RawComparator, but it means to
>>> deserialize the bytes array each time - seems not very efficient...
>>> Is there a way to give just an implementation of Java's Comparator? or
>>> to make the serialized object Comparable?
>>>
>>> Regards,
>>> Yaron
>>>
>>
>>
>

Re: Custom comparator when using Kryo serializer for MapReduce serialization

Posted by Yaron Gonen <ya...@gmail.com>.
Thanks for the reply.
How can I guarantee that in a distributed system?

On Wed, Aug 19, 2015 at 8:06 PM, William Slacum <ws...@gmail.com> wrote:

> In a general sense, if you can guarantee that your objects serialize in
> lexicographical order, then you should be able to do a comparator on the
> raw bytes themselves without any interpretation.
>
> On Wed, Aug 19, 2015 at 5:21 AM, Yaron Gonen <ya...@gmail.com>
> wrote:
>
>> Hi all,
>> (I'm using Hadoop 1.2.1)
>> I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
>> <https://github.com/twitter/chill>) as my serializer (instead of the
>> Writable interface).
>> However, I'm having trouble with the comparator: on one hand, since none
>> of my objects are Writable, I cannot use WritableComparator. On the
>> other hand, I can work with the RawComparator, but it means to
>> deserialize the bytes array each time - seems not very efficient...
>> Is there a way to give just an implementation of Java's Comparator? or
>> to make the serialized object Comparable?
>>
>> Regards,
>> Yaron
>>
>
>

Re: Custom comparator when using Kryo serializer for MapReduce serialization

Posted by Yaron Gonen <ya...@gmail.com>.
Thanks for the reply.
How can I guarantee that in a distributed system?

On Wed, Aug 19, 2015 at 8:06 PM, William Slacum <ws...@gmail.com> wrote:

> In a general sense, if you can guarantee that your objects serialize in
> lexicographical order, then you should be able to do a comparator on the
> raw bytes themselves without any interpretation.
>
> On Wed, Aug 19, 2015 at 5:21 AM, Yaron Gonen <ya...@gmail.com>
> wrote:
>
>> Hi all,
>> (I'm using Hadoop 1.2.1)
>> I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
>> <https://github.com/twitter/chill>) as my serializer (instead of the
>> Writable interface).
>> However, I'm having trouble with the comparator: on one hand, since none
>> of my objects are Writable, I cannot use WritableComparator. On the
>> other hand, I can work with the RawComparator, but it means to
>> deserialize the bytes array each time - seems not very efficient...
>> Is there a way to give just an implementation of Java's Comparator? or
>> to make the serialized object Comparable?
>>
>> Regards,
>> Yaron
>>
>
>

Re: Custom comparator when using Kryo serializer for MapReduce serialization

Posted by Yaron Gonen <ya...@gmail.com>.
Thanks for the reply.
How can I guarantee that in a distributed system?

On Wed, Aug 19, 2015 at 8:06 PM, William Slacum <ws...@gmail.com> wrote:

> In a general sense, if you can guarantee that your objects serialize in
> lexicographical order, then you should be able to do a comparator on the
> raw bytes themselves without any interpretation.
>
> On Wed, Aug 19, 2015 at 5:21 AM, Yaron Gonen <ya...@gmail.com>
> wrote:
>
>> Hi all,
>> (I'm using Hadoop 1.2.1)
>> I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
>> <https://github.com/twitter/chill>) as my serializer (instead of the
>> Writable interface).
>> However, I'm having trouble with the comparator: on one hand, since none
>> of my objects are Writable, I cannot use WritableComparator. On the
>> other hand, I can work with the RawComparator, but it means to
>> deserialize the bytes array each time - seems not very efficient...
>> Is there a way to give just an implementation of Java's Comparator? or
>> to make the serialized object Comparable?
>>
>> Regards,
>> Yaron
>>
>
>

Re: Custom comparator when using Kryo serializer for MapReduce serialization

Posted by Yaron Gonen <ya...@gmail.com>.
Thanks for the reply.
How can I guarantee that in a distributed system?

On Wed, Aug 19, 2015 at 8:06 PM, William Slacum <ws...@gmail.com> wrote:

> In a general sense, if you can guarantee that your objects serialize in
> lexicographical order, then you should be able to do a comparator on the
> raw bytes themselves without any interpretation.
>
> On Wed, Aug 19, 2015 at 5:21 AM, Yaron Gonen <ya...@gmail.com>
> wrote:
>
>> Hi all,
>> (I'm using Hadoop 1.2.1)
>> I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
>> <https://github.com/twitter/chill>) as my serializer (instead of the
>> Writable interface).
>> However, I'm having trouble with the comparator: on one hand, since none
>> of my objects are Writable, I cannot use WritableComparator. On the
>> other hand, I can work with the RawComparator, but it means to
>> deserialize the bytes array each time - seems not very efficient...
>> Is there a way to give just an implementation of Java's Comparator? or
>> to make the serialized object Comparable?
>>
>> Regards,
>> Yaron
>>
>
>

Re: Custom comparator when using Kryo serializer for MapReduce serialization

Posted by William Slacum <ws...@gmail.com>.
In a general sense, if you can guarantee that your objects serialize in
lexicographical order, then you should be able to do a comparator on the
raw bytes themselves without any interpretation.

On Wed, Aug 19, 2015 at 5:21 AM, Yaron Gonen <ya...@gmail.com> wrote:

> Hi all,
> (I'm using Hadoop 1.2.1)
> I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
> <https://github.com/twitter/chill>) as my serializer (instead of the
> Writable interface).
> However, I'm having trouble with the comparator: on one hand, since none
> of my objects are Writable, I cannot use WritableComparator. On the other
> hand, I can work with the RawComparator, but it means to deserialize the
> bytes array each time - seems not very efficient...
> Is there a way to give just an implementation of Java's Comparator? or to
> make the serialized object Comparable?
>
> Regards,
> Yaron
>

Re: Custom comparator when using Kryo serializer for MapReduce serialization

Posted by William Slacum <ws...@gmail.com>.
In a general sense, if you can guarantee that your objects serialize in
lexicographical order, then you should be able to do a comparator on the
raw bytes themselves without any interpretation.

On Wed, Aug 19, 2015 at 5:21 AM, Yaron Gonen <ya...@gmail.com> wrote:

> Hi all,
> (I'm using Hadoop 1.2.1)
> I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
> <https://github.com/twitter/chill>) as my serializer (instead of the
> Writable interface).
> However, I'm having trouble with the comparator: on one hand, since none
> of my objects are Writable, I cannot use WritableComparator. On the other
> hand, I can work with the RawComparator, but it means to deserialize the
> bytes array each time - seems not very efficient...
> Is there a way to give just an implementation of Java's Comparator? or to
> make the serialized object Comparable?
>
> Regards,
> Yaron
>

Re: Custom comparator when using Kryo serializer for MapReduce serialization

Posted by William Slacum <ws...@gmail.com>.
In a general sense, if you can guarantee that your objects serialize in
lexicographical order, then you should be able to do a comparator on the
raw bytes themselves without any interpretation.

On Wed, Aug 19, 2015 at 5:21 AM, Yaron Gonen <ya...@gmail.com> wrote:

> Hi all,
> (I'm using Hadoop 1.2.1)
> I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
> <https://github.com/twitter/chill>) as my serializer (instead of the
> Writable interface).
> However, I'm having trouble with the comparator: on one hand, since none
> of my objects are Writable, I cannot use WritableComparator. On the other
> hand, I can work with the RawComparator, but it means to deserialize the
> bytes array each time - seems not very efficient...
> Is there a way to give just an implementation of Java's Comparator? or to
> make the serialized object Comparable?
>
> Regards,
> Yaron
>

Re: Custom comparator when using Kryo serializer for MapReduce serialization

Posted by William Slacum <ws...@gmail.com>.
In a general sense, if you can guarantee that your objects serialize in
lexicographical order, then you should be able to do a comparator on the
raw bytes themselves without any interpretation.

On Wed, Aug 19, 2015 at 5:21 AM, Yaron Gonen <ya...@gmail.com> wrote:

> Hi all,
> (I'm using Hadoop 1.2.1)
> I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
> <https://github.com/twitter/chill>) as my serializer (instead of the
> Writable interface).
> However, I'm having trouble with the comparator: on one hand, since none
> of my objects are Writable, I cannot use WritableComparator. On the other
> hand, I can work with the RawComparator, but it means to deserialize the
> bytes array each time - seems not very efficient...
> Is there a way to give just an implementation of Java's Comparator? or to
> make the serialized object Comparable?
>
> Regards,
> Yaron
>