You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2008/05/20 10:04:55 UTC

[jira] Updated: (HADOOP-3380) need comparators in serializer framework

     [ https://issues.apache.org/jira/browse/HADOOP-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Enis Soztutar updated HADOOP-3380:
----------------------------------

    Attachment: comparator_wip1.patch

bq. I think the fact that the raw comparators depend on the serialization used is precisely why Doug wants to put it there. 
Yes, but my point is that, the *default* comparator returned by some serialization cannot do much except for deserializing the objects and calling compareTo on them. Is this assumption not correct? In either case, the developer has to write its own comparator for a specific class, under a known serialization.  

If we want to allow different raw comparators for different serializations (of the same class), then we may define the API like :
{code}
RawComparator c = new SerializationFactory(conf).getSerialization(MyKey.class).getComparator(MyKey.class);
{code}
Note that getComparator() takes the class as an argument so that it can return a registered comparator for that class, if any, if not it can return the default(deserializing) comparator. 

If we do not want to allow different raw comparators, then wouldn't the attached (half-baked) patch be enough ? 

> need comparators in serializer framework
> ----------------------------------------
>
>                 Key: HADOOP-3380
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3380
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: io
>            Reporter: Doug Cutting
>         Attachments: comparator_wip1.patch
>
>
> The new serialization framework permits Hadoop to incorporate different serialization systems, including Hadoop's Writable, Thrift, Java Serialization, etc.  It provides a generic, extensible means (SerializationFactory) to create serializers and deserializers for arbitrary Java classes.  However it does not include a generic means to create comparators for these classes.  Comparators are required for MapReduce keys and many other computations.  Thus we should enhance the serialization framwork to provide comparators too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.