You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Iván de Prado (JIRA)" <ji...@apache.org> on 2010/06/21 10:14:26 UTC
[jira] Commented: (AVRO-493) hadoop mapreduce support for avro data
[ https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880743#action_12880743 ]
Iván de Prado commented on AVRO-493:
------------------------------------
Writting a custom DeserializerCompartor is needed if you want this patch to be useful in many developments. Otherwise you would need a different Avro schema with a different sorting for each kind of grouping you want to do in the reducer. I'm failing to create a custom DeserializerComparator:
{code:java}
public static class CustomComparator extends DeserializerComparator<AvroWrapper<GenericRecord>> {
public CustomComparator() throws IOException {
super(new AvroKeySerialization().getDeserializer(AvroWrapper.class));
}
@Override
public int compare(AvroWrapper<GenericRecord> o1, AvroWrapper<GenericRecord> o2) {
return o1.datum().get("word").toString().charAt(1)-o2.datum().get("word").toString().charAt(1);
}
}
{code}
It raises the following exception:
{noformat}
Caused by: java.lang.NullPointerException
at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:98)
at org.apache.avro.mapred.AvroKeySerialization.getDeserializer(AvroKeySerialization.java:55)
....
{noformat}
The problem is in that line:
{code:java}
Schema schema = AvroJob.getMapOutputSchema(getConf());
{code}
It is looking for the datum schema at the job configuration but unsurprisingly it is not there.
Any ideas or workarrounds for creating custom Comparators for Avro?
> hadoop mapreduce support for avro data
> --------------------------------------
>
> Key: AVRO-493
> URL: https://issues.apache.org/jira/browse/AVRO-493
> Project: Avro
> Issue Type: New Feature
> Components: java
> Reporter: Doug Cutting
> Assignee: Doug Cutting
> Fix For: 1.4.0
>
> Attachments: AVRO-493.patch, AVRO-493.patch
>
>
> Avro should provide support for using Hadoop MapReduce over Avro data files.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.