You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Adric Eckstein (JIRA)" <ji...@apache.org> on 2015/10/16 14:10:05 UTC

[jira] [Commented] (CRUNCH-572) Avro schema error in Avros.tableOf() can't redefine org.apache.avro.mapred.Pair

    [ https://issues.apache.org/jira/browse/CRUNCH-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960570#comment-14960570 ] 

Adric Eckstein commented on CRUNCH-572:
---------------------------------------

Actually, I think I see where tableOf is checking the key/value for pair schema and recreating a schema with a unique name.  The issue here might be the call to Avros.pairs(), which doesn't have that same check on its input.  

This can be mitigated by first calling tableOf(), then converting to a pair:
PTableType<Pair<String, String>, Pair<String, Double>> t4 = Avros.tableOf(t1, t2);
AvroType<Pair<Pair<String, String>, Pair<String, Double>>> t5 = Avros.pairs(t4.getKeyType(), t4.getValueType());

So maybe Avros.pairs() needs to check for instances of PTableType and redefine the schema?

> Avro schema error in Avros.tableOf() can't redefine org.apache.avro.mapred.Pair
> -------------------------------------------------------------------------------
>
>                 Key: CRUNCH-572
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-572
>             Project: Crunch
>          Issue Type: Bug
>    Affects Versions: 0.13.0
>            Reporter: Adric Eckstein
>
> The Avros.tableOf() method produces a record schema named "org.apache.avro.mapred.Pair", which cannot be unique across types, whereas the Avros.pairs() method similarly creates a record schema, but calls Avros.createTupleSchema() which creates a unique name at every call.
> Example that fails on constructing schema:
> AvroType<Pair<String, String>> t1 = Avros.tableOf(Avros.strings(), Avros.strings());
> AvroType<Pair<String, Double>> t2 = Avros.tableOf(Avros.strings(), Avros.doubles());
> AvroType<Pair<Pair<String, String>, Pair<String, Double>>> t3 = Avros.pairs(t1, t2);
> System.out.println("schema: " + t3.getSchema().toString());
> Can the tableOf() method be updated to call the createTupleSchema() method instead of the avro pair constructor?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)