You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Rohini Palaniswamy (JIRA)" <ji...@apache.org> on 2017/05/24 16:01:04 UTC

[jira] [Deleted] (PIG-5099) AvroStorage on Tez with exception on nested records

     [ https://issues.apache.org/jira/browse/PIG-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohini Palaniswamy deleted PIG-5099:
------------------------------------


> AvroStorage on Tez with exception on nested records
> ---------------------------------------------------
>
>                 Key: PIG-5099
>                 URL: https://issues.apache.org/jira/browse/PIG-5099
>             Project: Pig
>          Issue Type: Bug
>         Environment: HadoopVersion: 2.6.0-cdh5.8.0
> PigVersion: 0.16.0
> TezVersion: 0.7.0
>            Reporter: Sebastian Geller
>
> Hi,
> While migrating to the latest Pig version we have seen a general issue when using nested Avro records on Tez:
> {code}
> Caused by: java.io.IOException: class org.apache.pig.impl.util.avro.AvroTupleWrapper.write called, but not implemented yet
> 	at org.apache.pig.impl.util.avro.AvroTupleWrapper.write(AvroTupleWrapper.java:68)
> 	at org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:139)
> ...
> {code}
> The setup is
> schema
> {code}
> {
>     "fields": [
>         {
>             "name": "id",
>             "type": "int"
>         },
>         {
>             "name": "property",
>             "type": {
>                 "fields": [
>                     {
>                         "name": "id",
>                         "type": "int"
>                     }
>                 ],
>                 "name": "Property",
>                 "type": "record"
>             }
>         }
>     ],
>     "name": "Person",
>     "namespace": "com.github.ouyi.avro",
>     "type": "record"
> }
> {code}
> Pig script group_person.pig
> {code}
> loaded_person =
>     LOAD '$input'
>     USING AvroStorage();
> grouped_records =
>     GROUP
>         loaded_person BY (property.id);
> STORE grouped_records
>     INTO '$output'
>     USING AvroStorage();
> {code}
> sample data
> {code}
> {"id":1,"property":{"id":1}}
> {code}
> Execution on Tez
> {code}
> pig -x tez_local -p input=file:///usr/lib/pig/pig-0.16.0/person-prop.avro -p output=file:///output group_person.pig
> ...
> Caused by: java.io.IOException: class org.apache.pig.impl.util.avro.AvroTupleWrapper.write called, but not implemented yet
> 	at org.apache.pig.impl.util.avro.AvroTupleWrapper.write(AvroTupleWrapper.java:68)
> 	at org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:139)
> ...
> {code}
> Execution on mapred
> {code}
> pig -x local -p input=file:///usr/lib/pig/pig-0.16.0/person-prop.avro -p output=file:///output7 group_person.pig
> ...
> Output(s):
> Successfully stored 1 records in: "file:///output7"
> ...
> {code}
> I am going to attach the complete log files of both runs.
> I assume that the Pig script should work regardless of Tez or mapreduce? Is there any underlying change when migrating to Tez which makes the schema invalid?
> Thanks,
> Sebastian



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)