You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Jagdish Kewat (JIRA)" <ji...@apache.org> on 2016/02/18 15:05:18 UTC
[jira] [Commented] (PIG-4813) AvroStorage doesn't work for schema
from external file for EMR
[ https://issues.apache.org/jira/browse/PIG-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15152358#comment-15152358 ]
Jagdish Kewat commented on PIG-4813:
------------------------------------
The error I am getting is
My store command in the script looks as shown below.
{code}
store records into 's3://my-bucket/my-output' using org.apache.pig.piggybank.storage.avro.AvroStorage('schema_file', 's3n://my-bucket/my-schema/records.avsc');{code}
{code}
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: Output schema is null!
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:473)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:453)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1542)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:453)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:371)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1500)
at java.security.AccessController.doPrivileged(Native Method)
{code}
> AvroStorage doesn't work for schema from external file for EMR
> --------------------------------------------------------------
>
> Key: PIG-4813
> URL: https://issues.apache.org/jira/browse/PIG-4813
> Project: Pig
> Issue Type: Bug
> Reporter: Jagdish Kewat
>
> Hi Team,
> I couldn't get the schema loading for AvroStorage as described in http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-etl-avro.html working.
> It works fine if I provide the raw schema string with option 'schema' as described in https://cwiki.apache.org/confluence/display/PIG/AvroStorage.
> On HDFS I don't even need to specify the schema with store command.
> A quick insights regarding the versions.
> * Hadoop :
> {code}
> Hadoop 2.6.0-amzn-2
> Subversion git@aws157git.com:/pkg/Aws157BigTop -r 41f4e6be3ac5d6676a3464f77de79a33e8fdd9f3
> Compiled by ec2-user on 2015-11-16T20:56Z
> Compiled with protoc 2.5.0
> {code}
> * Pig :
> {code}
> Apache Pig version 0.14.0-amzn-0 (r: unknown)
> {code}
> * piggybank jar version:
> ** piggybank-0.14.0.jar
> * avro jar version :
> ** avro-1.7.7.jar
> * avro-ipc jar version :
> ** avro-ipc-1.7.7.jar
> * json-simple jar version
> ** json-simple-1.1.jar
> I tried looking for any pibbybank version of jar for EMR however no luck. I fear I am not using correct versions of jars since the feature should work as it has been documented.
> Please advise if I am missing anything.
> Thanks,
> Jagdish
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)