You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Jack Alexandria <pa...@gmail.com> on 2014/12/05 06:57:11 UTC

pig use ParquetLoader load parquet data error

hi:
my pig script:

A = LOAD 'parquetDataPath' using ParquetLoader;
dump A;

ERROR log:
ERROR 2017: Internal error creating job configuration.

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException:
ERROR 2017: Internal error creating job configuration.
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:873)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:298)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:190)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1322)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1307)
at org.apache.pig.PigServer.execute(PigServer.java:1297)
at org.apache.pig.PigServer.executeBatch(PigServer.java:375)
at org.apache.pig.PigServer.executeBatch(PigServer.java:353)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:607)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.io.IOException: Runtime parquet dependency not found
at org.apache.pig.builtin.ParquetLoader.setLocation(ParquetLoader.java:78)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:498)
... 18 more
Caused by: java.lang.ClassNotFoundException: parquet.Version.class
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
at org.apache.pig.builtin.ParquetLoader.setLocation(ParquetLoader.java:76)


I get ParquetLoader source from github,find bug in this method:
@Override public void setLocation(String location, Job job) throws
IOException { try { JarManager.addDependencyJars(job, Class.forName(
"parquet.Version.class")); } catch (ClassNotFoundException e) { throw new
IOException("Runtime parquet dependency not found", e); }
super.setLocation(location,
job); }


update Class.forName("parquet.Version.class") ----------> Class.forName(
"parquet.Version")-------->compile-------->package------------>deploy

It's OK

thanks
Jack