You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Manu S <ma...@gmail.com> on 2012/10/25 16:44:24 UTC

Pig + Hbase integration

Hi,

I am using Pig-0.10.0 & hbase-0.94.2.

I am trying to store the processed output to Hbase cluster using pig
script.

I registered the required .jar and set the mapreduce and zookeeper
parameters within the script itself.

*# cat input.pig*
register jar/hbase-0.94.2.jar;
register jar/zookeeper-3.4.3.jar;
register jar/protobuf-java-2.4.0a.jar;
register jar/guava-11.0.2.jar;
register jar/pig-0.10.0.jar;

set fs.default.name hdfs://namenode:8020;
set mapred.job.tracker namenode:8021;
set hbase.cluster.distributed true;
set hbase.zookeeper.quorum namenode;
set hbase.master namenode:60000;
set hbase.zookeeper.property.clientPort 2181;
*
*
*raw_data = LOAD 'sample_data.csv' USING PigStorage( ',' ) AS (
 listing_id: chararray,fname: chararray,lname: chararray );*
*
*
*STORE raw_data INTO 'hbase://inputcsv' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage ('info:fname info:lname');*

When I execute the script I am getting this error

*# pig input.pig*
*2012-10-25 19:55:08,331 [main] INFO  org.apache.pig.Main - Apache Pig
version 0.10.0 (r1328203) compiled Apr 19 2012, 22:54:12*
*2012-10-25 19:55:08,332 [main] INFO  org.apache.pig.Main - Logging error
messages to: /export/home/hadoop/devel/pig/pig_1351175108325.log*
*2012-10-25 19:55:08,944 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
Connecting to hadoop file system at: hdfs://sangamt4:8020*
*2012-10-25 19:55:09,172 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
Connecting to map-reduce job tracker at: sangamt4:8021*
*2012-10-25 19:55:10,021 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2998: Unhandled internal error.
org/apache/hadoop/hbase/filter/WritableByteArrayComparable*
*Details at logfile: /export/home/hadoop/devel/pig/pig_1351175108325.log*


Appreciate your help on this.

Thanks,
Manu S

Re: Pig + Hbase integration

Posted by Manu S <ma...@gmail.com>.
Hi Jean,

This issue had been solved by following the suggestions of Cheolsoo

*1) ClassNotFoundError

Even though you're "registering" jars in your script, they're not present
in classpath. So you're seeing that ClassNotFound error. Can you try this?

PIG_CLASSPATH=<hbase_home>/hbase-0.94.1.jar:<hbase_home>/
lib/zookeeper-3.4.3.jar:<hbase_home>/lib/protobuf-java-2.4.0a.jar
./bin/pig <your script>

The best way to use HBaseStorage is to install the hbase client locally, so
they're present in classpath automatically. Then, you don't have to add
them to PIG_CLASSPATH.

2) pig-0.10.0.jar

Can you also make sure that you use pig-0.10.0-withouthadoop.jar instead of
pig-0.10.0.jar? Pig.jar embeds hbase-0.90, so you will run into
**a compatibly issue if you run it against hbase-0.94*.

Thanks,
Manu S
On Tue, Oct 30, 2012 at 4:25 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> On Thu, Oct 25, 2012 at 7:44 AM, Manu S <ma...@gmail.com> wrote:
> > Hi,
> >
> > I am using Pig-0.10.0 & hbase-0.94.2.
> >
> > I am trying to store the processed output to Hbase cluster using pig
> > script.
> >
> > I registered the required .jar and set the mapreduce and zookeeper
> > parameters within the script itself.
> >
> > *# cat input.pig*
> > register jar/hbase-0.94.2.jar;
> > register jar/zookeeper-3.4.3.jar;
> > register jar/protobuf-java-2.4.0a.jar;
> > register jar/guava-11.0.2.jar;
> > register jar/pig-0.10.0.jar;
> >
> > set fs.default.name hdfs://namenode:8020;
> > set mapred.job.tracker namenode:8021;
> > set hbase.cluster.distributed true;
> > set hbase.zookeeper.quorum namenode;
> > set hbase.master namenode:60000;
> > set hbase.zookeeper.property.clientPort 2181;
> > *
> > *
> > *raw_data = LOAD 'sample_data.csv' USING PigStorage( ',' ) AS (
> >  listing_id: chararray,fname: chararray,lname: chararray );*
> > *
> > *
> > *STORE raw_data INTO 'hbase://inputcsv' USING
> > org.apache.pig.backend.hadoop.hbase.HBaseStorage ('info:fname
> info:lname');*
> >
> > When I execute the script I am getting this error
> >
> > *# pig input.pig*
> > *2012-10-25 19:55:08,331 [main] INFO  org.apache.pig.Main - Apache Pig
> > version 0.10.0 (r1328203) compiled Apr 19 2012, 22:54:12*
> > *2012-10-25 19:55:08,332 [main] INFO  org.apache.pig.Main - Logging error
> > messages to: /export/home/hadoop/devel/pig/pig_1351175108325.log*
> > *2012-10-25 19:55:08,944 [main] INFO
> >  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> > Connecting to hadoop file system at: hdfs://sangamt4:8020*
> > *2012-10-25 19:55:09,172 [main] INFO
> >  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> > Connecting to map-reduce job tracker at: sangamt4:8021*
> > *2012-10-25 19:55:10,021 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > ERROR 2998: Unhandled internal error.
> > org/apache/hadoop/hbase/filter/WritableByteArrayComparable*
> > *Details at logfile: /export/home/hadoop/devel/pig/pig_1351175108325.log*
>
> And what are the details like in that log? Is it a classpath problem?
>
> J-D
>

Re: Pig + Hbase integration

Posted by Jean-Daniel Cryans <jd...@apache.org>.
On Thu, Oct 25, 2012 at 7:44 AM, Manu S <ma...@gmail.com> wrote:
> Hi,
>
> I am using Pig-0.10.0 & hbase-0.94.2.
>
> I am trying to store the processed output to Hbase cluster using pig
> script.
>
> I registered the required .jar and set the mapreduce and zookeeper
> parameters within the script itself.
>
> *# cat input.pig*
> register jar/hbase-0.94.2.jar;
> register jar/zookeeper-3.4.3.jar;
> register jar/protobuf-java-2.4.0a.jar;
> register jar/guava-11.0.2.jar;
> register jar/pig-0.10.0.jar;
>
> set fs.default.name hdfs://namenode:8020;
> set mapred.job.tracker namenode:8021;
> set hbase.cluster.distributed true;
> set hbase.zookeeper.quorum namenode;
> set hbase.master namenode:60000;
> set hbase.zookeeper.property.clientPort 2181;
> *
> *
> *raw_data = LOAD 'sample_data.csv' USING PigStorage( ',' ) AS (
>  listing_id: chararray,fname: chararray,lname: chararray );*
> *
> *
> *STORE raw_data INTO 'hbase://inputcsv' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage ('info:fname info:lname');*
>
> When I execute the script I am getting this error
>
> *# pig input.pig*
> *2012-10-25 19:55:08,331 [main] INFO  org.apache.pig.Main - Apache Pig
> version 0.10.0 (r1328203) compiled Apr 19 2012, 22:54:12*
> *2012-10-25 19:55:08,332 [main] INFO  org.apache.pig.Main - Logging error
> messages to: /export/home/hadoop/devel/pig/pig_1351175108325.log*
> *2012-10-25 19:55:08,944 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> Connecting to hadoop file system at: hdfs://sangamt4:8020*
> *2012-10-25 19:55:09,172 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> Connecting to map-reduce job tracker at: sangamt4:8021*
> *2012-10-25 19:55:10,021 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 2998: Unhandled internal error.
> org/apache/hadoop/hbase/filter/WritableByteArrayComparable*
> *Details at logfile: /export/home/hadoop/devel/pig/pig_1351175108325.log*

And what are the details like in that log? Is it a classpath problem?

J-D

Re: Pig + Hbase integration

Posted by Manu S <ma...@gmail.com>.
Wow, its fixed :)

Thanks a lot Cheolsoo for your quick solution.

Thanks,
Manu S

On Fri, Oct 26, 2012 at 11:27 AM, Cheolsoo Park <ch...@cloudera.com>wrote:

> Hi Manu,
>
> Thanks for providing the log.
>
> 1) ClassNotFoundError
>
> Even though you're "registering" jars in your script, they're not present
> in classpath. So you're seeing that ClassNotFound error. Can you try this?
>
>
> PIG_CLASSPATH=<hbase_home>/hbase-0.94.1.jar:<hbase_home>/lib/zookeeper-3.4.3.jar:<hbase_home>/lib/protobuf-java-2.4.0a.jar
> ./bin/pig <your script>
>
> The best way to use HBaseStorage is to install the hbase client locally, so
> they're present in classpath automatically. Then, you don't have to add
> them to PIG_CLASSPATH.
>
> 2) pig-0.10.0.jar
>
> Can you also make sure that you use pig-0.10.0-withouthadoop.jar instead of
> pig-0.10.0.jar? Pig.jar embeds hbase-0.90, so you will run into
> a compatibly issue if you run it against hbase-0.94.
>
> Thanks,
> Cheolsoo
>
> On Thu, Oct 25, 2012 at 8:57 PM, Manu S <ma...@gmail.com> wrote:
>
> > Hi Cheolsoo,
> >
> > Please find the log
> >
> > On Thu, Oct 25, 2012 at 10:19 PM, Cheolsoo Park <cheolsoo@cloudera.com
> > >wrote:
> >
> > > Hi Manu,
> > >
> > > Can you provide the output of
> > > 'cat /export/home/hadoop/devel/pig/pig_1351175108325.log' ?
> > >
> > >
> >
> > *Pig Stack Trace*
> > *---------------*
> > *ERROR 2998: Unhandled internal error.
> > org/apache/hadoop/hbase/filter/WritableByteArrayComparable*
> > *
> > *
> > *java.lang.NoClassDefFoundError:
> > org/apache/hadoop/hbase/filter/WritableByteArrayComparable*
> > *        at java.lang.Class.forName0(Native Method)*
> > *        at java.lang.Class.forName(Class.java:247)*
> > *        at
> > org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:477)*
> > *        at
> >
> >
> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:507)*
> > *        at
> >
> >
> org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:791)
> > *
> > *        at
> >
> >
> org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:780)
> > *
> > *        at
> >
> >
> org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:4583)
> > *
> > *        at
> >
> >
> org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:6225)
> > *
> > *        at
> >
> >
> org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1335)
> > *
> > *        at
> >
> >
> org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:789)
> > *
> > *        at
> >
> >
> org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:507)
> > *
> > *        at
> >
> >
> org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:382)
> > *
> > *        at
> >
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:175)*
> > *        at
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1589)*
> > *        at
> > org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1540)*
> > *        at org.apache.pig.PigServer.registerQuery(PigServer.java:540)*
> > *        at
> > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970)*
> > *        at
> >
> >
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
> > *
> > *        at
> >
> >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
> > *
> > *        at
> >
> >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> > *
> > *        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)*
> > *        at org.apache.pig.Main.run(Main.java:555)*
> > *        at org.apache.pig.Main.main(Main.java:111)*
> > *        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)*
> > *        at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > *
> > *        at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > *
> > *        at java.lang.reflect.Method.invoke(Method.java:597)*
> > *        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)*
> > *Caused by: java.lang.ClassNotFoundException:
> > org.apache.hadoop.hbase.filter.WritableByteArrayComparable*
> > *        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)*
> > *        at java.security.AccessController.doPrivileged(Native Method)*
> > *        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)*
> > *        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)*
> > *        at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)*
> > *        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)*
> > *        ... 28 more*
> > *
> >
> >
> ================================================================================
> > *
> >
> >
> >
> >
> > > Thanks,
> > > Cheolsoo
> > >
> > > On Thu, Oct 25, 2012 at 7:44 AM, Manu S <ma...@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I am using Pig-0.10.0 & hbase-0.94.2.
> > > >
> > > > I am trying to store the processed output to Hbase cluster using pig
> > > > script.
> > > >
> > > > I registered the required .jar and set the mapreduce and zookeeper
> > > > parameters within the script itself.
> > > >
> > > > *# cat input.pig*
> > > > register jar/hbase-0.94.2.jar;
> > > > register jar/zookeeper-3.4.3.jar;
> > > > register jar/protobuf-java-2.4.0a.jar;
> > > > register jar/guava-11.0.2.jar;
> > > > register jar/pig-0.10.0.jar;
> > > >
> > > > set fs.default.name hdfs://namenode:8020;
> > > > set mapred.job.tracker namenode:8021;
> > > > set hbase.cluster.distributed true;
> > > > set hbase.zookeeper.quorum namenode;
> > > > set hbase.master namenode:60000;
> > > > set hbase.zookeeper.property.clientPort 2181;
> > > > *
> > > > *
> > > > *raw_data = LOAD 'sample_data.csv' USING PigStorage( ',' ) AS (
> > > >  listing_id: chararray,fname: chararray,lname: chararray );*
> > > > *
> > > > *
> > > > *STORE raw_data INTO 'hbase://inputcsv' USING
> > > > org.apache.pig.backend.hadoop.hbase.HBaseStorage ('info:fname
> > > > info:lname');*
> > > >
> > > > When I execute the script I am getting this error
> > > >
> > > > *# pig input.pig*
> > > > *2012-10-25 19:55:08,331 [main] INFO  org.apache.pig.Main - Apache
> Pig
> > > > version 0.10.0 (r1328203) compiled Apr 19 2012, 22:54:12*
> > > > *2012-10-25 19:55:08,332 [main] INFO  org.apache.pig.Main - Logging
> > error
> > > > messages to: /export/home/hadoop/devel/pig/pig_1351175108325.log*
> > > > *2012-10-25 19:55:08,944 [main] INFO
> > > >  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> > > > Connecting to hadoop file system at: hdfs://sangamt4:8020*
> > > > *2012-10-25 19:55:09,172 [main] INFO
> > > >  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> > > > Connecting to map-reduce job tracker at: sangamt4:8021*
> > > > *2012-10-25 19:55:10,021 [main] ERROR
> org.apache.pig.tools.grunt.Grunt
> > -
> > > > ERROR 2998: Unhandled internal error.
> > > > org/apache/hadoop/hbase/filter/WritableByteArrayComparable*
> > > > *Details at logfile:
> > /export/home/hadoop/devel/pig/pig_1351175108325.log*
> > > >
> > > >
> > > > Appreciate your help on this.
> > > >
> > > > Thanks,
> > > > Manu S
> > > >
> > >
> >
>

Re: Pig + Hbase integration

Posted by Cheolsoo Park <ch...@cloudera.com>.
Hi Manu,

Thanks for providing the log.

1) ClassNotFoundError

Even though you're "registering" jars in your script, they're not present
in classpath. So you're seeing that ClassNotFound error. Can you try this?

PIG_CLASSPATH=<hbase_home>/hbase-0.94.1.jar:<hbase_home>/lib/zookeeper-3.4.3.jar:<hbase_home>/lib/protobuf-java-2.4.0a.jar
./bin/pig <your script>

The best way to use HBaseStorage is to install the hbase client locally, so
they're present in classpath automatically. Then, you don't have to add
them to PIG_CLASSPATH.

2) pig-0.10.0.jar

Can you also make sure that you use pig-0.10.0-withouthadoop.jar instead of
pig-0.10.0.jar? Pig.jar embeds hbase-0.90, so you will run into
a compatibly issue if you run it against hbase-0.94.

Thanks,
Cheolsoo

On Thu, Oct 25, 2012 at 8:57 PM, Manu S <ma...@gmail.com> wrote:

> Hi Cheolsoo,
>
> Please find the log
>
> On Thu, Oct 25, 2012 at 10:19 PM, Cheolsoo Park <cheolsoo@cloudera.com
> >wrote:
>
> > Hi Manu,
> >
> > Can you provide the output of
> > 'cat /export/home/hadoop/devel/pig/pig_1351175108325.log' ?
> >
> >
>
> *Pig Stack Trace*
> *---------------*
> *ERROR 2998: Unhandled internal error.
> org/apache/hadoop/hbase/filter/WritableByteArrayComparable*
> *
> *
> *java.lang.NoClassDefFoundError:
> org/apache/hadoop/hbase/filter/WritableByteArrayComparable*
> *        at java.lang.Class.forName0(Native Method)*
> *        at java.lang.Class.forName(Class.java:247)*
> *        at
> org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:477)*
> *        at
>
> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:507)*
> *        at
>
> org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:791)
> *
> *        at
>
> org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:780)
> *
> *        at
>
> org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:4583)
> *
> *        at
>
> org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:6225)
> *
> *        at
>
> org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1335)
> *
> *        at
>
> org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:789)
> *
> *        at
>
> org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:507)
> *
> *        at
>
> org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:382)
> *
> *        at
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:175)*
> *        at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1589)*
> *        at
> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1540)*
> *        at org.apache.pig.PigServer.registerQuery(PigServer.java:540)*
> *        at
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970)*
> *        at
>
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
> *
> *        at
>
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
> *
> *        at
>
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> *
> *        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)*
> *        at org.apache.pig.Main.run(Main.java:555)*
> *        at org.apache.pig.Main.main(Main.java:111)*
> *        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)*
> *        at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> *
> *        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> *
> *        at java.lang.reflect.Method.invoke(Method.java:597)*
> *        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)*
> *Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.hbase.filter.WritableByteArrayComparable*
> *        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)*
> *        at java.security.AccessController.doPrivileged(Native Method)*
> *        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)*
> *        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)*
> *        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)*
> *        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)*
> *        ... 28 more*
> *
>
> ================================================================================
> *
>
>
>
>
> > Thanks,
> > Cheolsoo
> >
> > On Thu, Oct 25, 2012 at 7:44 AM, Manu S <ma...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I am using Pig-0.10.0 & hbase-0.94.2.
> > >
> > > I am trying to store the processed output to Hbase cluster using pig
> > > script.
> > >
> > > I registered the required .jar and set the mapreduce and zookeeper
> > > parameters within the script itself.
> > >
> > > *# cat input.pig*
> > > register jar/hbase-0.94.2.jar;
> > > register jar/zookeeper-3.4.3.jar;
> > > register jar/protobuf-java-2.4.0a.jar;
> > > register jar/guava-11.0.2.jar;
> > > register jar/pig-0.10.0.jar;
> > >
> > > set fs.default.name hdfs://namenode:8020;
> > > set mapred.job.tracker namenode:8021;
> > > set hbase.cluster.distributed true;
> > > set hbase.zookeeper.quorum namenode;
> > > set hbase.master namenode:60000;
> > > set hbase.zookeeper.property.clientPort 2181;
> > > *
> > > *
> > > *raw_data = LOAD 'sample_data.csv' USING PigStorage( ',' ) AS (
> > >  listing_id: chararray,fname: chararray,lname: chararray );*
> > > *
> > > *
> > > *STORE raw_data INTO 'hbase://inputcsv' USING
> > > org.apache.pig.backend.hadoop.hbase.HBaseStorage ('info:fname
> > > info:lname');*
> > >
> > > When I execute the script I am getting this error
> > >
> > > *# pig input.pig*
> > > *2012-10-25 19:55:08,331 [main] INFO  org.apache.pig.Main - Apache Pig
> > > version 0.10.0 (r1328203) compiled Apr 19 2012, 22:54:12*
> > > *2012-10-25 19:55:08,332 [main] INFO  org.apache.pig.Main - Logging
> error
> > > messages to: /export/home/hadoop/devel/pig/pig_1351175108325.log*
> > > *2012-10-25 19:55:08,944 [main] INFO
> > >  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> > > Connecting to hadoop file system at: hdfs://sangamt4:8020*
> > > *2012-10-25 19:55:09,172 [main] INFO
> > >  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> > > Connecting to map-reduce job tracker at: sangamt4:8021*
> > > *2012-10-25 19:55:10,021 [main] ERROR org.apache.pig.tools.grunt.Grunt
> -
> > > ERROR 2998: Unhandled internal error.
> > > org/apache/hadoop/hbase/filter/WritableByteArrayComparable*
> > > *Details at logfile:
> /export/home/hadoop/devel/pig/pig_1351175108325.log*
> > >
> > >
> > > Appreciate your help on this.
> > >
> > > Thanks,
> > > Manu S
> > >
> >
>

Re: Pig + Hbase integration

Posted by Manu S <ma...@gmail.com>.
Hi Cheolsoo,

Please find the log

On Thu, Oct 25, 2012 at 10:19 PM, Cheolsoo Park <ch...@cloudera.com>wrote:

> Hi Manu,
>
> Can you provide the output of
> 'cat /export/home/hadoop/devel/pig/pig_1351175108325.log' ?
>
>

*Pig Stack Trace*
*---------------*
*ERROR 2998: Unhandled internal error.
org/apache/hadoop/hbase/filter/WritableByteArrayComparable*
*
*
*java.lang.NoClassDefFoundError:
org/apache/hadoop/hbase/filter/WritableByteArrayComparable*
*        at java.lang.Class.forName0(Native Method)*
*        at java.lang.Class.forName(Class.java:247)*
*        at
org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:477)*
*        at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:507)*
*        at
org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:791)
*
*        at
org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:780)
*
*        at
org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:4583)
*
*        at
org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:6225)
*
*        at
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1335)
*
*        at
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:789)
*
*        at
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:507)
*
*        at
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:382)
*
*        at
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:175)*
*        at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1589)*
*        at
org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1540)*
*        at org.apache.pig.PigServer.registerQuery(PigServer.java:540)*
*        at
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970)*
*        at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
*
*        at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
*
*        at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
*
*        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)*
*        at org.apache.pig.Main.run(Main.java:555)*
*        at org.apache.pig.Main.main(Main.java:111)*
*        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)*
*        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
*
*        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
*
*        at java.lang.reflect.Method.invoke(Method.java:597)*
*        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)*
*Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.filter.WritableByteArrayComparable*
*        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)*
*        at java.security.AccessController.doPrivileged(Native Method)*
*        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)*
*        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)*
*        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)*
*        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)*
*        ... 28 more*
*
================================================================================
*




> Thanks,
> Cheolsoo
>
> On Thu, Oct 25, 2012 at 7:44 AM, Manu S <ma...@gmail.com> wrote:
>
> > Hi,
> >
> > I am using Pig-0.10.0 & hbase-0.94.2.
> >
> > I am trying to store the processed output to Hbase cluster using pig
> > script.
> >
> > I registered the required .jar and set the mapreduce and zookeeper
> > parameters within the script itself.
> >
> > *# cat input.pig*
> > register jar/hbase-0.94.2.jar;
> > register jar/zookeeper-3.4.3.jar;
> > register jar/protobuf-java-2.4.0a.jar;
> > register jar/guava-11.0.2.jar;
> > register jar/pig-0.10.0.jar;
> >
> > set fs.default.name hdfs://namenode:8020;
> > set mapred.job.tracker namenode:8021;
> > set hbase.cluster.distributed true;
> > set hbase.zookeeper.quorum namenode;
> > set hbase.master namenode:60000;
> > set hbase.zookeeper.property.clientPort 2181;
> > *
> > *
> > *raw_data = LOAD 'sample_data.csv' USING PigStorage( ',' ) AS (
> >  listing_id: chararray,fname: chararray,lname: chararray );*
> > *
> > *
> > *STORE raw_data INTO 'hbase://inputcsv' USING
> > org.apache.pig.backend.hadoop.hbase.HBaseStorage ('info:fname
> > info:lname');*
> >
> > When I execute the script I am getting this error
> >
> > *# pig input.pig*
> > *2012-10-25 19:55:08,331 [main] INFO  org.apache.pig.Main - Apache Pig
> > version 0.10.0 (r1328203) compiled Apr 19 2012, 22:54:12*
> > *2012-10-25 19:55:08,332 [main] INFO  org.apache.pig.Main - Logging error
> > messages to: /export/home/hadoop/devel/pig/pig_1351175108325.log*
> > *2012-10-25 19:55:08,944 [main] INFO
> >  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> > Connecting to hadoop file system at: hdfs://sangamt4:8020*
> > *2012-10-25 19:55:09,172 [main] INFO
> >  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> > Connecting to map-reduce job tracker at: sangamt4:8021*
> > *2012-10-25 19:55:10,021 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > ERROR 2998: Unhandled internal error.
> > org/apache/hadoop/hbase/filter/WritableByteArrayComparable*
> > *Details at logfile: /export/home/hadoop/devel/pig/pig_1351175108325.log*
> >
> >
> > Appreciate your help on this.
> >
> > Thanks,
> > Manu S
> >
>

Parsing Nested Tuple Data in PIG

Posted by ra...@wipro.com.
Test

Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 

www.wipro.com

Re: Pig + Hbase integration

Posted by Cheolsoo Park <ch...@cloudera.com>.
Hi Manu,

Can you provide the output of
'cat /export/home/hadoop/devel/pig/pig_1351175108325.log' ?

Thanks,
Cheolsoo

On Thu, Oct 25, 2012 at 7:44 AM, Manu S <ma...@gmail.com> wrote:

> Hi,
>
> I am using Pig-0.10.0 & hbase-0.94.2.
>
> I am trying to store the processed output to Hbase cluster using pig
> script.
>
> I registered the required .jar and set the mapreduce and zookeeper
> parameters within the script itself.
>
> *# cat input.pig*
> register jar/hbase-0.94.2.jar;
> register jar/zookeeper-3.4.3.jar;
> register jar/protobuf-java-2.4.0a.jar;
> register jar/guava-11.0.2.jar;
> register jar/pig-0.10.0.jar;
>
> set fs.default.name hdfs://namenode:8020;
> set mapred.job.tracker namenode:8021;
> set hbase.cluster.distributed true;
> set hbase.zookeeper.quorum namenode;
> set hbase.master namenode:60000;
> set hbase.zookeeper.property.clientPort 2181;
> *
> *
> *raw_data = LOAD 'sample_data.csv' USING PigStorage( ',' ) AS (
>  listing_id: chararray,fname: chararray,lname: chararray );*
> *
> *
> *STORE raw_data INTO 'hbase://inputcsv' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage ('info:fname
> info:lname');*
>
> When I execute the script I am getting this error
>
> *# pig input.pig*
> *2012-10-25 19:55:08,331 [main] INFO  org.apache.pig.Main - Apache Pig
> version 0.10.0 (r1328203) compiled Apr 19 2012, 22:54:12*
> *2012-10-25 19:55:08,332 [main] INFO  org.apache.pig.Main - Logging error
> messages to: /export/home/hadoop/devel/pig/pig_1351175108325.log*
> *2012-10-25 19:55:08,944 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> Connecting to hadoop file system at: hdfs://sangamt4:8020*
> *2012-10-25 19:55:09,172 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> Connecting to map-reduce job tracker at: sangamt4:8021*
> *2012-10-25 19:55:10,021 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 2998: Unhandled internal error.
> org/apache/hadoop/hbase/filter/WritableByteArrayComparable*
> *Details at logfile: /export/home/hadoop/devel/pig/pig_1351175108325.log*
>
>
> Appreciate your help on this.
>
> Thanks,
> Manu S
>