You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Corbin Hoenes <co...@tynt.com> on 2010/12/15 18:38:04 UTC

Re: HBaseStorage in pig 0.8

PIG-1769 has been created sorry I lost track of this :(

On Nov 22, 2010, at 2:30 PM, Dmitriy Ryaboy wrote:

> Hm, good point. Can you create a JIRA for this?
> 
> On Mon, Nov 22, 2010 at 1:16 PM, Corbin Hoenes <co...@tynt.com> wrote:
> 
>> One comment on the HBaseStorage store func.  In our load statement we
>> are allowed to prefix the table name with "hbase://" but when we call
>> store it throws an exception unless we remove hbase:// from the table
>> name:
>> 
>> this works:
>> store raw into 'piggytest2' USING
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('content2:field1
>> anchor2:field1a anchor2:field2a');
>> 
>> this won't
>> store raw into 'hbase://piggytest2'
>> 
>> Exception:
>> Caused by: java.lang.IllegalArgumentException:
>> java.net.URISyntaxException: Relative path in absolute URI:
>> hbase://piggytest2_logs
>> 
>> Would be nice to be able to prefix the store with hbase:// as well.
>> 
>> 
>> On Mon, Nov 22, 2010 at 12:10 PM, Dmitriy Ryaboy <dv...@gmail.com>
>> wrote:
>>> 
>>> Why is it connecting to localhost?
>>> Sounds like you don't have the appropriate config files on the path.
>>> Hm, maybe  we should serialize those in the constructor so that you don't
>>> have to have them on the JT classpath (I have them on the JT classpath so
>>> this never came up). Can you confirm that this is the problem?
>>> 
>>> D
>>> 
>>> On Fri, Nov 19, 2010 at 10:33 PM, Corbin Hoenes <co...@tynt.com> wrote:
>>> 
>>>> Hey Jeff,
>>>> 
>>>> It wasn't starting a job but I got a bit further by registering the
>> pig8
>>>> jar in my pig script.  It seemed to have a bunch of dependencies on
>> google
>>>> common collections; zookeeper etc... built into that jar.
>>>> 
>>>> Now I am seeing this in the web ui logs:
>>>> 2010-11-19 23:19:44,200 INFO org.apache.zookeeper.ClientCnxn:
>> Attempting
>>>> connection to server localhost/127.0.0.1:2181
>>>> 2010-11-19 23:19:44,201 WARN org.apache.zookeeper.ClientCnxn: Exception
>>>> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@65efb4be
>>>> java.net.ConnectException: Connection refused
>>>>       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>       at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:885)
>>>> 2010-11-19 23:19:44,201 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>>>> exception during shutdown input
>>>> java.nio.channels.ClosedChannelException
>>>>       at
>>>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>>>>       at
>> sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:951)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
>>>> 2010-11-19 23:19:44,201 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>>>> exception during shutdown output
>>>> java.nio.channels.ClosedChannelException
>>>>       at
>>>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>>>>       at
>> sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>>>>       at
>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
>>>> 2010-11-19 23:19:44,303 WARN
>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to create
>> /hbase
>>>> -- check quorum servers, currently=localhost:2181
>>>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>> KeeperErrorCode = ConnectionLoss for /hbase
>>>> Looks like it doesn't know where my hbase/conf/hbase-site.xml file is?
>> Not
>>>> sure how would this get passed to the HBaseStorage class?
>>>> 
>>>> On Nov 19, 2010, at 5:09 PM, Jeff Zhang wrote:
>>>> 
>>>>> Does the mapreduce job start ? Could you check the logs on hadoop
>> side ?
>>>>> 
>>>>> 
>>>>> On Sat, Nov 20, 2010 at 7:56 AM, Corbin Hoenes <co...@tynt.com>
>> wrote:
>>>>>> We are trying to use the HBaseStorage LoadFunc in pig 0.8 and
>> getting an
>>>> exception.
>>>>>> 
>>>>>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066:
>> Unable
>>>> to open iterator for alias raw
>>>>>> at org.apache.pig.PigServer.openIterator(PigServer.java:754)
>>>>>> at
>>>> 
>> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
>>>>>> at
>>>> 
>> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
>>>>>> at
>>>> 
>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
>>>>>> at
>>>> 
>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
>>>>>> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
>>>>>> at org.apache.pig.Main.run(Main.java:465)
>>>>>> at org.apache.pig.Main.main(Main.java:107)
>>>>>> Caused by: java.io.IOException: Couldn't retrieve job.
>>>>>> at org.apache.pig.PigServer.store(PigServer.java:818)
>>>>>> at org.apache.pig.PigServer.openIterator(PigServer.java:728)
>>>>>> ... 7 more
>>>>>> 
>>>>>> 
>>>>>> Other jobs seem to work.
>>>>>> 
>>>>>> What are the requirements for getting hbase storage to work?
>>>>>> 
>>>>>> This is what I am doing:
>>>>>> 1 - added hbase config and hadoop config to my PIG_CLASSPATH
>>>>>> 2 - pig this script:
>>>>>> 
>>>>>> REGISTER ../lib/hbase-0.20.6.jar
>>>>>> 
>>>>>> raw = LOAD 'hbase://piggytest' USING
>>>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('content:field1
>>>> anchor:field1a anchor:field2a') as (content_field1, anchor_field1a,
>>>> anchor_field2a);
>>>>>> 
>>>>>> dump raw;
>>>>>> 
>>>>>> ---
>>>>>> what else am I missing?
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Best Regards
>>>>> 
>>>>> Jeff Zhang
>>>> 
>>>> 
>>