You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by byambajargal <by...@gmail.com> on 2011/04/25 14:32:39 UTC
How to store data into hbase by using Pig
Hello guys
I am running cloudere distribution cdh3u0 on my cluster with Pig and Hbase.
i can read data from hbase using the following pig query:
my_data = LOAD 'hbase://table1' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1') ;dump my_data
but when i try to store data into hbase as same way the job was failure.
store my_data into 'hbase://table2' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1');
the table1 and the table2 has same structure and same column.
the table i have:
hbase(main):029:0* scan 'table1'
ROW COLUMN+CELL
row1 column=cf:1, timestamp=1303731834050, value=value1
row2 column=cf:1, timestamp=1303731849901, value=value2
row3 column=cf:1, timestamp=1303731858637, value=value3
3 row(s) in 0.0470 seconds
thanks
Byambajargal
Re: How to store data into hbase by using Pig
Posted by byambajargal <by...@gmail.com>.
Thank you Dmitriy
i have tried as you said
my_data = LOAD 'hbase://table1' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1', '-loadKey') ;
store my_data into 'hbase://table2' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1','-loadKey');
the first part of relation can read the data with row key but i can not
store data to hbase.
when i try second part i have got the following error message from log file.
Pig Stack Trace
---------------
ERROR 2017: Internal error creating job configuration.
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable
to store alias 1
at
org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1569)
at org.apache.pig.PigServer.registerQuery(PigServer.java:523)
at
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:868)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:388)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
at org.apache.pig.Main.run(Main.java:465)
at org.apache.pig.Main.main(Main.java:107)
Caused by:
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException:
ERROR 2017: Internal error creating job configuration.
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:673)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:256)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:147)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:378)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1198)
at org.apache.pig.PigServer.execute(PigServer.java:1190)
at org.apache.pig.PigServer.access$100(PigServer.java:128)
at org.apache.pig.PigServer$Graph.execute(PigServer.java:1517)
at
org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1564)
... 8 more
Caused by: java.lang.IllegalArgumentException:
java.net.URISyntaxException: Relative path in absolute URI:
hbase://table2_logs
at org.apache.hadoop.fs.Path.initialize(Path.java:148)
at org.apache.hadoop.fs.Path.<init>(Path.java:71)
at org.apache.hadoop.fs.Path.<init>(Path.java:45)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:476)
... 16 more
Caused by: java.net.URISyntaxException: Relative path in absolute URI:
hbase://table2_logs
at java.net.URI.checkPath(URI.java:1787)
at java.net.URI.<init>(URI.java:735)
at org.apache.hadoop.fs.Path.initialize(Path.java:145)
... 19 more
================================================================================
~
~
thank you for your help
On 4/25/11 18:26, Dmitriy Ryaboy wrote:
> The first element of the relation you store must be the row key. You aren't
> loading the row key, so load> store isn't working.
> Try
> my_data = LOAD 'hbase://table1' using
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1', '-loadKey') ;
>
> On Mon, Apr 25, 2011 at 5:32 AM, byambajargal<by...@gmail.com>wrote:
>
>> Hello guys
>>
>> I am running cloudere distribution cdh3u0 on my cluster with Pig and Hbase.
>> i can read data from hbase using the following pig query:
>>
>> my_data = LOAD 'hbase://table1' using
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1') ;dump my_data
>>
>> but when i try to store data into hbase as same way the job was failure.
>>
>> store my_data into 'hbase://table2' using
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1');
>>
>> the table1 and the table2 has same structure and same column.
>>
>>
>> the table i have:
>>
>> hbase(main):029:0* scan 'table1'
>> ROW COLUMN+CELL
>> row1 column=cf:1, timestamp=1303731834050, value=value1
>> row2 column=cf:1, timestamp=1303731849901, value=value2
>> row3 column=cf:1, timestamp=1303731858637, value=value3
>> 3 row(s) in 0.0470 seconds
>>
>>
>> thanks
>>
>> Byambajargal
>>
>>
>>
Re: How to store data into hbase by using Pig
Posted by byambajargal <by...@gmail.com>.
I have just remove 'hbase://' from the second part it works fine
thanks
byambajargal
On 4/25/11 18:26, Dmitriy Ryaboy wrote:
> The first element of the relation you store must be the row key. You aren't
> loading the row key, so load> store isn't working.
> Try
> my_data = LOAD 'hbase://table1' using
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1', '-loadKey') ;
>
> On Mon, Apr 25, 2011 at 5:32 AM, byambajargal<by...@gmail.com>wrote:
>
>> Hello guys
>>
>> I am running cloudere distribution cdh3u0 on my cluster with Pig and Hbase.
>> i can read data from hbase using the following pig query:
>>
>> my_data = LOAD 'hbase://table1' using
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1') ;dump my_data
>>
>> but when i try to store data into hbase as same way the job was failure.
>>
>> store my_data into 'hbase://table2' using
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1');
>>
>> the table1 and the table2 has same structure and same column.
>>
>>
>> the table i have:
>>
>> hbase(main):029:0* scan 'table1'
>> ROW COLUMN+CELL
>> row1 column=cf:1, timestamp=1303731834050, value=value1
>> row2 column=cf:1, timestamp=1303731849901, value=value2
>> row3 column=cf:1, timestamp=1303731858637, value=value3
>> 3 row(s) in 0.0470 seconds
>>
>>
>> thanks
>>
>> Byambajargal
>>
>>
>>
Re: How to store data into hbase by using Pig
Posted by Dmitriy Ryaboy <dv...@gmail.com>.
1)
2011-04-27 10:29:32,953 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- HadoopJobId: job_201104251150_0071
2011-04-27 10:29:32,954 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- More information at:
http://haisen11:50030/jobdetails.jsp?jobid=job_201104251150_0071
2011-04-27 10:29:52,654 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- job job_201104251150_0071 has failed! Stop running all dependent jobs
^^^ Look at the job error logs.
2)
generate $0, $2 -- there is no $2, you only loaded two columns ($0 and $1).
Those are the ones you're going to be wanting.
3) loadKey, as the name implies, only applies to loading data, not to
storing it. It doesn't hurt anything to have it there, but it's not actually
doing anything.
D
On Wed, Apr 27, 2011 at 1:38 AM, byambajargal <by...@gmail.com>wrote:
> Hello
>
> I am using pig version pig 0.8.0
>
>
> A = load '/passwd' using PigStorage(':');B = foreach A generate $0 as id,
> $2 as value;dump B;
>
> the result of first part is here:
>
> (twilli,6259)
> (saamodt,6260)
> (hailu268,6261)
> (oddsen,6262)
> (neuhaus,6263)
> (zoila,6264)
> (elinmn,6265)
> (diego,6266)
> (fsudmann,6267)
> (yanliang,6268)
> (nestor,6269)
>
> As i understood the problem is at the second part
>
>
> store B into 'table2' using
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:a','-loadKey');
>
> I am suspecting that problem is for row key i am not sure how it can manage
> the row key .
> what i want is first item should be the row key and second item should be
> the column of hbase table.
>
> when i run the query i have got the following result on my task tracker:
>
> grunt> A = load '/passwd' using PigStorage(':');B = foreach A generate $0
> as id, $2 as value;store B into 'table2' using
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:a','-loadKey');
> 2011-04-27 10:29:29,785 [main] INFO
> org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
> script: UNKNOWN
> 2011-04-27 10:29:29,785 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> pig.usenewlogicalplan is set to true. New logical plan will be used.
> 2011-04-27 10:29:29,913 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011
> 22:27 GMT
> 2011-04-27 10:29:29,913 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client environment:host.name=haisen10.ux.uis.no
> 2011-04-27 10:29:29,913 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client environment:java.version=1.6.0_23
> 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client environment:java.vendor=Sun Microsystems Inc.
> 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client environment:java.home=/opt/jdk1.6.0_23/jre
> 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client
> environment:java.class.path=/etc/hbase/conf:/usr/lib/pig/bin/../conf:/opt/jdk/lib/tools.jar:/usr/lib/pig/bin/../pig-0.8.0-cdh3u0-core.jar:/usr/lib/pig/bin/../build/pig-*-SNAPSHOT.jar:/usr/lib/pig/bin/../lib/ant-contrib-1.0b3.jar:/usr/lib/pig/bin/../lib/automaton.jar:/usr/lib/pig/bin/../build/ivy/lib/Pig/*.jar:/usr/lib/hadoop/hadoop-core-0.20.2-cdh3u0.jar:/usr/lib/hadoop/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2-cdh3u0.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.LICENSE.txt:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jdiff:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.jar:/usr/lib/hadoop/lib/jetty-servlet-tester-6.1.26.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/jsp-2.1:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/kfs-0.2.LICENSE.txt:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/etc/hbase/conf::/usr/lib/hadoop/conf
> 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client
> environment:java.library.path=/opt/jdk1.6.0_23/jre/lib/amd64/server:/opt/jdk1.6.0_23/jre/lib/amd64:/opt/jdk1.6.0_23/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client environment:java.io.tmpdir=/tmp
> 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client environment:java.compiler=<NA>
> 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client environment:os.name=Linux
> 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client environment:os.arch=amd64
> 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client environment:os.version=2.6.18-194.32.1.el5.centos.plus
> 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client environment:user.name=haisen
> 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client environment:user.home=/home/ekstern/haisen
> 2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
> Client environment:user.dir=/import/br1raid6a1c1/haisen
> 2011-04-27 10:29:29,915 [main] INFO org.apache.zookeeper.ZooKeeper -
> Initiating client connection, connectString=haisen11:2181
> sessionTimeout=180000 watcher=hconnection
> 2011-04-27 10:29:29,923 [main-SendThread()] INFO
> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
> haisen11/152.94.1.130:2181
> 2011-04-27 10:29:29,926 [main-SendThread(haisen11:2181)] INFO
> org.apache.zookeeper.ClientCnxn - Socket connection established to
> haisen11/152.94.1.130:2181, initiating session
> 2011-04-27 10:29:29,936 [main-SendThread(haisen11:2181)] INFO
> org.apache.zookeeper.ClientCnxn - Session establishment complete on server
> haisen11/152.94.1.130:2181, sessionid = 0x12f8c18a1340177, negotiated
> timeout = 40000
> 2011-04-27 10:29:29,972 [main] DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
> - Lookedup root region location,
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@67f31652;
> hsa=haisen10.ux.uis.no:60020
> 2011-04-27 10:29:30,018 [main] DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
> - Cached location for .META.,,1.1028785192 is haisen10.ux.uis.no:60020
> 2011-04-27 10:29:30,020 [main] DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
> - Cache hit for row <> in tableName .META.: location server
> haisen10.ux.uis.no:60020, location region name .META.,,1.1028785192
> 2011-04-27 10:29:30,024 [main] DEBUG
> org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
> row=table2,,00000000000000 for max=10 rows
> 2011-04-27 10:29:30,028 [main] DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
> - Cached location for
> table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e. is
> haisen6.ux.uis.no:60020
> 2011-04-27 10:29:30,030 [main] DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
> - Cache hit for row <> in tableName table2: location server
> haisen6.ux.uis.no:60020, location region name
> table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e.
> 2011-04-27 10:29:30,031 [main] INFO
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table
> instance for table2
> 2011-04-27 10:29:30,068 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: B:
> Store(table2:org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:a','-loadKey'))
> - scope-6 Operator Key: scope-6)
> 2011-04-27 10:29:30,085 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler -
> File concatenation threshold: 100 optimistic? false
> 2011-04-27 10:29:30,122 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> - MR plan size before optimization: 1
> 2011-04-27 10:29:30,122 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> - MR plan size after optimization: 1
> 2011-04-27 10:29:30,187 [main] INFO
> org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added
> to the job
> 2011-04-27 10:29:30,204 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2011-04-27 10:29:31,684 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - Setting up single store job
> 2011-04-27 10:29:31,709 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 1 map-reduce job(s) waiting for submission.
> 2011-04-27 10:29:32,059 [Thread-7] INFO org.apache.zookeeper.ZooKeeper -
> Initiating client connection, connectString=haisen11:2181
> sessionTimeout=180000 watcher=hconnection
> 2011-04-27 10:29:32,060 [Thread-7-SendThread()] INFO
> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
> haisen11/152.94.1.130:2181
> 2011-04-27 10:29:32,061 [Thread-7-SendThread(haisen11:2181)] INFO
> org.apache.zookeeper.ClientCnxn - Socket connection established to
> haisen11/152.94.1.130:2181, initiating session
> 2011-04-27 10:29:32,063 [Thread-7-SendThread(haisen11:2181)] INFO
> org.apache.zookeeper.ClientCnxn - Session establishment complete on server
> haisen11/152.94.1.130:2181, sessionid = 0x12f8c18a1340178, negotiated
> timeout = 40000
> 2011-04-27 10:29:32,070 [Thread-7] DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
> - Lookedup root region location,
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@1f248f2b;
> hsa=haisen10.ux.uis.no:60020
> 2011-04-27 10:29:32,074 [Thread-7] DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
> - Cached location for .META.,,1.1028785192 is haisen10.ux.uis.no:60020
> 2011-04-27 10:29:32,074 [Thread-7] DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
> - Cache hit for row <> in tableName .META.: location server
> haisen10.ux.uis.no:60020, location region name .META.,,1.1028785192
> 2011-04-27 10:29:32,076 [Thread-7] DEBUG
> org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
> row=table2,,00000000000000 for max=10 rows
> 2011-04-27 10:29:32,080 [Thread-7] DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
> - Cached location for
> table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e. is
> haisen6.ux.uis.no:60020
> 2011-04-27 10:29:32,081 [Thread-7] DEBUG
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
> - Cache hit for row <> in tableName table2: location server
> haisen6.ux.uis.no:60020, location region name
> table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e.
> 2011-04-27 10:29:32,082 [Thread-7] INFO
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table
> instance for table2
> 2011-04-27 10:29:32,102 [Thread-7] INFO
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths
> to process : 1
> 2011-04-27 10:29:32,102 [Thread-7] INFO
> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
> paths to process : 1
> 2011-04-27 10:29:32,110 [Thread-7] INFO
> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
> paths (combined) to process : 1
> 2011-04-27 10:29:32,211 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 0% complete
> 2011-04-27 10:29:32,953 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - HadoopJobId: job_201104251150_0071
> 2011-04-27 10:29:32,954 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - More information at:
> http://haisen11:50030/jobdetails.jsp?jobid=job_201104251150_0071
> 2011-04-27 10:29:52,654 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - job job_201104251150_0071 has failed! Stop running all dependent jobs
> 2011-04-27 10:29:52,666 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 100% complete
> 2011-04-27 10:29:52,674 [main] ERROR
> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
> 2011-04-27 10:29:52,677 [main] INFO org.apache.pig.tools.pigstats.PigStats
> - Script Statistics:
>
> HadoopVersion PigVersion UserId StartedAt FinishedAt
> Features
> 0.20.2-cdh3u0 0.8.0-cdh3u0 haisen 2011-04-27 10:29:30 2011-04-27
> 10:29:52 UNKNOWN
>
> Failed!
>
> Failed Jobs:
> JobId Alias Feature Message Outputs
> job_201104251150_0071 A,B MAP_ONLY Message: Job failed! Error
> - NA table2,
>
> Input(s):
> Failed to read data from "/passwd"
>
> Output(s):
> Failed to produce result in "table2"
>
> Counters:
> Total records written : 0
> Total bytes written : 0
> Spillable Memory Manager spill count : 0
> Total bags proactively spilled: 0
> Total records proactively spilled: 0
>
> Job DAG:
> job_201104251150_0071
>
>
> 2011-04-27 10:29:52,677 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Failed!
>
>
> thank you
>
> Byambajargal
>
>
> On 4/27/11 06:07, Bill Graham wrote:
>
>> What version of Pig are you running and what errors are you seeing on
>> the task trackers?
>>
>> On Tue, Apr 26, 2011 at 4:46 AM, byambajargal<by...@gmail.com>
>> wrote:
>>
>>> Hello ...
>>> I have a question for you
>>>
>>> I am doing a pig job as following that read from hdfs simply to store
>>> hbase
>>> when i start the job first part works fine and second part was failure.
>>> Could you give me a direction how to move data from hdfs to Hbase
>>>
>>>
>>> A = load '/passwd' using PigStorage(':');B = foreach A generate $0 as
>>> id,
>>> $2 as value;dump B;
>>> store B into 'table2' using
>>> org.apache.pig.backend.hadoop.hbase.HBaseStorage( 'cf:a' , '-loadKey');
>>>
>>> thank you for your help
>>>
>>> Byambajargal
>>>
>>>
>>>
>>> On 4/25/11 18:26, Dmitriy Ryaboy wrote:
>>>
>>>> The first element of the relation you store must be the row key. You
>>>> aren't
>>>> loading the row key, so load> store isn't working.
>>>> Try
>>>> my_data = LOAD 'hbase://table1' using
>>>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1', '-loadKey') ;
>>>>
>>>> On Mon, Apr 25, 2011 at 5:32 AM,
>>>> byambajargal<by...@gmail.com>wrote:
>>>>
>>>> Hello guys
>>>>>
>>>>> I am running cloudere distribution cdh3u0 on my cluster with Pig and
>>>>> Hbase.
>>>>> i can read data from hbase using the following pig query:
>>>>>
>>>>> my_data = LOAD 'hbase://table1' using
>>>>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1') ;dump my_data
>>>>>
>>>>> but when i try to store data into hbase as same way the job was
>>>>> failure.
>>>>>
>>>>> store my_data into 'hbase://table2' using
>>>>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1');
>>>>>
>>>>> the table1 and the table2 has same structure and same column.
>>>>>
>>>>>
>>>>> the table i have:
>>>>>
>>>>> hbase(main):029:0* scan 'table1'
>>>>> ROW COLUMN+CELL
>>>>> row1 column=cf:1, timestamp=1303731834050, value=value1
>>>>> row2 column=cf:1, timestamp=1303731849901, value=value2
>>>>> row3 column=cf:1, timestamp=1303731858637, value=value3
>>>>> 3 row(s) in 0.0470 seconds
>>>>>
>>>>>
>>>>> thanks
>>>>>
>>>>> Byambajargal
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>
Re: How to store data into hbase by using Pig
Posted by byambajargal <by...@gmail.com>.
Hello
I am using pig version pig 0.8.0
A = load '/passwd' using PigStorage(':');B = foreach A generate $0 as id,
$2 as value;dump B;
the result of first part is here:
(twilli,6259)
(saamodt,6260)
(hailu268,6261)
(oddsen,6262)
(neuhaus,6263)
(zoila,6264)
(elinmn,6265)
(diego,6266)
(fsudmann,6267)
(yanliang,6268)
(nestor,6269)
As i understood the problem is at the second part
store B into 'table2' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:a','-loadKey');
I am suspecting that problem is for row key i am not sure how it can
manage the row key .
what i want is first item should be the row key and second item should
be the column of hbase table.
when i run the query i have got the following result on my task tracker:
grunt> A = load '/passwd' using PigStorage(':');B = foreach A generate
$0 as id, $2 as value;store B into 'table2' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:a','-loadKey');
2011-04-27 10:29:29,785 [main] INFO
org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
script: UNKNOWN
2011-04-27 10:29:29,785 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
pig.usenewlogicalplan is set to true. New logical plan will be used.
2011-04-27 10:29:29,913 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011
22:27 GMT
2011-04-27 10:29:29,913 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:host.name=haisen10.ux.uis.no
2011-04-27 10:29:29,913 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:java.version=1.6.0_23
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:java.vendor=Sun Microsystems Inc.
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:java.home=/opt/jdk1.6.0_23/jre
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client
environment:java.class.path=/etc/hbase/conf:/usr/lib/pig/bin/../conf:/opt/jdk/lib/tools.jar:/usr/lib/pig/bin/../pig-0.8.0-cdh3u0-core.jar:/usr/lib/pig/bin/../build/pig-*-SNAPSHOT.jar:/usr/lib/pig/bin/../lib/ant-contrib-1.0b3.jar:/usr/lib/pig/bin/../lib/automaton.jar:/usr/lib/pig/bin/../build/ivy/lib/Pig/*.jar:/usr/lib/hadoop/hadoop-core-0.20.2-cdh3u0.jar:/usr/lib/hadoop/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2-cdh3u0.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.LICENSE.txt:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jdiff:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.jar:/usr/lib/hadoop/lib/jetty-servlet-tester-6.1.26.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/jsp-2.1:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/kfs-0.2.LICENSE.txt:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/etc/hbase/conf::/usr/lib/hadoop/conf
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client
environment:java.library.path=/opt/jdk1.6.0_23/jre/lib/amd64/server:/opt/jdk1.6.0_23/jre/lib/amd64:/opt/jdk1.6.0_23/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:java.io.tmpdir=/tmp
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:java.compiler=<NA>
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:os.name=Linux
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:os.arch=amd64
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:os.version=2.6.18-194.32.1.el5.centos.plus
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:user.name=haisen
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:user.home=/home/ekstern/haisen
2011-04-27 10:29:29,914 [main] INFO org.apache.zookeeper.ZooKeeper -
Client environment:user.dir=/import/br1raid6a1c1/haisen
2011-04-27 10:29:29,915 [main] INFO org.apache.zookeeper.ZooKeeper -
Initiating client connection, connectString=haisen11:2181
sessionTimeout=180000 watcher=hconnection
2011-04-27 10:29:29,923 [main-SendThread()] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
haisen11/152.94.1.130:2181
2011-04-27 10:29:29,926 [main-SendThread(haisen11:2181)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
haisen11/152.94.1.130:2181, initiating session
2011-04-27 10:29:29,936 [main-SendThread(haisen11:2181)] INFO
org.apache.zookeeper.ClientCnxn - Session establishment complete on
server haisen11/152.94.1.130:2181, sessionid = 0x12f8c18a1340177,
negotiated timeout = 40000
2011-04-27 10:29:29,972 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Lookedup root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@67f31652;
hsa=haisen10.ux.uis.no:60020
2011-04-27 10:29:30,018 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cached location for .META.,,1.1028785192 is haisen10.ux.uis.no:60020
2011-04-27 10:29:30,020 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cache hit for row <> in tableName .META.: location server
haisen10.ux.uis.no:60020, location region name .META.,,1.1028785192
2011-04-27 10:29:30,024 [main] DEBUG
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
row=table2,,00000000000000 for max=10 rows
2011-04-27 10:29:30,028 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cached location for
table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e. is
haisen6.ux.uis.no:60020
2011-04-27 10:29:30,030 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cache hit for row <> in tableName table2: location server
haisen6.ux.uis.no:60020, location region name
table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e.
2011-04-27 10:29:30,031 [main] INFO
org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table
instance for table2
2011-04-27 10:29:30,068 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name:
B:
Store(table2:org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:a','-loadKey'))
- scope-6 Operator Key: scope-6)
2011-04-27 10:29:30,085 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler
- File concatenation threshold: 100 optimistic? false
2011-04-27 10:29:30,122 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1
2011-04-27 10:29:30,122 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 1
2011-04-27 10:29:30,187 [main] INFO
org.apache.pig.tools.pigstats.ScriptState - Pig script settings are
added to the job
2011-04-27 10:29:30,204 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-04-27 10:29:31,684 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job
2011-04-27 10:29:31,709 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission.
2011-04-27 10:29:32,059 [Thread-7] INFO org.apache.zookeeper.ZooKeeper
- Initiating client connection, connectString=haisen11:2181
sessionTimeout=180000 watcher=hconnection
2011-04-27 10:29:32,060 [Thread-7-SendThread()] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
haisen11/152.94.1.130:2181
2011-04-27 10:29:32,061 [Thread-7-SendThread(haisen11:2181)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
haisen11/152.94.1.130:2181, initiating session
2011-04-27 10:29:32,063 [Thread-7-SendThread(haisen11:2181)] INFO
org.apache.zookeeper.ClientCnxn - Session establishment complete on
server haisen11/152.94.1.130:2181, sessionid = 0x12f8c18a1340178,
negotiated timeout = 40000
2011-04-27 10:29:32,070 [Thread-7] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Lookedup root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@1f248f2b;
hsa=haisen10.ux.uis.no:60020
2011-04-27 10:29:32,074 [Thread-7] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cached location for .META.,,1.1028785192 is haisen10.ux.uis.no:60020
2011-04-27 10:29:32,074 [Thread-7] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cache hit for row <> in tableName .META.: location server
haisen10.ux.uis.no:60020, location region name .META.,,1.1028785192
2011-04-27 10:29:32,076 [Thread-7] DEBUG
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
row=table2,,00000000000000 for max=10 rows
2011-04-27 10:29:32,080 [Thread-7] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cached location for
table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e. is
haisen6.ux.uis.no:60020
2011-04-27 10:29:32,081 [Thread-7] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
- Cache hit for row <> in tableName table2: location server
haisen6.ux.uis.no:60020, location region name
table2,,1303809998908.0a8a5a1a398c449de8f29a2cf082f30e.
2011-04-27 10:29:32,082 [Thread-7] INFO
org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table
instance for table2
2011-04-27 10:29:32,102 [Thread-7] INFO
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input
paths to process : 1
2011-04-27 10:29:32,102 [Thread-7] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
input paths to process : 1
2011-04-27 10:29:32,110 [Thread-7] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
input paths (combined) to process : 1
2011-04-27 10:29:32,211 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2011-04-27 10:29:32,953 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- HadoopJobId: job_201104251150_0071
2011-04-27 10:29:32,954 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- More information at:
http://haisen11:50030/jobdetails.jsp?jobid=job_201104251150_0071
2011-04-27 10:29:52,654 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- job job_201104251150_0071 has failed! Stop running all dependent jobs
2011-04-27 10:29:52,666 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2011-04-27 10:29:52,674 [main] ERROR
org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2011-04-27 10:29:52,677 [main] INFO
org.apache.pig.tools.pigstats.PigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt
Features
0.20.2-cdh3u0 0.8.0-cdh3u0 haisen 2011-04-27 10:29:30
2011-04-27 10:29:52 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_201104251150_0071 A,B MAP_ONLY Message: Job failed!
Error - NA table2,
Input(s):
Failed to read data from "/passwd"
Output(s):
Failed to produce result in "table2"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_201104251150_0071
2011-04-27 10:29:52,677 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Failed!
thank you
Byambajargal
On 4/27/11 06:07, Bill Graham wrote:
> What version of Pig are you running and what errors are you seeing on
> the task trackers?
>
> On Tue, Apr 26, 2011 at 4:46 AM, byambajargal<by...@gmail.com> wrote:
>> Hello ...
>> I have a question for you
>>
>> I am doing a pig job as following that read from hdfs simply to store hbase
>> when i start the job first part works fine and second part was failure.
>> Could you give me a direction how to move data from hdfs to Hbase
>>
>>
>> A = load '/passwd' using PigStorage(':');B = foreach A generate $0 as id,
>> $2 as value;dump B;
>> store B into 'table2' using
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage( 'cf:a' , '-loadKey');
>>
>> thank you for your help
>>
>> Byambajargal
>>
>>
>>
>> On 4/25/11 18:26, Dmitriy Ryaboy wrote:
>>> The first element of the relation you store must be the row key. You
>>> aren't
>>> loading the row key, so load> store isn't working.
>>> Try
>>> my_data = LOAD 'hbase://table1' using
>>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1', '-loadKey') ;
>>>
>>> On Mon, Apr 25, 2011 at 5:32 AM,
>>> byambajargal<by...@gmail.com>wrote:
>>>
>>>> Hello guys
>>>>
>>>> I am running cloudere distribution cdh3u0 on my cluster with Pig and
>>>> Hbase.
>>>> i can read data from hbase using the following pig query:
>>>>
>>>> my_data = LOAD 'hbase://table1' using
>>>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1') ;dump my_data
>>>>
>>>> but when i try to store data into hbase as same way the job was failure.
>>>>
>>>> store my_data into 'hbase://table2' using
>>>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1');
>>>>
>>>> the table1 and the table2 has same structure and same column.
>>>>
>>>>
>>>> the table i have:
>>>>
>>>> hbase(main):029:0* scan 'table1'
>>>> ROW COLUMN+CELL
>>>> row1 column=cf:1, timestamp=1303731834050, value=value1
>>>> row2 column=cf:1, timestamp=1303731849901, value=value2
>>>> row3 column=cf:1, timestamp=1303731858637, value=value3
>>>> 3 row(s) in 0.0470 seconds
>>>>
>>>>
>>>> thanks
>>>>
>>>> Byambajargal
>>>>
>>>>
>>>>
>>
Re: How to store data into hbase by using Pig
Posted by Bill Graham <bi...@gmail.com>.
What version of Pig are you running and what errors are you seeing on
the task trackers?
On Tue, Apr 26, 2011 at 4:46 AM, byambajargal <by...@gmail.com> wrote:
> Hello ...
> I have a question for you
>
> I am doing a pig job as following that read from hdfs simply to store hbase
> when i start the job first part works fine and second part was failure.
> Could you give me a direction how to move data from hdfs to Hbase
>
>
> A = load '/passwd' using PigStorage(':');B = foreach A generate $0 as id,
> $2 as value;dump B;
> store B into 'table2' using
> org.apache.pig.backend.hadoop.hbase.HBaseStorage( 'cf:a' , '-loadKey');
>
> thank you for your help
>
> Byambajargal
>
>
>
> On 4/25/11 18:26, Dmitriy Ryaboy wrote:
>>
>> The first element of the relation you store must be the row key. You
>> aren't
>> loading the row key, so load> store isn't working.
>> Try
>> my_data = LOAD 'hbase://table1' using
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1', '-loadKey') ;
>>
>> On Mon, Apr 25, 2011 at 5:32 AM,
>> byambajargal<by...@gmail.com>wrote:
>>
>>> Hello guys
>>>
>>> I am running cloudere distribution cdh3u0 on my cluster with Pig and
>>> Hbase.
>>> i can read data from hbase using the following pig query:
>>>
>>> my_data = LOAD 'hbase://table1' using
>>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1') ;dump my_data
>>>
>>> but when i try to store data into hbase as same way the job was failure.
>>>
>>> store my_data into 'hbase://table2' using
>>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1');
>>>
>>> the table1 and the table2 has same structure and same column.
>>>
>>>
>>> the table i have:
>>>
>>> hbase(main):029:0* scan 'table1'
>>> ROW COLUMN+CELL
>>> row1 column=cf:1, timestamp=1303731834050, value=value1
>>> row2 column=cf:1, timestamp=1303731849901, value=value2
>>> row3 column=cf:1, timestamp=1303731858637, value=value3
>>> 3 row(s) in 0.0470 seconds
>>>
>>>
>>> thanks
>>>
>>> Byambajargal
>>>
>>>
>>>
>
>
Re: How to store data into hbase by using Pig
Posted by byambajargal <by...@gmail.com>.
Hello ...
I have a question for you
I am doing a pig job as following that read from hdfs simply to store hbase
when i start the job first part works fine and second part was failure.
Could you give me a direction how to move data from hdfs to Hbase
A = load '/passwd' using PigStorage(':');B = foreach A generate $0 as
id, $2 as value;dump B;
store B into 'table2' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage( 'cf:a' , '-loadKey');
thank you for your help
Byambajargal
On 4/25/11 18:26, Dmitriy Ryaboy wrote:
> The first element of the relation you store must be the row key. You aren't
> loading the row key, so load> store isn't working.
> Try
> my_data = LOAD 'hbase://table1' using
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1', '-loadKey') ;
>
> On Mon, Apr 25, 2011 at 5:32 AM, byambajargal<by...@gmail.com>wrote:
>
>> Hello guys
>>
>> I am running cloudere distribution cdh3u0 on my cluster with Pig and Hbase.
>> i can read data from hbase using the following pig query:
>>
>> my_data = LOAD 'hbase://table1' using
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1') ;dump my_data
>>
>> but when i try to store data into hbase as same way the job was failure.
>>
>> store my_data into 'hbase://table2' using
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1');
>>
>> the table1 and the table2 has same structure and same column.
>>
>>
>> the table i have:
>>
>> hbase(main):029:0* scan 'table1'
>> ROW COLUMN+CELL
>> row1 column=cf:1, timestamp=1303731834050, value=value1
>> row2 column=cf:1, timestamp=1303731849901, value=value2
>> row3 column=cf:1, timestamp=1303731858637, value=value3
>> 3 row(s) in 0.0470 seconds
>>
>>
>> thanks
>>
>> Byambajargal
>>
>>
>>
Re: How to store data into hbase by using Pig
Posted by Dmitriy Ryaboy <dv...@gmail.com>.
The first element of the relation you store must be the row key. You aren't
loading the row key, so load > store isn't working.
Try
my_data = LOAD 'hbase://table1' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1', '-loadKey') ;
On Mon, Apr 25, 2011 at 5:32 AM, byambajargal <by...@gmail.com>wrote:
>
> Hello guys
>
> I am running cloudere distribution cdh3u0 on my cluster with Pig and Hbase.
> i can read data from hbase using the following pig query:
>
> my_data = LOAD 'hbase://table1' using
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1') ;dump my_data
>
> but when i try to store data into hbase as same way the job was failure.
>
> store my_data into 'hbase://table2' using
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:1');
>
> the table1 and the table2 has same structure and same column.
>
>
> the table i have:
>
> hbase(main):029:0* scan 'table1'
> ROW COLUMN+CELL
> row1 column=cf:1, timestamp=1303731834050, value=value1
> row2 column=cf:1, timestamp=1303731849901, value=value2
> row3 column=cf:1, timestamp=1303731858637, value=value3
> 3 row(s) in 0.0470 seconds
>
>
> thanks
>
> Byambajargal
>
>
>