You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by marko <mb...@gmail.com> on 2013/03/11 13:00:16 UTC
Writing to HBase
Hello!
I successfully read from HBase table using:
table = load 'hbase://temp' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:c1, cf:c2',
'-loadKey true') as (key:chararray, c1:bytearray, c2:bytearray)
I used UDF to parse column data and convert it into doubles from
bytearrays.
I do some processing and manage to dump the results:
dump results;
which prints:
((product1-20131231-20100101,1.5,1.5))
((product2-20131231-20100101,2.5,2.5))
However, I cannot write these results into a newly created empty HBase
table:
copy = store results into 'hbase://results' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:res1, cf:res2');
I have also tried .. store results into 'results' using .., but it
doesn't help.
I am using pig-0.11.0.
I suspect I should do some sort of casting into bytearrays using UDF,
like I did when reading the table.
This is the exception I get:
java.io.IOException: java.lang.IllegalArgumentException: No columns to
insert
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:470)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:433)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:413)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:257)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:650)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
Caused by: java.lang.IllegalArgumentException: No columns to insert
at org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:970)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:763)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:749)
at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:123)
at
org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:84)
at
org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:885)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
at
org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:588)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:468)
Thanks for any suggestions,
Marko
Re: Writing to HBase
Posted by marko <mb...@gmail.com>.
I flattened it and now it gets written into HBase properly.
Thanks.
On 03/11/2013 01:19 PM, Aaron Zimmerman wrote:
> From your dump results it looks like the only value on the relation is a tuple. So it is trying to use that tuple as the row key and there are no other fields to use as columns.
>
> On Mar 11, 2013, at 7:00, marko <mb...@gmail.com> wrote:
>
>> Hello!
>>
>> I successfully read from HBase table using:
>>
>> table = load 'hbase://temp' using org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:c1, cf:c2', '-loadKey true') as (key:chararray, c1:bytearray, c2:bytearray)
>>
>> I used UDF to parse column data and convert it into doubles from bytearrays.
>>
>> I do some processing and manage to dump the results:
>> dump results;
>>
>> which prints:
>> ((product1-20131231-20100101,1.5,1.5))
>> ((product2-20131231-20100101,2.5,2.5))
>>
>> However, I cannot write these results into a newly created empty HBase table:
>> copy = store results into 'hbase://results' using org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:res1, cf:res2');
>>
>> I have also tried .. store results into 'results' using .., but it doesn't help.
>> I am using pig-0.11.0.
>>
>> I suspect I should do some sort of casting into bytearrays using UDF, like I did when reading the table.
>>
>> This is the exception I get:
>> java.io.IOException: java.lang.IllegalArgumentException: No columns to insert
>> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:470)
>> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:433)
>> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:413)
>> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:257)
>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
>> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:650)
>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
>> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
>> Caused by: java.lang.IllegalArgumentException: No columns to insert
>> at org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:970)
>> at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:763)
>> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:749)
>> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:123)
>> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:84)
>> at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:885)
>> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
>> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
>> at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:588)
>> at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:468)
>>
>>
>> Thanks for any suggestions,
>> Marko
>>
Re: Writing to HBase
Posted by Aaron Zimmerman <az...@sproutsocial.com>.
From your dump results it looks like the only value on the relation is a tuple. So it is trying to use that tuple as the row key and there are no other fields to use as columns.
On Mar 11, 2013, at 7:00, marko <mb...@gmail.com> wrote:
> Hello!
>
> I successfully read from HBase table using:
>
> table = load 'hbase://temp' using org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:c1, cf:c2', '-loadKey true') as (key:chararray, c1:bytearray, c2:bytearray)
>
> I used UDF to parse column data and convert it into doubles from bytearrays.
>
> I do some processing and manage to dump the results:
> dump results;
>
> which prints:
> ((product1-20131231-20100101,1.5,1.5))
> ((product2-20131231-20100101,2.5,2.5))
>
> However, I cannot write these results into a newly created empty HBase table:
> copy = store results into 'hbase://results' using org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:res1, cf:res2');
>
> I have also tried .. store results into 'results' using .., but it doesn't help.
> I am using pig-0.11.0.
>
> I suspect I should do some sort of casting into bytearrays using UDF, like I did when reading the table.
>
> This is the exception I get:
> java.io.IOException: java.lang.IllegalArgumentException: No columns to insert
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:470)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:433)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:413)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:257)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:650)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
> Caused by: java.lang.IllegalArgumentException: No columns to insert
> at org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:970)
> at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:763)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:749)
> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:123)
> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:84)
> at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:885)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
> at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:588)
> at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:468)
>
>
> Thanks for any suggestions,
> Marko
>