You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Kevin Lion <kl...@ubikod.com> on 2012/03/06 14:58:49 UTC

Re: Using merge join from a HBaseStorage

Hello,

I've made a patch for this issue. You can find all infos about that here :
https://issues.apache.org/jira/browse/PIG-2495

Kevin
Capptain.com - Pilot your apps!





2012/1/24 Kevin Lion <kl...@ubikod.com>

> Hi,
>
> To increase performance of my computation, I would like to use a merge
> join between two tables. I wrote this code to do that :
>
> pigServer.registerQuery("start_sessions = LOAD
>> hbase://startSession.localhost USING HBaseStorage('meta:sid meta:infoid
>> meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray,
>> imei:chararray, start:long);");
>> pigServer.registerQuery("end_sessions = LOAD hbase://endSession.localhost
>> USING HBaseStorage('meta:sid meta:timestamp meta:locid') AS (sid:chararray,
>> end:long, locid:chararray);");
>> pigServer.registerQuery("sessions = JOIN start_sessions BY sid,
>> end_sessions BY sid USING 'merge';");
>> pigServer.deleteFile(Analytics.getOutputFilePath("sessions"));
>> pigServer.store("sessions", Analytics.getOutputFilePath("sessions"),
>> "BinStorage");
>>
>
>
> During the first job of the computation, I've got this error :
>
> java.io.IOException: Could create instance of class
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage$1, while attempting to
>> de-serialize it. (no default constructor ?)
>>     at
>> org.apache.pig.data.BinInterSedes.readWritable(BinInterSedes.java:235)
>>     at org.apache.pig.data.BinInterSedes.readDatum(BinInterSedes.java:336)
>>     at org.apache.pig.data.BinInterSedes.readDatum(BinInterSedes.java:251)
>>     at
>> org.apache.pig.data.BinInterSedes.addColsToTuple(BinInterSedes.java:556)
>>     at org.apache.pig.data.BinSedesTuple.readFields(BinSedesTuple.java:64)
>>     at
>> org.apache.pig.impl.io.PigNullableWritable.readFields(PigNullableWritable.java:114)
>>     at
>> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>>     at
>> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>>     at
>> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:113)
>>     at
>> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>>     at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>>     at
>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
>>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>>     at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> Caused by: java.lang.InstantiationException:
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage$1
>>     at java.lang.Class.newInstance0(Class.java:340)
>>     at java.lang.Class.newInstance(Class.java:308)
>>     at
>> org.apache.pig.data.BinInterSedes.readWritable(BinInterSedes.java:231)
>>     ... 13 more
>>
>
> I have no idea about the origin of the problem. Does someone have any idea
> ?
>
> Thanks,
> Kevin
>
>