You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by jaydeep vishwakarma <ja...@mkhoj.com> on 2010/12/23 10:48:13 UTC

Issue with map join

Hi,

I am trying to running some MAPJOIN queries. When I am placing single
table in MAP JOIN it works fine,But when I run same query with two
tables on MAPJOIN it gives error. Can any tell me what could be the
problem? Here is the error log which I am getting from job tracker.


java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-15-20"}
        at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:171)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-15-20"}
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:417)
        at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:153)
        ... 4 more
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:177)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:400)
        ... 5 more

Regard,
Jaydeep

The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.

Re: Issue with map join

Posted by jaydeep vishwakarma <ja...@mkhoj.com>.
I checked map join behaviour, When I use more than one table in map join without serde it works perfectly fine , But it does not work with Serde for more than one tables. I checked the code, I found the class called MapJoinOperator.java have joinKeys variable,It have null value when I use serde with mapjoin. Is there any thing by that we can tell mapjoin use Serde. Or Mapjoin has not implemented for more than one table on Serde.

Regards,
Jaydeep

On Thursday 23 December 2010 03:18 PM, jaydeep vishwakarma wrote:

Hi,

I am trying to running some MAPJOIN queries. When I am placing single
table in MAP JOIN it works fine,But when I run same query with two
tables on MAPJOIN it gives error. Can any tell me what could be the
problem? Here is the error log which I am getting from job tracker.


java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-15-20"}
        at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:171)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-15-20"}
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:417)
        at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:153)
        ... 4 more
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:177)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:400)
        ... 5 more

Regard,
Jaydeep

The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
.




________________________________
The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.

Re: Issue with map join

Posted by jaydeep vishwakarma <ja...@mkhoj.com>.
Hi Namit,

Here is the query which I am trying to run :-
select /*+ MAPJOIN(region,carrier ) */
region.reg,carrier.country,count(column1) from foo join carrier on
(foo.column10 =carrier.id) join region on
(carrier.country=region.country)  where foo.rq_dt = '2010-12-16-00' AND
carrier.rq_dt = '2010-12-15-20' AND region.rq_dt = '2010-12-15-20' group
by region.name, carrier.country;


Here are the schema of tables. First two tables contains hardly 600 and
200 rows in each partions. But in foo table I have about 30 million rows
in each partition and also foo table have serde.

create table carrier
(
id int,
country STRING,
name STRING
)PARTITIONED BY(rq_dt STRING);

create table region
(
id int,
country STRING,
name STRING
)PARTITIONED BY(rq_dt STRING);





create table foo
(
column_1 STRING,
column_2 STRING,
column_3 STRING,
column_4 STRING,
column_5 STRING,
column_6 STRING,
column_7 STRING,
column_8 STRING,
column_9 STRING,
column_10 INT,
column_11 INT,
column_12 STRING,
column_13 INT,
column_14 STRING,
column_15 BIGINT,
column_16 BIGINT,
column_17 BIGINT,
column_18 BIGINT,
column_19 BIGINT,
column_20 BIGINT,
column_21 BIGINT,
column_22 BIGINT,
column_23 BIGINT,
column_24 BIGINT,
column_25 BIGINT,
column_26 STRING,
column_27 STRING,
column_28 STRING,
column_29 STRING,
column_30 STRING,
column_31 STRING,
column_32 STRING,
column_33 STRING,
column_34 STRING,
column_35 INT,
column_36 STRING,
column_37 STRING,
column_38 STRING,
column_39 STRING,
column_40 STRING,
column_41 STRING,
column_42 STRING,
column_43 STRING,
column_44 STRING,
column_45 STRING,
column_46 STRING,
column_47 STRING,
column_48 STRING,
column_49 STRING,
column_50 STRING,
column_51 STRING,
column_52 STRING,
column_53 STRING,
column_54 STRING,
column_55 STRING,
column_56 STRING,
column_57 INT,
column_58 STRING,
column_59 STRING,
column_60 STRING,
column_61 STRING,
column_62 STRING,
column_63 STRING,
column_64 STRING,
column_65 STRING,
column_66 STRING,
column_67 STRING,
column_68 STRING,
column_69 STRING,
column_70 STRING,
column_71 STRING,
column_72 STRING,
column_73 STRING,
column_74 STRING,
column_75 STRING,
column_76 STRING,
column_77 STRING,
column_78 STRING,
column_79 STRING,
column_80 STRING,
column_81 STRING,
column_82 STRING,
column_83 STRING,
column_84 STRING,
column_85 STRING,
column_86 STRING,
column_87 STRING,
)
PARTITIONED BY(rq_dt STRING)
ROW FORMAT SERDE 'com.inmobi.dw.datastore.hive.MySerDe'
STORED AS  SEQUENCEFILE;

Thanks,
Jaydep

On Friday 24 December 2010 12:23 AM, Namit Jain wrote:
> Can you send the exact query along with the schema of the tables ?
>
>
> On 12/23/10 1:48 AM, "jaydeep vishwakarma"<ja...@mkhoj.com>
> wrote:
>
>> Hi,
>>
>> I am trying to running some MAPJOIN queries. When I am placing single
>> table in MAP JOIN it works fine,But when I run same query with two
>> tables on MAPJOIN it gives error. Can any tell me what could be the
>> problem? Here is the error log which I am getting from job tracker.
>>
>>
>> java.lang.RuntimeException:
>> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error
>> while processing row
>> {"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-1
>> 5-20"}
>>         at
>> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:171)
>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>>         at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
>> Error while processing row
>> {"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-1
>> 5-20"}
>>         at
>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:417)
>>         at
>> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:153)
>>         ... 4 more
>> Caused by: java.lang.NullPointerException
>>         at
>> org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.j
>> ava:177)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
>>         at
>> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.jav
>> a:84)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
>>         at
>> org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
>>         at
>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:400)
>>         ... 5 more
>>
>> Regard,
>> Jaydeep
>>
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally
>> privileged information. If you are not the intended recipient you are
>> hereby notified that any disclosure, copying, distribution or taking any
>> action in reliance on the contents of this information is strictly
>> prohibited and may be unlawful. If you have received this communication
>> in error, please notify us immediately by responding to this email and
>> then delete it from your system. The firm is neither liable for the
>> proper and complete transmission of the information contained in this
>> communication nor for any delay in its receipt.
> .
>


The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.

Re: Issue with map join

Posted by Namit Jain <nj...@fb.com>.
Can you send the exact query along with the schema of the tables ?


On 12/23/10 1:48 AM, "jaydeep vishwakarma" <ja...@mkhoj.com>
wrote:

>Hi,
>
>I am trying to running some MAPJOIN queries. When I am placing single
>table in MAP JOIN it works fine,But when I run same query with two
>tables on MAPJOIN it gives error. Can any tell me what could be the
>problem? Here is the error log which I am getting from job tracker.
>
>
>java.lang.RuntimeException:
>org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error
>while processing row
>{"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-1
>5-20"}
>        at 
>org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:171)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
>Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
>Error while processing row
>{"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-1
>5-20"}
>        at 
>org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:417)
>        at 
>org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:153)
>        ... 4 more
>Caused by: java.lang.NullPointerException
>        at 
>org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.j
>ava:177)
>        at 
>org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
>        at 
>org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
>        at 
>org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.jav
>a:84)
>        at 
>org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
>        at 
>org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
>        at 
>org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:400)
>        ... 5 more
>
>Regard,
>Jaydeep
>
>The information contained in this communication is intended solely for
>the use of the individual or entity to whom it is addressed and others
>authorized to receive it. It may contain confidential or legally
>privileged information. If you are not the intended recipient you are
>hereby notified that any disclosure, copying, distribution or taking any
>action in reliance on the contents of this information is strictly
>prohibited and may be unlawful. If you have received this communication
>in error, please notify us immediately by responding to this email and
>then delete it from your system. The firm is neither liable for the
>proper and complete transmission of the information contained in this
>communication nor for any delay in its receipt.