You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Wang Haihua (JIRA)" <ji...@apache.org> on 2018/04/28 17:55:00 UTC

[jira] [Commented] (HIVE-18144) Runtime type inference error when join three table for different column type

    [ https://issues.apache.org/jira/browse/HIVE-18144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457720#comment-16457720 ] 

Wang Haihua commented on HIVE-18144:
------------------------------------

cc [~pxiong] could you please see this again?

> Runtime type inference error when join three table for different column type 
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-18144
>                 URL: https://issues.apache.org/jira/browse/HIVE-18144
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>    Affects Versions: 2.1.1, 2.2.0
>            Reporter: Wang Haihua
>            Assignee: Wang Haihua
>            Priority: Major
>         Attachments: HIVE-18144.1.patch, HIVE-18144.2.patch, HIVE-18144.3.patch
>
>
> Union operation with three or more table, which has different column types, may cause type inference error when Task execution.
> E.g, e.g. t1(with column int) union all t2(with column int) union all t3(with column bigint), finally should be {{bigint}},
> RowSchema of union t1 with t2, we call {{leftOp}}, should be int, then leftOp union t3 should finally be bigint.
> This mean RowSchema of leftOp would be {{bigint}} instead of {{int}}
> However we see in {{SemanticAnalyzer.java}}, leftOp RowSchema is finally {{int}} which was wrong: 
> {code}
> (_col0: int|{t01-subquery1}diff_long_type,_col1: int|{t01-subquery1}id2,_col2: bigint|{t01-subquery1}id3)}}
> {code}
> Impacted code  in SemanticAnalyzer.java:
> {code}
>       if(!(leftOp instanceof UnionOperator)) {
>         Operator oldChild = leftOp;
>         leftOp = (Operator) leftOp.getParentOperators().get(0);
>         leftOp.removeChildAndAdoptItsChildren(oldChild);
>       }
>       // make left a child of right
>       List<Operator<? extends OperatorDesc>> child =
>           new ArrayList<Operator<? extends OperatorDesc>>();
>       child.add(leftOp);
>       rightOp.setChildOperators(child);
>       List<Operator<? extends OperatorDesc>> parent = leftOp
>           .getParentOperators();
>       parent.add(rightOp);
>       UnionDesc uDesc = ((UnionOperator) leftOp).getConf();
>       // Here we should set RowSchema of leftOp to unionoutRR's, or else the RowSchema of leftOp is wrong.
>       // leftOp.setSchema(new RowSchema(unionoutRR.getColumnInfos()));
>       uDesc.setNumInputs(uDesc.getNumInputs() + 1);
>       return putOpInsertMap(leftOp, unionoutRR);
> {code}
> Operation for reproduceļ¼š
> {code}
> create table test_union_different_type(id bigint, id2 bigint, id3 bigint, name string);
> set hive.auto.convert.join=true;
> insert overwrite table test_union_different_type select 1, 2, 3, "test_union_different_type";
> select
>   t01.diff_long_type as diff_long_type,
>   t01.id2 as id2,
>   t00.id as id,
>   t01.id3 as id3
> from test_union_different_type t00
> left join
>   (
>     select 1 as diff_long_type, 30 as id2, id3 from test_union_different_type
>     union ALL
>     select 2 as diff_long_type, 20 as id2, id3 from test_union_different_type
>     union ALL
>     select id as diff_long_type, id2, 30 as id3 from test_union_different_type
>   ) t01
> on t00.id = t01.diff_long_type
> ;
> {code}
> Stack trace:
> {code}
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"id":1,"id2":null,"id3":null,"name":null}
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:169)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"id":1,"id2":null,"id3":null,"name":null}
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:499)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception from MapJoinOperator : org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:465)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
>   at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
>   at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:149)
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
>   ... 9 more
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
>   at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.get(WritableIntObjectInspector.java:36)
>   at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:239)
>   at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292)
>   at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247)
>   at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231)
>   at org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
>   at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:714)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
>   at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
>   at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:647)
>   at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:660)
>   at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:663)
>   at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:759)
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:452)
>   ... 13 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)