You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Wang Haihua (JIRA)" <ji...@apache.org> on 2018/04/28 17:55:00 UTC
[jira] [Commented] (HIVE-18144) Runtime type inference error when
join three table for different column type
[ https://issues.apache.org/jira/browse/HIVE-18144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457720#comment-16457720 ]
Wang Haihua commented on HIVE-18144:
------------------------------------
cc [~pxiong] could you please see this again?
> Runtime type inference error when join three table for different column type
> -----------------------------------------------------------------------------
>
> Key: HIVE-18144
> URL: https://issues.apache.org/jira/browse/HIVE-18144
> Project: Hive
> Issue Type: Bug
> Components: Query Planning
> Affects Versions: 2.1.1, 2.2.0
> Reporter: Wang Haihua
> Assignee: Wang Haihua
> Priority: Major
> Attachments: HIVE-18144.1.patch, HIVE-18144.2.patch, HIVE-18144.3.patch
>
>
> Union operation with three or more table, which has different column types, may cause type inference error when Task execution.
> E.g, e.g. t1(with column int) union all t2(with column int) union all t3(with column bigint), finally should be {{bigint}},
> RowSchema of union t1 with t2, we call {{leftOp}}, should be int, then leftOp union t3 should finally be bigint.
> This mean RowSchema of leftOp would be {{bigint}} instead of {{int}}
> However we see in {{SemanticAnalyzer.java}}, leftOp RowSchema is finally {{int}} which was wrong:
> {code}
> (_col0: int|{t01-subquery1}diff_long_type,_col1: int|{t01-subquery1}id2,_col2: bigint|{t01-subquery1}id3)}}
> {code}
> Impacted code in SemanticAnalyzer.java:
> {code}
> if(!(leftOp instanceof UnionOperator)) {
> Operator oldChild = leftOp;
> leftOp = (Operator) leftOp.getParentOperators().get(0);
> leftOp.removeChildAndAdoptItsChildren(oldChild);
> }
> // make left a child of right
> List<Operator<? extends OperatorDesc>> child =
> new ArrayList<Operator<? extends OperatorDesc>>();
> child.add(leftOp);
> rightOp.setChildOperators(child);
> List<Operator<? extends OperatorDesc>> parent = leftOp
> .getParentOperators();
> parent.add(rightOp);
> UnionDesc uDesc = ((UnionOperator) leftOp).getConf();
> // Here we should set RowSchema of leftOp to unionoutRR's, or else the RowSchema of leftOp is wrong.
> // leftOp.setSchema(new RowSchema(unionoutRR.getColumnInfos()));
> uDesc.setNumInputs(uDesc.getNumInputs() + 1);
> return putOpInsertMap(leftOp, unionoutRR);
> {code}
> Operation for reproduceļ¼
> {code}
> create table test_union_different_type(id bigint, id2 bigint, id3 bigint, name string);
> set hive.auto.convert.join=true;
> insert overwrite table test_union_different_type select 1, 2, 3, "test_union_different_type";
> select
> t01.diff_long_type as diff_long_type,
> t01.id2 as id2,
> t00.id as id,
> t01.id3 as id3
> from test_union_different_type t00
> left join
> (
> select 1 as diff_long_type, 30 as id2, id3 from test_union_different_type
> union ALL
> select 2 as diff_long_type, 20 as id2, id3 from test_union_different_type
> union ALL
> select id as diff_long_type, id2, 30 as id3 from test_union_different_type
> ) t01
> on t00.id = t01.diff_long_type
> ;
> {code}
> Stack trace:
> {code}
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"id":1,"id2":null,"id3":null,"name":null}
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:169)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"id":1,"id2":null,"id3":null,"name":null}
> at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:499)
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
> ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception from MapJoinOperator : org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
> at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:465)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
> at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:149)
> at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
> ... 9 more
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
> at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.get(WritableIntObjectInspector.java:36)
> at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:239)
> at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292)
> at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247)
> at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231)
> at org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
> at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:714)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
> at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
> at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:647)
> at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:660)
> at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:663)
> at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:759)
> at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:452)
> ... 13 more
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)