You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Wang Haihua (JIRA)" <ji...@apache.org> on 2017/11/24 17:54:00 UTC

[jira] [Updated] (HIVE-18144) Runtime type error when join three table for different column type

     [ https://issues.apache.org/jira/browse/HIVE-18144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wang Haihua updated HIVE-18144:
-------------------------------
    Description: 
 Union operation with three or more table, which has different column types, may cause type inference error when Task execution.

E.g, e.g. t1(with column int) union all t2(with column int) union t3(with column bigint), finally should be {{bigint}},

rowshema of leftOp and rightOp should be the unionRowReslover's

Otherwize, the result of union first two table would be 

 but RowSchema leftOp of t1(int) union all t2(int), would be int.

leftOp RowSchema
(_col0: int|{t01-subquery1}diff_long_type,_col1: int|{t01-subquery1}id2,_col2: bigint|{t01-subquery1}id3)

unionoutRowSchema
(_col0: bigint|{t01}diff_long_type,_col1: bigint|{t01}id2,_col2: bigint|{t01}id3)


Code with impaction
{code}

      if(!(leftOp instanceof UnionOperator)) {
        Operator oldChild = leftOp;
        leftOp = (Operator) leftOp.getParentOperators().get(0);
        leftOp.removeChildAndAdoptItsChildren(oldChild);
      }

      // make left a child of right
      List<Operator<? extends OperatorDesc>> child =
          new ArrayList<Operator<? extends OperatorDesc>>();
      child.add(leftOp);
      rightOp.setChildOperators(child);

      List<Operator<? extends OperatorDesc>> parent = leftOp
          .getParentOperators();
      parent.add(rightOp);

      UnionDesc uDesc = ((UnionOperator) leftOp).getConf();
      leftOp.setSchema(new RowSchema(unionoutRR.getColumnInfos()));
      uDesc.setNumInputs(uDesc.getNumInputs() + 1);
      return putOpInsertMap(leftOp, unionoutRR);

{code}

Operation for reproduce:

{code}
create table test_union_different_type(id bigint, id2 bigint, id3 bigint, name string);
set hive.auto.convert.join=true;
insert overwrite table test_union_different_type select 1, 2, 3, "test_union_different_type";
select
  t01.diff_long_type as diff_long_type,
  t01.id2 as id2,
  t00.id as id,
  t01.id3 as id3
from test_union_different_type t00
left join
  (
    select 1 as diff_long_type, 30 as id2, id3 from test_union_different_type
    union ALL
    select 2 as diff_long_type, 20 as id2, id3 from test_union_different_type
    union ALL
    select id as diff_long_type, id2, 30 as id3 from test_union_different_type
  ) t01
on t00.id = t01.diff_long_type
;

{code}

Stack trace:

{code}

Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"id":1,"id2":null,"id3":null,"name":null}
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:169)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:415)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"id":1,"id2":null,"id3":null,"name":null}
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:499)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
  ... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception from MapJoinOperator : org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:465)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
  at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
  at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:149)
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
  ... 9 more
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
  at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.get(WritableIntObjectInspector.java:36)
  at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:239)
  at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292)
  at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247)
  at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231)
  at org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
  at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:714)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
  at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
  at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:647)
  at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:660)
  at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:663)
  at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:759)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:452)
  ... 13 more

{code}

  was:
three table union, rowshema of leftOp and rightOp should be the unionRowReslover's

Otherwize, the result of union first two table would be 

e.g. t1(int) union all t2(int) union t3(bigint), finally should be bigint, but RowSchema leftOp of t1(int) union all t2(int), would be int.

leftOp RowSchema
(_col0: int|{t01-subquery1}diff_long_type,_col1: int|{t01-subquery1}id2,_col2: bigint|{t01-subquery1}id3)

unionoutRowSchema
(_col0: bigint|{t01}diff_long_type,_col1: bigint|{t01}id2,_col2: bigint|{t01}id3)


Code with impaction
{code}

      if(!(leftOp instanceof UnionOperator)) {
        Operator oldChild = leftOp;
        leftOp = (Operator) leftOp.getParentOperators().get(0);
        leftOp.removeChildAndAdoptItsChildren(oldChild);
      }

      // make left a child of right
      List<Operator<? extends OperatorDesc>> child =
          new ArrayList<Operator<? extends OperatorDesc>>();
      child.add(leftOp);
      rightOp.setChildOperators(child);

      List<Operator<? extends OperatorDesc>> parent = leftOp
          .getParentOperators();
      parent.add(rightOp);

      UnionDesc uDesc = ((UnionOperator) leftOp).getConf();
      leftOp.setSchema(new RowSchema(unionoutRR.getColumnInfos()));
      uDesc.setNumInputs(uDesc.getNumInputs() + 1);
      return putOpInsertMap(leftOp, unionoutRR);

{code}

Operation for reproduce:

{code}
create table test_union_different_type(id bigint, id2 bigint, id3 bigint, name string);
set hive.auto.convert.join=true;
insert overwrite table test_union_different_type select 1, 2, 3, "test_union_different_type";
select
  t01.diff_long_type as diff_long_type,
  t01.id2 as id2,
  t00.id as id,
  t01.id3 as id3
from test_union_different_type t00
left join
  (
    select 1 as diff_long_type, 30 as id2, id3 from test_union_different_type
    union ALL
    select 2 as diff_long_type, 20 as id2, id3 from test_union_different_type
    union ALL
    select id as diff_long_type, id2, 30 as id3 from test_union_different_type
  ) t01
on t00.id = t01.diff_long_type
;

{code}

Stack trace:

{code}

Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"id":1,"id2":null,"id3":null,"name":null}
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:169)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:415)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"id":1,"id2":null,"id3":null,"name":null}
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:499)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
  ... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception from MapJoinOperator : org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:465)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
  at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
  at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:149)
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
  ... 9 more
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
  at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.get(WritableIntObjectInspector.java:36)
  at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:239)
  at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292)
  at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247)
  at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231)
  at org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
  at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:714)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
  at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
  at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:647)
  at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:660)
  at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:663)
  at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:759)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:452)
  ... 13 more

{code}


> Runtime type error when join three table for different column type
> ------------------------------------------------------------------
>
>                 Key: HIVE-18144
>                 URL: https://issues.apache.org/jira/browse/HIVE-18144
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>    Affects Versions: 2.1.1, 2.2.0
>            Reporter: Wang Haihua
>            Assignee: Wang Haihua
>
>  Union operation with three or more table, which has different column types, may cause type inference error when Task execution.
> E.g, e.g. t1(with column int) union all t2(with column int) union t3(with column bigint), finally should be {{bigint}},
> rowshema of leftOp and rightOp should be the unionRowReslover's
> Otherwize, the result of union first two table would be 
>  but RowSchema leftOp of t1(int) union all t2(int), would be int.
> leftOp RowSchema
> (_col0: int|{t01-subquery1}diff_long_type,_col1: int|{t01-subquery1}id2,_col2: bigint|{t01-subquery1}id3)
> unionoutRowSchema
> (_col0: bigint|{t01}diff_long_type,_col1: bigint|{t01}id2,_col2: bigint|{t01}id3)
> Code with impaction
> {code}
>       if(!(leftOp instanceof UnionOperator)) {
>         Operator oldChild = leftOp;
>         leftOp = (Operator) leftOp.getParentOperators().get(0);
>         leftOp.removeChildAndAdoptItsChildren(oldChild);
>       }
>       // make left a child of right
>       List<Operator<? extends OperatorDesc>> child =
>           new ArrayList<Operator<? extends OperatorDesc>>();
>       child.add(leftOp);
>       rightOp.setChildOperators(child);
>       List<Operator<? extends OperatorDesc>> parent = leftOp
>           .getParentOperators();
>       parent.add(rightOp);
>       UnionDesc uDesc = ((UnionOperator) leftOp).getConf();
>       leftOp.setSchema(new RowSchema(unionoutRR.getColumnInfos()));
>       uDesc.setNumInputs(uDesc.getNumInputs() + 1);
>       return putOpInsertMap(leftOp, unionoutRR);
> {code}
> Operation for reproduce:
> {code}
> create table test_union_different_type(id bigint, id2 bigint, id3 bigint, name string);
> set hive.auto.convert.join=true;
> insert overwrite table test_union_different_type select 1, 2, 3, "test_union_different_type";
> select
>   t01.diff_long_type as diff_long_type,
>   t01.id2 as id2,
>   t00.id as id,
>   t01.id3 as id3
> from test_union_different_type t00
> left join
>   (
>     select 1 as diff_long_type, 30 as id2, id3 from test_union_different_type
>     union ALL
>     select 2 as diff_long_type, 20 as id2, id3 from test_union_different_type
>     union ALL
>     select id as diff_long_type, id2, 30 as id3 from test_union_different_type
>   ) t01
> on t00.id = t01.diff_long_type
> ;
> {code}
> Stack trace:
> {code}
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"id":1,"id2":null,"id3":null,"name":null}
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:169)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"id":1,"id2":null,"id3":null,"name":null}
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:499)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception from MapJoinOperator : org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:465)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
>   at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
>   at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:149)
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
>   ... 9 more
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
>   at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.get(WritableIntObjectInspector.java:36)
>   at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:239)
>   at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292)
>   at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247)
>   at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231)
>   at org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
>   at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:714)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
>   at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:879)
>   at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:647)
>   at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:660)
>   at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:663)
>   at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:759)
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:452)
>   ... 13 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)