You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Pengcheng Xiong (JIRA)" <ji...@apache.org> on 2015/04/24 20:11:38 UTC

[jira] [Commented] (HIVE-10455) CBO (Calcite Return Path): Different data types at Reducer before JoinOp

    [ https://issues.apache.org/jira/browse/HIVE-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511455#comment-14511455 ] 

Pengcheng Xiong commented on HIVE-10455:
----------------------------------------

[~jcamachorodriguez], as per [~jpullokkaran]'s request, could you please review the patch? Thanks!

> CBO (Calcite Return Path): Different data types at Reducer before JoinOp
> ------------------------------------------------------------------------
>
>                 Key: HIVE-10455
>                 URL: https://issues.apache.org/jira/browse/HIVE-10455
>             Project: Hive
>          Issue Type: Sub-task
>          Components: CBO
>            Reporter: Pengcheng Xiong
>            Assignee: Pengcheng Xiong
>             Fix For: 1.2.0
>
>         Attachments: HIVE-10455.01.patch
>
>
> The following error occured for cbo_subq_not_in.q 
> {code}
> java.lang.Exception: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error: Unable to deserialize reduce input key from x1x128x0x0x1 with properties {columns=reducesinkkey0, serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe, serialization.sort.order=+, columns.types=double}
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
> {code}
> A more easier way to reproduce is 
> {code}
> set hive.cbo.enable=true;
> set hive.exec.check.crossproducts=false;
> set hive.stats.fetch.column.stats=true;
> set hive.auto.convert.join=false;
> select p_size, src.key
> from 
> part join src
> on p_size=key;
> {code}
> As you can see, p_size is integer while src.key is string. Both of them should be cast to double when they join. When return path is off, this will happen before Join, at RS. However, when return path is on, this will be considered as an expression in Join. Thus, when reducer is collecting different types of keys from different join branches, it throws exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)