You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Aihua Xu (JIRA)" <ji...@apache.org> on 2016/03/30 23:08:26 UTC

[jira] [Updated] (HIVE-11880) filter bug of UNION ALL when hive.ppd.remove.duplicatefilters=true and filter condition is type incompatible column

     [ https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aihua Xu updated HIVE-11880:
----------------------------
    Description: 
   For UNION ALL , when an union operator is constant column (such as '0L', BIGINT Type)  and its corresponding column has incompatible type (such as INT type). 
  Query with filter condition on type incompatible column on this UNION ALL  will cause IndexOutOfBoundsException.

 Such as TPC-H table "orders",in  the following query:
 Type of 'orders'.'o_custkey' is INT normally,  while  the type of corresponding constant column  "0" is BIGINT( `0L AS `o_custkey` ). 
 This following query (with filter "type incompatible column 'o_custkey' ")  will fail  with  java.lang.IndexOutOfBoundsException : 

{code}
set hive.cbo.enable=false;
set hive.ppd.remove.duplicatefilters=true;
CREATE TABLE `orders`(
  `o_orderkey` int, 
  `o_custkey` int, 
  `o_orderstatus` string, 
  `o_totalprice` double, 
  `o_orderdate` string, 
  `o_orderpriority` string, 
  `o_clerk` string, 
  `o_shippriority` int, 
  `o_comment` string);

SELECT o_orderkey
FROM   (
              SELECT `o_orderkey` ,
                     `o_custkey`
              FROM   `orders`
              UNION ALL
              SELECT `o_orderkey`,
                     0L  AS `o_custkey`
              FROM   `orders`) `oo`
WHERE  o_custkey<10;
{code}

  was:
   For UNION ALL , when an union operator is constant column (such as '0L', BIGINT Type)  and its corresponding column has incompatible type (such as INT type). 
  Query with filter condition on type incompatible column on this UNION ALL  will cause IndexOutOfBoundsException.

 Such as TPC-H table "orders",in  the following query:
 Type of 'orders'.'o_custkey' is INT normally,  while  the type of corresponding constant column  "0" is BIGINT( `0L AS `o_custkey` ). 
 This query (with filter "type incompatible column 'o_custkey' ")  will fail  with  java.lang.IndexOutOfBoundsException : 
{code}
SELECT Count(1)
FROM   (
              SELECT `o_orderkey` ,
                     `o_custkey`
              FROM   `orders`
              UNION ALL
              SELECT `o_orderkey`,
                     0L  AS `o_custkey`
              FROM   `orders`) `oo`
WHERE  o_custkey<100000 limit 4 ;
{code}
When 
{code}
set hive.ppd.remove.duplicatefilters=true
{code}


> filter bug  of UNION ALL when hive.ppd.remove.duplicatefilters=true and filter condition is type incompatible column 
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-11880
>                 URL: https://issues.apache.org/jira/browse/HIVE-11880
>             Project: Hive
>          Issue Type: Bug
>          Components: Logical Optimizer
>    Affects Versions: 1.2.1
>            Reporter: WangMeng
>            Assignee: WangMeng
>         Attachments: HIVE-11880.01.patch, HIVE-11880.02.patch, HIVE-11880.03.patch, HIVE-11880.04.patch
>
>
>    For UNION ALL , when an union operator is constant column (such as '0L', BIGINT Type)  and its corresponding column has incompatible type (such as INT type). 
>   Query with filter condition on type incompatible column on this UNION ALL  will cause IndexOutOfBoundsException.
>  Such as TPC-H table "orders",in  the following query:
>  Type of 'orders'.'o_custkey' is INT normally,  while  the type of corresponding constant column  "0" is BIGINT( `0L AS `o_custkey` ). 
>  This following query (with filter "type incompatible column 'o_custkey' ")  will fail  with  java.lang.IndexOutOfBoundsException : 
> {code}
> set hive.cbo.enable=false;
> set hive.ppd.remove.duplicatefilters=true;
> CREATE TABLE `orders`(
>   `o_orderkey` int, 
>   `o_custkey` int, 
>   `o_orderstatus` string, 
>   `o_totalprice` double, 
>   `o_orderdate` string, 
>   `o_orderpriority` string, 
>   `o_clerk` string, 
>   `o_shippriority` int, 
>   `o_comment` string);
> SELECT o_orderkey
> FROM   (
>               SELECT `o_orderkey` ,
>                      `o_custkey`
>               FROM   `orders`
>               UNION ALL
>               SELECT `o_orderkey`,
>                      0L  AS `o_custkey`
>               FROM   `orders`) `oo`
> WHERE  o_custkey<10;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)