You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2019/04/25 00:07:00 UTC

[jira] [Updated] (IMPALA-8110) Parquet stat filtering does not handle narrowed int types correctly

     [ https://issues.apache.org/jira/browse/IMPALA-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong updated IMPALA-8110:
----------------------------------
    Priority: Critical  (was: Major)

> Parquet stat filtering does not handle narrowed int types correctly
> -------------------------------------------------------------------
>
>                 Key: IMPALA-8110
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8110
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Csaba Ringhofer
>            Priority: Critical
>              Labels: correctness, parquet
>
> Impala can read int32 Parquet columns as tiny/smallint SQL columns. If the value does not fit into the 8/16 bit signed int's range, the value will overflow, e.g writing 128 as int32 and then rereading it as int8 will return -128. This is normal as far as I understand, but min/max stat filtering does not handle this case correctly:
> create table tnarrow (i int) stored as parquet;
> insert into tnarrow values (1), (201); 
> alter table tnarrow change column i i tinyint;
> set PARQUET_READ_STATISTICS=0;
> select * from tnarrow where i < 0;
> -> returns 1 row: -56
> set PARQUET_READ_STATISTICS=1;
> -> returns 0 row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org