You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "guojingfeng (Jira)" <ji...@apache.org> on 2020/07/14 09:52:00 UTC

[jira] [Updated] (IMPALA-9952) Parquet with lz4 ColumnIndex filter error

     [ https://issues.apache.org/jira/browse/IMPALA-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

guojingfeng updated IMPALA-9952:
--------------------------------
    Description: 
When reading parquet file with lz4 compress codec, encountered the following error:

 
{code:java}
// code placeholder

{code}
 

Coresponding source code:

 
{code:java}
I0714 16:05:28.720537 1061963 status.cc:126] 7941d4598e5dd5a4:45b99456000002c0] Invalid offset index in Parquet file hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq.
    @           0xbf4ef9
    @          0x1748c41
    @          0x174e170
    @          0x1750e58
    @          0x17519f0
    @          0x1748559
    @          0x1510b41
    @          0x1512c8f
    @          0x137488a
    @          0x1375759
    @          0x1b48a19
    @     0x7f34509f5e24
    @     0x7f344d5ed35c

I0714 16:11:48.835763 1075838 runtime-state.cc:207] 8c43203adb2d4fc8:0478df9b000002c0] Error from query 8c43203adb2d4fc8:0478df9b00000000: Invalid offset index in Parquet file hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq.
I0714 16:11:48.893784 1075820 status.cc:126] 8c43203adb2d4fc8:0478df9b0000018b] Top level rows aren't in sync during page filtering in file hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq.
    @           0xbf4ef9
    @          0x1749104
    @          0x17494cc
    @          0x1751aee
    @          0x1748559
    @          0x1510b41
    @          0x1512c8f
    @          0x137488a
    @          0x1375759
    @          0x1b48a19
    @     0x7f34509f5e24
    @     0x7f344d5ed35c
{code}

  was:
When reading parquet file with lz4 compress codec, encountered the following error:
{code:java}
I0714 16:05:28.720537 1061963 status.cc:126] 7941d4598e5dd5a4:45b99456000002c0] Invalid offset index in Parquet file hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq.
    @           0xbf4ef9
    @          0x1748c41
    @          0x174e170
    @          0x1750e58
    @          0x17519f0
    @          0x1748559
    @          0x1510b41
    @          0x1512c8f
    @          0x137488a
    @          0x1375759
    @          0x1b48a19
    @     0x7f34509f5e24
    @     0x7f344d5ed35c

I0714 16:11:48.835763 1075838 runtime-state.cc:207] 8c43203adb2d4fc8:0478df9b000002c0] Error from query 8c43203adb2d4fc8:0478df9b00000000: Invalid offset index in Parquet file hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq.
I0714 16:11:48.893784 1075820 status.cc:126] 8c43203adb2d4fc8:0478df9b0000018b] Top level rows aren't in sync during page filtering in file hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq.
    @           0xbf4ef9
    @          0x1749104
    @          0x17494cc
    @          0x1751aee
    @          0x1748559
    @          0x1510b41
    @          0x1512c8f
    @          0x137488a
    @          0x1375759
    @          0x1b48a19
    @     0x7f34509f5e24
    @     0x7f344d5ed35c
{code}


> Parquet with lz4 ColumnIndex filter error
> -----------------------------------------
>
>                 Key: IMPALA-9952
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9952
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 3.4.0
>            Reporter: guojingfeng
>            Priority: Major
>
> When reading parquet file with lz4 compress codec, encountered the following error:
>  
> {code:java}
> // code placeholder
> {code}
>  
> Coresponding source code:
>  
> {code:java}
> I0714 16:05:28.720537 1061963 status.cc:126] 7941d4598e5dd5a4:45b99456000002c0] Invalid offset index in Parquet file hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq.
>     @           0xbf4ef9
>     @          0x1748c41
>     @          0x174e170
>     @          0x1750e58
>     @          0x17519f0
>     @          0x1748559
>     @          0x1510b41
>     @          0x1512c8f
>     @          0x137488a
>     @          0x1375759
>     @          0x1b48a19
>     @     0x7f34509f5e24
>     @     0x7f344d5ed35c
> I0714 16:11:48.835763 1075838 runtime-state.cc:207] 8c43203adb2d4fc8:0478df9b000002c0] Error from query 8c43203adb2d4fc8:0478df9b00000000: Invalid offset index in Parquet file hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq.
> I0714 16:11:48.893784 1075820 status.cc:126] 8c43203adb2d4fc8:0478df9b0000018b] Top level rows aren't in sync during page filtering in file hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq.
>     @           0xbf4ef9
>     @          0x1749104
>     @          0x17494cc
>     @          0x1751aee
>     @          0x1748559
>     @          0x1510b41
>     @          0x1512c8f
>     @          0x137488a
>     @          0x1375759
>     @          0x1b48a19
>     @     0x7f34509f5e24
>     @     0x7f344d5ed35c
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org