You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Zsombor Fedor (JIRA)" <ji...@apache.org> on 2018/09/24 10:23:00 UTC

[jira] [Created] (IMPALA-7612) Parquet file with no rows in it causing WARNING in explain

Zsombor Fedor created IMPALA-7612:
-------------------------------------

             Summary: Parquet file with no rows in it causing WARNING in explain
                 Key: IMPALA-7612
                 URL: https://issues.apache.org/jira/browse/IMPALA-7612
             Project: IMPALA
          Issue Type: New Feature
          Components: Frontend
    Affects Versions: Impala 2.12.0
            Reporter: Zsombor Fedor


An empty Parquet file, with no rows in it causing a warning in explain:
{code:java}
WARNING: The following tables have potentially corrupt table statistics. Drop and re-compute statistics to resolve this problem. {code}
This Warning is showing even after
{code:java}
compute stats tp;{code}
because :
{code:java}
partitions=1/1 files=1 size=220B{code}
but numRows = 0.

A simple reproduction:
{code:java}
create table tp (a int);{code}
create and empty.csv file

create parquet file from the csv with a simple MR job:

[https://github.com/tomwhite/hadoop-book/blob/master/ch13-parquet/src/main/java/TextToParquetWithAvro.java]

using the following schema:
{code:java}
"{\n" +
 " \"type\": \"record\",\n" + 
 " \"name\": \"tp\",\n" +
 " \"doc\": \"Avro schema for table tp\",\n" +
 " \"fields\":\n" + 
 " [\n" + 
 " {\"name\": \"a\", \"type\": \"int\"}\n"+
 " ]\n"+
 "}\n");{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)