You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Xinli Shang (Jira)" <ji...@apache.org> on 2019/10/26 01:54:00 UTC

[jira] [Created] (PARQUET-1685) Truncate the stored min and max for String statistics to reduce the footer size

Xinli Shang created PARQUET-1685:
------------------------------------

             Summary: Truncate the stored min and max for String statistics to reduce the footer size 
                 Key: PARQUET-1685
                 URL: https://issues.apache.org/jira/browse/PARQUET-1685
             Project: Parquet
          Issue Type: Improvement
          Components: parquet-mr
    Affects Versions: 1.10.1
            Reporter: Xinli Shang
            Assignee: Xinli Shang
             Fix For: 1.12.0


Iceberg has a cool feature that truncates the stored min, max statistics to minimize the metadata size. We can borrow to truncate them in Parquet also to reduce the size of the footer, or even the page header. Here is the code in IceBerg [https://github.com/apache/incubator-iceberg/blob/master/api/src/main/java/org/apache/iceberg/util/UnicodeUtil.java]. 

 

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)