You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Gabor Szadovszky (JIRA)" <ji...@apache.org> on 2017/12/29 10:01:00 UTC

[jira] [Assigned] (PARQUET-386) Printing out the statistics of metadata in parquet-tools

     [ https://issues.apache.org/jira/browse/PARQUET-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gabor Szadovszky reassigned PARQUET-386:
----------------------------------------

    Assignee: Gabor Szadovszky

> Printing out the statistics of metadata in parquet-tools
> --------------------------------------------------------
>
>                 Key: PARQUET-386
>                 URL: https://issues.apache.org/jira/browse/PARQUET-386
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-mr
>            Reporter: Onur Soyer
>            Assignee: Gabor Szadovszky
>            Priority: Trivial
>             Fix For: 1.9.0
>
>
> While playing with "parquet-tools", I found that the statistics data of columns is not being printed out when the following is executed;
> $ java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar schema --detailed perf.1000.parquet
> And the output for a row group like this;
> =====================================================================================================================
> row group 1: RC:747388 TS:134218473 OFFSET:4
> --------------------------------------------------------------------------------
> cust_key:  INT64 UNCOMPRESSED DO:0 FPO:4 SZ:5979444/5979444/1.00 VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> name:  BINARY UNCOMPRESSED DO:0 FPO:5979448 SZ:16443766/16443766/1.00 VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> address:  BINARY UNCOMPRESSED DO:0 FPO:22423214 SZ:21716568/21716568/1.00 VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> nation_key:  INT32 UNCOMPRESSED DO:0 FPO:44139782 SZ:2989697/2989697/1.00 VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> phone:  BINARY UNCOMPRESSED DO:0 FPO:47129479 SZ:14201364/14201364/1.00 VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> acctbal:  DOUBLE UNCOMPRESSED DO:0 FPO:61330843 SZ:5979444/5979444/1.00 VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> mktsegment:  BINARY UNCOMPRESSED DO:0 FPO:67310287 SZ:9714675/9714675/1.00 VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> comment_col:  BINARY UNCOMPRESSED DO:0 FPO:77024962 SZ:57193515/57193515/1.00 VC:747388 ENC:PLAIN,RLE,BIT_PACKED
> =====================================================================================================================
> However, it would be great to print out the data of statistics of metadata.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)