You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Owen O'Malley (Jira)" <ji...@apache.org> on 2021/10/01 06:41:00 UTC
[jira] [Assigned] (ORC-1017) Create a new tool that summarizes the
size of a file by column
[ https://issues.apache.org/jira/browse/ORC-1017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Owen O'Malley reassigned ORC-1017:
----------------------------------
> Create a new tool that summarizes the size of a file by column
> --------------------------------------------------------------
>
> Key: ORC-1017
> URL: https://issues.apache.org/jira/browse/ORC-1017
> Project: ORC
> Issue Type: Improvement
> Reporter: Owen O'Malley
> Assignee: Owen O'Malley
> Priority: Major
>
> I want a tool that summarizes how the space inside an ORC file is used. In particular, for each column, the indexes, the file footer, and the stripe footers.
> The output on the orc_split_elim_new.orc is:
> {quote}Percent Bytes/Row Name
> 46.79 0.04 subtype
> 17.49 0.02 _file_footer
> 16.57 0.02 _index
> 7.01 0.01 decimal1
> 5.05 0.00 _stripe_footer
> 2.84 0.00 string1
> 2.59 0.00 ts
> 1.67 0.00 userid{quote}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)