You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by "Owen O'Malley (Jira)" <ji...@apache.org> on 2021/10/01 06:41:00 UTC

[jira] [Created] (ORC-1017) Create a new tool that summarizes the size of a file by column

Owen O'Malley created ORC-1017:
----------------------------------

             Summary: Create a new tool that summarizes the size of a file by column
                 Key: ORC-1017
                 URL: https://issues.apache.org/jira/browse/ORC-1017
             Project: ORC
          Issue Type: Improvement
            Reporter: Owen O'Malley
            Assignee: Owen O'Malley


I want a tool that summarizes how the space inside an ORC file is used. In particular, for each column, the indexes, the file footer, and the stripe footers.

The output on the orc_split_elim_new.orc is:

{quote}Percent  Bytes/Row  Name
  46.79  0.04       subtype
  17.49  0.02       _file_footer
  16.57  0.02       _index
   7.01   0.01       decimal1
   5.05   0.00       _stripe_footer
   2.84   0.00       string1
   2.59   0.00       ts
   1.67   0.00       userid{quote}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)