You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Lawrence Chan (JIRA)" <ji...@apache.org> on 2018/03/10 00:49:00 UTC

[jira] [Created] (ARROW-2296) Add num_rows to file footer

Lawrence Chan created ARROW-2296:
------------------------------------

             Summary: Add num_rows to file footer
                 Key: ARROW-2296
                 URL: https://issues.apache.org/jira/browse/ARROW-2296
             Project: Apache Arrow
          Issue Type: Improvement
            Reporter: Lawrence Chan


Maybe I'm overlooking something, but I don't see something on the API surface to get the number of rows in a arrow file without reading all the record batches.

I'd like to propose that we add `num_rows` as a field to the footer so it's easy to query without reading the whole file.

Meanwhile, before we get that added to the official format fbs, it would be nice to haveĀ a method that iterates over the record batch headers and sums up the lengths without reading the actual record batch body.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)