You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/07/26 08:50:26 UTC

[GitHub] [iceberg] openinx opened a new issue #2870: Add includeColumnStats option in FindFiles API

openinx opened a new issue #2870:
URL: https://github.com/apache/iceberg/issues/2870


   We want to select the data files from apache iceberg table by the API [FindFiles](https://github.com/apache/iceberg/blob/90225d6c9413016d611e2ce5eff37db1bc1b4fc5/core/src/main/java/org/apache/iceberg/FindFiles.java) ,  and commit those files into another iceberg transaction.  But seems those DataFiles read from the FindFiles don't have the column stat such as:
   
   * value_counts
   * null_value_counts
   * nan_value_counts
   * lower_bounds
   * upper_bounds
   * record_count
   
   In this way, we still have to open the parquet files and fill the columns from the parquet reader.  I would suggest to provide a new option named `includeColumnStats`  in FindFiles API , so that we don't have to go through the low-level API to fill the stats by hand.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on issue #2870: Add includeColumnStats option in FindFiles API

Posted by GitBox <gi...@apache.org>.
openinx commented on issue #2870:
URL: https://github.com/apache/iceberg/issues/2870#issuecomment-889126183


   Closed via #2875 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx closed issue #2870: Add includeColumnStats option in FindFiles API

Posted by GitBox <gi...@apache.org>.
openinx closed issue #2870:
URL: https://github.com/apache/iceberg/issues/2870


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org