You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "dramaticlly (via GitHub)" <gi...@apache.org> on 2023/03/23 23:52:02 UTC

[GitHub] [iceberg] dramaticlly opened a new issue, #7189: Refactor the planning in PartitionTable

dramaticlly opened a new issue, #7189:
URL: https://github.com/apache/iceberg/issues/7189

   ### Feature Request / Improvement
   
   Based on @szehon-ho comment in #6661 when we trying to add delele file stats in partition table. 
   
   https://github.com/apache/iceberg/pull/6661#discussion_r1132981630
   
   Today, the partition table are using one of its kind `ManifestGroup.planFiles() / FileScanTask` to read list of data files and aggregagte the partition level stats such as record count and files count. 
   
   Szehon proposed to refactor this into ManifestReader to plan for manifests read and subsequently data files read, this enable us to read both data manifests and delete manifests into a coherent way and avoid keeping large hashset 
   - For Data files: we can use `ManifestReader.read()`
   - For delete files: we can use `ManifestReader.readDeleteManifest()` 
   
   
   
   ### Query engine
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho closed issue #7189: Refactor the planning in PartitionTable

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho closed issue #7189: Refactor the planning in PartitionTable
URL: https://github.com/apache/iceberg/issues/7189


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org