You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/18 08:04:18 UTC

[GitHub] [iceberg] liubo1022126 opened a new issue #3916: Core: Orc data file not support to shouldRollToNewFile

liubo1022126 opened a new issue #3916:
URL: https://github.com/apache/iceberg/issues/3916


   @openinx @rdblue 
   
   I sort data within partitions by columns to gain performance, like `insert overwrite tableA partition(pt='20220118') select id,name,age from tableA where pt='20220118' order by id;`, and table's write.format.default=orc and 'write.target-file-size-bytes'='134217728'.
   
   But the data file within partitions is only one file with a large size. and I find that [ORC file now not support target file size before closed](https://github.com/apache/iceberg/pull/1213#discussion_r459197243).
   because there is only a large data file in every partition, so I can't filter data files at planning time like https://iceberg.apache.org/#performance/#data-filtering.
   
   So if I want to use orc fileformat, how to RollToNewFile?
   
   By the way, In Flink steaming job, will roll a new file when checkpoint, what is the different with batch job? why batch job can't roll ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] liubo1022126 commented on issue #3916: Core: Orc data file not support to shouldRollToNewFile

Posted by GitBox <gi...@apache.org>.
liubo1022126 commented on issue #3916:
URL: https://github.com/apache/iceberg/issues/3916#issuecomment-1016164217


   > There's a PR to add this that I just started reviewing yesterday: #3784
   
   That's great, This ability is necessary.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on issue #3916: Core: Orc data file not support to shouldRollToNewFile

Posted by GitBox <gi...@apache.org>.
rdblue commented on issue #3916:
URL: https://github.com/apache/iceberg/issues/3916#issuecomment-1015947073


   There's a PR to add this that I just started reviewing yesterday: https://github.com/apache/iceberg/pull/3784


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] liubo1022126 removed a comment on issue #3916: Core: Orc data file not support to shouldRollToNewFile

Posted by GitBox <gi...@apache.org>.
liubo1022126 removed a comment on issue #3916:
URL: https://github.com/apache/iceberg/issues/3916#issuecomment-1016164217


   > There's a PR to add this that I just started reviewing yesterday: #3784
   
   That's great, This ability is necessary.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] liubo1022126 commented on issue #3916: Core: Orc data file not support to shouldRollToNewFile

Posted by GitBox <gi...@apache.org>.
liubo1022126 commented on issue #3916:
URL: https://github.com/apache/iceberg/issues/3916#issuecomment-1016165145


   > There's a PR to add this that I just started reviewing yesterday: #3784
   
   That's great and necessary ^^


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org