You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/27 06:33:36 UTC

[GitHub] [iceberg] hililiwei commented on pull request #3784: ORC:ORC supports rolling writers.

hililiwei commented on pull request #3784:
URL: https://github.com/apache/iceberg/pull/3784#issuecomment-1022891787


   > @hililiwei, can you describe how you're estimating the size of data that is buffered in memory for ORC? I think a description to explain to reviewers would help.
   
   If a file is being written, to estimate its size,  in three steps:
   1. Size of data that has been written to `stripe`.The value is obtained by summing the `offset` and `length `of the last stripe of the `writer`.
   2. Size of data that has been submitted to the `writer `but has not been written to the stripe. When creating OrcFileAppender, the `treeWriter` is obtained through reflection,  and use its `estimateMemory` to estimate how much memory is being used.
   3. Data that has not been submitted to the `writer`, that is, the size of the buffer. The maximum default value of the buffer is used here.
   
   Add these three values to estimate the data size.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org