You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/10/22 08:21:32 UTC

[GitHub] [iceberg] openinx commented on issue #1643: Flink RowDataIterator should support reuseObject flag

openinx commented on issue #1643:
URL: https://github.com/apache/iceberg/issues/1643#issuecomment-714321255


   In most flink cases,  the `FlinkInputFormat` will read a record and emit it to the downstream operator,  that means  it will serialize the `RowData` and then sends bytes to next operator,  so I think it's right to set `reuse=true` by default, that saves lots of object allocation from JVM.  
   
   In your batch-read case,  the ideal way should be :  allocating a fixed-size array which should be reused in every batch read, when reading a given record we should pass the relative element from reused array for reusing purpose.    But I read the code, seems the `newAvroIterable`, `newParquetIterable`, `newOrcIterable` will return a Iterable whose `next()` method has no way to pass a `reused` instance for reusing.  we have to do the `RowData` copying (Copy RowData from iterator to the fixed-size array) but we still could reuse the fixed-size array to avoiding allocating too many young objects. 
   
   Does that make sense ? 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org