You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/04/09 14:49:44 UTC

[GitHub] [flink] zenfenan edited a comment on issue #11474: FLINK-10114: Add ORC BulkWriter support for StreamingFileSink

zenfenan edited a comment on issue #11474: FLINK-10114: Add ORC BulkWriter support for StreamingFileSink
URL: https://github.com/apache/flink/pull/11474#issuecomment-611569646
 
 
   @kl0u Hey Kostas. Thanks for the suggestions. I agree with the improvements in `Vectorizer`. I actually had already made changes in local to handle the `VectorizedRowBatch` like you suggested but your point of having the schema in one place is a nice +1. I will include that as well. However, do you really think we need add `addUserMetadata()` to the Vectorizer?
   
   @kl0u @JingsongLi 
   I got the reply from the ORC community regarding the better way a VectorizedRowBatch can be handled[1]. The community suggested that it actually makes little difference but I think it would be better we go with a single instance of VectorizedRowBatch and handle the lifecycle functions of BulkWriter. Thoughts?
   
   [1] https://lists.apache.org/thread.html/r7c1eb6199c2834bb60ff6e8bb2ac7cb66d8912b86164ac7a6f3aaaf4%40%3Cuser.orc.apache.org%3E

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services