You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by GitBox <gi...@apache.org> on 2021/04/01 17:25:50 UTC

[GitHub] [orc] sv2000 commented on pull request #674: ORC-777: Make the vectorized row batch size configurable in MR record…

sv2000 commented on pull request #674:
URL: https://github.com/apache/orc/pull/674#issuecomment-812057013


   > > @pgaref IIUC, what you are proposing is to have an overloaded constructor in OrcMapreduceRecordReader/Writer that accepts rowBatchSize as an argument (like we do in this PR), but leave it to the caller to override the getRecordWriter() method, which in turn calls the overloaded constructor. This way we don't need to introduce a separate config. LMK if I am missing anything.
   > 
   > Hey @sv2000 -- yes you are on point. Just want to avoid having an extra configuration here as it could confuse users.
   > If we can introduce a new constructor for Reader/Writer and release the change (on 1.6 or 1.7) it could sort out the issue.
   > Thoughts?
   
   @pgaref I am ok with the proposed change. One potential downside though would be that every user that needs to configure the row batch size will have to override the getRecordWriter() method. But may be worth the cost to pay to avoid the confusion of introducing a new config. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org