You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by GitBox <gi...@apache.org> on 2020/09/09 09:51:56 UTC

[GitHub] [parquet-mr] panthony edited a comment on pull request #470: PARQUET-869: Configurable record counts for block size checks

panthony edited a comment on pull request #470:
URL: https://github.com/apache/parquet-mr/pull/470#issuecomment-682457362


   Same here, we have 1 or 2 columns that can vary widely in size (few Kbs up to 10Mb) and we often stumble upon an OutOfMemory error because it didn't check the buffered rows in time.
   
   Being able to adjust the checks frequency would be a huge help 👍 
   
   I have a [rebased branch](https://github.com/cogniteev/parquet-mr/tree/PARQUET-869-configurable-row-group-min-max-record-check) against master if anyone interested


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org