You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/10/08 02:45:14 UTC

[GitHub] [arrow] kanga333 opened a new pull request #8391: ARROW-10227: [Ruby] Use a table size as the default for parquet chunk_size

kanga333 opened a new pull request #8391:
URL: https://github.com/apache/arrow/pull/8391


   A chunk_size that is too small will cause metadata bloat in the parquet file, leading to poor read performance. Set the chunk_size to be the same value as the table size so that one file becomes one row_group.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8391: ARROW-10227: [Ruby] Use a table size as the default for parquet chunk_size

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8391:
URL: https://github.com/apache/arrow/pull/8391#issuecomment-705297244


   https://issues.apache.org/jira/browse/ARROW-10227


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kou closed pull request #8391: ARROW-10227: [Ruby] Use a table size as the default for parquet chunk_size

Posted by GitBox <gi...@apache.org>.
kou closed pull request #8391:
URL: https://github.com/apache/arrow/pull/8391


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org