You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/01/20 18:59:47 UTC

[GitHub] [arrow-rs] tustvold opened a new issue #1213: ArrowWriter Row Group Byte Size Limit

tustvold opened a new issue #1213:
URL: https://github.com/apache/arrow-rs/issues/1213


   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   
   Currently `ArrowWriter` uses `max_row_group_size` as a row count limit. Whilst this is significantly simpler to implement, it is at odds with other arrow implementations that use a bytes threshold.
   
   **Describe the solution you'd like**
   
   Any or all of:
   
   * Clearly document what `max_row_group_size` is used for and how it is different from the other size quantities in WriterProperties
   * Assess if the `DEFAULT_MAX_ROW_GROUP_SIZE` of `128 * 1024 * 1024` makes sense given this is not bytes
   * Add functionality to flush based on a bytes threshold instead of, or in addition to, the current row threshold
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org