You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/06/09 04:49:15 UTC

[GitHub] [arrow] mapleFU commented on a diff in pull request #34435: GH-34410: [Python] Allow chunk sizes larger than the default to be used

mapleFU commented on code in PR #34435:
URL: https://github.com/apache/arrow/pull/34435#discussion_r1223840611


##########
python/pyarrow/_parquet.pyx:
##########
@@ -1597,6 +1597,15 @@ cdef shared_ptr[WriterProperties] _create_writer_properties(
         props.encryption(
             (<FileEncryptionProperties>encryption_properties).unwrap())
 
+    # For backwards compatibility reasons we cap the maximum row group size
+    # at 64Mi rows.  This could be changed in the future, though it would be
+    # a breaking change.
+    #
+    # The user can always specify a smaller row group size (and the default
+    # is smaller) when calling write_table.  If the call to write_table uses
+    # a size larger than this then it will be latched to this value.
+    props.max_row_group_length(64*1024*1024)

Review Comment:
   does this causing the bug in https://github.com/apache/arrow/issues/35859#issuecomment-1583614922 ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org