You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/06/25 03:17:20 UTC

[GitHub] [arrow] mapleFU commented on issue #36280: [Python][Parquet] Allow `write_batch` to directly write batch

mapleFU commented on issue #36280:
URL: https://github.com/apache/arrow/issues/36280#issuecomment-1605841418

   @jorisvandenbossche Hi Joris, I'd like to add interface in Python, previously python code just use write_table:
   
   ```python
       def write_batch(self, batch, row_group_size=None):
           """
           Write RecordBatch to the Parquet file.
   
           Parameters
           ----------
           batch : RecordBatch
           row_group_size : int, default None
               Maximum number of rows in written row group. If None, the
               row group size will be the minimum of the RecordBatch
               size and 1024 * 1024.  If set larger than 64Mi then 64Mi
               will be used instead.
           """
           table = pa.Table.from_batches([batch], batch.schema)
           self.write_table(table, row_group_size)
   ```
   
   Should I just change it to `write_record_batch`, or adding a argument to control it to avoid breaking previous behavior?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org