You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/06/25 03:17:20 UTC
[GitHub] [arrow] mapleFU commented on issue #36280: [Python][Parquet] Allow `write_batch` to directly write batch
mapleFU commented on issue #36280:
URL: https://github.com/apache/arrow/issues/36280#issuecomment-1605841418
@jorisvandenbossche Hi Joris, I'd like to add interface in Python, previously python code just use write_table:
```python
def write_batch(self, batch, row_group_size=None):
"""
Write RecordBatch to the Parquet file.
Parameters
----------
batch : RecordBatch
row_group_size : int, default None
Maximum number of rows in written row group. If None, the
row group size will be the minimum of the RecordBatch
size and 1024 * 1024. If set larger than 64Mi then 64Mi
will be used instead.
"""
table = pa.Table.from_batches([batch], batch.schema)
self.write_table(table, row_group_size)
```
Should I just change it to `write_record_batch`, or adding a argument to control it to avoid breaking previous behavior?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org