You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "mapleFU (via GitHub)" <gi...@apache.org> on 2023/06/26 13:31:04 UTC
[GitHub] [arrow] mapleFU commented on pull request #36286: GH-36280: [Python][Parquet] Export C++ WriteRecordBatch in Python API
mapleFU commented on PR #36286:
URL: https://github.com/apache/arrow/pull/36286#issuecomment-1607482752
Hi, I'd like to add interface in Python, previously python code just use write_table:
```
def write_batch(self, batch, row_group_size=None):
"""
Write RecordBatch to the Parquet file.
Parameters
----------
batch : RecordBatch
row_group_size : int, default None
Maximum number of rows in written row group. If None, the
row group size will be the minimum of the RecordBatch
size and 1024 * 1024. If set larger than 64Mi then 64Mi
will be used instead.
"""
table = pa.Table.from_batches([batch], batch.schema)
self.write_table(table, row_group_size)
```
Should I just change it to `write_record_batch`, or adding a argument to control it to avoid breaking previous behavior? @pitrou @jorisvandenbossche
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org