You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by Yifei Yang <yi...@eng.ucsd.edu> on 2020/08/07 18:10:09 UTC

[C++] Question about TableBatchReader

Hello,

I'm using TableBatchReader to process tables. I wonder is there any way to
set the batch size or the number of tuples in a record batch? Sometimes a
batch only contains tens of tuples, which slows down the processing a lot.
I tried TableBatchReader::set_chunksize(), but got no change. Thanks!

Re: [C++] Question about TableBatchReader

Posted by Wes McKinney <we...@gmail.com>.
TableBatchReader doesn't do any array concatenation -- it only
iterates through each chunk of the table that can be represented in a
RecordBatch. When the columns of the table have different chunk
layouts this can result in small batches.

On Fri, Aug 7, 2020 at 1:10 PM Yifei Yang <yi...@eng.ucsd.edu> wrote:
>
> Hello,
>
> I'm using TableBatchReader to process tables. I wonder is there any way to set the batch size or the number of tuples in a record batch? Sometimes a batch only contains tens of tuples, which slows down the processing a lot. I tried TableBatchReader::set_chunksize(), but got no change. Thanks!