You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/20 06:54:22 UTC
[GitHub] [arrow] zhixingheyi-tian commented on a change in pull request #11763: ARROW-14153: [C++][Dataset] Add support for batch_size in the ORC Scanner
zhixingheyi-tian commented on a change in pull request #11763:
URL: https://github.com/apache/arrow/pull/11763#discussion_r772117408
##########
File path: cpp/src/arrow/dataset/file_orc.cc
##########
@@ -85,24 +85,20 @@ class OrcScanTask : public ScanTask {
included_fields.push_back(name);
}
+ std::shared_ptr<RecordBatchReader> recordBatchReader;
+ reader->NextStripeReader(scan_options.batch_size, included_fields, &recordBatchReader);
Review comment:
@jorisvandenbossche
Recently, our testing found that the NextStripeReader only read first stripe of the big files. I will fix this as soon as possible.
Is anyone fixing this issue? We Can discuss and solve this problem together.
Thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org