You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/20 06:54:22 UTC

[GitHub] [arrow] zhixingheyi-tian commented on a change in pull request #11763: ARROW-14153: [C++][Dataset] Add support for batch_size in the ORC Scanner

zhixingheyi-tian commented on a change in pull request #11763:
URL: https://github.com/apache/arrow/pull/11763#discussion_r772117408



##########
File path: cpp/src/arrow/dataset/file_orc.cc
##########
@@ -85,24 +85,20 @@ class OrcScanTask : public ScanTask {
           included_fields.push_back(name);
         }
 
+        std::shared_ptr<RecordBatchReader> recordBatchReader;
+        reader->NextStripeReader(scan_options.batch_size, included_fields, &recordBatchReader);

Review comment:
       @jorisvandenbossche 
   
   Recently, our testing found that the NextStripeReader  only read first stripe of the big files. I will fix this as soon as possible.
   
   Is anyone fixing this issue? We Can discuss and solve this problem together.
   
   Thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org