You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/12 06:04:49 UTC

[GitHub] [arrow] JkSelf commented on a diff in pull request #14151: ARROW-11776: [C++][Java] Support parquet write from scanner to file

JkSelf commented on code in PR #14151:
URL: https://github.com/apache/arrow/pull/14151#discussion_r993030111


##########
java/dataset/src/main/java/org/apache/arrow/dataset/file/JniWrapper.java:
##########
@@ -45,4 +46,21 @@ private JniWrapper() {
    */
   public native long makeFileSystemDatasetFactory(String uri, int fileFormat);
 
+  /**
+   * Write all record batches in a {@link NativeRecordBatchIterator} into files. This internally
+   * depends on C++ write API: FileSystemDataset::Write.
+   *
+   * @param itr iterator to be used for writing
+   * @param schema serialized schema of output files
+   * @param fileFormat target file format (ID)
+   * @param uri target file uri
+   * @param partitionColumns columns used to partition output files
+   * @param maxPartitions maximum partitions to be included in written files
+   * @param baseNameTemplate file name template used to make partitions. E.g. "dat_{i}", i is current partition
+   *                         ID around all written files.
+   */
+  public native void writeFromScannerToFile(CRecordBatchIterator itr, long schema_address,

Review Comment:
   We have already used the ArrowArrayStream sharing data to native. Here the CRecordBatchIterator is to iterate the ArrowArrayStream object. I have changed the CRecordBatchIterator name to CArrowArrayStreamIterator. Sorry for the wrong naming.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org