You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/25 14:33:27 UTC

[GitHub] [arrow] lidavidm commented on a diff in pull request #10883: ARROW-7272: [C++][Java][Dataset] JNI bridge between RecordBatch and VectorSchemaRoot

lidavidm commented on code in PR #10883:
URL: https://github.com/apache/arrow/pull/10883#discussion_r857687429


##########
java/dataset/src/main/java/org/apache/arrow/dataset/scanner/ScanTask.java:
##########
@@ -29,14 +29,7 @@
 public interface ScanTask extends AutoCloseable {

Review Comment:
   N.B. we should eventually remove this class entirely as the corresponding interface no longer exists in C++ ARROW-15745



##########
cpp/src/jni/dataset/jni_wrapper.cc:
##########
@@ -458,25 +416,21 @@ Java_org_apache_arrow_dataset_jni_JniWrapper_getSchemaFromScanner(JNIEnv* env, j
 /*
  * Class:     org_apache_arrow_dataset_jni_JniWrapper
  * Method:    nextRecordBatch
- * Signature: (J)Lorg/apache/arrow/dataset/jni/NativeRecordBatchHandle;
+ * Signature: (JJ)Z
  */
-JNIEXPORT jobject JNICALL Java_org_apache_arrow_dataset_jni_JniWrapper_nextRecordBatch(
-    JNIEnv* env, jobject, jlong scanner_id) {
+JNIEXPORT jboolean JNICALL Java_org_apache_arrow_dataset_jni_JniWrapper_nextRecordBatch(
+    JNIEnv* env, jobject, jlong scanner_id, jlong struct_array) {
   JNI_METHOD_START
   std::shared_ptr<DisposableScannerAdaptor> scanner_adaptor =
       RetrieveNativeInstance<DisposableScannerAdaptor>(scanner_id);
 
   std::shared_ptr<arrow::RecordBatch> record_batch =
       JniGetOrThrow(scanner_adaptor->Next());
   if (record_batch == nullptr) {
-    return nullptr;  // stream ended
+    return false;  // stream ended
   }
-  std::shared_ptr<arrow::Schema> schema = record_batch->schema();
-  jobjectArray field_array =
-      env->NewObjectArray(schema->num_fields(), record_batch_handle_field_class, nullptr);
-
-  std::vector<std::shared_ptr<arrow::Buffer>> buffers;
-  for (int i = 0; i < schema->num_fields(); ++i) {
+  std::vector<std::shared_ptr<arrow::Array>> offset_zeroed_arrays;
+  for (int i = 0; i < record_batch->num_columns(); ++i) {
     // TODO: If the array has an offset then we need to de-offset the array

Review Comment:
   Is this TODO still relevant? It seems it's not a TODO anymore but an explanatory note



##########
java/dataset/src/test/java/org/apache/arrow/dataset/TestDataset.java:
##########
@@ -56,15 +62,31 @@ protected RootAllocator rootAllocator() {
   protected List<ArrowRecordBatch> collectResultFromFactory(DatasetFactory factory, ScanOptions options) {
     final Dataset dataset = factory.finish();
     final Scanner scanner = dataset.newScan(options);
-    final List<ArrowRecordBatch> ret = stream(scanner.scan())
-        .flatMap(t -> stream(t.execute()))
-        .collect(Collectors.toList());
     try {

Review Comment:
   Seems this should actually be a try-with-resources or a try-finally?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org