You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by GitBox <gi...@apache.org> on 2019/05/06 14:10:42 UTC

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3193: [CARBONDATA-3365] Integrate apache arrow vector filling to carbon SDK

ajantha-bhat commented on a change in pull request #3193: [CARBONDATA-3365] Integrate apache arrow vector filling to carbon SDK
URL: https://github.com/apache/carbondata/pull/3193#discussion_r281201809
 
 

 ##########
 File path: store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java
 ##########
 @@ -94,6 +96,43 @@ public T readNextRow() throws IOException, InterruptedException {
     return currentReader.getCurrentValue();
   }
 
+  /**
+   * Carbon reader will fill the arrow vector after reading the carbondata files.
+   * This arrow byte[] can be used to create arrow table and used for in memory analytics
+   *
+   * Note: create a reader at blocklet level, so that arrow byte[] will not exceed INT_MAX
+   *
+   * @param carbonSchema
+   * @return
+   * @throws Exception
+   */
+  public byte[] readArrowBatch(Schema carbonSchema) throws Exception {
+    ArrowConverter arrowConverter = new ArrowConverter(carbonSchema, 10000);
+    while (hasNext()) {
+      arrowConverter.addToArrowBuffer(readNextBatchRow());
+    }
+    return arrowConverter.toSerializeArray();
+  }
+
+  /**
+   * Carbon reader will fill the arrow vector after reading carbondata files.
+   * Here unsafe memory address will be returned instead of byte[],
+   * so that this address can be sent across java to python or c modules and
+   * can directly read the content from this unsafe memory
 
 Review comment:
   No limitation as unsafe memory is freed after usage. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services