You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/07/26 17:19:37 UTC

[GitHub] [pinot] klsince commented on issue #9084: Batch Ingestion from Delta Table

klsince commented on issue #9084:
URL: https://github.com/apache/pinot/issues/9084#issuecomment-1195761958

   I just read a bit about the delta lib. A simple flow may look like below, where we can open the delta table with the lib and loop through all the records. The lib also supports data filtering, and that can be some advanced options for data ingestion.
   
   ```
   import io.delta.standalone.data.RowRecord;
   import io.delta.standalone.Snapshot;
   
   DeltaLog log = DeltaLog.forTable(new Configuration(), "/data/sales");
   CloseableIterator<RowRecord> dataIter = log.update().open();
   
   try {
       while (dataIter.hasNext()) {
           // We get a delta record here, and can convert to pinot GenericRow as far as I can tell
           RowRecord row = dataIter.next();
           int year = row.getInt("year");
           String customer = row.getString("customer");
           float totalCost = row.getFloat("total_cost");
       }
   } finally {
       dataIter.close();
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org