You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "fritzb (via GitHub)" <gi...@apache.org> on 2023/07/26 18:00:17 UTC

[GitHub] [pinot] fritzb commented on issue #6768: DBT Pinot support?

fritzb commented on issue #6768:
URL: https://github.com/apache/pinot/issues/6768#issuecomment-1652264822

   I have a simplified DBT use case, which I believe is the first step to involve Pinot in DBT. I'd like to start the discussion about whether this idea is feasible as a short project. The idea is to handle the DBT dags mostly in Trino/Iceberg and convert the final materialized table from Iceberg to Pinot in DBT. This way, we narrow down the project into converting the final materialized table in Parquet format to a Pinot table.
   
   The workflow is as follows:
   
   1. Create model(s) in DBT with the output as Datalake, using Trino/Iceberg.
   2. The very last stage of the DAG is a materialized table that powers the Business Intelligence dashboards. The current output is Iceberg (via Trino/Iceberg).
   3. This is where Pinot comes in to make the Dashboard blazing fast. We aim to convert the last materialized table in Iceberg to a Pinot table. Currently, this is done manually by copying the Iceberg table into a simple Parquet format since Pinot does not support Iceberg ingestion. The process involves deleting and re-creating the schema+table in Pinot, followed by ingesting the Parquet files into a Pinot table by setting the ingestion config to an S3 location.
   
   @xiangfu0 I was wondering if Step 3 could be automated as a DBT connector by:
   
   1. Detect the new schema for the materialized table and automatically create the Pinot schema+table (offline table?).
   2. Insert the rows of the materialized table into the Pinot table. If inserts via Trino are not supported, can we instruct Pinot to ingest Parquet from an S3 location using an SQL statement? This way, I can write this ingestion instruction as a DBT SQL model.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org