You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2022/07/18 23:10:09 UTC

[GitHub] [superset] manojrustagi79 opened a new issue, #20758: Connecting superset to delta lake table

manojrustagi79 opened a new issue, #20758:
URL: https://github.com/apache/superset/issues/20758

   Hi 
   I have implemented delta lake on premise and created some delta tables. Now I want to read the data from delta lake and create the dashboard in Apache superset. I have seen one library which can connect Apache superset with cloud delta lake . Can you please let me know how to connect Apache superset with delta lake? T
   
   Thanks 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] alitrack commented on issue #20758: Connecting superset to delta lake table

Posted by GitBox <gi...@apache.org>.
alitrack commented on issue #20758:
URL: https://github.com/apache/superset/issues/20758#issuecomment-1211540895

    @LittleYmada  @manojrustagi79 
   
   I have two solutions for this,
   
   ###  [DuckDB version](https://mp.weixin.qq.com/s?__biz=MzU1NTg2ODQ5Nw==&mid=2247487993&idx=1&sn=0068a1f79c386f76a3cb14bb79637c9d&chksm=fbcc99c7ccbb10d1be9143be8f1af35f994ff8414ac1b550763603f6e182bb43a443baa9bc8e&token=1997429354&lang=zh_CN#rd)
   
   - install packages
   ```bash
   pip install duckdb-engine deltalake
   # if want to support partition, needs duckdb-0.4.1.dev1057 at least.
   pip install -U --pre duckdb 
   ```
   - superset-config.py
   
   ```python 
   # activate Jinja templating
   FEATURE_FLAGS = {
      # . . . . .
     "ENABLE_TEMPLATE_PROCESSING": True
   }
   
   def delta_table(table_name):
       from deltalake import DeltaTable
       dt = DeltaTable(table_name)
       return dt.file_uris()
   
   JINJA_CONTEXT_ADDONS = {
       'delta_table': delta_table
   }
   ```
   - in SQL Lab
   ```sql
   SELECT * from read_parquet({{delta_table('/Users/steven/data/iris')}})
   
   ```
   want to support partitons,
   ```sql
   SELECT * from parquet_scan({{delta_table('/Users/steven/data/iris1')}},HIVE_PARTITIONING=true)
   
   ```
   
   ![image](https://user-images.githubusercontent.com/20972179/184063966-42797a1d-ac9a-4206-92dd-b5e3b5580ee7.png)
   
   
   ### [PostgreSQL version ](https://mp.weixin.qq.com/s?__biz=MzU1NTg2ODQ5Nw==&mid=2247487935&idx=1&sn=cd893d1f5ce5acc7cce1f65fa4984d8c&chksm=fbcc9981ccbb1097a5b61c2298536401186d072395935c82ea32402b31b379f2a0d1888d4b26&token=1997429354&lang=zh_CN#rd)
   
   tested on Postgres 14 and PL/Python (Python3.9)
   
   - install packages
   ```bash
   pip install deltalake typing-extensions
   ```
   - plpython function 
   ```sql
   create or replace function read_iris(iris_path text)
   returns table(sepal_length REAL,sepal_width REAL
                 ,petal_length REAL,petal_width REAL
                 ,species text)
   as $$
   from deltalake import DeltaTable
   dt = DeltaTable(iris_path)
   df= dt.to_pandas()
   return df.values.tolist()
   
   $$ language plpython3u;
   ```
   - test function 
   ```sql
   select * from read_iris('/Users/steven/data/iris');
   ```
   ![image](https://user-images.githubusercontent.com/20972179/184064222-ca9be4c2-89e5-45ec-864a-2cc0386ca3ef.png)
   
   
   - then you can create view 
   
   ```
   create or replace view iris as
   select * from read_iris('/Users/steven/data/iris');
   ```
   - or materialized view 
   ```
   create   materialized view iris1 as
   select * from read_iris('/Users/steven/data/iris');
   ```
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] LittleYmada commented on issue #20758: Connecting superset to delta lake table

Posted by GitBox <gi...@apache.org>.
LittleYmada commented on issue #20758:
URL: https://github.com/apache/superset/issues/20758#issuecomment-1211517217

   @manojrustagi79  have you solve this problem? I have similar requests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] mrigbysmoove commented on issue #20758: Connecting superset to delta lake table

Posted by GitBox <gi...@apache.org>.
mrigbysmoove commented on issue #20758:
URL: https://github.com/apache/superset/issues/20758#issuecomment-1188838464

   You'd need a SQLAlchemy-compatible way of reading Delta tables. Either the Spark Thrift server (with the pyhive driver) or Presto may work for on-prem?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] krishnamenon22 commented on issue #20758: Connecting superset to delta lake table

Posted by GitBox <gi...@apache.org>.
krishnamenon22 commented on issue #20758:
URL: https://github.com/apache/superset/issues/20758#issuecomment-1205813929

   @manojrustagi79 do you have a detail here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] manojrustagi79 closed issue #20758: Connecting superset to delta lake table

Posted by GitBox <gi...@apache.org>.
manojrustagi79 closed issue #20758: Connecting superset to delta lake table
URL: https://github.com/apache/superset/issues/20758


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org