You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "wjones127 (via GitHub)" <gi...@apache.org> on 2023/06/22 04:45:16 UTC
[GitHub] [arrow-datafusion-python] wjones127 commented on issue #414: Show documentation how to use Delta table
wjones127 commented on issue #414:
URL: https://github.com/apache/arrow-datafusion-python/issues/414#issuecomment-1602003194
> I think Delta rust is using Datafusion internally
There's three senses in which we integrate with DataFusion:
1. We use DataFusion components inside of our own functions
2. We have a plugin for Rust DataFusion, but that can only be used from Rust
3. We can export PyArrow datasets, which datafusion-python can read.
It's only the third one that applies to this library.
> I could not find any documentation though how to use Delta table with Python datafusion
Our integration with the Python DataFusion is similar to DuckDB: create a PyArrow dataset, import that into DataFusion, and query as desired.
```python
from datafusion import SessionContext
from deltalake import DeltaTable
# Create a DataFusion context
ctx = SessionContext()
delta_table = DeltaTable("path/to/your/table")
ctx.register_dataset(delta_table.to_pyarrow_dataset(), table_name="my_table")
df = ctx.sql("SELECT * FROM my_table")
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org