You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/30 18:03:07 UTC

[GitHub] [arrow-datafusion] tustvold opened a new issue, #2659: DataFrame TableProvider Circular Reference

tustvold opened a new issue, #2659:
URL: https://github.com/apache/arrow-datafusion/issues/2659

   **Describe the bug**
   
   Currently `DataFrame` implements `TableProvider` allowing it to be registered on a `SessionContext`, this will often be the same `SessionContext` from which its `Arc<RwLock<SessionState>>` came. For example
   
   ```
   #[tokio::test]
       async fn register_table() -> Result<()> {
           let df = test_table().await?.select_columns(&["c1", "c12"])?;
           let ctx = SessionContext::new();
           let df_impl = Arc::new(DataFrame::new(ctx.state.clone(), &df.plan.clone()));
       
           // register a dataframe as a table
           ctx.register_table("test_table", df_impl.clone())?;
   ```
   
   This will result in a circular reference that will prevent destruction of the SchemaProvider and the DataFrame.
   
   **To Reproduce**
   
   Inspect code
   
   **Expected behavior**
   
   It should not be possible to introduce circular dependencies. On a more holistic level, I'm not entirely sure what the purpose of this API is. Perhaps it could be removed and replaced with a combination of `DataFrame::to_logical_plan` and `ViewTable` (once #2657 is fixed)
   
   **Additional context**
   
   https://github.com/apache/arrow-datafusion/issues/2658
   
   FYI @yjshen @xudong963 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] tustvold commented on issue #2659: DataFrame TableProvider Circular Reference

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #2659:
URL: https://github.com/apache/arrow-datafusion/issues/2659#issuecomment-1142143439

   I was thinking more a `DataFrame::to_table_provider` method, the fact `ViewTable` currently requires a `SessionContext` in its constructor is fixed by #2660 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] matthewmturner commented on issue #2659: DataFrame TableProvider Circular Reference

Posted by GitBox <gi...@apache.org>.
matthewmturner commented on issue #2659:
URL: https://github.com/apache/arrow-datafusion/issues/2659#issuecomment-1141629687

   Sounds good to me - is your idea to add a method like a `ctx.register_dataframe` method that would convert the `DataFrame` to a logical plan and register as `ViewTable`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] matthewmturner commented on issue #2659: DataFrame TableProvider Circular Reference

Posted by GitBox <gi...@apache.org>.
matthewmturner commented on issue #2659:
URL: https://github.com/apache/arrow-datafusion/issues/2659#issuecomment-1141425501

   FYI some additional context on this is https://github.com/apache/arrow-datafusion/issues/1698 and https://github.com/apache/arrow-datafusion/pull/1699.  Basically the idea was to enable registering a `DataFrame` as a table i.e. with a name.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] tustvold commented on issue #2659: DataFrame TableProvider Circular Reference

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #2659:
URL: https://github.com/apache/arrow-datafusion/issues/2659#issuecomment-1141431525

   What did you think about this?
   
   > Perhaps it could be removed and replaced with a combination of DataFrame::to_logical_plan and ViewTable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] tustvold closed issue #2659: DataFrame TableProvider Circular Reference

Posted by GitBox <gi...@apache.org>.
tustvold closed issue #2659: DataFrame TableProvider Circular Reference
URL: https://github.com/apache/arrow-datafusion/issues/2659


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org