You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/03/02 19:46:04 UTC

[GitHub] [arrow-datafusion] alamb opened a new pull request, #5455: chore: Remove references from SessionState from physical_plan

alamb opened a new pull request, #5455:
URL: https://github.com/apache/arrow-datafusion/pull/5455

   # Which issue does this PR close?
   Part of https://github.com/apache/arrow-datafusion/issues/1754
   
   # Rationale for this change
   I am trying to extract the physical_plan code into its own crate; SessionState is in `datafusion-core` which means physical_plan can't have references back there.
   
   # What changes are included in this PR?
   Change some code to use `TaskContext` rather than `SessionState` (that is simply used to make a `TaskContext`)
   
   
   # Are these changes tested?
   Covered by existing tests
   
   # Are there any user-facing changes?
   
   I don't think anyone uses these APIs (they are helpers from SessionContext) but I think they are technically public 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5455: chore: Remove references from SessionState from physical_plan

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5455:
URL: https://github.com/apache/arrow-datafusion/pull/5455#discussion_r1123625469


##########
datafusion/core/src/physical_plan/file_format/parquet.rs:
##########
@@ -803,6 +802,7 @@ mod tests {
     use crate::datasource::file_format::test_util::scan_format;
     use crate::datasource::listing::{FileRange, PartitionedFile};
     use crate::datasource::object_store::ObjectStoreUrl;
+    use crate::execution::context::SessionState;

Review Comment:
   This is for tests, which I think is ok to depend on datafusion core



##########
datafusion/core/src/physical_plan/file_format/csv.rs:
##########
@@ -280,7 +280,7 @@ impl FileOpener for CsvOpener {
 }
 
 pub async fn plan_to_csv(
-    state: &SessionState,
+    task_ctx: Arc<TaskContext>,

Review Comment:
   The point of the pR is to remove the use of SessionState and hoist the creation of `TaskContext` into `SessionContext`
   
   



##########
datafusion/core/src/physical_plan/file_format/csv.rs:
##########
@@ -300,8 +300,7 @@ pub async fn plan_to_csv(
         let path = fs_path.join(filename);
         let file = fs::File::create(path)?;
         let mut writer = csv::Writer::new(file);
-        let task_ctx = Arc::new(TaskContext::from(state));
-        let stream = plan.execute(i, task_ctx)?;
+        let stream = plan.execute(i, task_ctx.clone())?;

Review Comment:
   Note that `TaskContext` is simply a clone of the state on `SessionState`: https://github.com/apache/arrow-datafusion/blob/a95e0ec2fd929aae1c2f67148243eb4825d81a3b/datafusion/core/src/execution/context.rs#L2157-L2173
   
   so making it once and cloning is probably better than making multiple `TaskContext`s (each that have a bunch of `Arcs`)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #5455: chore: Remove references from SessionState from physical_plan

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #5455:
URL: https://github.com/apache/arrow-datafusion/pull/5455#issuecomment-1454888418

   @metesynnada / @mustafasrepo  might you have time to review this change? I also think this may be be on code that @metesynnada is working on for INSERT / COPY TO.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb merged pull request #5455: chore: Remove references from SessionState from physical_plan

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb merged PR #5455:
URL: https://github.com/apache/arrow-datafusion/pull/5455


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] mustafasrepo commented on pull request #5455: chore: Remove references from SessionState from physical_plan

Posted by "mustafasrepo (via GitHub)" <gi...@apache.org>.
mustafasrepo commented on PR #5455:
URL: https://github.com/apache/arrow-datafusion/pull/5455#issuecomment-1457632751

   LGTM!. Thanks for this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org