You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/07/26 23:55:21 UTC

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #778: JOIN conditions are order dependent

alamb commented on a change in pull request #778:
URL: https://github.com/apache/arrow-datafusion/pull/778#discussion_r677019621



##########
File path: datafusion/src/logical_plan/builder.rs
##########
@@ -282,14 +282,28 @@ impl LogicalPlanBuilder {
             ));
         }
 
-        let left_keys: Vec<Column> = left_keys
-            .into_iter()
-            .map(|c| c.into().normalize(&self.plan))
-            .collect::<Result<_>>()?;
-        let right_keys: Vec<Column> = right_keys
-            .into_iter()
-            .map(|c| c.into().normalize(right))
-            .collect::<Result<_>>()?;
+        let (left_keys, right_keys): (Vec<Result<Column>>, Vec<Result<Column>>) =

Review comment:
       I don't fully understand this code -- if `LogicalPlanBuilder::join` is called it has two inputs, a left (`self`) and a right (`right`) so the idea that `left_keys` might contain references to `right` is confusing to me.
   
   I wonder if there might be a better place to fix up the errors -- perhaps  somewhere further up in the call stack thatn  `LogicalPlanBuilder::join` -- for example `extract_join_keys` in the SQL planner (especially since this bug seems related to SQL planning)
   
   https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/sql/planner.rs#L372-L380

##########
File path: datafusion/tests/sql.rs
##########
@@ -1754,51 +1779,70 @@ async fn equijoin_and_unsupported_condition() -> Result<()> {
 #[tokio::test]
 async fn left_join() -> Result<()> {
     let mut ctx = create_join_context("t1_id", "t2_id")?;
-    let sql = "SELECT t1_id, t1_name, t2_name FROM t1 LEFT JOIN t2 ON t1_id = t2_id ORDER BY t1_id";
-    let actual = execute(&mut ctx, sql).await;
+    let equivalent_sql = [

Review comment:
       I wonder if this test case and those below it add any additional coverage compared to 
   
   ```rust
   async fn equijoin_multiple_condition_ordering() -> Result<()> {
   ```
   
   Like are there bugs that would cause `equijoin_multiple_condition_ordering` to pass but the others to fail?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org