You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "avantgardnerio (via GitHub)" <gi...@apache.org> on 2023/02/07 18:30:22 UTC

[GitHub] [arrow-datafusion] avantgardnerio opened a new pull request, #5216: Replace placeholders in ScalarSubqueries

avantgardnerio opened a new pull request, #5216:
URL: https://github.com/apache/arrow-datafusion/pull/5216

   # Which issue does this PR close?
   
   Closes #5215.
   
   # Rationale for this change
   
   Described in issue.
   
   # What changes are included in this PR?
   
   Replace parameters in scalar subqueries.
   
   # Are these changes tested?
   
   With an integration test, yes.
   
   # Are there any user-facing changes?
   
   Scalar subqueries with parameters should work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5216: Replace placeholders in ScalarSubqueries

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5216:
URL: https://github.com/apache/arrow-datafusion/pull/5216#discussion_r1100515076


##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -710,31 +711,39 @@ impl LogicalPlan {
         param_values: &[ScalarValue],
     ) -> Result<Expr, DataFusionError> {
         rewrite_expr(expr, |expr| {
-            if let Expr::Placeholder { id, data_type } = &expr {
-                // convert id (in format $1, $2, ..) to idx (0, 1, ..)
-                let idx = id[1..].parse::<usize>().map_err(|e| {
-                    DataFusionError::Internal(format!(
-                        "Failed to parse placeholder id: {e}"
-                    ))
-                })? - 1;
-                // value at the idx-th position in param_values should be the value for the placeholder
-                let value = param_values.get(idx).ok_or_else(|| {
-                    DataFusionError::Internal(format!(
-                        "No value found for placeholder with id {id}"
-                    ))
-                })?;
-                // check if the data type of the value matches the data type of the placeholder
-                if Some(value.get_datatype()) != *data_type {
-                    return Err(DataFusionError::Internal(format!(
-                        "Placeholder value type mismatch: expected {:?}, got {:?}",
-                        data_type,
-                        value.get_datatype()
-                    )));
+            match &expr {
+                Expr::Placeholder { id, data_type } => {
+                    // convert id (in format $1, $2, ..) to idx (0, 1, ..)
+                    let idx = id[1..].parse::<usize>().map_err(|e| {
+                        DataFusionError::Internal(format!(
+                            "Failed to parse placeholder id: {e}"
+                        ))
+                    })? - 1;
+                    // value at the idx-th position in param_values should be the value for the placeholder
+                    let value = param_values.get(idx).ok_or_else(|| {
+                        DataFusionError::Internal(format!(
+                            "No value found for placeholder with id {id}"
+                        ))
+                    })?;
+                    // check if the data type of the value matches the data type of the placeholder
+                    if Some(value.get_datatype()) != *data_type {
+                        return Err(DataFusionError::Internal(format!(
+                            "Placeholder value type mismatch: expected {:?}, got {:?}",
+                            data_type,
+                            value.get_datatype()
+                        )));
+                    }
+                    // Replace the placeholder with the value
+                    Ok(Expr::Literal(value.clone()))
                 }
-                // Replace the placeholder with the value
-                Ok(Expr::Literal(value.clone()))
-            } else {
-                Ok(expr)
+                Expr::ScalarSubquery(qry) => {

Review Comment:
   I was thinking that the recursion into ScalarSubquery would happen as part of the ExprRewriter itself -- aka https://github.com/apache/arrow-datafusion/blob/e222bd627b6e7974133364fed4600d74b4da6811/datafusion/expr/src/expr_rewriter.rs#L132 
   
   ```rust
    Expr::ScalarSubquery(_) => self.clone(), 
   ```
   Instead of `self.clone()` it would be something like you have in this PR
   
   ```rust
    Expr::ScalarSubquery(qry) => self.clone(), 
        let rewritten_subquery = somehow_rewrite_all_exprs_in_query(rewriter)
        Expr::ScalarSubquery(rewritten_subquery)
   }
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5216: Replace placeholders in ScalarSubqueries

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5216:
URL: https://github.com/apache/arrow-datafusion/pull/5216#discussion_r1099270660


##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -710,31 +711,39 @@ impl LogicalPlan {
         param_values: &[ScalarValue],
     ) -> Result<Expr, DataFusionError> {
         rewrite_expr(expr, |expr| {
-            if let Expr::Placeholder { id, data_type } = &expr {
-                // convert id (in format $1, $2, ..) to idx (0, 1, ..)
-                let idx = id[1..].parse::<usize>().map_err(|e| {
-                    DataFusionError::Internal(format!(
-                        "Failed to parse placeholder id: {e}"
-                    ))
-                })? - 1;
-                // value at the idx-th position in param_values should be the value for the placeholder
-                let value = param_values.get(idx).ok_or_else(|| {
-                    DataFusionError::Internal(format!(
-                        "No value found for placeholder with id {id}"
-                    ))
-                })?;
-                // check if the data type of the value matches the data type of the placeholder
-                if Some(value.get_datatype()) != *data_type {
-                    return Err(DataFusionError::Internal(format!(
-                        "Placeholder value type mismatch: expected {:?}, got {:?}",
-                        data_type,
-                        value.get_datatype()
-                    )));
+            match &expr {
+                Expr::Placeholder { id, data_type } => {
+                    // convert id (in format $1, $2, ..) to idx (0, 1, ..)
+                    let idx = id[1..].parse::<usize>().map_err(|e| {
+                        DataFusionError::Internal(format!(
+                            "Failed to parse placeholder id: {e}"
+                        ))
+                    })? - 1;
+                    // value at the idx-th position in param_values should be the value for the placeholder
+                    let value = param_values.get(idx).ok_or_else(|| {
+                        DataFusionError::Internal(format!(
+                            "No value found for placeholder with id {id}"
+                        ))
+                    })?;
+                    // check if the data type of the value matches the data type of the placeholder
+                    if Some(value.get_datatype()) != *data_type {
+                        return Err(DataFusionError::Internal(format!(
+                            "Placeholder value type mismatch: expected {:?}, got {:?}",
+                            data_type,
+                            value.get_datatype()
+                        )));
+                    }
+                    // Replace the placeholder with the value
+                    Ok(Expr::Literal(value.clone()))
                 }
-                // Replace the placeholder with the value
-                Ok(Expr::Literal(value.clone()))
-            } else {
-                Ok(expr)
+                Expr::ScalarSubquery(qry) => {

Review Comment:
   👍 
   
   I double checked this recursion doesn't happen automatically during the walk
   
   https://github.com/apache/arrow-datafusion/blob/e222bd627b6e7974133364fed4600d74b4da6811/datafusion/expr/src/expr_rewriter.rs#L132
   
   
   Though now that I write that, perhaps it should 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] avantgardnerio commented on a diff in pull request #5216: Replace placeholders in ScalarSubqueries

Posted by "avantgardnerio (via GitHub)" <gi...@apache.org>.
avantgardnerio commented on code in PR #5216:
URL: https://github.com/apache/arrow-datafusion/pull/5216#discussion_r1100327305


##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -710,31 +711,39 @@ impl LogicalPlan {
         param_values: &[ScalarValue],
     ) -> Result<Expr, DataFusionError> {
         rewrite_expr(expr, |expr| {
-            if let Expr::Placeholder { id, data_type } = &expr {
-                // convert id (in format $1, $2, ..) to idx (0, 1, ..)
-                let idx = id[1..].parse::<usize>().map_err(|e| {
-                    DataFusionError::Internal(format!(
-                        "Failed to parse placeholder id: {e}"
-                    ))
-                })? - 1;
-                // value at the idx-th position in param_values should be the value for the placeholder
-                let value = param_values.get(idx).ok_or_else(|| {
-                    DataFusionError::Internal(format!(
-                        "No value found for placeholder with id {id}"
-                    ))
-                })?;
-                // check if the data type of the value matches the data type of the placeholder
-                if Some(value.get_datatype()) != *data_type {
-                    return Err(DataFusionError::Internal(format!(
-                        "Placeholder value type mismatch: expected {:?}, got {:?}",
-                        data_type,
-                        value.get_datatype()
-                    )));
+            match &expr {
+                Expr::Placeholder { id, data_type } => {
+                    // convert id (in format $1, $2, ..) to idx (0, 1, ..)
+                    let idx = id[1..].parse::<usize>().map_err(|e| {
+                        DataFusionError::Internal(format!(
+                            "Failed to parse placeholder id: {e}"
+                        ))
+                    })? - 1;
+                    // value at the idx-th position in param_values should be the value for the placeholder
+                    let value = param_values.get(idx).ok_or_else(|| {
+                        DataFusionError::Internal(format!(
+                            "No value found for placeholder with id {id}"
+                        ))
+                    })?;
+                    // check if the data type of the value matches the data type of the placeholder
+                    if Some(value.get_datatype()) != *data_type {
+                        return Err(DataFusionError::Internal(format!(
+                            "Placeholder value type mismatch: expected {:?}, got {:?}",
+                            data_type,
+                            value.get_datatype()
+                        )));
+                    }
+                    // Replace the placeholder with the value
+                    Ok(Expr::Literal(value.clone()))
                 }
-                // Replace the placeholder with the value
-                Ok(Expr::Literal(value.clone()))
-            } else {
-                Ok(expr)
+                Expr::ScalarSubquery(qry) => {

Review Comment:
   Where are you suggesting the recursion goes? `self.expressions()`? I'm happy to file a new PR to do it a better way...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] ursabot commented on pull request #5216: Replace placeholders in ScalarSubqueries

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #5216:
URL: https://github.com/apache/arrow-datafusion/pull/5216#issuecomment-1422839508

   Benchmark runs are scheduled for baseline = f0c67193a3d18ff1d94f9dd55bfb1715e5473bf1 and contender = b4cf60acb00294baeed6574b6b08a4963204c23f. b4cf60acb00294baeed6574b6b08a4963204c23f is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/c8a10df6b6204837b304b7e34cc6c965...0199adc487cd4593b9341c9ab8da0db7/)
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] [test-mac-arm](https://conbench.ursa.dev/compare/runs/1fcffe5021494f838a301038f1258abf...9c9d71d6b3934357a9d9439d48fbdba9/)
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/6ea1eb07a5b94706a3bf138781bb3d26...31d53adc6f1046469d497505edc67960/)
   [Skipped :warning: Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/ac95625ae40145a1826d9ff617d7ece7...6367ae34bbc445b8ac1bc4be9876a475/)
   Buildkite builds:
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #5216: Replace placeholders in ScalarSubqueries

Posted by "avantgardnerio (via GitHub)" <gi...@apache.org>.
avantgardnerio commented on PR #5216:
URL: https://github.com/apache/arrow-datafusion/pull/5216#issuecomment-1421398431

   @NGA-TRAN PTAL as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] avantgardnerio merged pull request #5216: Replace placeholders in ScalarSubqueries

Posted by "avantgardnerio (via GitHub)" <gi...@apache.org>.
avantgardnerio merged PR #5216:
URL: https://github.com/apache/arrow-datafusion/pull/5216


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org