You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/07/28 14:43:16 UTC

[GitHub] [arrow-datafusion] alamb opened a new pull request, #7126: Fix panic in filter predicate

alamb opened a new pull request, #7126:
URL: https://github.com/apache/arrow-datafusion/pull/7126

   # Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   Closes https://github.com/apache/arrow-datafusion/issues/7125
   
   # Rationale for this change
   Panic in query isn't good
   
   # What changes are included in this PR?
   
   Fix the check if a predicate is supported for range analysis 
   
   # Are these changes tested?
   yes
   <!--
   We typically require tests for all PRs in order to:
   1. Prevent the code from being accidentally broken by subsequent changes
   2. Serve as another way to document the expected behavior of the code
   
   If tests are not included in your PR, please explain why (for example, are they covered by existing tests)?
   -->
   
   # Are there any user-facing changes?
   
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #7126: Fix panic in filter predicate

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #7126:
URL: https://github.com/apache/arrow-datafusion/pull/7126#discussion_r1277648998


##########
datafusion/core/src/physical_plan/filter.rs:
##########
@@ -191,11 +190,8 @@ impl ExecutionPlan for FilterExec {
     fn statistics(&self) -> Statistics {
         let predicate = self.predicate();
 
-        if let Some(binary) = predicate.as_any().downcast_ref::<BinaryExpr>() {
-            let columns = collect_columns(predicate);
-            if !is_operator_supported(binary.op()) || columns.is_empty() {
-                return Statistics::default();
-            }
+        if !check_support(predicate) {

Review Comment:
   There is an existing function that does the correctly recursive check



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #7126: fix: Fix panic in filter predicate

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on PR #7126:
URL: https://github.com/apache/arrow-datafusion/pull/7126#issuecomment-1655886649

   @berkaysynnada do you perhaps have time to review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #7126: fix: Fix panic in filter predicate

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #7126:
URL: https://github.com/apache/arrow-datafusion/pull/7126#discussion_r1277990307


##########
datafusion/core/src/physical_plan/filter.rs:
##########
@@ -191,11 +190,8 @@ impl ExecutionPlan for FilterExec {
     fn statistics(&self) -> Statistics {
         let predicate = self.predicate();
 
-        if let Some(binary) = predicate.as_any().downcast_ref::<BinaryExpr>() {
-            let columns = collect_columns(predicate);
-            if !is_operator_supported(binary.op()) || columns.is_empty() {
-                return Statistics::default();
-            }
+        if !check_support(predicate) {

Review Comment:
   I don't know its purpose -- given that all the tests still pass I think the answer is no, but it would be good to get a second opinion on that



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya merged pull request #7126: fix: Fix panic in filter predicate

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya merged PR #7126:
URL: https://github.com/apache/arrow-datafusion/pull/7126


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on a diff in pull request #7126: fix: Fix panic in filter predicate

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #7126:
URL: https://github.com/apache/arrow-datafusion/pull/7126#discussion_r1277793507


##########
datafusion/core/src/physical_plan/filter.rs:
##########
@@ -191,11 +190,8 @@ impl ExecutionPlan for FilterExec {
     fn statistics(&self) -> Statistics {
         let predicate = self.predicate();
 
-        if let Some(binary) = predicate.as_any().downcast_ref::<BinaryExpr>() {
-            let columns = collect_columns(predicate);
-            if !is_operator_supported(binary.op()) || columns.is_empty() {
-                return Statistics::default();
-            }
+        if !check_support(predicate) {

Review Comment:
   Do we need to keep `columns.is_empty()` check?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on a diff in pull request #7126: fix: Fix panic in filter predicate

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on code in PR #7126:
URL: https://github.com/apache/arrow-datafusion/pull/7126#discussion_r1278065861


##########
datafusion/core/src/physical_plan/filter.rs:
##########
@@ -191,11 +190,8 @@ impl ExecutionPlan for FilterExec {
     fn statistics(&self) -> Statistics {
         let predicate = self.predicate();
 
-        if let Some(binary) = predicate.as_any().downcast_ref::<BinaryExpr>() {
-            let columns = collect_columns(predicate);
-            if !is_operator_supported(binary.op()) || columns.is_empty() {
-                return Statistics::default();
-            }
+        if !check_support(predicate) {

Review Comment:
   I guess it means if the predicate doesn't involve any column, we cannot do selectivity check so just return default?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] berkaysynnada commented on pull request #7126: fix: Fix panic in filter predicate

Posted by "berkaysynnada (via GitHub)" <gi...@apache.org>.
berkaysynnada commented on PR #7126:
URL: https://github.com/apache/arrow-datafusion/pull/7126#issuecomment-1656592421

   > @berkaysynnada do you perhaps have time to review this PR?
   
   LGTM, thank you for the fix. Actually, it's good that you've removed the default return in the case that there is no column in the predicate. We don't need to provide a default return for those with 0 selective deterministic filters. Now, the calculations will yield a 0 selectivity result for these cases, instead of returning a default (as in the instance of 1>2).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org