You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "mustafasrepo (via GitHub)" <gi...@apache.org> on 2023/06/16 06:40:27 UTC

[GitHub] [arrow-datafusion] mustafasrepo opened a new pull request, #6695: Replace supports_bounded_execution with supports_retract_batch

mustafasrepo opened a new pull request, #6695:
URL: https://github.com/apache/arrow-datafusion/pull/6695

   # Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   Closes #.
   
   # Rationale for this change
   With the changes in the [#6671](https://github.com/apache/arrow-datafusion/pull/6671), `supports_retract_batch` method is introduced to the `Accumulator` trait. With the introduction of  `supports_retract_batch` method, `supports_bounded_execution` method is no longer necessary for the `AggregateExpr` trait. (Similar to the case we have move `supports_bounded_execution` trait from `BuiltinWindowFunctionExpr` to `PartitionEvalautor` trait.) 
   
   This PR removes `supports_bounded_execution` method from `AggregateExpr` and moves its functionality to `supports_retract_batch` method in the Accumulator` trait for existing accumulators.
   <!--
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   -->
   
   # What changes are included in this PR?
   
   <!--
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   # Are these changes tested?
   
   <!--
   We typically require tests for all PRs in order to:
   1. Prevent the code from being accidentally broken by subsequent changes
   2. Serve as another way to document the expected behavior of the code
   
   If tests are not included in your PR, please explain why (for example, are they covered by existing tests)?
   -->
   
   # Are there any user-facing changes?
   
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #6695: Replace supports_bounded_execution with supports_retract_batch

Posted by "mustafasrepo (via GitHub)" <gi...@apache.org>.
mustafasrepo commented on code in PR #6695:
URL: https://github.com/apache/arrow-datafusion/pull/6695#discussion_r1232319053


##########
datafusion/physical-expr/src/window/partition_evaluator.rs:
##########
@@ -248,7 +248,7 @@ pub trait PartitionEvaluator: Debug + Send {
     ///   (3,4),
     /// ]
     /// ```
-    fn evaluate_with_rank_all(
+    fn evaluate_all_with_rank(

Review Comment:
   I think, `evaluate_all_with_rank` is better name. As part of this PR, I changed the method from `evaluate_with_rank_all` to `evaluate_all_with_rank`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6695: Replace supports_bounded_execution with supports_retract_batch

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6695:
URL: https://github.com/apache/arrow-datafusion/pull/6695#discussion_r1232417920


##########
datafusion/physical-expr/src/aggregate/mod.rs:
##########
@@ -96,12 +96,6 @@ pub trait AggregateExpr: Send + Sync + Debug + PartialEq<dyn Any> {
         false
     }
 
-    /// Specifies whether this aggregate function can run using bounded memory.

Review Comment:
   THis is nice to have moved this logic entirely to the accumulator 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #6695: Replace supports_bounded_execution with supports_retract_batch

Posted by "mustafasrepo (via GitHub)" <gi...@apache.org>.
mustafasrepo commented on code in PR #6695:
URL: https://github.com/apache/arrow-datafusion/pull/6695#discussion_r1231892829


##########
datafusion/physical-expr/src/window/aggregate.rs:
##########
@@ -155,8 +155,7 @@ impl WindowExpr for PlainAggregateWindowExpr {
     }
 
     fn uses_bounded_memory(&self) -> bool {
-        self.aggregate.supports_bounded_execution()
-            && !self.window_frame.end_bound.is_unbounded()
+        !self.window_frame.end_bound.is_unbounded()

Review Comment:
   After thinking through this logic. Actually, as long as `end_bound` is not bounded (not `UNBOUNDED FOLLOWING` such as in the form `N FOLLOWING`). We can produce results without waiting for the whole data to come (If accumulator do not support `retract_batch` method. We wouldn't be able to run queries in the form `M PRECEDING and N FOLLOWING`, in this case we will give an error anyway.). Hence here we do not need to check for `self.aggregate.supports_bounded_execution()` (`acc.supports_retract_batch()` method with the new API.) 



##########
datafusion/physical-expr/src/window/sliding_aggregate.rs:
##########
@@ -139,8 +139,7 @@ impl WindowExpr for SlidingAggregateWindowExpr {
     }
 
     fn uses_bounded_memory(&self) -> bool {
-        self.aggregate.supports_bounded_execution()
-            && !self.window_frame.end_bound.is_unbounded()
+        !self.window_frame.end_bound.is_unbounded()

Review Comment:
   Similar to the case above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb merged pull request #6695: Replace supports_bounded_execution with supports_retract_batch

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb merged PR #6695:
URL: https://github.com/apache/arrow-datafusion/pull/6695


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6695: Replace supports_bounded_execution with supports_retract_batch

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6695:
URL: https://github.com/apache/arrow-datafusion/pull/6695#discussion_r1232417392


##########
datafusion/physical-expr/src/window/aggregate.rs:
##########
@@ -155,8 +155,7 @@ impl WindowExpr for PlainAggregateWindowExpr {
     }
 
     fn uses_bounded_memory(&self) -> bool {
-        self.aggregate.supports_bounded_execution()
-            && !self.window_frame.end_bound.is_unbounded()
+        !self.window_frame.end_bound.is_unbounded()

Review Comment:
   This makes sense to me



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org