You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "metesynnada (via GitHub)" <gi...@apache.org> on 2023/04/12 11:53:33 UTC

[GitHub] [arrow-datafusion] metesynnada commented on a diff in pull request #5937: Streaming Memory Reservation in SHJ

metesynnada commented on code in PR #5937:
URL: https://github.com/apache/arrow-datafusion/pull/5937#discussion_r1164022590


##########
datafusion/core/src/physical_plan/joins/symmetric_hash_join.rs:
##########
@@ -1339,6 +1370,11 @@ fn combine_two_batches(
 }
 
 impl SymmetricHashJoinStream {
+    fn size(&self) -> usize {

Review Comment:
   Since `HashJoinExec` did not include those features, I have tried to replicate the same behavior in previous commits. However, I have now added the required memory measures. If this implementation is acceptable, we can proceed with it.



##########
datafusion/core/src/physical_plan/joins/symmetric_hash_join.rs:
##########
@@ -1442,6 +1478,9 @@ impl SymmetricHashJoinStream {
                     // Combine results:
                     let result =
                         combine_two_batches(&self.schema, equal_result, anti_result)?;
+                    let capacity = self.size();
+                    self.metrics.stream_memory_usage.set(capacity);
+                    self.reservation.lock().resize(capacity);

Review Comment:
   I add a function called `try_resize`, which does `try_grow` inside.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org