You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/24 15:28:28 UTC

[GitHub] [arrow-datafusion] Dandandan commented on a diff in pull request #4355: fix bug: right anti join with filter

Dandandan commented on code in PR #4355:
URL: https://github.com/apache/arrow-datafusion/pull/4355#discussion_r1031636126


##########
datafusion/core/src/physical_plan/joins/hash_join.rs:
##########
@@ -884,32 +904,27 @@ fn build_join_indexes(
             for (row, hash_value) in hash_values.iter().enumerate() {
                 // Get the hash and find it in the build index
 
-                // For every item on the left and right we check if it doesn't match
+                // For every item on the left and right we check if it matches
                 // This possibly contains rows with hash collisions,
                 // So we have to check here whether rows are equal or not
-                // We only produce one row if there is no match
-                let matches = left.0.get(*hash_value, |(hash, _)| *hash_value == *hash);
-                let mut no_match = true;
-                match matches {
-                    Some((_, indices)) => {
-                        for &i in indices {
-                            // Check hash collisions
-                            if equal_rows(
-                                i as usize,
-                                row,
-                                &left_join_values,
-                                &keys_values,
-                                *null_equals_null,
-                            )? {
-                                no_match = false;
-                                break;
-                            }
+                // After get all matched right size and left size, it may be filtered by the join_filter
+                // After filter, use the bitmap and get the unmatched right size row
+                if let Some((_, indices)) =
+                    left.0.get(*hash_value, |(hash, _)| *hash_value == *hash)
+                {
+                    for &i in indices {
+                        // Check hash collisions
+                        if equal_rows(
+                            i as usize,
+                            row,
+                            &left_join_values,
+                            &keys_values,
+                            *null_equals_null,
+                        )? {
+                            left_indices.append(i);
+                            right_indices.append(row as u32);
                         }
                     }

Review Comment:
   Can't it be combined with the other cases like `JoinType::Inner` (code is the same now?)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org