You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/02 11:03:12 UTC
[GitHub] [arrow-datafusion] Dandandan opened a new pull request #238: Fix left join unmatched rows
Dandandan opened a new pull request #238:
URL: https://github.com/apache/arrow-datafusion/pull/238
# Which issue does this PR close?
Closes #235
TODO: add test for left join with multiple batches.
# Rationale for this change
Fixes behavior of left join with regard to multiple batches on the right side and 0 right side batches.
Also performance-wise likely is faster as it avoids keeping generating / keeping / indexing into a hash set for each batch.
# What changes are included in this PR?
We add a `Vec<bool>` to keep track of left-side rows that didn't match with the right side.
# Are there any user-facing changes?
Should be
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] Dandandan merged pull request #238: Fix Left join implementation is incorrect for 0 or multiple batches on the right side
Posted by GitBox <gi...@apache.org>.
Dandandan merged pull request #238:
URL: https://github.com/apache/arrow-datafusion/pull/238
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #238: Fix Left join implementation is incorrect for 0 or multiple batches on the right side
Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #238:
URL: https://github.com/apache/arrow-datafusion/pull/238#discussion_r624753793
##########
File path: datafusion/src/physical_plan/hash_join.rs
##########
@@ -1025,14 +1053,42 @@ impl Stream for HashJoinStream {
);
self.num_input_batches += 1;
self.num_input_rows += batch.num_rows();
- if let Ok(ref batch) = result {
+ if let Ok((ref batch, ref left_side)) = result {
self.join_time += start.elapsed().as_millis() as usize;
self.num_output_batches += 1;
self.num_output_rows += batch.num_rows();
+
+ if self.join_type == JoinType::Left {
Review comment:
Thanks @houqp I agree, updated the code!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #238: Fix Left join implementation is incorrect for 0 or multiple batches on the right side
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #238:
URL: https://github.com/apache/arrow-datafusion/pull/238#issuecomment-830794080
# [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#238](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ab760af) into [master](https://codecov.io/gh/apache/arrow-datafusion/commit/c945b03f3a459a5c15f481f9d52819df56e1090c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c945b03) will **increase** coverage by `0.30%`.
> The diff coverage is `96.42%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow-datafusion/pull/238/graphs/tree.svg?width=650&height=150&src=pr&token=JXwWBKD3D9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #238 +/- ##
==========================================
+ Coverage 76.46% 76.76% +0.30%
==========================================
Files 135 134 -1
Lines 23250 23248 -2
==========================================
+ Hits 17777 17847 +70
+ Misses 5473 5401 -72
```
| [Impacted Files](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [datafusion/src/physical\_plan/hash\_utils.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9oYXNoX3V0aWxzLnJz) | `100.00% <ø> (ø)` | |
| [datafusion/src/physical\_plan/hash\_join.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9oYXNoX2pvaW4ucnM=) | `87.21% <96.42%> (+1.27%)` | :arrow_up: |
| [ballista/rust/core/src/client.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9jbGllbnQucnM=) | `0.00% <0.00%> (ø)` | |
| [ballista/rust/executor/src/main.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9leGVjdXRvci9zcmMvbWFpbi5ycw==) | `0.00% <0.00%> (ø)` | |
| [ballista/rust/scheduler/src/main.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9zY2hlZHVsZXIvc3JjL21haW4ucnM=) | `0.00% <0.00%> (ø)` | |
| [ballista/rust/executor/src/execution\_loop.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9leGVjdXRvci9zcmMvZXhlY3V0aW9uX2xvb3AucnM=) | `0.00% <0.00%> (ø)` | |
| [ballista/rust/executor/src/flight\_service.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9leGVjdXRvci9zcmMvZmxpZ2h0X3NlcnZpY2UucnM=) | `0.00% <0.00%> (ø)` | |
| [...ust/core/src/execution\_plans/unresolved\_shuffle.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9leGVjdXRpb25fcGxhbnMvdW5yZXNvbHZlZF9zaHVmZmxlLnJz) | `50.00% <0.00%> (ø)` | |
| [ballista/rust/executor/src/lib.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9leGVjdXRvci9zcmMvbGliLnJz) | | |
| [ballista/rust/scheduler/src/lib.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9zY2hlZHVsZXIvc3JjL2xpYi5ycw==) | `21.13% <0.00%> (+1.62%)` | :arrow_up: |
| ... and [1 more](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c945b03...ab760af](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #238: Fix Left join implementation is incorrect for 0 or multiple batches on the right side
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #238:
URL: https://github.com/apache/arrow-datafusion/pull/238#issuecomment-830794080
# [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#238](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (141c708) into [master](https://codecov.io/gh/apache/arrow-datafusion/commit/c945b03f3a459a5c15f481f9d52819df56e1090c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c945b03) will **increase** coverage by `0.04%`.
> The diff coverage is `98.41%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow-datafusion/pull/238/graphs/tree.svg?width=650&height=150&src=pr&token=JXwWBKD3D9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #238 +/- ##
==========================================
+ Coverage 76.46% 76.50% +0.04%
==========================================
Files 135 135
Lines 23250 23303 +53
==========================================
+ Hits 17777 17829 +52
- Misses 5473 5474 +1
```
| [Impacted Files](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [datafusion/src/physical\_plan/hash\_utils.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9oYXNoX3V0aWxzLnJz) | `100.00% <ø> (ø)` | |
| [datafusion/src/physical\_plan/hash\_join.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9oYXNoX2pvaW4ucnM=) | `87.09% <98.41%> (+1.15%)` | :arrow_up: |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c945b03...141c708](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #238: Fix Left join implementation is incorrect for 0 or multiple batches on the right side
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #238:
URL: https://github.com/apache/arrow-datafusion/pull/238#issuecomment-830794080
# [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#238](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (638e604) into [master](https://codecov.io/gh/apache/arrow-datafusion/commit/c945b03f3a459a5c15f481f9d52819df56e1090c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c945b03) will **increase** coverage by `0.06%`.
> The diff coverage is `98.73%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow-datafusion/pull/238/graphs/tree.svg?width=650&height=150&src=pr&token=JXwWBKD3D9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #238 +/- ##
==========================================
+ Coverage 76.46% 76.52% +0.06%
==========================================
Files 135 135
Lines 23250 23319 +69
==========================================
+ Hits 17777 17845 +68
- Misses 5473 5474 +1
```
| [Impacted Files](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [datafusion/src/physical\_plan/hash\_utils.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9oYXNoX3V0aWxzLnJz) | `100.00% <ø> (ø)` | |
| [datafusion/src/physical\_plan/hash\_join.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9oYXNoX2pvaW4ucnM=) | `87.45% <98.73%> (+1.51%)` | :arrow_up: |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c945b03...638e604](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #238: Fix Left join implementation is incorrect for 0 or multiple batches on the right side
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #238:
URL: https://github.com/apache/arrow-datafusion/pull/238#issuecomment-830794080
# [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#238](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2b4fe02) into [master](https://codecov.io/gh/apache/arrow-datafusion/commit/c945b03f3a459a5c15f481f9d52819df56e1090c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c945b03) will **increase** coverage by `0.06%`.
> The diff coverage is `98.73%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow-datafusion/pull/238/graphs/tree.svg?width=650&height=150&src=pr&token=JXwWBKD3D9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #238 +/- ##
==========================================
+ Coverage 76.46% 76.52% +0.06%
==========================================
Files 135 135
Lines 23250 23319 +69
==========================================
+ Hits 17777 17845 +68
- Misses 5473 5474 +1
```
| [Impacted Files](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [datafusion/src/physical\_plan/hash\_utils.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9oYXNoX3V0aWxzLnJz) | `100.00% <ø> (ø)` | |
| [datafusion/src/physical\_plan/hash\_join.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9oYXNoX2pvaW4ucnM=) | `87.45% <98.73%> (+1.51%)` | :arrow_up: |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c945b03...2b4fe02](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] Dandandan commented on pull request #238: Fix Left join implementation is incorrect for 0 or multiple batches on the right side
Posted by GitBox <gi...@apache.org>.
Dandandan commented on pull request #238:
URL: https://github.com/apache/arrow-datafusion/pull/238#issuecomment-830870451
Merging this when it's green
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #238: Fix left join unmatched rows
Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #238:
URL: https://github.com/apache/arrow-datafusion/pull/238#issuecomment-830794080
# [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#238](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3d7a01b) into [master](https://codecov.io/gh/apache/arrow-datafusion/commit/c945b03f3a459a5c15f481f9d52819df56e1090c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c945b03) will **increase** coverage by `0.04%`.
> The diff coverage is `98.41%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow-datafusion/pull/238/graphs/tree.svg?width=650&height=150&src=pr&token=JXwWBKD3D9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #238 +/- ##
==========================================
+ Coverage 76.46% 76.50% +0.04%
==========================================
Files 135 135
Lines 23250 23303 +53
==========================================
+ Hits 17777 17829 +52
- Misses 5473 5474 +1
```
| [Impacted Files](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [datafusion/src/physical\_plan/hash\_utils.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9oYXNoX3V0aWxzLnJz) | `100.00% <ø> (ø)` | |
| [datafusion/tests/sql.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi90ZXN0cy9zcWwucnM=) | `99.88% <ø> (ø)` | |
| [datafusion/src/physical\_plan/hash\_join.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/238/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvcGh5c2ljYWxfcGxhbi9oYXNoX2pvaW4ucnM=) | `87.09% <98.41%> (+1.15%)` | :arrow_up: |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c945b03...3d7a01b](https://codecov.io/gh/apache/arrow-datafusion/pull/238?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] houqp commented on a change in pull request #238: Fix Left join implementation is incorrect for 0 or multiple batches on the right side
Posted by GitBox <gi...@apache.org>.
houqp commented on a change in pull request #238:
URL: https://github.com/apache/arrow-datafusion/pull/238#discussion_r624740133
##########
File path: datafusion/src/physical_plan/hash_join.rs
##########
@@ -1025,14 +1053,42 @@ impl Stream for HashJoinStream {
);
self.num_input_batches += 1;
self.num_input_rows += batch.num_rows();
- if let Ok(ref batch) = result {
+ if let Ok((ref batch, ref left_side)) = result {
self.join_time += start.elapsed().as_millis() as usize;
self.num_output_batches += 1;
self.num_output_rows += batch.num_rows();
+
+ if self.join_type == JoinType::Left {
Review comment:
nit, I think it would be better to do an explicit match on the join type enum here to exhaust all cases. that way, when we add outer join in the future, the compiler will remind us to update the check here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org