You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/12 10:06:54 UTC
[GitHub] [arrow] ovr opened a new pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
ovr opened a new pull request #9174:
URL: https://github.com/apache/arrow/pull/9174
Introduce support in DataFushion for GROUP BY on boolean values. Boolean type in Rust implements Eq and Hash traits which allow us to use GroupByScalar.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] codecov-io edited a comment on pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9174:
URL: https://github.com/apache/arrow/pull/9174#issuecomment-761564205
# [Codecov](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=h1) Report
> Merging [#9174](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=desc) (3db6560) into [master](https://codecov.io/gh/apache/arrow/commit/1393188e1aa1b3d59993ce7d4ade7f7ac8570959?el=desc) (1393188) will **decrease** coverage by `0.00%`.
> The diff coverage is `85.71%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9174/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #9174 +/- ##
==========================================
- Coverage 81.61% 81.60% -0.01%
==========================================
Files 215 215
Lines 51867 51885 +18
==========================================
+ Hits 42329 42343 +14
- Misses 9538 9542 +4
```
| [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [rust/datafusion/src/physical\_plan/group\_scalar.rs](https://codecov.io/gh/apache/arrow/pull/9174/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2dyb3VwX3NjYWxhci5ycw==) | `68.00% <0.00%> (-2.84%)` | :arrow_down: |
| [...ust/datafusion/src/physical\_plan/hash\_aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9174/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2hhc2hfYWdncmVnYXRlLnJz) | `84.91% <100.00%> (+0.11%)` | :arrow_up: |
| [rust/datafusion/src/physical\_plan/hash\_join.rs](https://codecov.io/gh/apache/arrow/pull/9174/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2hhc2hfam9pbi5ycw==) | `84.78% <100.00%> (+0.07%)` | :arrow_up: |
| [rust/datafusion/tests/sql.rs](https://codecov.io/gh/apache/arrow/pull/9174/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3Rlc3RzL3NxbC5ycw==) | `99.84% <100.00%> (+<0.01%)` | :arrow_up: |
| [rust/parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow/pull/9174/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9lbmNvZGluZ3MvZW5jb2RpbmcucnM=) | `94.86% <0.00%> (-0.20%)` | :arrow_down: |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=footer). Last update [eaa7b7a...3db6560](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on a change in pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9174:
URL: https://github.com/apache/arrow/pull/9174#discussion_r555958013
##########
File path: rust/datafusion/src/physical_plan/hash_join.rs
##########
@@ -447,6 +447,11 @@ pub(crate) fn create_key(
// store the string value
vec.extend_from_slice(value.as_bytes());
}
+ DataType::Boolean => {
+ let array = col.as_any().downcast_ref::<BooleanArray>().unwrap();
+ let x: u8 = if array.value(row) { 1 } else { 0 };
Review comment:
This could probably also use `array.value(row) as u8`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb closed pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
Posted by GitBox <gi...@apache.org>.
alamb closed pull request #9174:
URL: https://github.com/apache/arrow/pull/9174
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9174:
URL: https://github.com/apache/arrow/pull/9174#issuecomment-758549335
https://issues.apache.org/jira/browse/ARROW-11220
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb commented on pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9174:
URL: https://github.com/apache/arrow/pull/9174#issuecomment-761773418
I apologize for the delay in merging Rust PRs -- the 3.0 release is being finalized now and are planning to minimize entropy by postponing merging changes not critical for the release until the process was complete. I hope the process is complete in the next few days. There is more [discussion](https://lists.apache.org/list.html?dev@arrow.apache.org) in the mailing list
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ovr commented on pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
Posted by GitBox <gi...@apache.org>.
ovr commented on pull request #9174:
URL: https://github.com/apache/arrow/pull/9174#issuecomment-761564117
@alamb Rebased, I added test. Thanks
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] codecov-io commented on pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #9174:
URL: https://github.com/apache/arrow/pull/9174#issuecomment-761564205
# [Codecov](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=h1) Report
> Merging [#9174](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=desc) (03985bb) into [master](https://codecov.io/gh/apache/arrow/commit/1393188e1aa1b3d59993ce7d4ade7f7ac8570959?el=desc) (1393188) will **decrease** coverage by `0.01%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9174/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #9174 +/- ##
==========================================
- Coverage 81.61% 81.59% -0.02%
==========================================
Files 215 215
Lines 51867 51876 +9
==========================================
Hits 42329 42329
- Misses 9538 9547 +9
```
| [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [rust/datafusion/src/physical\_plan/group\_scalar.rs](https://codecov.io/gh/apache/arrow/pull/9174/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2dyb3VwX3NjYWxhci5ycw==) | `68.00% <0.00%> (-2.84%)` | :arrow_down: |
| [...ust/datafusion/src/physical\_plan/hash\_aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9174/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2hhc2hfYWdncmVnYXRlLnJz) | `84.14% <0.00%> (-0.66%)` | :arrow_down: |
| [rust/datafusion/src/physical\_plan/hash\_join.rs](https://codecov.io/gh/apache/arrow/pull/9174/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2hhc2hfam9pbi5ycw==) | `84.07% <0.00%> (-0.64%)` | :arrow_down: |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=footer). Last update [eaa7b7a...3db6560](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb commented on pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9174:
URL: https://github.com/apache/arrow/pull/9174#issuecomment-761140983
I think this PR needs a rebase and perhaps an end-to-end test for grouping (as you did on https://github.com/apache/arrow/pull/9175) and it will be ready to go. Thanks again @ovr !
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb commented on pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9174:
URL: https://github.com/apache/arrow/pull/9174#issuecomment-764600978
Thanks again for the contribution @ovr
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] codecov-io edited a comment on pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9174:
URL: https://github.com/apache/arrow/pull/9174#issuecomment-761564205
# [Codecov](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=h1) Report
> Merging [#9174](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=desc) (9eb40b5) into [master](https://codecov.io/gh/apache/arrow/commit/1393188e1aa1b3d59993ce7d4ade7f7ac8570959?el=desc) (1393188) will **increase** coverage by `0.00%`.
> The diff coverage is `86.36%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9174/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #9174 +/- ##
=======================================
Coverage 81.61% 81.61%
=======================================
Files 215 215
Lines 51867 51886 +19
=======================================
+ Hits 42329 42345 +16
- Misses 9538 9541 +3
```
| [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [rust/datafusion/src/physical\_plan/group\_scalar.rs](https://codecov.io/gh/apache/arrow/pull/9174/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2dyb3VwX3NjYWxhci5ycw==) | `68.00% <0.00%> (-2.84%)` | :arrow_down: |
| [...ust/datafusion/src/physical\_plan/hash\_aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9174/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2hhc2hfYWdncmVnYXRlLnJz) | `84.91% <100.00%> (+0.11%)` | :arrow_up: |
| [rust/datafusion/src/physical\_plan/hash\_join.rs](https://codecov.io/gh/apache/arrow/pull/9174/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2hhc2hfam9pbi5ycw==) | `84.82% <100.00%> (+0.11%)` | :arrow_up: |
| [rust/datafusion/tests/sql.rs](https://codecov.io/gh/apache/arrow/pull/9174/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3Rlc3RzL3NxbC5ycw==) | `99.84% <100.00%> (+<0.01%)` | :arrow_up: |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=footer). Last update [eaa7b7a...3db6560](https://codecov.io/gh/apache/arrow/pull/9174?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ovr commented on a change in pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
Posted by GitBox <gi...@apache.org>.
ovr commented on a change in pull request #9174:
URL: https://github.com/apache/arrow/pull/9174#discussion_r558883490
##########
File path: rust/datafusion/src/physical_plan/hash_join.rs
##########
@@ -447,6 +447,11 @@ pub(crate) fn create_key(
// store the string value
vec.extend_from_slice(value.as_bytes());
}
+ DataType::Boolean => {
+ let array = col.as_any().downcast_ref::<BooleanArray>().unwrap();
+ let x: u8 = if array.value(row) { 1 } else { 0 };
Review comment:
Thanks
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb commented on pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9174:
URL: https://github.com/apache/arrow/pull/9174#issuecomment-764600978
Thanks again for the contribution @ovr
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb closed pull request #9174: ARROW-11220: [Rust] Implement GROUP BY support for Boolean
Posted by GitBox <gi...@apache.org>.
alamb closed pull request #9174:
URL: https://github.com/apache/arrow/pull/9174
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org