You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/29 13:08:07 UTC

[GitHub] [arrow-rs] HaoYang670 opened a new pull request, #1763: Minor: Clean up the code of MutableArrayData

HaoYang670 opened a new pull request, #1763:
URL: https://github.com/apache/arrow-rs/pull/1763

   Signed-off-by: remzi <13...@gmail.com>
   
   # Which issue does this PR close?
   
   None.
   
   # Rationale for this change
    Make the code cleaner.
   
   # What changes are included in this PR?
   1. use macro to avoid duplicate code.
   2. use short circuit to calculate `use_null`
   ...
   
   # Are there any user-facing changes?
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1763: Minor: Clean up the code of MutableArrayData

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on code in PR #1763:
URL: https://github.com/apache/arrow-rs/pull/1763#discussion_r885351369


##########
arrow/src/array/transform/mod.rs:
##########
@@ -394,28 +364,30 @@ impl<'a> MutableArrayData<'a> {
     /// a [Capacities] variant is not yet supported.
     pub fn with_capacities(
         arrays: Vec<&'a ArrayData>,
-        mut use_nulls: bool,
+        use_nulls: bool,
         capacities: Capacities,
     ) -> Self {
         let data_type = arrays[0].data_type();
         use crate::datatypes::*;
 
         // if any of the arrays has nulls, insertions from any array requires setting bits
         // as there is at least one array with nulls.
-        if arrays.iter().any(|array| array.null_count() > 0) {
-            use_nulls = true;
-        };
+        let use_nulls = use_nulls | arrays.iter().any(|array| array.null_count() > 0);

Review Comment:
   > Aah yes, was a long day and brain was fried πŸ˜…
   
   Haha, summer is coming!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1763: Minor: Clean up the code of MutableArrayData

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on code in PR #1763:
URL: https://github.com/apache/arrow-rs/pull/1763#discussion_r884269233


##########
arrow/src/array/transform/mod.rs:
##########
@@ -394,28 +364,30 @@ impl<'a> MutableArrayData<'a> {
     /// a [Capacities] variant is not yet supported.
     pub fn with_capacities(
         arrays: Vec<&'a ArrayData>,
-        mut use_nulls: bool,
+        use_nulls: bool,
         capacities: Capacities,
     ) -> Self {
         let data_type = arrays[0].data_type();
         use crate::datatypes::*;
 
         // if any of the arrays has nulls, insertions from any array requires setting bits
         // as there is at least one array with nulls.
-        if arrays.iter().any(|array| array.null_count() > 0) {
-            use_nulls = true;
-        };

Review Comment:
   I guess there was a bug here, because the input value `use_nulls` was covered.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1763: Minor: Clean up the code of MutableArrayData

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on code in PR #1763:
URL: https://github.com/apache/arrow-rs/pull/1763#discussion_r884269233


##########
arrow/src/array/transform/mod.rs:
##########
@@ -394,28 +364,30 @@ impl<'a> MutableArrayData<'a> {
     /// a [Capacities] variant is not yet supported.
     pub fn with_capacities(
         arrays: Vec<&'a ArrayData>,
-        mut use_nulls: bool,
+        use_nulls: bool,
         capacities: Capacities,
     ) -> Self {
         let data_type = arrays[0].data_type();
         use crate::datatypes::*;
 
         // if any of the arrays has nulls, insertions from any array requires setting bits
         // as there is at least one array with nulls.
-        if arrays.iter().any(|array| array.null_count() > 0) {
-            use_nulls = true;
-        };

Review Comment:
   I guess there was a bug here, because the input value `use_nulls` was dismissed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] codecov-commenter commented on pull request #1763: Minor: Clean up the code of MutableArrayData

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #1763:
URL: https://github.com/apache/arrow-rs/pull/1763#issuecomment-1140448983

   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1763?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#1763](https://codecov.io/gh/apache/arrow-rs/pull/1763?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (e3d25c6) into [master](https://codecov.io/gh/apache/arrow-rs/commit/6e6a9e148c0f50ed1e0eeebd667245e67d7826b6?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6e6a9e1) will **increase** coverage by `0.00%`.
   > The diff coverage is `45.00%`.
   
   ```diff
   @@           Coverage Diff           @@
   ##           master    #1763   +/-   ##
   =======================================
     Coverage   83.48%   83.48%           
   =======================================
     Files         196      196           
     Lines       55923    55907   -16     
   =======================================
   - Hits        46686    46676   -10     
   + Misses       9237     9231    -6     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/1763?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Ξ” | |
   |---|---|---|
   | [arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow-rs/pull/1763/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L3RyYW5zZm9ybS9tb2QucnM=) | `87.41% <45.00%> (+0.56%)` | :arrow_up: |
   | [arrow/src/datatypes/datatype.rs](https://codecov.io/gh/apache/arrow-rs/pull/1763/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2RhdGF0eXBlcy9kYXRhdHlwZS5ycw==) | `65.42% <0.00%> (-0.38%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1763?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Ξ” = absolute <relative> (impact)`, `ΓΈ = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1763?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [6e6a9e1...e3d25c6](https://codecov.io/gh/apache/arrow-rs/pull/1763?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold merged pull request #1763: Minor: Clean up the code of MutableArrayData

Posted by GitBox <gi...@apache.org>.
tustvold merged PR #1763:
URL: https://github.com/apache/arrow-rs/pull/1763


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1763: Minor: Clean up the code of MutableArrayData

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on code in PR #1763:
URL: https://github.com/apache/arrow-rs/pull/1763#discussion_r885193911


##########
arrow/src/array/transform/mod.rs:
##########
@@ -394,28 +364,30 @@ impl<'a> MutableArrayData<'a> {
     /// a [Capacities] variant is not yet supported.
     pub fn with_capacities(
         arrays: Vec<&'a ArrayData>,
-        mut use_nulls: bool,
+        use_nulls: bool,
         capacities: Capacities,
     ) -> Self {
         let data_type = arrays[0].data_type();
         use crate::datatypes::*;
 
         // if any of the arrays has nulls, insertions from any array requires setting bits
         // as there is at least one array with nulls.
-        if arrays.iter().any(|array| array.null_count() > 0) {
-            use_nulls = true;
-        };
+        let use_nulls = use_nulls | arrays.iter().any(|array| array.null_count() > 0);

Review Comment:
   > I think this is incorrect, as described above use_nulls is a hint, and should be false if the only source of nulls are the arras themselves.
   
   Thank you @tustvold for your review. I am still a little confused about it. 
   This is the truth table in my mind: 
   | the input value of `use_nulls` | `arrays.any(null_count > 0)` | final value of `use_nulls`|
   | -- | -- | -- |
   | true | true | true |
   | true | false | true |
   | false | true | true |
   | false | false | false |
   
   Although I not very clear about the whole logic of the function `with_capacities`, for `use_nulls` I think this is the logic.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1763: Minor: Clean up the code of MutableArrayData

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on code in PR #1763:
URL: https://github.com/apache/arrow-rs/pull/1763#discussion_r884269233


##########
arrow/src/array/transform/mod.rs:
##########
@@ -394,28 +364,30 @@ impl<'a> MutableArrayData<'a> {
     /// a [Capacities] variant is not yet supported.
     pub fn with_capacities(
         arrays: Vec<&'a ArrayData>,
-        mut use_nulls: bool,
+        use_nulls: bool,
         capacities: Capacities,
     ) -> Self {
         let data_type = arrays[0].data_type();
         use crate::datatypes::*;
 
         // if any of the arrays has nulls, insertions from any array requires setting bits
         // as there is at least one array with nulls.
-        if arrays.iter().any(|array| array.null_count() > 0) {
-            use_nulls = true;
-        };

Review Comment:
   I guess there was a bug here, because the input value `use_nulls` was dismissed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on a diff in pull request #1763: Minor: Clean up the code of MutableArrayData

Posted by GitBox <gi...@apache.org>.
tustvold commented on code in PR #1763:
URL: https://github.com/apache/arrow-rs/pull/1763#discussion_r885059534


##########
arrow/src/array/transform/mod.rs:
##########
@@ -394,28 +364,30 @@ impl<'a> MutableArrayData<'a> {
     /// a [Capacities] variant is not yet supported.
     pub fn with_capacities(
         arrays: Vec<&'a ArrayData>,
-        mut use_nulls: bool,
+        use_nulls: bool,
         capacities: Capacities,
     ) -> Self {
         let data_type = arrays[0].data_type();
         use crate::datatypes::*;
 
         // if any of the arrays has nulls, insertions from any array requires setting bits
         // as there is at least one array with nulls.
-        if arrays.iter().any(|array| array.null_count() > 0) {
-            use_nulls = true;
-        };
+        let use_nulls = use_nulls | arrays.iter().any(|array| array.null_count() > 0);

Review Comment:
   I think this is incorrect, as described above use_nulls is a hint, and should be false if the only source of nulls are the arras themselves.
   
   To explain why, you might have a number of source arrays that don't contain any nulls, but then call extend_nulls.
   
   If it helps I tried to clean this up a bit in https://github.com/apache/arrow-rs/pull/1225 but abandoned it for lack of time - if you wanted to pick it up again...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on a diff in pull request #1763: Minor: Clean up the code of MutableArrayData

Posted by GitBox <gi...@apache.org>.
tustvold commented on code in PR #1763:
URL: https://github.com/apache/arrow-rs/pull/1763#discussion_r885336269


##########
arrow/src/array/transform/mod.rs:
##########
@@ -394,28 +364,30 @@ impl<'a> MutableArrayData<'a> {
     /// a [Capacities] variant is not yet supported.
     pub fn with_capacities(
         arrays: Vec<&'a ArrayData>,
-        mut use_nulls: bool,
+        use_nulls: bool,
         capacities: Capacities,
     ) -> Self {
         let data_type = arrays[0].data_type();
         use crate::datatypes::*;
 
         // if any of the arrays has nulls, insertions from any array requires setting bits
         // as there is at least one array with nulls.
-        if arrays.iter().any(|array| array.null_count() > 0) {
-            use_nulls = true;
-        };
+        let use_nulls = use_nulls | arrays.iter().any(|array| array.null_count() > 0);

Review Comment:
   Aah yes, was a long day and brain was fried πŸ˜…



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1763: Minor: Clean up the code of MutableArrayData

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on code in PR #1763:
URL: https://github.com/apache/arrow-rs/pull/1763#discussion_r884269267


##########
arrow/src/array/transform/mod.rs:
##########
@@ -394,28 +364,30 @@ impl<'a> MutableArrayData<'a> {
     /// a [Capacities] variant is not yet supported.
     pub fn with_capacities(
         arrays: Vec<&'a ArrayData>,
-        mut use_nulls: bool,
+        use_nulls: bool,
         capacities: Capacities,
     ) -> Self {
         let data_type = arrays[0].data_type();
         use crate::datatypes::*;
 
         // if any of the arrays has nulls, insertions from any array requires setting bits
         // as there is at least one array with nulls.
-        if arrays.iter().any(|array| array.null_count() > 0) {
-            use_nulls = true;
-        };
+        let use_nulls = use_nulls | arrays.iter().any(|array| array.null_count() > 0);

Review Comment:
   Short circuit here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org