You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/24 14:34:30 UTC

[GitHub] [arrow-rs] HaoYang670 opened a new pull request, #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

HaoYang670 opened a new pull request, #1739:
URL: https://github.com/apache/arrow-rs/pull/1739

   Signed-off-by: remzi <13...@gmail.com>
   
   # Which issue does this PR close?
   
   <!---
   We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   Closes #1737.
   
   # Rationale for this change
    1. More user-friendly: expose the type of `null_bit_buffer`
    2. Maybe faster: avoid some useless pattern matching
    3. fewer `mut`s are better
   
   # What changes are included in this PR?
   Change the type of `buf` from `Buffer` to `Option<Buffer>`
   <!---
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   # Are there any user-facing changes?
   Yes!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on a diff in pull request #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

Posted by GitBox <gi...@apache.org>.
tustvold commented on code in PR #1739:
URL: https://github.com/apache/arrow-rs/pull/1739#discussion_r881326194


##########
arrow/src/array/builder.rs:
##########
@@ -630,12 +630,15 @@ impl BooleanBuilder {
         let len = self.len();
         let null_bit_buffer = self.bitmap_builder.finish();
         let null_count = len - null_bit_buffer.count_set_bits();
-        let mut builder = ArrayData::builder(DataType::Boolean)
+        let builder = ArrayData::builder(DataType::Boolean)
             .len(len)
-            .add_buffer(self.values_builder.finish());
-        if null_count > 0 {
-            builder = builder.null_bit_buffer(null_bit_buffer);
-        }
+            .add_buffer(self.values_builder.finish())
+            .null_bit_buffer(if null_count > 0 {

Review Comment:
   This could be written as (null_count > 0).then(|| null_bit_buffer)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] jhorstmann commented on pull request #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on PR #1739:
URL: https://github.com/apache/arrow-rs/pull/1739#issuecomment-1139112019

   Fully agree, this looks much more fluent than before.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on code in PR #1739:
URL: https://github.com/apache/arrow-rs/pull/1739#discussion_r881430187


##########
arrow/src/array/builder.rs:
##########
@@ -630,12 +630,15 @@ impl BooleanBuilder {
         let len = self.len();
         let null_bit_buffer = self.bitmap_builder.finish();
         let null_count = len - null_bit_buffer.count_set_bits();
-        let mut builder = ArrayData::builder(DataType::Boolean)
+        let builder = ArrayData::builder(DataType::Boolean)
             .len(len)
-            .add_buffer(self.values_builder.finish());
-        if null_count > 0 {
-            builder = builder.null_bit_buffer(null_bit_buffer);
-        }
+            .add_buffer(self.values_builder.finish())
+            .null_bit_buffer(if null_count > 0 {

Review Comment:
   Updated!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on code in PR #1739:
URL: https://github.com/apache/arrow-rs/pull/1739#discussion_r883191079


##########
arrow/src/compute/kernels/string.rs:
##########
@@ -74,15 +74,11 @@ pub fn string_concat<Offset: OffsetSizeTrait>(
         output_offsets.append(Offset::from_usize(output_values.len()).unwrap());
     }
 
-    let mut builder =

Review Comment:
   Updated!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on a diff in pull request #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

Posted by GitBox <gi...@apache.org>.
tustvold commented on code in PR #1739:
URL: https://github.com/apache/arrow-rs/pull/1739#discussion_r883377367


##########
arrow/src/array/builder.rs:
##########
@@ -829,12 +828,15 @@ impl<T: ArrowPrimitiveType> PrimitiveBuilder<T> {
                 .as_ref()
                 .map(|b| b.count_set_bits())
                 .unwrap_or(len);
-        let mut builder = ArrayData::builder(T::DATA_TYPE)
+        let builder = ArrayData::builder(T::DATA_TYPE)
             .len(len)
-            .add_buffer(self.values_builder.finish());
-        if null_count > 0 {
-            builder = builder.null_bit_buffer(null_bit_buffer.unwrap());
-        }
+            .add_buffer(self.values_builder.finish())
+            .null_bit_buffer(if null_count > 0 {

Review Comment:
   Missed a .then opportunity here FWIW



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] HaoYang670 commented on pull request #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on PR #1739:
URL: https://github.com/apache/arrow-rs/pull/1739#issuecomment-1136611225

   cc @alamb @tustvold @viirya 
   Please help to review. Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on a diff in pull request #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

Posted by GitBox <gi...@apache.org>.
tustvold commented on code in PR #1739:
URL: https://github.com/apache/arrow-rs/pull/1739#discussion_r883377367


##########
arrow/src/array/builder.rs:
##########
@@ -829,12 +828,15 @@ impl<T: ArrowPrimitiveType> PrimitiveBuilder<T> {
                 .as_ref()
                 .map(|b| b.count_set_bits())
                 .unwrap_or(len);
-        let mut builder = ArrayData::builder(T::DATA_TYPE)
+        let builder = ArrayData::builder(T::DATA_TYPE)
             .len(len)
-            .add_buffer(self.values_builder.finish());
-        if null_count > 0 {
-            builder = builder.null_bit_buffer(null_bit_buffer.unwrap());
-        }
+            .add_buffer(self.values_builder.finish())
+            .null_bit_buffer(if null_count > 0 {

Review Comment:
   Missed a .then opportunity here FWIW. Same for line below



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold merged pull request #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

Posted by GitBox <gi...@apache.org>.
tustvold merged PR #1739:
URL: https://github.com/apache/arrow-rs/pull/1739


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

Posted by GitBox <gi...@apache.org>.
HaoYang670 commented on code in PR #1739:
URL: https://github.com/apache/arrow-rs/pull/1739#discussion_r883425152


##########
arrow/src/array/builder.rs:
##########
@@ -829,12 +828,15 @@ impl<T: ArrowPrimitiveType> PrimitiveBuilder<T> {
                 .as_ref()
                 .map(|b| b.count_set_bits())
                 .unwrap_or(len);
-        let mut builder = ArrayData::builder(T::DATA_TYPE)
+        let builder = ArrayData::builder(T::DATA_TYPE)
             .len(len)
-            .add_buffer(self.values_builder.finish());
-        if null_count > 0 {
-            builder = builder.null_bit_buffer(null_bit_buffer.unwrap());
-        }
+            .add_buffer(self.values_builder.finish())
+            .null_bit_buffer(if null_count > 0 {

Review Comment:
   `.then` is not suitable here because the type of `null_bit_buffer` is `Option<Buffer>`. Using `.then` will get a result of `Option<Option<Buffer>>`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] codecov-commenter commented on pull request #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #1739:
URL: https://github.com/apache/arrow-rs/pull/1739#issuecomment-1136062138

   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1739?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#1739](https://codecov.io/gh/apache/arrow-rs/pull/1739?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (1b2e9ef) into [master](https://codecov.io/gh/apache/arrow-rs/commit/ca1d85f746099c79b43700496042ed567d95c6cc?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ca1d85f) will **increase** coverage by `0.00%`.
   > The diff coverage is `95.02%`.
   
   > :exclamation: Current head 1b2e9ef differs from pull request most recent head d07f2c7. Consider uploading reports for the commit d07f2c7 to get more accurate results
   
   ```diff
   @@           Coverage Diff           @@
   ##           master    #1739   +/-   ##
   =======================================
     Coverage   83.31%   83.31%           
   =======================================
     Files         196      196           
     Lines       55961    55959    -2     
   =======================================
     Hits        46625    46625           
   + Misses       9336     9334    -2     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/1739?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [integration-testing/src/lib.rs](https://codecov.io/gh/apache/arrow-rs/pull/1739/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aW50ZWdyYXRpb24tdGVzdGluZy9zcmMvbGliLnJz) | `0.00% <0.00%> (ø)` | |
   | [arrow/src/compute/kernels/filter.rs](https://codecov.io/gh/apache/arrow-rs/pull/1739/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2NvbXB1dGUva2VybmVscy9maWx0ZXIucnM=) | `88.33% <83.33%> (ø)` | |
   | [arrow/src/ipc/reader.rs](https://codecov.io/gh/apache/arrow-rs/pull/1739/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2lwYy9yZWFkZXIucnM=) | `88.76% <94.23%> (+0.03%)` | :arrow_up: |
   | [arrow/src/array/array.rs](https://codecov.io/gh/apache/arrow-rs/pull/1739/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5LnJz) | `89.67% <100.00%> (ø)` | |
   | [arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow-rs/pull/1739/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X2JpbmFyeS5ycw==) | `93.27% <100.00%> (-0.03%)` | :arrow_down: |
   | [arrow/src/array/array\_dictionary.rs](https://codecov.io/gh/apache/arrow-rs/pull/1739/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X2RpY3Rpb25hcnkucnM=) | `91.77% <100.00%> (-0.15%)` | :arrow_down: |
   | [arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow-rs/pull/1739/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X2xpc3QucnM=) | `96.16% <100.00%> (ø)` | |
   | [arrow/src/array/array\_map.rs](https://codecov.io/gh/apache/arrow-rs/pull/1739/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X21hcC5ycw==) | `84.81% <100.00%> (ø)` | |
   | [arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow-rs/pull/1739/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X3ByaW1pdGl2ZS5ycw==) | `95.06% <100.00%> (ø)` | |
   | [arrow/src/array/array\_string.rs](https://codecov.io/gh/apache/arrow-rs/pull/1739/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X3N0cmluZy5ycw==) | `97.72% <100.00%> (-0.01%)` | :arrow_down: |
   | ... and [24 more](https://codecov.io/gh/apache/arrow-rs/pull/1739/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1739?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1739?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [ca1d85f...d07f2c7](https://codecov.io/gh/apache/arrow-rs/pull/1739?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on a diff in pull request #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

Posted by GitBox <gi...@apache.org>.
alamb commented on code in PR #1739:
URL: https://github.com/apache/arrow-rs/pull/1739#discussion_r883016153


##########
arrow/src/ipc/reader.rs:
##########
@@ -287,24 +287,22 @@ fn create_primitive_array(
     let array_data = match data_type {
         Utf8 | Binary | LargeBinary | LargeUtf8 => {
             // read 3 buffers
-            let mut builder = ArrayData::builder(data_type.clone())
+            ArrayData::builder(data_type.clone())

Review Comment:
   👍  the new pattern certainly look nicer in my opinion



##########
arrow/src/compute/kernels/string.rs:
##########
@@ -74,15 +74,11 @@ pub fn string_concat<Offset: OffsetSizeTrait>(
         output_offsets.append(Offset::from_usize(output_values.len()).unwrap());
     }
 
-    let mut builder =

Review Comment:
   Since this code is moved, this PR now has a conflict sadly



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on a diff in pull request #1739: Rewrite `ArrayDataBuilder::null_bit_buffer`

Posted by GitBox <gi...@apache.org>.
tustvold commented on code in PR #1739:
URL: https://github.com/apache/arrow-rs/pull/1739#discussion_r883431142


##########
arrow/src/array/builder.rs:
##########
@@ -829,12 +828,15 @@ impl<T: ArrowPrimitiveType> PrimitiveBuilder<T> {
                 .as_ref()
                 .map(|b| b.count_set_bits())
                 .unwrap_or(len);
-        let mut builder = ArrayData::builder(T::DATA_TYPE)
+        let builder = ArrayData::builder(T::DATA_TYPE)
             .len(len)
-            .add_buffer(self.values_builder.finish());
-        if null_count > 0 {
-            builder = builder.null_bit_buffer(null_bit_buffer.unwrap());
-        }
+            .add_buffer(self.values_builder.finish())
+            .null_bit_buffer(if null_count > 0 {

Review Comment:
   Previously this called `.unwrap()` which is probably safe given the null count, although I can't remember what NullArrays do. Either way you could then just call `.flatten()`. Not a big deal :grin: 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org