You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/30 20:08:19 UTC

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #1763: Minor: Clean up the code of MutableArrayData

tustvold commented on code in PR #1763:
URL: https://github.com/apache/arrow-rs/pull/1763#discussion_r885059534


##########
arrow/src/array/transform/mod.rs:
##########
@@ -394,28 +364,30 @@ impl<'a> MutableArrayData<'a> {
     /// a [Capacities] variant is not yet supported.
     pub fn with_capacities(
         arrays: Vec<&'a ArrayData>,
-        mut use_nulls: bool,
+        use_nulls: bool,
         capacities: Capacities,
     ) -> Self {
         let data_type = arrays[0].data_type();
         use crate::datatypes::*;
 
         // if any of the arrays has nulls, insertions from any array requires setting bits
         // as there is at least one array with nulls.
-        if arrays.iter().any(|array| array.null_count() > 0) {
-            use_nulls = true;
-        };
+        let use_nulls = use_nulls | arrays.iter().any(|array| array.null_count() > 0);

Review Comment:
   I think this is incorrect, as described above use_nulls is a hint, and should be false if the only source of nulls are the arras themselves.
   
   To explain why, you might have a number of source arrays that don't contain any nulls, but then call extend_nulls.
   
   If it helps I tried to clean this up a bit in https://github.com/apache/arrow-rs/pull/1225 but abandoned it for lack of time - if you wanted to pick it up again...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org