You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "felipecrv (via GitHub)" <gi...@apache.org> on 2023/07/19 22:50:51 UTC

[GitHub] [arrow] felipecrv opened a new pull request, #36780: GH-36776: [C++] Make ListArray::FromArrays() handle sliced offsets Arrays containing nulls

felipecrv opened a new pull request, #36780:
URL: https://github.com/apache/arrow/pull/36780

   ### Rationale for this change
   
   It fixes the issue described in #36776 
   
   ### What changes are included in this PR?
   
   The fix and a simplifications of the interaction between `ListArrayFromArrays` and `CleanOffsets` which is rather involved.
   
   ### Are these changes tested?
   
   Yes. I added a test that reproduces the issue before adding the fixes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] conbench-apache-arrow[bot] commented on pull request #36780: GH-36776: [C++] Make ListArray::FromArrays() handle sliced offsets Arrays containing nulls

Posted by "conbench-apache-arrow[bot] (via GitHub)" <gi...@apache.org>.
conbench-apache-arrow[bot] commented on PR #36780:
URL: https://github.com/apache/arrow/pull/36780#issuecomment-1658940659

   After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 16806a1486f644093b9578243cc55ac051c60056.
   
   There were no benchmark performance regressions. 🎉
   
   The [full Conbench report](https://github.com/apache/arrow/runs/15493976025) has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] felipecrv commented on a diff in pull request #36780: GH-36776: [C++] Make ListArray::FromArrays() handle sliced offsets Arrays containing nulls

Posted by "felipecrv (via GitHub)" <gi...@apache.org>.
felipecrv commented on code in PR #36780:
URL: https://github.com/apache/arrow/pull/36780#discussion_r1272872929


##########
cpp/src/arrow/array/array_list_test.cc:
##########
@@ -233,6 +233,26 @@ class TestListArray : public ::testing::Test {
                                                  expected->null_bitmap()));
   }
 
+  void TestFromArraysWithSlicedNullBitmap() {
+    std::vector<offset_type> offsets = {-1, -1, 0, 1, 1, 3};

Review Comment:
   Added to the two versions of tests I added.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] felipecrv commented on a diff in pull request #36780: GH-36776: [C++] Make ListArray::FromArrays() handle sliced offsets Arrays containing nulls

Posted by "felipecrv (via GitHub)" <gi...@apache.org>.
felipecrv commented on code in PR #36780:
URL: https://github.com/apache/arrow/pull/36780#discussion_r1272383758


##########
cpp/src/arrow/array/array_list_test.cc:
##########
@@ -233,6 +233,26 @@ class TestListArray : public ::testing::Test {
                                                  expected->null_bitmap()));
   }
 
+  void TestFromArraysWithSlicedNullBitmap() {

Review Comment:
   I don't think so. I will add one.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou merged pull request #36780: GH-36776: [C++] Make ListArray::FromArrays() handle sliced offsets Arrays containing nulls

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou merged PR #36780:
URL: https://github.com/apache/arrow/pull/36780


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] felipecrv commented on pull request #36780: GH-36776: [C++] Make ListArray::FromArrays() handle sliced offsets Arrays containing nulls

Posted by "felipecrv (via GitHub)" <gi...@apache.org>.
felipecrv commented on PR #36780:
URL: https://github.com/apache/arrow/pull/36780#issuecomment-1642866454

   @pitrou review this commit by commit as the whole change might be confusing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #36780: GH-36776: [C++] Make ListArray::FromArrays() handle sliced offsets Arrays containing nulls

Posted by "pitrou (via GitHub)" <gi...@apache.org>.
pitrou commented on code in PR #36780:
URL: https://github.com/apache/arrow/pull/36780#discussion_r1272304079


##########
cpp/src/arrow/array/array_list_test.cc:
##########
@@ -233,6 +233,26 @@ class TestListArray : public ::testing::Test {
                                                  expected->null_bitmap()));
   }
 
+  void TestFromArraysWithSlicedNullBitmap() {
+    std::vector<offset_type> offsets = {-1, -1, 0, 1, 1, 3};

Review Comment:
   Perhaps we should spice things up by ensuring that the first sliced offset is non-zero?
   For example by adding a second check with `offsets_w_nulls->Slice(3, 3)`.



##########
cpp/src/arrow/array/array_list_test.cc:
##########
@@ -233,6 +233,26 @@ class TestListArray : public ::testing::Test {
                                                  expected->null_bitmap()));
   }
 
+  void TestFromArraysWithSlicedNullBitmap() {
+    std::vector<offset_type> offsets = {-1, -1, 0, 1, 1, 3};
+    std::vector<bool> offsets_w_nulls_is_valid = {true, true, true, false, true, true};
+
+    std::shared_ptr<Array> offsets_w_nulls;
+    ArrayFromVector<OffsetType, offset_type>(offsets_w_nulls_is_valid, offsets,
+                                             &offsets_w_nulls);
+
+    auto type = std::make_shared<T>(int32());
+    auto expected = std::dynamic_pointer_cast<ArrayType>(
+        ArrayFromJSON(type, "[[0], null, [0, null]]"));
+    auto values = expected->values();
+
+    // Apply an offset to the offsets array with nulls (GH-36776)
+    auto sliced_offsets = offsets_w_nulls->Slice(2, 4);
+    ASSERT_OK_AND_ASSIGN(auto result,
+                         ArrayType::FromArrays(*sliced_offsets, *values, pool_));
+    AssertArraysEqual(*result, *expected);

Review Comment:
   ```suggestion
       ASSERT_OK(result->ValidateFull());
       AssertArraysEqual(*result, *expected);
   ```



##########
cpp/src/arrow/array/array_list_test.cc:
##########
@@ -233,6 +233,26 @@ class TestListArray : public ::testing::Test {
                                                  expected->null_bitmap()));
   }
 
+  void TestFromArraysWithSlicedNullBitmap() {

Review Comment:
   For the record, do we already have a test with sliced offsets but no validity bitmap?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org