You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/03 21:17:12 UTC

[GitHub] [arrow] wjones127 opened a new pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults [WIP]

wjones127 opened a new pull request #11855:
URL: https://github.com/apache/arrow/pull/11855


   The segfault only happened in Debug build (which is why some weren't able to repro) because it was from a failing DCHECK on this line:
   
   https://github.com/apache/arrow/blob/e734856676a00335f3dfe79899a995d87286f5a9/cpp/src/arrow/array/array_nested.cc#L193
   
   The LHS of the DCHECK is missing field names, while the RHS has them.
   
   The basic problem is that MapBuilder doesn't preserve field names when creating the array.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r763309088



##########
File path: python/pyarrow/tests/test_array.py
##########
@@ -891,6 +891,14 @@ def test_list_from_arrays(list_array_type, list_type_factory):
         result.validate(full=True)
 
 
+def test_map_labelled():
+    t = pa.map_(pa.field("name", "string", nullable=False), "int64")
+    arr = pa.array([[('a', 1), ('b', 2)], [('c', 3)]], type=t)
+    assert arr.type.key_field == pa.field("name", pa.utf8(), nullable=False)

Review comment:
       Not for this PR (maybe for a new JIRA), but it seems we are not very consistent in this. For example for list type it seems to matter:
   
   ```
   In [6]: l1 = pa.list_(pa.field("val", pa.int64()))
   
   In [7]: l2 = pa.list_(pa.int64())
   
   In [8]: l1
   Out[8]: ListType(list<val: int64>)
   
   In [9]: l2
   Out[9]: ListType(list<item: int64>)
   
   In [10]: l1 == l2
   Out[10]: False
   ```
   
   Although for a map type, the name of the key field is maybe less important than the values for a list array




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wjones127 commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
wjones127 commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r771708123



##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -330,7 +330,14 @@ class ARROW_EXPORT MapBuilder : public ArrayBuilder {
   ArrayBuilder* value_builder() const { return list_builder_->value_builder(); }
 
   std::shared_ptr<DataType> type() const override {
-    return map(key_builder_->type(), item_builder_->type(), keys_sorted_);
+    // Key and Item builder may update types, but they don't contain the field names,
+    // so we need to reconstruct the type. (See ARROW-13735.)
+    return std::make_shared<MapType>(
+        field(entries_name_,
+              struct_({field(key_name_, key_builder_->type(), false),
+                       field(item_name_, item_builder_->type())}),

Review comment:
       What if items is not nullable?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wjones127 commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
wjones127 commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r762366744



##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -329,9 +329,7 @@ class ARROW_EXPORT MapBuilder : public ArrayBuilder {
   /// the Map to be built as a list of struct values.
   ArrayBuilder* value_builder() const { return list_builder_->value_builder(); }
 
-  std::shared_ptr<DataType> type() const override {
-    return map(key_builder_->type(), item_builder_->type(), keys_sorted_);
-  }
+  std::shared_ptr<DataType> type() const override { return type_; }

Review comment:
       Note that the `map` construction I deleted was already used to set the type when it isn't explicitly passed.
   
   https://github.com/apache/arrow/blob/e734856676a00335f3dfe79899a995d87286f5a9/cpp/src/arrow/array/builder_nested.cc#L52-L55




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wjones127 commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
wjones127 commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r770046849



##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -329,9 +329,7 @@ class ARROW_EXPORT MapBuilder : public ArrayBuilder {
   /// the Map to be built as a list of struct values.
   ArrayBuilder* value_builder() const { return list_builder_->value_builder(); }
 
-  std::shared_ptr<DataType> type() const override {
-    return map(key_builder_->type(), item_builder_->type(), keys_sorted_);
-  }
+  std::shared_ptr<DataType> type() const override { return type_; }

Review comment:
       Missed this. That makes sense. I think the problem is that those types don't have field names, so maybe the solution is to dynamically add back the field names within this method. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wjones127 commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
wjones127 commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r777643434



##########
File path: cpp/src/arrow/array/array_list_test.cc
##########
@@ -717,6 +717,29 @@ TEST_F(TestMapArray, BuildingStringToInt) {
   ASSERT_ARRAYS_EQUAL(*actual, expected);
 }
 
+TEST_F(TestMapArray, BuildingWithFieldNames) {
+  // Builder should preserve field names in output Array
+  ASSERT_OK_AND_ASSIGN(auto map_type,
+                       MapType::Make(field("some_entries",
+                                           struct_({field("some_key", int16(), false),
+                                                    field("some_value", int16())}),
+                                           false)));
+
+  auto key_builder = std::make_shared<Int16Builder>();
+  auto item_builder = std::make_shared<Int16Builder>();
+  MapBuilder map_builder(default_memory_pool(), key_builder, item_builder, map_type);
+
+  std::shared_ptr<Array> actual;
+  ASSERT_OK(map_builder.Append());
+  ASSERT_OK(key_builder->AppendValues({0, 1, 2, 3, 4, 5}));
+  ASSERT_OK(item_builder->AppendValues({1, 1, 2, 3, 5, 8}));
+  ASSERT_OK(map_builder.AppendNull());
+  ASSERT_OK(map_builder.Finish(&actual));

Review comment:
       Added a simple test.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wjones127 commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
wjones127 commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r763381855



##########
File path: python/pyarrow/tests/test_array.py
##########
@@ -891,6 +891,14 @@ def test_list_from_arrays(list_array_type, list_type_factory):
         result.validate(full=True)
 
 
+def test_map_labelled():
+    t = pa.map_(pa.field("name", "string", nullable=False), "int64")
+    arr = pa.array([[('a', 1), ('b', 2)], [('c', 3)]], type=t)
+    assert arr.type.key_field == pa.field("name", pa.utf8(), nullable=False)

Review comment:
       According to the spec:
   
   https://github.com/apache/arrow/blob/85b276dd13512d6676a0bc2dc703a0f9b6934236/format/Schema.fbs#L112-L115
   
   I don't see anything about the names for a list. IMO seems like a piece of metadata most people want to ignore. I'll create a JIRA to discuss that.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wjones127 commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
wjones127 commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r770819730



##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -329,9 +329,7 @@ class ARROW_EXPORT MapBuilder : public ArrayBuilder {
   /// the Map to be built as a list of struct values.
   ArrayBuilder* value_builder() const { return list_builder_->value_builder(); }
 
-  std::shared_ptr<DataType> type() const override {
-    return map(key_builder_->type(), item_builder_->type(), keys_sorted_);
-  }
+  std::shared_ptr<DataType> type() const override { return type_; }

Review comment:
       @pitrou I've updated this part to rebuild the type with the child builders and the saved field names.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #11855:
URL: https://github.com/apache/arrow/pull/11855


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r763146334



##########
File path: python/pyarrow/tests/test_array.py
##########
@@ -891,6 +891,14 @@ def test_list_from_arrays(list_array_type, list_type_factory):
         result.validate(full=True)
 
 
+def test_map_labelled():
+    t = pa.map_(pa.field("name", "string", nullable=False), "int64")
+    arr = pa.array([[('a', 1), ('b', 2)], [('c', 3)]], type=t)
+    assert arr.type.key_field == pa.field("name", pa.utf8(), nullable=False)

Review comment:
       Would `assert arr.type == t` capture possible incorrect names as well?

##########
File path: python/pyarrow/tests/test_array.py
##########
@@ -891,6 +891,14 @@ def test_list_from_arrays(list_array_type, list_type_factory):
         result.validate(full=True)
 
 
+def test_map_labelled():
+    t = pa.map_(pa.field("name", "string", nullable=False), "int64")

Review comment:
       ```suggestion
   def test_map_labelled():
       #  ARROW-13735
       t = pa.map_(pa.field("name", "string", nullable=False), "int64")
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#issuecomment-1005379375


   Benchmark runs are scheduled for baseline = 7235698e7f7da021e6fa64201f69bf77d1625aac and contender = 31a07be1d9dc2f7c9720cc0fdcd7f083d947aba1. 31a07be1d9dc2f7c9720cc0fdcd7f083d947aba1 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/c1809ae52313432faf1eb6dec8b8e0e0...267baedc3d9d41f3a3a3ef263ff9187e/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/dc02bde6ec3a436083b6c736108e5bc2...12b49a67ff1d4420aab5273b8e68eb09/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/d7a151ed090942af8f21647e8576302d...14aa1ce144a84dc392d264fa451596e6/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults [WIP]

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#issuecomment-985833919






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wjones127 commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
wjones127 commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r763385419



##########
File path: python/pyarrow/tests/test_array.py
##########
@@ -891,6 +891,14 @@ def test_list_from_arrays(list_array_type, list_type_factory):
         result.validate(full=True)
 
 
+def test_map_labelled():
+    t = pa.map_(pa.field("name", "string", nullable=False), "int64")
+    arr = pa.array([[('a', 1), ('b', 2)], [('c', 3)]], type=t)
+    assert arr.type.key_field == pa.field("name", pa.utf8(), nullable=False)

Review comment:
       https://issues.apache.org/jira/browse/ARROW-14999




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot commented on pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
ursabot commented on pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#issuecomment-1005379375


   Benchmark runs are scheduled for baseline = 7235698e7f7da021e6fa64201f69bf77d1625aac and contender = 31a07be1d9dc2f7c9720cc0fdcd7f083d947aba1. 31a07be1d9dc2f7c9720cc0fdcd7f083d947aba1 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Scheduled] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/c1809ae52313432faf1eb6dec8b8e0e0...267baedc3d9d41f3a3a3ef263ff9187e/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/dc02bde6ec3a436083b6c736108e5bc2...12b49a67ff1d4420aab5273b8e68eb09/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/d7a151ed090942af8f21647e8576302d...14aa1ce144a84dc392d264fa451596e6/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wjones127 commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
wjones127 commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r763269450



##########
File path: python/pyarrow/tests/test_array.py
##########
@@ -891,6 +891,14 @@ def test_list_from_arrays(list_array_type, list_type_factory):
         result.validate(full=True)
 
 
+def test_map_labelled():
+    t = pa.map_(pa.field("name", "string", nullable=False), "int64")
+    arr = pa.array([[('a', 1), ('b', 2)], [('c', 3)]], type=t)
+    assert arr.type.key_field == pa.field("name", pa.utf8(), nullable=False)

Review comment:
       I tried and it does not. I think that is intentional, given this test in C++:
   
   https://github.com/apache/arrow/blob/85b276dd13512d6676a0bc2dc703a0f9b6934236/cpp/src/arrow/type_test.cc#L1232-L1238




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wjones127 commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
wjones127 commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r762366744



##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -329,9 +329,7 @@ class ARROW_EXPORT MapBuilder : public ArrayBuilder {
   /// the Map to be built as a list of struct values.
   ArrayBuilder* value_builder() const { return list_builder_->value_builder(); }
 
-  std::shared_ptr<DataType> type() const override {
-    return map(key_builder_->type(), item_builder_->type(), keys_sorted_);
-  }
+  std::shared_ptr<DataType> type() const override { return type_; }

Review comment:
       Note that the `map` construction I deleted is already used to set the type when it isn't explicitly passed.
   
   https://github.com/apache/arrow/blob/e734856676a00335f3dfe79899a995d87286f5a9/cpp/src/arrow/array/builder_nested.cc#L52-L55




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r763309088



##########
File path: python/pyarrow/tests/test_array.py
##########
@@ -891,6 +891,14 @@ def test_list_from_arrays(list_array_type, list_type_factory):
         result.validate(full=True)
 
 
+def test_map_labelled():
+    t = pa.map_(pa.field("name", "string", nullable=False), "int64")
+    arr = pa.array([[('a', 1), ('b', 2)], [('c', 3)]], type=t)
+    assert arr.type.key_field == pa.field("name", pa.utf8(), nullable=False)

Review comment:
       For for this PR (maybe for a new JIRA), but it seems we are not very consistent in this. For example for list type it seems to matter:
   
   ```
   In [6]: l1 = pa.list_(pa.field("val", pa.int64()))
   
   In [7]: l2 = pa.list_(pa.int64())
   
   In [8]: l1
   Out[8]: ListType(list<val: int64>)
   
   In [9]: l2
   Out[9]: ListType(list<item: int64>)
   
   In [10]: l1 == l2
   Out[10]: False
   ```
   
   Although for a map type, the name of the key field is maybe less important than the values for a list array




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r765050118



##########
File path: cpp/src/arrow/array/builder_nested.h
##########
@@ -329,9 +329,7 @@ class ARROW_EXPORT MapBuilder : public ArrayBuilder {
   /// the Map to be built as a list of struct values.
   ArrayBuilder* value_builder() const { return list_builder_->value_builder(); }
 
-  std::shared_ptr<DataType> type() const override {
-    return map(key_builder_->type(), item_builder_->type(), keys_sorted_);
-  }
+  std::shared_ptr<DataType> type() const override { return type_; }

Review comment:
       Hmm... unfortunately, I don't think this is right. Some child builders can actually change type over time (for example the index width of a dictionary may increase depending on the dictionary size). So we have to keep the dynamic computation in the `type()` method.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#issuecomment-1005379375


   Benchmark runs are scheduled for baseline = 7235698e7f7da021e6fa64201f69bf77d1625aac and contender = 31a07be1d9dc2f7c9720cc0fdcd7f083d947aba1. 31a07be1d9dc2f7c9720cc0fdcd7f083d947aba1 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/c1809ae52313432faf1eb6dec8b8e0e0...267baedc3d9d41f3a3a3ef263ff9187e/)
   [Failed] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/dc02bde6ec3a436083b6c736108e5bc2...12b49a67ff1d4420aab5273b8e68eb09/)
   [Failed :arrow_down:0.31% :arrow_up:0.13%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/d7a151ed090942af8f21647e8576302d...14aa1ce144a84dc392d264fa451596e6/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#discussion_r776812941



##########
File path: cpp/src/arrow/array/array_list_test.cc
##########
@@ -717,6 +717,29 @@ TEST_F(TestMapArray, BuildingStringToInt) {
   ASSERT_ARRAYS_EQUAL(*actual, expected);
 }
 
+TEST_F(TestMapArray, BuildingWithFieldNames) {
+  // Builder should preserve field names in output Array
+  ASSERT_OK_AND_ASSIGN(auto map_type,
+                       MapType::Make(field("some_entries",
+                                           struct_({field("some_key", int16(), false),
+                                                    field("some_value", int16())}),
+                                           false)));
+
+  auto key_builder = std::make_shared<Int16Builder>();
+  auto item_builder = std::make_shared<Int16Builder>();
+  MapBuilder map_builder(default_memory_pool(), key_builder, item_builder, map_type);
+
+  std::shared_ptr<Array> actual;
+  ASSERT_OK(map_builder.Append());
+  ASSERT_OK(key_builder->AppendValues({0, 1, 2, 3, 4, 5}));
+  ASSERT_OK(item_builder->AppendValues({1, 1, 2, 3, 5, 8}));
+  ASSERT_OK(map_builder.AppendNull());
+  ASSERT_OK(map_builder.Finish(&actual));

Review comment:
       Can you add a test for `map_builder.type()`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#issuecomment-1005379375


   Benchmark runs are scheduled for baseline = 7235698e7f7da021e6fa64201f69bf77d1625aac and contender = 31a07be1d9dc2f7c9720cc0fdcd7f083d947aba1. 31a07be1d9dc2f7c9720cc0fdcd7f083d947aba1 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/c1809ae52313432faf1eb6dec8b8e0e0...267baedc3d9d41f3a3a3ef263ff9187e/)
   [Failed] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/dc02bde6ec3a436083b6c736108e5bc2...12b49a67ff1d4420aab5273b8e68eb09/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/d7a151ed090942af8f21647e8576302d...14aa1ce144a84dc392d264fa451596e6/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #11855: ARROW-13735: [C++][Python] Creating a Map array with non-default field names segfaults

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #11855:
URL: https://github.com/apache/arrow/pull/11855#issuecomment-1004896133


   Thanks for the update! I rebased and will merge if CI is green.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org