You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/01 13:23:32 UTC

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12300: ARROW-15253: [Python] Error in to_pandas for empty dataframe with index with extension type

jorisvandenbossche commented on a change in pull request #12300:
URL: https://github.com/apache/arrow/pull/12300#discussion_r796589838



##########
File path: python/pyarrow/tests/test_pandas.py
##########
@@ -4082,6 +4082,18 @@ def test_array_to_pandas():
         # tm.assert_series_equal(result, expected)
 
 
+def test_roundtrip_empty_table_with_intervalrange_index():

Review comment:
       ```suggestion
   def test_roundtrip_empty_table_with_extension_dtype_index():
   ```
   
   It's not an issue specifically with IntervalDtype, but more specifically with any extension dtype that defines a `__from_arrow__` (and interval type is one of the examples of this in pandas)

##########
File path: python/pyarrow/pandas_compat.py
##########
@@ -822,8 +822,12 @@ def _get_extension_dtypes(table, columns_metadata, types_mapper=None):
 
     # infer the extension columns from the pandas metadata
     for col_meta in columns_metadata:
-        name = col_meta['name']
+        if col_meta['name']:
+            name = col_meta['name']
+        else:
+            name = col_meta['field_name']

Review comment:
       We can maybe simply always use `col_meta["field_name"]` ? 
   I don't think there is a case where that would be incorrect (as that should always map to the name in the arrow table, while `col_meta["name"]` doesn't always match exactly)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org