You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Fabien Aulaire (Jira)" <ji...@apache.org> on 2022/05/14 21:54:00 UTC

[jira] [Created] (ARROW-16580) [pyarrow][pandas conversion] Multiindex levels are not preserved after a from_pandas/to_pandas

Fabien Aulaire created ARROW-16580:
--------------------------------------

             Summary: [pyarrow][pandas conversion]  Multiindex levels are not preserved after a from_pandas/to_pandas
                 Key: ARROW-16580
                 URL: https://issues.apache.org/jira/browse/ARROW-16580
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 8.0.0
            Reporter: Fabien Aulaire


Hello,

Not sure if it's the good place to report this issue but here is what I saw when I tried to convert a multi indexed dataframe (columns) to Table and convert it back to pandas:

 
{code:python}
import pandas as pd
import pyarrow

# pandas version '1.4.1'
# pyarrow version '8.0.0'

df = pd.DataFrame([[100,300, 400], [200,500, 600]], columns=['Toyota', 'Ford', 'Alfa'])
concatenated = pd.concat([df, df*2], axis=1, keys=['foo', 'bar'])
concatenated.columns.names = ['l1', 'l2']

table = pyarrow.Table.from_pandas(concatenated)
from_table_df = table.to_pandas()

from_table_df.columns.levels # == FrozenList([['bar', 'foo'], ['Alfa', 'Ford', 'Toyota']])
concatenated.columns.levels # == FrozenList([['foo', 'bar'], ['Toyota', 'Ford', 'Alfa']])


{code}
the order of columns levels is not preserved. 

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)