You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2022/12/08 15:12:00 UTC
[jira] [Commented] (ARROW-18394) [CI][Python] Nightly pyhon pandas jobs using latest or upstream_devel fail
[ https://issues.apache.org/jira/browse/ARROW-18394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644867#comment-17644867 ]
Joris Van den Bossche commented on ARROW-18394:
-----------------------------------------------
For the failure shown above, this seems to be a regression on the pandas' side, and I opened an issue there to further discuss that: https://github.com/pandas-dev/pandas/issues/50127
> [CI][Python] Nightly pyhon pandas jobs using latest or upstream_devel fail
> --------------------------------------------------------------------------
>
> Key: ARROW-18394
> URL: https://issues.apache.org/jira/browse/ARROW-18394
> Project: Apache Arrow
> Issue Type: Bug
> Components: Continuous Integration, Python
> Reporter: Raúl Cumplido
> Assignee: Joris Van den Bossche
> Priority: Critical
> Labels: Nightly
> Fix For: 11.0.0
>
>
> Currently the following jobs fail:
> |test-conda-python-3.8-pandas-nightly|https://github.com/ursacomputing/crossbow/actions/runs/3532562061/jobs/5927065343|
> |test-conda-python-3.9-pandas-upstream_devel|https://github.com/ursacomputing/crossbow/actions/runs/3532562477/jobs/5927066168|
> with:
> {code:java}
> _________________ test_roundtrip_with_bytes_unicode[columns0] __________________columns = [b'foo'] @pytest.mark.parametrize('columns', ([b'foo'], ['foo']))
> def test_roundtrip_with_bytes_unicode(columns):
> df = pd.DataFrame(columns=columns)
> table1 = pa.Table.from_pandas(df)
> > table2 = pa.Table.from_pandas(table1.to_pandas())opt/conda/envs/arrow/lib/python3.8/site-packages/pyarrow/tests/test_pandas.py:2867:
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> pyarrow/array.pxi:830: in pyarrow.lib._PandasConvertible.to_pandas
> ???
> pyarrow/table.pxi:3908: in pyarrow.lib.Table._to_pandas
> ???
> opt/conda/envs/arrow/lib/python3.8/site-packages/pyarrow/pandas_compat.py:819: in table_to_blockmanager
> columns = _deserialize_column_index(table, all_columns, column_indexes)
> opt/conda/envs/arrow/lib/python3.8/site-packages/pyarrow/pandas_compat.py:935: in _deserialize_column_index
> columns = _reconstruct_columns_from_metadata(columns, column_indexes)
> opt/conda/envs/arrow/lib/python3.8/site-packages/pyarrow/pandas_compat.py:1154: in _reconstruct_columns_from_metadata
> level = level.astype(dtype)
> opt/conda/envs/arrow/lib/python3.8/site-packages/pandas/core/indexes/base.py:1029: in astype
> return Index(new_values, name=self.name, dtype=new_values.dtype, copy=False)
> opt/conda/envs/arrow/lib/python3.8/site-packages/pandas/core/indexes/base.py:518: in __new__
> klass = cls._dtype_to_subclass(arr.dtype)
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ cls = <class 'pandas.core.indexes.base.Index'>, dtype = dtype('S3') @final
> @classmethod
> def _dtype_to_subclass(cls, dtype: DtypeObj):
> # Delay import for perf. https://github.com/pandas-dev/pandas/pull/31423
>
> if isinstance(dtype, ExtensionDtype):
> if isinstance(dtype, DatetimeTZDtype):
> from pandas import DatetimeIndex
>
> return DatetimeIndex
> elif isinstance(dtype, CategoricalDtype):
> from pandas import CategoricalIndex
>
> return CategoricalIndex
> elif isinstance(dtype, IntervalDtype):
> from pandas import IntervalIndex
>
> return IntervalIndex
> elif isinstance(dtype, PeriodDtype):
> from pandas import PeriodIndex
>
> return PeriodIndex
>
> return Index
>
> if dtype.kind == "M":
> from pandas import DatetimeIndex
>
> return DatetimeIndex
>
> elif dtype.kind == "m":
> from pandas import TimedeltaIndex
>
> return TimedeltaIndex
>
> elif dtype.kind == "f":
> from pandas.core.api import Float64Index
>
> return Float64Index
> elif dtype.kind == "u":
> from pandas.core.api import UInt64Index
>
> return UInt64Index
> elif dtype.kind == "i":
> from pandas.core.api import Int64Index
>
> return Int64Index
>
> elif dtype.kind == "O":
> # NB: assuming away MultiIndex
> return Index
>
> elif issubclass(
> dtype.type, (str, bool, np.bool_, complex, np.complex64, np.complex128)
> ):
> return Index
>
> > raise NotImplementedError(dtype)
> E NotImplementedError: |S3opt/conda/envs/arrow/lib/python3.8/site-packages/pandas/core/indexes/base.py:595: NotImplementedError{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)