You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/09/16 17:03:18 UTC
[GitHub] [beam] TheNeuralBit opened a new issue, #23276: [Bug]: DeferredSeries.astype rejects CategoricalDtype
TheNeuralBit opened a new issue, #23276:
URL: https://github.com/apache/beam/issues/23276
### What happened?
See https://github.com/apache/beam/pull/22587#discussion_r964644584
```
object_class_col.astype(pd.CategoricalDtype(...)).str.get_dummies()
WontImplementError: astype(dtype='category') is not supported because the type of the output column depends on the data. Please use pd.CategoricalDtype with explicit categories instead.
For more information see https://s.apache.org/dataframe-non-deferred-columns.
```
This is particularly bad since it raises the same error even when the user takes the suggested action (using a CategoricalDtype instead of "categories".
### Issue Priority
Priority: 2
### Issue Component
Component: dsl-dataframe
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit closed issue #23276: [Bug]: DeferredSeries.astype rejects CategoricalDtype
Posted by GitBox <gi...@apache.org>.
TheNeuralBit closed issue #23276: [Bug]: DeferredSeries.astype rejects CategoricalDtype
URL: https://github.com/apache/beam/issues/23276
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on issue #23276: [Bug]: DeferredSeries.astype rejects CategoricalDtype
Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #23276:
URL: https://github.com/apache/beam/issues/23276#issuecomment-1249591254
From https://github.com/apache/beam/pull/22587#discussion_r972161089
> Interesting apparently instances of CategoricalDtype are considered equal to the string "category" : https://github.com/pandas-dev/pandas/blob/54347fe684e0f7844bf407b1fb958a5269646825/pandas/core/dtypes/dtypes.py#L366
>
> The aim in our check was to avoid the case where users indicate astype("category") and rely on pandas to resolve the categories, since we need explicit categories. We should be able to find another way to check this, but it will have to be another bugfix.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org