You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/09/16 17:03:18 UTC

[GitHub] [beam] TheNeuralBit opened a new issue, #23276: [Bug]: DeferredSeries.astype rejects CategoricalDtype

TheNeuralBit opened a new issue, #23276:
URL: https://github.com/apache/beam/issues/23276

   ### What happened?
   
   See https://github.com/apache/beam/pull/22587#discussion_r964644584
   
   ```
   object_class_col.astype(pd.CategoricalDtype(...)).str.get_dummies()
   
   WontImplementError: astype(dtype='category') is not supported because the type of the output column depends on the data. Please use pd.CategoricalDtype with explicit categories instead.
   For more information see https://s.apache.org/dataframe-non-deferred-columns.
   ```
   
   This is particularly bad since it raises the same error even when the user takes the suggested action (using a CategoricalDtype instead of "categories".
   
   ### Issue Priority
   
   Priority: 2
   
   ### Issue Component
   
   Component: dsl-dataframe


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit closed issue #23276: [Bug]: DeferredSeries.astype rejects CategoricalDtype

Posted by GitBox <gi...@apache.org>.
TheNeuralBit closed issue #23276: [Bug]: DeferredSeries.astype rejects CategoricalDtype
URL: https://github.com/apache/beam/issues/23276


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit commented on issue #23276: [Bug]: DeferredSeries.astype rejects CategoricalDtype

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #23276:
URL: https://github.com/apache/beam/issues/23276#issuecomment-1249591254

   From https://github.com/apache/beam/pull/22587#discussion_r972161089
   
   > Interesting apparently instances of CategoricalDtype are considered equal to the string "category" : https://github.com/pandas-dev/pandas/blob/54347fe684e0f7844bf407b1fb958a5269646825/pandas/core/dtypes/dtypes.py#L366
   > 
   > The aim in our check was to avoid the case where users indicate astype("category") and rely on pandas to resolve the categories, since we need explicit categories. We should be able to find another way to check this, but it will have to be another bugfix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org