You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2021/07/19 17:21:02 UTC

[jira] [Commented] (BEAM-12367) SeriesGroupBy corr and cov do not raise the expected error at pipeline construction time

    [ https://issues.apache.org/jira/browse/BEAM-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383468#comment-17383468 ] 

Beam JIRA Bot commented on BEAM-12367:
--------------------------------------

This issue is P2 but has been unassigned without any comment for 60 days so it has been labeled "stale-P2". If this issue is still affecting you, we care! Please comment and remove the label. Otherwise, in 14 days the issue will be moved to P3.

Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed explanation of what these priorities mean.


> SeriesGroupBy corr and cov do not raise the expected error at pipeline construction time
> ----------------------------------------------------------------------------------------
>
>                 Key: BEAM-12367
>                 URL: https://issues.apache.org/jira/browse/BEAM-12367
>             Project: Beam
>          Issue Type: Bug
>          Components: dsl-dataframe, sdk-py-core
>            Reporter: Brian Hulette
>            Priority: P2
>              Labels: dataframe-api, stale-P2
>
> SeriesGroupBy.corr should raise an error at construction time because it needs multiple Series:
> {code}
> In [4]: df.groupby('A').B.corr()
> ---------------------------------------------------------------------------
> TypeError                                 Traceback (most recent call last)
> <ipython-input-4-d760b6077290> in <module>
> ----> 1 df.groupby('A').B.corr()
> ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py in wrapper(*args, **kwargs)
>     815                 return self.apply(curried)
>     816 
> --> 817             return self._python_apply_general(curried, self._obj_with_exclusions)
>     818 
>     819         wrapper.__name__ = name
> ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py in _python_apply_general(self, f, data)
>     926             data after applying f
>     927         """
> --> 928         keys, values, mutated = self.grouper.apply(f, data, self.axis)
>     929 
>     930         return self._wrap_applied_output(
> ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/ops.py in apply(self, f, data, axis)
>     236             # group might be modified
>     237             group_axes = group.axes
> --> 238             res = f(group)
>     239             if not _is_indexed_like(res, group_axes, axis):
>     240                 mutated = True
> ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py in curried(x)
>     804 
>     805             def curried(x):
> --> 806                 return f(x, *args, **kwargs)
>     807 
>     808             # preserve the name so we can detect it when calling plot methods,
> TypeError: corr() missing 1 required positional argument: 'other'
> {code}
> But this isn't raised when called on an empty dataset (perhaps an upstream bug), so we don't raise it during proxy generation. It will not fail until the pipeline is running.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)