You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2021/07/19 17:21:02 UTC
[jira] [Commented] (BEAM-12367) SeriesGroupBy corr and cov do not
raise the expected error at pipeline construction time
[ https://issues.apache.org/jira/browse/BEAM-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383468#comment-17383468 ]
Beam JIRA Bot commented on BEAM-12367:
--------------------------------------
This issue is P2 but has been unassigned without any comment for 60 days so it has been labeled "stale-P2". If this issue is still affecting you, we care! Please comment and remove the label. Otherwise, in 14 days the issue will be moved to P3.
Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed explanation of what these priorities mean.
> SeriesGroupBy corr and cov do not raise the expected error at pipeline construction time
> ----------------------------------------------------------------------------------------
>
> Key: BEAM-12367
> URL: https://issues.apache.org/jira/browse/BEAM-12367
> Project: Beam
> Issue Type: Bug
> Components: dsl-dataframe, sdk-py-core
> Reporter: Brian Hulette
> Priority: P2
> Labels: dataframe-api, stale-P2
>
> SeriesGroupBy.corr should raise an error at construction time because it needs multiple Series:
> {code}
> In [4]: df.groupby('A').B.corr()
> ---------------------------------------------------------------------------
> TypeError Traceback (most recent call last)
> <ipython-input-4-d760b6077290> in <module>
> ----> 1 df.groupby('A').B.corr()
> ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py in wrapper(*args, **kwargs)
> 815 return self.apply(curried)
> 816
> --> 817 return self._python_apply_general(curried, self._obj_with_exclusions)
> 818
> 819 wrapper.__name__ = name
> ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py in _python_apply_general(self, f, data)
> 926 data after applying f
> 927 """
> --> 928 keys, values, mutated = self.grouper.apply(f, data, self.axis)
> 929
> 930 return self._wrap_applied_output(
> ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/ops.py in apply(self, f, data, axis)
> 236 # group might be modified
> 237 group_axes = group.axes
> --> 238 res = f(group)
> 239 if not _is_indexed_like(res, group_axes, axis):
> 240 mutated = True
> ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py in curried(x)
> 804
> 805 def curried(x):
> --> 806 return f(x, *args, **kwargs)
> 807
> 808 # preserve the name so we can detect it when calling plot methods,
> TypeError: corr() missing 1 required positional argument: 'other'
> {code}
> But this isn't raised when called on an empty dataset (perhaps an upstream bug), so we don't raise it during proxy generation. It will not fail until the pipeline is running.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)