You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 20:06:13 UTC

[GitHub] [beam] damccorm opened a new issue, #20895: SeriesGroupBy corr and cov do not raise the expected error at pipeline construction time

damccorm opened a new issue, #20895:
URL: https://github.com/apache/beam/issues/20895

   SeriesGroupBy.corr should raise an error at construction time because it needs multiple Series:
   
   ```
   
   In [4]: df.groupby('A').B.corr()
   ---------------------------------------------------------------------------
   TypeError
                                   Traceback (most recent call last)
   <ipython-input-4-d760b6077290> in
   <module>
   ----> 1 df.groupby('A').B.corr()
   
   ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py
   in wrapper(*args, **kwargs)
       815                 return self.apply(curried)
       816 
   --> 817 
              return self._python_apply_general(curried, self._obj_with_exclusions)
       818 
       819
           wrapper.__name__ = name
   
   ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py
   in _python_apply_general(self, f, data)
       926             data after applying f
       927        
   """
   --> 928         keys, values, mutated = self.grouper.apply(f, data, self.axis)
       929 
       930
           return self._wrap_applied_output(
   
   ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/ops.py
   in apply(self, f, data, axis)
       236             # group might be modified
       237             group_axes
   = group.axes
   --> 238             res = f(group)
       239             if not _is_indexed_like(res, group_axes,
   axis):
       240                 mutated = True
   
   ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py
   in curried(x)
       804 
       805             def curried(x):
   --> 806                 return f(x, *args,
   **kwargs)
       807 
       808             # preserve the name so we can detect it when calling plot methods,
   
   TypeError:
   corr() missing 1 required positional argument: 'other'
   
   ```
   
   
   But this isn't raised when called on an empty dataset (perhaps an upstream bug), so we don't raise it during proxy generation. It will not fail until the pipeline is running.
   
   Imported from Jira [BEAM-12367](https://issues.apache.org/jira/browse/BEAM-12367). Original Jira may contain additional context.
   Reported by: bhulette.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org