You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Ankur Goenka (Jira)" <ji...@apache.org> on 2021/07/15 18:16:00 UTC
[jira] [Commented] (BEAM-12531) ib.show does not handle deferred
dataframe instances
[ https://issues.apache.org/jira/browse/BEAM-12531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381518#comment-17381518 ]
Ankur Goenka commented on BEAM-12531:
-------------------------------------
Is this a blocker for 2.32 release.
If not, Please change the fix version.
> ib.show does not handle deferred dataframe instances
> ----------------------------------------------------
>
> Key: BEAM-12531
> URL: https://issues.apache.org/jira/browse/BEAM-12531
> Project: Beam
> Issue Type: Bug
> Components: dsl-dataframe
> Affects Versions: 2.31.0
> Reporter: Brian Hulette
> Assignee: Sam Rohde
> Priority: P2
> Fix For: 2.32.0
>
> Time Spent: 6h
> Remaining Estimate: 0h
>
> When passed a deferred dataframe instance (e.g. {{ib.show(counts.nlargest(20, keep='all'))}}), ib.show calls len() and ends up raising a WontImplementError:
> {code}
> ---------------------------------------------------------------------------
> WontImplementError Traceback (most recent call last)
> <ipython-input-9-56c2dd81898d> in <module>
> ----> 1 ib.show(counts.nlargest(20, keep='all'))
> 2 frames
> /usr/local/lib/python3.7/dist-packages/apache_beam/runners/interactive/utils.py in run_within_progress_indicator(*args, **kwargs)
> 245 def run_within_progress_indicator(*args, **kwargs):
> 246 with ProgressIndicator('Processing...', 'Done.'):
> --> 247 return func(*args, **kwargs)
> 248
> 249 return run_within_progress_indicator
> /usr/local/lib/python3.7/dist-packages/apache_beam/runners/interactive/interactive_beam.py in show(include_window_info, visualize_data, n, duration, *pcolls)
> 441 else:
> 442 try:
> --> 443 flatten_pcolls.extend(iter(pcoll_container))
> 444 except TypeError:
> 445 raise ValueError(
> /usr/local/lib/python3.7/dist-packages/apache_beam/dataframe/frames.py in __len__(self)
> 695 "len(df) is not currently supported because it produces a non-deferred "
> 696 "result. Consider using df.length() instead.",
> --> 697 reason="non-deferred-result")
> 698
> 699 @property # type: ignore
> WontImplementError: len(df) is not currently supported because it produces a non-deferred result. Consider using df.length() instead.
> For more information see https://s.apache.org/dataframe-non-deferred-result.
> {code}
> We should support this case, or at least fail gracefully.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)