You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Sam Rohde (Jira)" <ji...@apache.org> on 2021/07/08 23:53:00 UTC
[jira] [Assigned] (BEAM-12388) Improve caching experience on
InteractiveRunner with dataframes
[ https://issues.apache.org/jira/browse/BEAM-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sam Rohde reassigned BEAM-12388:
--------------------------------
Assignee: Sam Rohde
> Improve caching experience on InteractiveRunner with dataframes
> ---------------------------------------------------------------
>
> Key: BEAM-12388
> URL: https://issues.apache.org/jira/browse/BEAM-12388
> Project: Beam
> Issue Type: Improvement
> Components: runner-py-interactive
> Reporter: Sam Rohde
> Assignee: Sam Rohde
> Priority: P2
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Reusing the default label for to_pcollection when using the interactive runner results in caching errors when used with multiple pipelines:
>
>
> {{Traceback (most recent call last):
> File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/interactive_runner_test.py", line 389, in test_dataframes_with_multi_index_get_result
> pd.testing.assert_series_equal(df_expected, ib.collect(deferred_df, n=10))
> File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/utils.py", line 247, in run_within_progress_indicator
> return func(*args, **kwargs)
> File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/interactive_beam.py", line 579, in collect
> recording = recording_manager.record([pcoll], max_n=n, max_duration=duration)
> File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/recording_manager.py", line 433, in record
> self._watch(pcolls)
> File "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/recording_manager.py", line 306, in _watch
> for pcoll in to_pcollection(*watched_dataframes, always_return_tuple=True):
> File "/home/srohde/Workdir/beam/sdks/python/apache_beam/dataframe/convert.py", line 196, in to_pcollection
> new_results = {p: extract_input(p)
> File "/home/srohde/Workdir/beam/sdks/python/apache_beam/transforms/ptransform.py", line 1086, in __ror__
> return self.transform.__ror__(pvalueish, self.label)
> File "/home/srohde/Workdir/beam/sdks/python/apache_beam/transforms/ptransform.py", line 587, in __ror__
> raise ValueError(
> ValueError: Mixing value from different pipelines not allowed.}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)