You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Peter Kleinmann (Jira)" <ji...@apache.org> on 2022/04/18 05:54:00 UTC
[jira] [Commented] (BEAM-10708) InteractiveRunner cannot execute pipeline with cross-language transform
[ https://issues.apache.org/jira/browse/BEAM-10708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523537#comment-17523537 ]
Peter Kleinmann commented on BEAM-10708:
----------------------------------------
Just seconding Brian's comment.
I am a corporate developer stuck behind a firewall, with business users trying to evaluate Beam. The first thing we want to try is loading up csvs and running SQL on them offline, and outputting a file (might sound pedestrian, but this is where we are coming from).
Of course, you could ask why offline, but getting a project approved, and setup, and a bucket is actually a weeks long process in a bank, if you haven't done it before.
> InteractiveRunner cannot execute pipeline with cross-language transform
> -----------------------------------------------------------------------
>
> Key: BEAM-10708
> URL: https://issues.apache.org/jira/browse/BEAM-10708
> Project: Beam
> Issue Type: Bug
> Components: cross-language
> Reporter: Brian Hulette
> Priority: P2
> Time Spent: 49h 40m
> Remaining Estimate: 0h
>
> The InteractiveRunner crashes when given a pipeline that includes a cross-language transform.
> Here's the example I tried to run in a jupyter notebook:
> {code:python}
> p = beam.Pipeline(InteractiveRunner())
> pc = (p | SqlTransform("""SELECT
> CAST(1 AS INT) AS `id`,
> CAST('foo' AS VARCHAR) AS `str`,
> CAST(3.14 AS DOUBLE) AS `flt`"""))
> df = interactive_beam.collect(pc)
> {code}
> The problem occurs when [pipeline_fragment.py|https://github.com/apache/beam/blob/dce1eb83b8d5137c56ac58568820c24bd8fda526/sdks/python/apache_beam/runners/interactive/pipeline_fragment.py#L66] creates a copy of the pipeline by [writing it to proto and reading it back|https://github.com/apache/beam/blob/dce1eb83b8d5137c56ac58568820c24bd8fda526/sdks/python/apache_beam/runners/interactive/pipeline_fragment.py#L120]. Reading it back fails because some of the pipeline is not written in Python.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)