You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Peter Kleinmann (Jira)" <ji...@apache.org> on 2022/04/18 05:54:00 UTC

[jira] [Commented] (BEAM-10708) InteractiveRunner cannot execute pipeline with cross-language transform

    [ https://issues.apache.org/jira/browse/BEAM-10708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523537#comment-17523537 ] 

Peter Kleinmann commented on BEAM-10708:
----------------------------------------

Just seconding Brian's comment.  

 

I am a corporate developer stuck behind a firewall, with business users trying to evaluate Beam.  The first thing we want to try is loading up csvs and running SQL on them offline, and outputting a file (might sound pedestrian, but this is where we are coming from). 

 

Of course, you could ask why offline, but getting a project approved, and setup, and a bucket is actually a weeks long process in a bank, if you haven't done it before. 

> InteractiveRunner cannot execute pipeline with cross-language transform
> -----------------------------------------------------------------------
>
>                 Key: BEAM-10708
>                 URL: https://issues.apache.org/jira/browse/BEAM-10708
>             Project: Beam
>          Issue Type: Bug
>          Components: cross-language
>            Reporter: Brian Hulette
>            Priority: P2
>          Time Spent: 49h 40m
>  Remaining Estimate: 0h
>
> The InteractiveRunner crashes when given a pipeline that includes a cross-language transform.
> Here's the example I tried to run in a jupyter notebook:
> {code:python}
> p = beam.Pipeline(InteractiveRunner())
> pc = (p | SqlTransform("""SELECT
>             CAST(1 AS INT) AS `id`,
>             CAST('foo' AS VARCHAR) AS `str`,
>             CAST(3.14  AS DOUBLE) AS `flt`"""))
> df = interactive_beam.collect(pc)
> {code}
> The problem occurs when [pipeline_fragment.py|https://github.com/apache/beam/blob/dce1eb83b8d5137c56ac58568820c24bd8fda526/sdks/python/apache_beam/runners/interactive/pipeline_fragment.py#L66] creates a copy of the pipeline by [writing it to proto and reading it back|https://github.com/apache/beam/blob/dce1eb83b8d5137c56ac58568820c24bd8fda526/sdks/python/apache_beam/runners/interactive/pipeline_fragment.py#L120]. Reading it back fails because some of the pipeline is not written in Python.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)