You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Adrià Arcarons (Jira)" <ji...@apache.org> on 2022/03/23 08:55:00 UTC

[jira] (BEAM-13682) Pipeline fails to start when .withDynamicRead() is activated

    [ https://issues.apache.org/jira/browse/BEAM-13682 ]


    Adrià Arcarons deleted comment on BEAM-13682:
    ---------------------------------------

was (Author: JIRAUSER283783):
This looks like an issue on the Dataflow runner side. There is already an issue in Google's PIT: [https://issuetracker.google.com/issues/216497508]

IMO there is nothing that the Beam community can do at this point. Closing the issue.

 

> Pipeline fails to start when .withDynamicRead() is activated
> ------------------------------------------------------------
>
>                 Key: BEAM-13682
>                 URL: https://issues.apache.org/jira/browse/BEAM-13682
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>    Affects Versions: 2.34.0, 2.35.0
>            Reporter: Adrià Arcarons
>            Priority: P2
>              Labels: stale-P2
>             Fix For: Not applicable
>
>         Attachments: Main.java
>
>
> We have a {+}Java{+}, {+}streaming{+}, pipeline that consumes messages from Kafka and ingests them into BigQuery. We run this pipeline in Dataflow.
> Recently, we enabled the `.withDynamicRead()` option in our KafkaIO source to be able to auto-discover new partitions.
> As a result, the pipeline won't load in Dataflow with the following (not very informative) error:
> {noformat}
> "Workflow failed. Causes: Internal Issue (fc4aaa0289f1f666): 62242584:26". 
> {noformat}
> This error happens before the actual job graph can be generated.
> Google Support provided us with an internal Dataflow log message that can help to narrow the problem:
> {noformat}
> "invalid_argument: Duplicate side input name.  Side input names must be unique."
> {noformat}
> Some important pieces of information:
>  * The error only happens if the destination sink is BigQueryIO. We have tested it by replacing BigQueryIO with TextIO and it works OK.
>  * The error doesn't happen if pipeline is run by DirectRunner; only DataflowRunner.
>  * We have reproduced this error in Beam SDK versions 2.34 and 2.35.
>  
> I've written a small pipeline that reproduces the error. Please find it attached in the ticket [^Main.java].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)