You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kenneth Knowles (Jira)" <ji...@apache.org> on 2021/04/25 15:05:00 UTC

[jira] [Updated] (BEAM-12222) Dataflow translation sometimes fails on side inputs

     [ https://issues.apache.org/jira/browse/BEAM-12222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kenneth Knowles updated BEAM-12222:
-----------------------------------
    Status: Open  (was: Triage Needed)

> Dataflow translation sometimes fails on side inputs
> ---------------------------------------------------
>
>                 Key: BEAM-12222
>                 URL: https://issues.apache.org/jira/browse/BEAM-12222
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>            Reporter: Kenneth Knowles
>            Assignee: Kenneth Knowles
>            Priority: P1
>             Fix For: 2.30.0
>
>
> I have identified a seemingly nondeterministic issue in Dataflow translation, where pipelines with side inputs sometimes are translated in the wrong order.
> {code}
> java.lang.NullPointerException: Unknown producer for value SimplePCollectionView{tag=Tag<org.apache.beam.sdk.values.PCollectionViews$SimplePCollectionView.<init>:1221#4dca087078898728>} while translating step TfIdf.ComputeTfIdf/Combine.globally(Count)/ProduceDefault
> 	at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:1227)
> {code}
> Seen on https://ci-beam.apache.org/job/beam_PostCommit_Java_Examples_Dataflow_V2_PR/32/testReport/junit/org.apache.beam.examples.complete/TfIdfIT/testE2ETfIdf/ and also other changes. I think the change itself is just triggering the nondeterministic problem.
> So there is a lurking problem with side inputs overall.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)