You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Niel Markwick <ni...@google.com> on 2019/01/11 15:59:55 UTC

regression in 2.9.0 - FileIO.write with dynamic naming and DirectRunner.

I have found and narrowed down a regression in 2.9.0 (and 2.10.0/head)
where:

   - If you use DirectRunner (or TestPipeline which uses DirectRunner)
   - AND you use FileIO.writeDynamic()
   - AND you have a side input to the ContextFul.FN
   - AND you do not limit to a single shard
   - Then the pipeline will fail.


java.lang.IllegalStateException: All PCollectionViews that are consumed
must be written by some WriteView PTransform: Missing [<unnamed>
[RunnerPCollectionView]]

This is due to the DirectRunner using TransformOverrides re-writing FileIO
sinks to use runner-determined-sharding
( see DirectRunner.java line 226
<https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226>
)

No idea why this occurs or why it started failing in 2.9.0...

Raised https://issues.apache.org/jira/browse/BEAM-6407



<https://cloud.google.com>
* •  **Niel Markwick*
* •  *Cloud Solutions Architect
* •  *Google Belgium
* •  *nielm@google.com


Google Belgium NV/SA, Steenweg op Etterbeek 180, 1040 Brussel, Belgie.
RPR: 0878.065.378

If you received this communication by mistake, please don't forward it to
anyone else (it may contain confidential or privileged information), please
erase all copies of it, including all attachments, and please let the
sender know it went to the wrong person. Thanks

Re: regression in 2.9.0 - FileIO.write with dynamic naming and DirectRunner.

Posted by Kenneth Knowles <ke...@apache.org>.
Thanks for the report! I moved the "Affects Version = 2.10.0" to "Fix
Version = 2.10.0" to see if we can get this fixed for the ongoing release.
I'll keep further commentary on the bug so it is in one place.

On Fri, Jan 11, 2019 at 8:00 AM Niel Markwick <ni...@google.com> wrote:

>
> I have found and narrowed down a regression in 2.9.0 (and 2.10.0/head)
> where:
>
>    - If you use DirectRunner (or TestPipeline which uses DirectRunner)
>    - AND you use FileIO.writeDynamic()
>    - AND you have a side input to the ContextFul.FN
>    - AND you do not limit to a single shard
>    - Then the pipeline will fail.
>
>
> java.lang.IllegalStateException: All PCollectionViews that are consumed
> must be written by some WriteView PTransform: Missing [<unnamed>
> [RunnerPCollectionView]]
>
> This is due to the DirectRunner using TransformOverrides re-writing FileIO
> sinks to use runner-determined-sharding
> ( see DirectRunner.java line 226
> <https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226>
> )
>
> No idea why this occurs or why it started failing in 2.9.0...
>
> Raised https://issues.apache.org/jira/browse/BEAM-6407
>
>
>
> <https://cloud.google.com>
> * •  **Niel Markwick*
> * •  *Cloud Solutions Architect
> * •  *Google Belgium
> * •  *nielm@google.com
>
>
>
> Google Belgium NV/SA, Steenweg op Etterbeek 180, 1040 Brussel, Belgie. RPR: 0878.065.378
>
> If you received this communication by mistake, please don't forward it to
> anyone else (it may contain confidential or privileged information), please
> erase all copies of it, including all attachments, and please let the
> sender know it went to the wrong person. Thanks
>