You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "David Moravek (Jira)" <ji...@apache.org> on 2019/11/07 14:28:00 UTC

[jira] [Created] (BEAM-8577) FileSystems may have not be initialized during ResourceId deserialization

David Moravek created BEAM-8577:
-----------------------------------

             Summary: FileSystems may have not be initialized during ResourceId deserialization
                 Key: BEAM-8577
                 URL: https://issues.apache.org/jira/browse/BEAM-8577
             Project: Beam
          Issue Type: Bug
          Components: runner-flink
    Affects Versions: 2.16.0
            Reporter: David Moravek
            Assignee: David Moravek
             Fix For: 2.17.0


- FileSystems use static registration using *FileSystems#setDefaultPipelineOptions* method.
- *#setDefaultPipelineOptions* is called either when deserializaing SerializablePipelineOptions or during opening of various beam operators. 
- *FileIO#matchAll* is expanded using *Reshuffle.viaRandomKey()*.
- Reshuffle is implemented using *.rebalance*, that doesn't have a "RichFunction" lifecycle, so we need to find another way to register FileSystems, as the deserialization may happen before other "rich operators" get executed on particular task manager.

This results in random pipeline fails as the task assignment is not deterministic.

We can workaround this, by registering FileSystems during coder deserialization.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)