You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/03 18:35:06 UTC

[GitHub] [beam] kennknowles opened a new issue, #18430: SerializablePipelineOptions should not call FileSystems.setDefaultPipelineOptions.

kennknowles opened a new issue, #18430:
URL: https://github.com/apache/beam/issues/18430

   https://github.com/apache/beam/pull/3654 introduces SerializablePipelineOptions, which on deserialization calls FileSystems.setDefaultPipelineOptions.
   
   This is obviously problematic and racy in case the same process uses SerializablePipelineOptions with different filesystem-related options in them.
   
   The reason the PR does this is, Flink and Apex runners were already doing it in their respective SerializablePipelineOptions-like classes (being removed in the PR); and Spark wasn't but probably should have.
   
   I believe this is done for the sake of having the proper filesystem options automatically available on workers in all places where any kind of PipelineOptions are used. Instead, all 3 runners should pick a better place to initialize their workers, and explicitly call FileSystems.setDefaultPipelineOptions there.
   
   It would be even better if FileSystems.setDefaultPipelineOptions didn't exist at all, but that's a topic for a separate JIRA.
   
   CC'ing runner contributors [~aljoscha] [~aviemzur] [~thw]
   
   Imported from Jira [BEAM-2712](https://issues.apache.org/jira/browse/BEAM-2712). Original Jira may contain additional context.
   Reported by: jkff.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org