You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/19 19:39:51 UTC

[GitHub] [beam] gosforth opened a new issue, #21946: [Bug]: No way to read or write to file when running Beam in Flink

gosforth opened a new issue, #21946:
URL: https://github.com/apache/beam/issues/21946

   ### What happened?
   
   My code is (this is taken from Beam examples):
   
   ```
   def run():
    import apache_beam as beam
    from apache_beam.options.pipeline_options import PipelineOptions
    
    options = PipelineOptions([
           "--runner=FlinkRunner",
           "--flink_version=1.14",
           "--flink_master=localhost:8081",
           "--environment_config=localhost:50000"
    ])
   
    output_file_prefix = 'C:\\ApacheBeam\\output'
   
    with beam.Pipeline(options=options) as p:
       (p
   	    | 'Create file lines' >> beam.Create([
             'Each element must be a string.',
             'It writes one element per line.',
             'There are no guarantees on the line order.',
             'The data might be written into multiple files.',
           ])
           | 'Write to files' >> beam.io.WriteToText(output_file_prefix, file_name_suffix='.txt')
       )
   
   if __name__ == "__main__":
       run()
   ```
   
   But Flink is not able to write or read from file:
   
   `Caused by: java.lang.Exception: The user defined 'open()' method caused an exception: java.io.IOException: Cannot run program "docker": CreateProcess error=2, The system cannot find the file specified`
   
   According Beam documentation this is how it should work: https://beam.apache.org/documentation/runners/flink/
   
   ### Issue Priority
   
   Priority: 1
   
   ### Issue Component
   
   Component: runner-flink


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] manuzhang commented on issue #21946: [Bug]: No way to read or write to file when running Beam in Flink

Posted by GitBox <gi...@apache.org>.
manuzhang commented on issue #21946:
URL: https://github.com/apache/beam/issues/21946#issuecomment-1176898855

   It is the current design for Beam to support [portability](https://beam.apache.org/roadmap/portability/), or interoperability between SDKs and runners. There are [discussions on the mailing list](https://lists.apache.org/thread/ylvqb6ch6dftxw639gcbk1tqj5dpbgb5) to replace the current approach with Wasm.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] gosforth commented on issue #21946: [Bug]: No way to read or write to file when running Beam in Flink

Posted by GitBox <gi...@apache.org>.
gosforth commented on issue #21946:
URL: https://github.com/apache/beam/issues/21946#issuecomment-1176688641

   Running some 'docker' just to open file is like reaching left ear with right hand.
   This is code problem that need to be fixed.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] manuzhang commented on issue #21946: [Bug]: No way to read or write to file when running Beam in Flink

Posted by GitBox <gi...@apache.org>.
manuzhang commented on issue #21946:
URL: https://github.com/apache/beam/issues/21946#issuecomment-1175029463

   > You will need Docker to be installed in your execution environment.
   
   As the error shows, you need to install docker to run Python pipeline on FlinkRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org