You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 16:22:36 UTC

[GitHub] [beam] damccorm opened a new issue, #20244: Stateful Dataflow runner?

damccorm opened a new issue, #20244:
URL: https://github.com/apache/beam/issues/20244

   Hi,
   
   I'm trying to use python portable DataflowRunner with a [BagStateSpec]([https://beam.apache.org/releases/pydoc/2.6.0/apache_beam.transforms.userstate.html)]. Though I encounter followiung issue:
   ```
   
   Traceback (most recent call last):
     File "/Users/leopold/.pyenv/versions/3.6.0/lib/python3.6/runpy.py",
   line 193, in _run_module_as_main
       "__main__", mod_spec)
     File "/Users/leopold/.pyenv/versions/3.6.0/lib/python3.6/runpy.py",
   line 85, in _run_code
       exec(code, run_globals)
     File "/Users/leopold/workspace/BenchmarkListingStreaming/listing_beam_pipeline/test_runner.py",
   line 49, in <module>
       run()
     File "/Users/leopold/workspace/BenchmarkListingStreaming/listing_beam_pipeline/test_runner.py",
   line 44, in run
       | 'write to file' >> WriteToText(known_args.output)
     File "/Users/leopold/.pyenv/versions/BenchmarkListingStreaming/lib/python3.6/site-packages/apache_beam/pipeline.py",
   line 481, in __exit__
       self.run().wait_until_finish()
     File "/Users/leopold/.pyenv/versions/BenchmarkListingStreaming/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
   line 1449, in wait_until_finish
       (self.state, getattr(self._runner, 'last_error_msg', None)), self)
   apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException:
   Dataflow pipeline failed. State: FAILED, Error:
   Traceback (most recent call last):
     File "/usr/local/lib/python3.6/site-packages/dataflow_worker/batchworker.py",
   line 648, in do_work
       work_executor.execute()
     File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py",
   line 176, in execute
       op.start()
     File "apache_beam/runners/worker/operations.py", line 649, in
   apache_beam.runners.worker.operations.DoOperation.start
     File "apache_beam/runners/worker/operations.py",
   line 651, in apache_beam.runners.worker.operations.DoOperation.start
     File "apache_beam/runners/worker/operations.py",
   line 652, in apache_beam.runners.worker.operations.DoOperation.start
     File "apache_beam/runners/worker/operations.py",
   line 261, in apache_beam.runners.worker.operations.Operation.start
     File "apache_beam/runners/worker/operations.py",
   line 266, in apache_beam.runners.worker.operations.Operation.start
     File "apache_beam/runners/worker/operations.py",
   line 597, in apache_beam.runners.worker.operations.DoOperation.setup
     File "apache_beam/runners/worker/operations.py",
   line 636, in apache_beam.runners.worker.operations.DoOperation.setup
     File "apache_beam/runners/common.py",
   line 866, in apache_beam.runners.common.DoFnRunner.__init__
   Exception: Requested execution of a stateful
   DoFn, but no user state context is available. This likely means that the current runner does not support
   the execution of stateful DoFns.
   
   ```
   
   I've also seen this issue in stackoverflow
   
   [https://stackoverflow.com/questions/55413690/does-google-dataflow-support-stateful-pipelines-developed-with-python-sdk](https://stackoverflow.com/questions/55413690/does-google-dataflow-support-stateful-pipelines-developed-with-python-sdk)
   
    
   
   Do you have any idea/ETA when this feature will be available with beam?
   
    
   
   Thanks!
   
    
   
   Imported from Jira [BEAM-9655](https://issues.apache.org/jira/browse/BEAM-9655). Original Jira may contain additional context.
   Reported by: leopold.boudard.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org