You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/07/26 00:12:29 UTC

[GitHub] [beam] TheNeuralBit opened a new issue, #22440: [Bug]: Python Batch Dataflow SideInput LoadTests failing

TheNeuralBit opened a new issue, #22440:
URL: https://github.com/apache/beam/issues/22440

   ### What happened?
   
   See https://ci-beam.apache.org/job/beam_LoadTests_Python_SideInput_Dataflow_Batch/654/console
   
   Job is failing with
   ```
   13:00:34 INFO:apache_beam.runners.dataflow.dataflow_runner:2022-07-25T20:00:32.836Z: JOB_MESSAGE_BASIC: Stopping **** pool...
   13:00:34 INFO:apache_beam.runners.dataflow.dataflow_runner:Job 2022-07-25_09_00_14-1969075893707144978 is in state JOB_STATE_CANCELLING
   13:00:35 INFO:apache_beam.runners.dataflow.dataflow_runner:2022-07-25T20:01:07.531Z: JOB_MESSAGE_DETAILED: Autoscaling: Resized **** pool from 10 to 0.
   13:01:08 INFO:apache_beam.runners.dataflow.dataflow_runner:2022-07-25T20:01:07.588Z: JOB_MESSAGE_BASIC: Worker pool stopped.
   13:01:08 INFO:apache_beam.runners.dataflow.dataflow_runner:2022-07-25T20:01:07.612Z: JOB_MESSAGE_DEBUG: Tearing down pending resources...
   13:01:08 INFO:apache_beam.runners.dataflow.dataflow_runner:Job 2022-07-25_09_00_14-1969075893707144978 is in state JOB_STATE_CANCELLED
   13:01:15 ERROR:apache_beam.runners.dataflow.dataflow_runner:Console URL: https://console.cloud.google.com/dataflow/jobs/<RegionId>/2022-07-25_09_00_14-1969075893707144978?project=<ProjectId>
   13:01:16 Traceback (most recent call last):
   13:01:16   File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
   13:01:16     "__main__", mod_spec)
   13:01:16   File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
   13:01:16     exec(code, run_globals)
   13:01:16   File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_SideInput_Dataflow_Batch/src/sdks/python/apache_beam/testing/load_tests/sideinput_test.py", line 216, in <module>
   13:01:16     SideInputTest().run()
   13:01:16   File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_SideInput_Dataflow_Batch/src/sdks/python/apache_beam/testing/load_tests/load_test.py", line 151, in run
   13:01:16     self.result.wait_until_finish(duration=self.timeout_ms)
   13:01:16   File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_SideInput_Dataflow_Batch/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py", line 1676, in wait_until_finish
   13:01:16     self)
   13:01:16 apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: CANCELLED, Error:
   13:01:16 None
   13:01:16 
   13:01:17 > Task :sdks:python:apache_beam:testing:load_tests:run FAILED
   ```
   
   Opening up the Dataflow console we see the following errors in the worker logs:
   ```
   An exception was raised when trying to execute the workitem 8003945814083367836 : Traceback (most recent call last):
     File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process
     File "apache_beam/runners/common.py", line 837, in apache_beam.runners.common.PerWindowInvoker.invoke_process
     File "apache_beam/runners/common.py", line 983, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
     File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_SideInput_Dataflow_Batch/src/sdks/python/apache_beam/testing/load_tests/sideinput_test.py", line 123, in process
     File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/sideinputs.py", line 114, in __iter__
       for wv in self._iterable:
     File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sideinputs.py", line 180, in __iter__
       raise self.reader_exceptions.get()
     File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sideinputs.py", line 130, in _reader_thread
       for value in reader:
     File "/usr/local/lib/python3.7/site-packages/dataflow_worker/nativefileio.py", line 204, in __iter__
       for record in self.read_next_block():
     File "/usr/local/lib/python3.7/site-packages/dataflow_worker/nativeavroio.py", line 362, in read_next_block
       fastavro_block = next(self._block_iterator)
     File "fastavro/_read.pyx", line 1051, in fastavro._read.file_reader.__next__
     File "fastavro/_read.pyx", line 953, in _iter_avro_blocks
     File "fastavro/_read.pyx", line 854, in fastavro._read.snappy_read_block
     File "fastavro/_read.pyx", line 856, in fastavro._read.snappy_read_block
     File "/usr/local/lib/python3.7/site-packages/apache_beam/io/filesystemio.py", line 112, in readinto
       data = self._downloader.get_range(start, end)
     File "/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/gcsio.py", line 701, in get_range
       self._downloader.GetRange(start, end - 1)
     File "/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 486, in GetRange
       response = self.__ProcessResponse(response)
     File "/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 424, in __ProcessResponse
       raise exceptions.HttpError.FromResponse(response)
   apitools.base.py.exceptions.HttpNotFoundError: HttpError accessing <https://www.googleapis.com/storage/v1/b/temp-storage-for-perf-tests/o/loadtests%2Fload-tests-python-dataflow-batch-sideinput-9-0725100319.1658764814.369494%2Ftmp-e6377a3786602f76-00003-of-00028.avro?alt=media&generation=1658765290656871>: response: <{'x-guploader-uploadid': 'ADPycds9lcszewL5vWbRPedbhn7ewfjlPPHltdw2rk8IwZa4aEcwdveOb1sNgE5JHOXCGd166jZ-q0raGSyH1mAAOj9ZEA', 'content-type': 'text/html; charset=UTF-8', 'date': 'Mon, 25 Jul 2022 20:00:32 GMT', 'vary': 'Origin, X-Origin', 'expires': 'Mon, 25 Jul 2022 20:00:32 GMT', 'cache-control': 'private, max-age=0', 'content-length': '168', 'server': 'UploadServer', 'status': '404'}>, content <No such object: temp-storage-for-perf-tests/loadtests/load-tests-python-dataflow-batch-sideinput-9-0725100319.1658764814.369494/tmp-e6377a3786602f76-00003-of-00028.avro>
   ```
   
   ### Issue Priority
   
   Priority: 1
   
   ### Issue Component
   
   Component: test-failures


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles commented on issue #22440: [Bug]: Python Batch Dataflow SideInput LoadTests failing

Posted by GitBox <gi...@apache.org>.
kennknowles commented on issue #22440:
URL: https://github.com/apache/beam/issues/22440#issuecomment-1198439324

   Currently failing? (do we have a tag for that? would that potentially raise it to P0?)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles commented on issue #22440: [Bug]: Python Batch Dataflow SideInput LoadTests failing

Posted by GitBox <gi...@apache.org>.
kennknowles commented on issue #22440:
URL: https://github.com/apache/beam/issues/22440#issuecomment-1246035966

   No longer failing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit commented on issue #22440: [Bug]: Python Batch Dataflow SideInput LoadTests failing

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #22440:
URL: https://github.com/apache/beam/issues/22440#issuecomment-1199695756

   Yes currently failing. No we don't seem to have a tag for that. I think P1 is appropriate since this is effectively a PostCommit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles closed issue #22440: [Bug]: Python Batch Dataflow SideInput LoadTests failing

Posted by GitBox <gi...@apache.org>.
kennknowles closed issue #22440: [Bug]: Python Batch Dataflow SideInput LoadTests failing
URL: https://github.com/apache/beam/issues/22440


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org