You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/07/26 00:12:29 UTC
[GitHub] [beam] TheNeuralBit opened a new issue, #22440: [Bug]: Python Batch Dataflow SideInput LoadTests failing
TheNeuralBit opened a new issue, #22440:
URL: https://github.com/apache/beam/issues/22440
### What happened?
See https://ci-beam.apache.org/job/beam_LoadTests_Python_SideInput_Dataflow_Batch/654/console
Job is failing with
```
13:00:34 INFO:apache_beam.runners.dataflow.dataflow_runner:2022-07-25T20:00:32.836Z: JOB_MESSAGE_BASIC: Stopping **** pool...
13:00:34 INFO:apache_beam.runners.dataflow.dataflow_runner:Job 2022-07-25_09_00_14-1969075893707144978 is in state JOB_STATE_CANCELLING
13:00:35 INFO:apache_beam.runners.dataflow.dataflow_runner:2022-07-25T20:01:07.531Z: JOB_MESSAGE_DETAILED: Autoscaling: Resized **** pool from 10 to 0.
13:01:08 INFO:apache_beam.runners.dataflow.dataflow_runner:2022-07-25T20:01:07.588Z: JOB_MESSAGE_BASIC: Worker pool stopped.
13:01:08 INFO:apache_beam.runners.dataflow.dataflow_runner:2022-07-25T20:01:07.612Z: JOB_MESSAGE_DEBUG: Tearing down pending resources...
13:01:08 INFO:apache_beam.runners.dataflow.dataflow_runner:Job 2022-07-25_09_00_14-1969075893707144978 is in state JOB_STATE_CANCELLED
13:01:15 ERROR:apache_beam.runners.dataflow.dataflow_runner:Console URL: https://console.cloud.google.com/dataflow/jobs/<RegionId>/2022-07-25_09_00_14-1969075893707144978?project=<ProjectId>
13:01:16 Traceback (most recent call last):
13:01:16 File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
13:01:16 "__main__", mod_spec)
13:01:16 File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
13:01:16 exec(code, run_globals)
13:01:16 File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_SideInput_Dataflow_Batch/src/sdks/python/apache_beam/testing/load_tests/sideinput_test.py", line 216, in <module>
13:01:16 SideInputTest().run()
13:01:16 File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_SideInput_Dataflow_Batch/src/sdks/python/apache_beam/testing/load_tests/load_test.py", line 151, in run
13:01:16 self.result.wait_until_finish(duration=self.timeout_ms)
13:01:16 File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_SideInput_Dataflow_Batch/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py", line 1676, in wait_until_finish
13:01:16 self)
13:01:16 apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: CANCELLED, Error:
13:01:16 None
13:01:16
13:01:17 > Task :sdks:python:apache_beam:testing:load_tests:run FAILED
```
Opening up the Dataflow console we see the following errors in the worker logs:
```
An exception was raised when trying to execute the workitem 8003945814083367836 : Traceback (most recent call last):
File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 837, in apache_beam.runners.common.PerWindowInvoker.invoke_process
File "apache_beam/runners/common.py", line 983, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_SideInput_Dataflow_Batch/src/sdks/python/apache_beam/testing/load_tests/sideinput_test.py", line 123, in process
File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/sideinputs.py", line 114, in __iter__
for wv in self._iterable:
File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sideinputs.py", line 180, in __iter__
raise self.reader_exceptions.get()
File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sideinputs.py", line 130, in _reader_thread
for value in reader:
File "/usr/local/lib/python3.7/site-packages/dataflow_worker/nativefileio.py", line 204, in __iter__
for record in self.read_next_block():
File "/usr/local/lib/python3.7/site-packages/dataflow_worker/nativeavroio.py", line 362, in read_next_block
fastavro_block = next(self._block_iterator)
File "fastavro/_read.pyx", line 1051, in fastavro._read.file_reader.__next__
File "fastavro/_read.pyx", line 953, in _iter_avro_blocks
File "fastavro/_read.pyx", line 854, in fastavro._read.snappy_read_block
File "fastavro/_read.pyx", line 856, in fastavro._read.snappy_read_block
File "/usr/local/lib/python3.7/site-packages/apache_beam/io/filesystemio.py", line 112, in readinto
data = self._downloader.get_range(start, end)
File "/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/gcsio.py", line 701, in get_range
self._downloader.GetRange(start, end - 1)
File "/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 486, in GetRange
response = self.__ProcessResponse(response)
File "/usr/local/lib/python3.7/site-packages/apitools/base/py/transfer.py", line 424, in __ProcessResponse
raise exceptions.HttpError.FromResponse(response)
apitools.base.py.exceptions.HttpNotFoundError: HttpError accessing <https://www.googleapis.com/storage/v1/b/temp-storage-for-perf-tests/o/loadtests%2Fload-tests-python-dataflow-batch-sideinput-9-0725100319.1658764814.369494%2Ftmp-e6377a3786602f76-00003-of-00028.avro?alt=media&generation=1658765290656871>: response: <{'x-guploader-uploadid': 'ADPycds9lcszewL5vWbRPedbhn7ewfjlPPHltdw2rk8IwZa4aEcwdveOb1sNgE5JHOXCGd166jZ-q0raGSyH1mAAOj9ZEA', 'content-type': 'text/html; charset=UTF-8', 'date': 'Mon, 25 Jul 2022 20:00:32 GMT', 'vary': 'Origin, X-Origin', 'expires': 'Mon, 25 Jul 2022 20:00:32 GMT', 'cache-control': 'private, max-age=0', 'content-length': '168', 'server': 'UploadServer', 'status': '404'}>, content <No such object: temp-storage-for-perf-tests/loadtests/load-tests-python-dataflow-batch-sideinput-9-0725100319.1658764814.369494/tmp-e6377a3786602f76-00003-of-00028.avro>
```
### Issue Priority
Priority: 1
### Issue Component
Component: test-failures
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] kennknowles commented on issue #22440: [Bug]: Python Batch Dataflow SideInput LoadTests failing
Posted by GitBox <gi...@apache.org>.
kennknowles commented on issue #22440:
URL: https://github.com/apache/beam/issues/22440#issuecomment-1198439324
Currently failing? (do we have a tag for that? would that potentially raise it to P0?)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] kennknowles commented on issue #22440: [Bug]: Python Batch Dataflow SideInput LoadTests failing
Posted by GitBox <gi...@apache.org>.
kennknowles commented on issue #22440:
URL: https://github.com/apache/beam/issues/22440#issuecomment-1246035966
No longer failing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] TheNeuralBit commented on issue #22440: [Bug]: Python Batch Dataflow SideInput LoadTests failing
Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #22440:
URL: https://github.com/apache/beam/issues/22440#issuecomment-1199695756
Yes currently failing. No we don't seem to have a tag for that. I think P1 is appropriate since this is effectively a PostCommit.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] kennknowles closed issue #22440: [Bug]: Python Batch Dataflow SideInput LoadTests failing
Posted by GitBox <gi...@apache.org>.
kennknowles closed issue #22440: [Bug]: Python Batch Dataflow SideInput LoadTests failing
URL: https://github.com/apache/beam/issues/22440
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org