You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "tvalentyn (via GitHub)" <gi...@apache.org> on 2023/05/18 17:28:59 UTC

[GitHub] [beam] tvalentyn opened a new issue, #26769: [Bug]: GCS IO publishes uninformative error message

tvalentyn opened a new issue, #26769:
URL: https://github.com/apache/beam/issues/26769

   ### What happened?
   
   When I ran a wordcount pipeline with an output set to a bucket in a different project (causing permission errors), I saw following output in the console:
   
   ```
     File "apache_beam/runners/common.py", line 1731, in apache_beam.runners.common._OutputHandler.finish_bundle_outputs
     File "/usr/local/lib/python3.9/site-packages/apache_beam/io/iobase.py", line 1202, in finish_bundle
       self.writer.close(),
     File "/usr/local/lib/python3.9/site-packages/apache_beam/io/filebasedsink.py", line 434, in close
       self.sink.close(self.temp_handle)
     File "/usr/local/lib/python3.9/site-packages/apache_beam/io/textio.py", line 524, in close
       super().close(file_handle)
     File "/usr/local/lib/python3.9/site-packages/apache_beam/io/filebasedsink.py", line 167, in close
       file_handle.close()
     File "/usr/local/lib/python3.9/site-packages/apache_beam/io/filesystemio.py", line 215, in close
       self._uploader.finish()
     File "/usr/local/lib/python3.9/site-packages/apache_beam/io/gcp/gcsio.py", line 829, in finish
       self._upload_thread.last_error.message)  # pylint: disable=raising-bad-type
   AttributeError: 'HttpForbiddenError' object has no attribute 'message' [while running 'Write/Write/WriteImpl/WriteBundles-ptransform-58']
   ``` 
   
   Looks like not all possible errors have the message attribute, and the code needs to account for it.
   
   Relevant PR: https://github.com/apache/beam/pull/24449 
   
   
   
   ### Issue Priority
   
   Priority: 2 (default / most bugs should be filed as P2)
   
   ### Issue Components
   
   - [X] Component: Python SDK
   - [ ] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [X] Component: IO connector
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #26769: [Bug]: GCS IO publishes uninformative error message

Posted by "tvalentyn (via GitHub)" <gi...@apache.org>.
tvalentyn commented on issue #26769:
URL: https://github.com/apache/beam/issues/26769#issuecomment-1553382889

   cc: @BjornPrime who is working on GCS IO


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] cozos commented on issue #26769: [Bug]: GCS IO publishes uninformative error message

Posted by "cozos (via GitHub)" <gi...@apache.org>.
cozos commented on issue #26769:
URL: https://github.com/apache/beam/issues/26769#issuecomment-1721086421

   @tvalentyn https://github.com/apache/beam/pull/28470


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn closed issue #26769: [Bug]: GCS IO publishes uninformative error message

Posted by "tvalentyn (via GitHub)" <gi...@apache.org>.
tvalentyn closed issue #26769: [Bug]: GCS IO publishes uninformative error message
URL: https://github.com/apache/beam/issues/26769


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] liferoad commented on issue #26769: [Bug]: GCS IO publishes uninformative error message

Posted by "liferoad (via GitHub)" <gi...@apache.org>.
liferoad commented on issue #26769:
URL: https://github.com/apache/beam/issues/26769#issuecomment-1701232920

   Note that https://github.com/apache/beam/pull/28079 will remove `apitools`. @BjornPrime FYI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] BjornPrime commented on issue #26769: [Bug]: GCS IO publishes uninformative error message

Posted by "BjornPrime (via GitHub)" <gi...@apache.org>.
BjornPrime commented on issue #26769:
URL: https://github.com/apache/beam/issues/26769#issuecomment-1553537456

   The client swap should take care of this. I'll claim it for now and close it once the swap is merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] BjornPrime commented on issue #26769: [Bug]: GCS IO publishes uninformative error message

Posted by "BjornPrime (via GitHub)" <gi...@apache.org>.
BjornPrime commented on issue #26769:
URL: https://github.com/apache/beam/issues/26769#issuecomment-1553537908

   .take-issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #26769: [Bug]: GCS IO publishes uninformative error message

Posted by "tvalentyn (via GitHub)" <gi...@apache.org>.
tvalentyn commented on issue #26769:
URL: https://github.com/apache/beam/issues/26769#issuecomment-1703459831

   @cozos SGTM. You can raise a RuntimeError. As mentioned, this would hopefully be moot soon though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] cozos commented on issue #26769: [Bug]: GCS IO publishes uninformative error message

Posted by "cozos (via GitHub)" <gi...@apache.org>.
cozos commented on issue #26769:
URL: https://github.com/apache/beam/issues/26769#issuecomment-1700422615

   This should probably be:
   
   ```py
   raise BeamGcsError("error while uploading file %s...") from self._upload_thread.last_error
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #26769: [Bug]: GCS IO publishes uninformative error message

Posted by "tvalentyn (via GitHub)" <gi...@apache.org>.
tvalentyn commented on issue #26769:
URL: https://github.com/apache/beam/issues/26769#issuecomment-1721527407

   Thanks, @cozos !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] cozos commented on issue #26769: [Bug]: GCS IO publishes uninformative error message

Posted by "cozos (via GitHub)" <gi...@apache.org>.
cozos commented on issue #26769:
URL: https://github.com/apache/beam/issues/26769#issuecomment-1700422785

   @tvalentyn If you agree I can make a PR


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] cozos commented on issue #26769: [Bug]: GCS IO publishes uninformative error message

Posted by "cozos (via GitHub)" <gi...@apache.org>.
cozos commented on issue #26769:
URL: https://github.com/apache/beam/issues/26769#issuecomment-1700411569

   This code:
   
   ```
   raise type(self._upload_thread.last_error)(
             "Error while uploading file %s: %s",
             self._path,
             self._upload_thread.last_error.message if hasattr(self._upload_thread.last_error, "message") else "")  # pylint: disable=raising-bad-type
   ``` 
   
   Totally breaks Google's [HttpError](https://github.com/google/apitools/blob/master/apitools/base/py/exceptions.py#L50) for me, since it overrides `response` which is expected to be dictionary rather than a message. 
   
   Calling `HttpError.status_code` results in this exception:
   
   ```
   File "/generate_dataset_docker_pybinary.runfiles/cruise_ws/cruise/mlp/cfs/projects/temporal_understanding/generate_dataset_docker_pybinary_exedir/apitools/base/py/exceptions.py", line 78, in status_code
       return int(self.response['status'])
   TypeError: string indices must be integers [while running 'Publish dataset/Write to big query/StoreArtifacts/WriteArtifact-ptransform-158']
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org