You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/03 19:34:24 UTC

[GitHub] [beam] kennknowles opened a new issue, #18664: Ingesting json file ValidationError: Expected type

kennknowles opened a new issue, #18664:
URL: https://github.com/apache/beam/issues/18664

   Reading a json file from GCS file pattern using Beam Python SDK 2.2.0 in Dataflow yields the following warning:
   
   ```
   
   Retry with exponential backoff: waiting for 4.21317187833 seconds before retrying report_completion_status
   because we caught exception: ValidationError: Expected type <type 'unicode'> for field name, found s05-s34-reify20-process-msecs
   (type <class 'apache_beam.utils.counters.CounterName'>) Traceback for above exception (most recent call
   last): File "/usr/local/lib/python2.7/dist-packages/apache_beam/utils/retry.py", line 175, in wrapper
   return fun(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py",
   line 491, in report_completion_status exception_details=exception_details) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py",
   line 299, in report_status work_executor=self._work_executor) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py",
   line 316, in report_status append_counter(work_item_status, counter, tentative=not completed) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py",
   line 43, in append_counter status_object, counter.name, kind, counter.accumulator, setter) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py",
   line 95, in append_counter_update add_unstructured_name_and_kind(metric_update, metric_name, kind) File
   "/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py", line 63, in add_unstructured_name_and_kind
   metric_update.nameAndKind.name = metric_name File "/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
   line 973, in __setattr__ object.__setattr__(self, name, value) File "/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
   line 1299, in __set__ value = self.validate(value) File "/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
   line 1406, in validate return self.__validate(value, self.validate_element) File "/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
   line 1364, in __validate return validate_element(value) File "/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
   line 1549, in validate_element return super(StringField, self).validate_element(value) File "/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
   line 1346, in validate_element (self.type, name, value, type(value)))
   
   ```
   
   
   The job does not fail but rather gets stuck on trying to read the file. The above warning is thrown for every retry read.
   
   However running the job with Beam Python SDK 2.1.1 works perfectly fine.
   
   
   Imported from Jira [BEAM-3403](https://issues.apache.org/jira/browse/BEAM-3403). Original Jira may contain additional context.
   Reported by: akashpatel.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org