You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/03 17:24:06 UTC

[GitHub] [beam] kennknowles opened a new issue, #18327: Can't save datastore objects

kennknowles opened a new issue, #18327:
URL: https://github.com/apache/beam/issues/18327

   I can't seem to save my database objects using `WriteToDatastore`, as it errors out on a strange unicode issue when trying to write a batch. Stacktrace follows:
   
   ```
   
   File "apache_beam/runners/common.py", line 195, in apache_beam.runners.common.DoFnRunner.receive (apache_beam/runners/common.c:5142)
   
    self.process(windowed_value) 
   File "apache_beam/runners/common.py", line 267, in apache_beam.runners.common.DoFnRunner.process
   (apache_beam/runners/common.c:7201)
     self.reraise_augmented(exn) 
   File "apache_beam/runners/common.py",
   line 279, in apache_beam.runners.common.DoFnRunner.reraise_augmented (apache_beam/runners/common.c:7590)
   
    raise type(exn), args, sys.exc_info()[2] 
   File "apache_beam/runners/common.py", line 263, in apache_beam.runners.common.DoFnRunner.process
   (apache_beam/runners/common.c:7090)
     self._dofn_simple_invoker(element) 
   File "apache_beam/runners/common.py",
   line 198, in apache_beam.runners.common.DoFnRunner._dofn_simple_invoker (apache_beam/runners/common.c:5262)
   
    self._process_outputs(element, self.dofn_process(element.value)) 
   File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/datastore/v1/datastoreio.py",
   line 354, in process
     self._flush_batch() 
   File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/datastore/v1/datastoreio.py",
   line 363, in _flush_batch
     helper.write_mutations(self._datastore, self._project, self._mutations)
   
   File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/datastore/v1/helper.py", line 187,
   in write_mutations
     commit(commit_request) 
   File "/usr/local/lib/python2.7/dist-packages/apache_beam/utils/retry.py",
   line 174, in wrapper
     return fun(*args, **kwargs) 
   File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/datastore/v1/helper.py",
   line 185, in commit
     datastore.commit(req) 
   File "/usr/local/lib/python2.7/dist-packages/googledatastore/connection.py",
   line 140, in commit
     datastore_pb2.CommitResponse) 
   File "/usr/local/lib/python2.7/dist-packages/googledatastore/connection.py",
   line 199, in _call_method
     method='POST', body=payload, headers=headers) 
   File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py",
   line 631, in new_request
     redirections, connection_type) 
   File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py",
   line 1609, in request (response, content)
     = self._request(conn, authority, uri, request_uri, method,
   body, headers, redirections, cachekey) 
   File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py",
   line 1351, in _request (response, content)
     = self._conn_request(conn, request_uri, method, body, headers)
   
   File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1273, in _conn_request
     conn.request(method,
   request_uri, body, headers) 
   File "/usr/lib/python2.7/httplib.py", line 1039, in request
     self._send_request(method,
   url, body, headers)
   File "/usr/lib/python2.7/httplib.py", line 1073, in _send_request
      self.endheaders(body)
   
   File "/usr/lib/python2.7/httplib.py", line 1035, in endheaders
     self._send_output(message_body) 
   File
   "/usr/lib/python2.7/httplib.py", line 877, in _send_output
     msg += message_body TypeError: must be
   str, not unicode
   [while running 'write to datastore/Convert to Mutation']
   
   ```
   
   
   My code is basically:
   ```
   
           | 'convert from entity' >> beam.Map(ConvertFromEntity)
           | 'write to datastore' >> WriteToDatastore(client.project)
   
   ```
   
   
   Where `ConvertFromEntity` converts from a google.cloud.datastore object (which has a nice API/interface) into the underlying protobuf (which is what the beam gcp/datastore library expects):
   ```
   
   from google.cloud.datastore import helpers
   def ConvertFromEntity(entity):
       return helpers.entity_to_protobuf(entity)
   
   ```
   
   
   I assume entity_to_protobuf works fine/normally, since it's also what is used by `google/cloud/datastore/batch.py` to write a bunch of `entity_pb2.Entity` objects into the `datastore_pb2.CommitRequest.mutations[n].upsert`:
   
   In batch.py: `put() -> _assign_entity_to_pb() -> entity_to_protobuf()`.
   
   In datastoreio.py: `WriteToDatastore->DatastoreWriteFn.to_upsert_mutation->_Mutate.DatastoreWriteFn->helper.write_mutations`
   
   Any idea what's going on here and why this doesn't work? Yes, I may have some unicode in my objects...but it works in my appengine DB/NDB usage. I will attempt to skip WriteToDatastore and just put unbatched entities using the datastore library and see if that goes any better for me...
   
   Imported from Jira [BEAM-1800](https://issues.apache.org/jira/browse/BEAM-1800). Original Jira may contain additional context.
   Reported by: mlambert.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org