You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Sam Le (Jira)" <ji...@apache.org> on 2021/10/14 21:15:00 UTC

[jira] [Commented] (BEAM-10958) WriteToPubsub with Protobuf message missing field

    [ https://issues.apache.org/jira/browse/BEAM-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429016#comment-17429016 ] 

Sam Le commented on BEAM-10958:
-------------------------------

tks, it has addressed the issue.

> WriteToPubsub with Protobuf message missing field
> -------------------------------------------------
>
>                 Key: BEAM-10958
>                 URL: https://issues.apache.org/jira/browse/BEAM-10958
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-py-gcp
>    Affects Versions: 2.23.0, 2.24.0
>            Reporter: Sam Le
>            Priority: P3
>
> Hi, I am trying to test ordering_key (new beta feature of Google Pubsub). However, I seem to hit the wall here. Google Protobuff standard message has been described here
> [https://cloud.google.com/pubsub/docs/reference/rpc/google.pubsub.v1#google.pubsub.v1.PubsubMessage] . So, it seems that Apache Beam should have no issue to passing the new field to Pubsub if the format of the message is correct. It does not seem to be a case here. As far as, I have tried to go through the code, I believe that I would only need to change  
> [https://github.com/apache/beam/blob/1a09a7576a4a0e22d32583c0c1c52b67970691d6/sdks/python/apache_beam/io/gcp/pubsub.py#L115] 
> {code:java}
> // 
> def _to_proto_str(self):
>   msg = pubsub.types.pubsub_pb2.PubsubMessage()
>   msg.data = self.data
>   for key, value in iteritems(self.attributes):
>     msg.attributes[key] = value
>   msg.ordering_key = self.ordering_key
>   return msg.SerializeToString()
> {code}
> And then set with_attributes=true. So, WriteToPubSub would allow me to send out a protobuff message to pubsub. However, at some point, the information (ordering_key) is tripped, although, I could set custom attributes. I guess it may be the dataflow_runner code, but it really difficult to navigate around the code in that area. I have gone as far as create a new version of write to WriteToPubSub, but it does not help. The field is still missing on the message when it hits Google Pubsub Topic.
> Could anyone point me to the right direction please?
>  
> After digging a little bit deeper, I think the root reason probably because, Apache Beam depend on an old version of Pubsub (1.7.0) . Probably for support of Python 2.  I am just not sure for how long? It seems that the support for Python will end after 2.24.0. However, the dependency for older version of google-cloud-pubsub is still there for 2.25.0 (
> |'google-cloud-pubsub>=0.39.0,<2',|
> | |
> )
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)