You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Sam Le (Jira)" <ji...@apache.org> on 2021/10/14 21:15:00 UTC
[jira] [Commented] (BEAM-10958) WriteToPubsub with Protobuf message
missing field
[ https://issues.apache.org/jira/browse/BEAM-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429016#comment-17429016 ]
Sam Le commented on BEAM-10958:
-------------------------------
tks, it has addressed the issue.
> WriteToPubsub with Protobuf message missing field
> -------------------------------------------------
>
> Key: BEAM-10958
> URL: https://issues.apache.org/jira/browse/BEAM-10958
> Project: Beam
> Issue Type: Improvement
> Components: io-py-gcp
> Affects Versions: 2.23.0, 2.24.0
> Reporter: Sam Le
> Priority: P3
>
> Hi, I am trying to test ordering_key (new beta feature of Google Pubsub). However, I seem to hit the wall here. Google Protobuff standard message has been described here
> [https://cloud.google.com/pubsub/docs/reference/rpc/google.pubsub.v1#google.pubsub.v1.PubsubMessage] . So, it seems that Apache Beam should have no issue to passing the new field to Pubsub if the format of the message is correct. It does not seem to be a case here. As far as, I have tried to go through the code, I believe that I would only need to change
> [https://github.com/apache/beam/blob/1a09a7576a4a0e22d32583c0c1c52b67970691d6/sdks/python/apache_beam/io/gcp/pubsub.py#L115]
> {code:java}
> //
> def _to_proto_str(self):
> msg = pubsub.types.pubsub_pb2.PubsubMessage()
> msg.data = self.data
> for key, value in iteritems(self.attributes):
> msg.attributes[key] = value
> msg.ordering_key = self.ordering_key
> return msg.SerializeToString()
> {code}
> And then set with_attributes=true. So, WriteToPubSub would allow me to send out a protobuff message to pubsub. However, at some point, the information (ordering_key) is tripped, although, I could set custom attributes. I guess it may be the dataflow_runner code, but it really difficult to navigate around the code in that area. I have gone as far as create a new version of write to WriteToPubSub, but it does not help. The field is still missing on the message when it hits Google Pubsub Topic.
> Could anyone point me to the right direction please?
>
> After digging a little bit deeper, I think the root reason probably because, Apache Beam depend on an old version of Pubsub (1.7.0) . Probably for support of Python 2. I am just not sure for how long? It seems that the support for Python will end after 2.24.0. However, the dependency for older version of google-cloud-pubsub is still there for 2.25.0 (
> |'google-cloud-pubsub>=0.39.0,<2',|
> | |
> )
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)