You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Valentyn Tymofieiev (Jira)" <ji...@apache.org> on 2022/05/19 10:14:00 UTC

[jira] [Commented] (BEAM-14482) Mismatch in event publish_time

    [ https://issues.apache.org/jira/browse/BEAM-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539449#comment-17539449 ] 

Valentyn Tymofieiev commented on BEAM-14482:
--------------------------------------------

Would you be interested to try to find an offending commit using git bisect with your setup? You would need to install Beam SDK from head (see: s.apache.org/beam-python-dev-wiki), install the SDK via smth like:

pip install -e ./sdks/python/build-requirements.txt 
pip install -e ./sdks/python[gcp,test]

git checkout master
git bisect bad
git checkout 072fcae6d44e8540faed79e20370c3d7be267197   # old commit before 2.35.0 was cut. verify that it's sufficiently old.
git bisect good

Iterate and test until you verify the offending commit.
You may need to reinstall install the beam sdk each time you move to a different commit via `pip install -e ./sdks/python[gcp,test]`.



> Mismatch in event publish_time
> ------------------------------
>
>                 Key: BEAM-14482
>                 URL: https://issues.apache.org/jira/browse/BEAM-14482
>             Project: Beam
>          Issue Type: Bug
>          Components: beam-community
>    Affects Versions: 2.36.0
>            Reporter: Daljeet Singh
>            Priority: P2
>              Labels: python
>
> below is the example code, where we are trying to get the publish_time of the pubsub message in our DoFn(). It seems the type has changed in the version of apache beam starting 2.36.0. However, I was not able to find any release notes that talk about this change. Any reference will be helpful.
> {code:java}
> class ProtoToDictDoFn(beam.DoFn):
> def process(self, element, publish_time=beam.DoFn.TimestampParam):
>     """
>     element is type PubsubMessage(), 
>     """
>     print('-------------')
>     print(type(element.publish_time))
>     print(element.publish_time)
>     print('-------------')
> Output:
> — for version 2.35.0 —
> <class 'google.protobuf.timestamp_pb2.Timestamp'>
> seconds: 1652814206
> nanos: 417000000
> — for version >= 2.36.0 —
> <class 'proto.datetime_helpers.DatetimeWithNanoseconds'>
> 2022-05-17 19:02:06.314000+00:00 {code}
> Which seems to be an issue. As per the [google pubsub documentation publish time should be type "google.protobuf.timestamp_pb2.Timestamp"|https://github.com/googleapis/python-pubsub/blob/main/google/pubsub_v1/types/pubsub.py#L232]
> Any clue, what has changed or caused this issue?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)