You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kenneth Knowles (Jira)" <ji...@apache.org> on 2022/01/18 21:11:00 UTC
[jira] [Commented] (BEAM-10926) Specify the event time when consuming pubsub data.
[ https://issues.apache.org/jira/browse/BEAM-10926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17478210#comment-17478210 ]
Kenneth Knowles commented on BEAM-10926:
----------------------------------------
This is because Dataflow uses a native custom Pubsub source and does not execute the Java-based source in the Beam repo. It is a common issue. You can set {{--experiments=enable_custom_pubsub_source}} to use the Java implementation, which does have this capability. CC [~chamikara].
> Specify the event time when consuming pubsub data.
> --------------------------------------------------
>
> Key: BEAM-10926
> URL: https://issues.apache.org/jira/browse/BEAM-10926
> Project: Beam
> Issue Type: Improvement
> Components: io-java-gcp
> Reporter: Jian Zheng
> Priority: P3
>
> I need to specify the event time when consuming pubsub data.
> {code:java}
> PCollection<PubsubMessage> pubsubMessages = pipeline.apply("Read Pub/Sub Data",
> PubsubIO.readMessagesWithAttributes()
> .withTimestampAttribute(options.getTimeAttribute())
> .fromSubscription(options.getInputSubscription()));
> {code}
> The only way to do this is to use the {color:#ff0000}withTimestampAttribute(){color} method.
> But if I use a timestamp in some other format, such as a 19-bit nanosecond, or if save the event time in the payload. The method won't work !
> So I had to extend the PubsubClient class and override the PubsubClient. extractTimestamp() method.
> I'am hoping to provide a way to pass in some implementation class that would allow to parse out timestamp from the current pubsub message.
>
> My beam version is 2.19.0.
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)