You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by gaurav mishra <ga...@gmail.com> on 2022/03/04 20:24:45 UTC

[PubsubIO]question about withTimestampAttribute

Hi,
from the docs I can see withTimestampAttribute is used to attach a custom
timestamp attribute which can be used by a downstream pipeline while
reading from pubsub.

In my publisher pipeline code I am writing json strings(serialized from
pojos). withTimestampAttribute lets me specify the name of the attribute
which will contain the timestamp but it does not let me set the value of
the timestamp. how to set the timestamp value which will be extracted from
each record being published.

I would like to do something like this -
PubsubIO.writeStrings()
.withTimestampAttribute("attribute_timestamp",
 ExtractTimestampEpocFromStringFn())
                            .to(outputTopicId));

Re: [PubsubIO]question about withTimestampAttribute

Posted by Reza Rokni <re...@google.com>.
Try with writeMessages<PubSubMessage>:

PubsubIO.html#writeMessages--
<https://beam.apache.org/releases/javadoc/2.37.0/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.html#writeMessages-->

And set the attribute in the Map

io/gcp/pubsub/PubsubMessage.html
<https://beam.apache.org/releases/javadoc/2.37.0/org/apache/beam/sdk/io/gcp/pubsub/PubsubMessage.html>



On Fri, Mar 4, 2022 at 1:22 PM Reuven Lax <re...@google.com> wrote:

> However in order to get correct watermarks fromPubSubIO,we currently only
> support extracting the timestamp from a PubSubAttribute. This timestamp
> needs to be set by the publisher of the message as a PubSub attribute.
>
> On Fri, Mar 4, 2022 at 12:59 PM Brian Hulette <bh...@google.com> wrote:
>
>> Hi Gaurav,
>>
>> The timestamp comes from the event time on the record. The Beam
>> programming guide has details on how you can set this [1].
>>
>> Brian
>>
>> [1]
>> https://beam.apache.org/documentation/programming-guide/#adding-timestamps-to-a-pcollections-elements
>>
>> On Fri, Mar 4, 2022 at 12:25 PM gaurav mishra <
>> gauravmishra.itbhu@gmail.com> wrote:
>>
>>> Hi,
>>> from the docs I can see withTimestampAttribute is used to attach a
>>> custom timestamp attribute which can be used by a downstream pipeline while
>>> reading from pubsub.
>>>
>>> In my publisher pipeline code I am writing json strings(serialized from
>>> pojos). withTimestampAttribute lets me specify the name of the attribute
>>> which will contain the timestamp but it does not let me set the value of
>>> the timestamp. how to set the timestamp value which will be extracted from
>>> each record being published.
>>>
>>> I would like to do something like this -
>>> PubsubIO.writeStrings()
>>> .withTimestampAttribute("attribute_timestamp",
>>>  ExtractTimestampEpocFromStringFn())
>>>                             .to(outputTopicId));
>>>
>>>
>>>

Re: [PubsubIO]question about withTimestampAttribute

Posted by Reuven Lax <re...@google.com>.
However in order to get correct watermarks fromPubSubIO,we currently only
support extracting the timestamp from a PubSubAttribute. This timestamp
needs to be set by the publisher of the message as a PubSub attribute.

On Fri, Mar 4, 2022 at 12:59 PM Brian Hulette <bh...@google.com> wrote:

> Hi Gaurav,
>
> The timestamp comes from the event time on the record. The Beam
> programming guide has details on how you can set this [1].
>
> Brian
>
> [1]
> https://beam.apache.org/documentation/programming-guide/#adding-timestamps-to-a-pcollections-elements
>
> On Fri, Mar 4, 2022 at 12:25 PM gaurav mishra <
> gauravmishra.itbhu@gmail.com> wrote:
>
>> Hi,
>> from the docs I can see withTimestampAttribute is used to attach a custom
>> timestamp attribute which can be used by a downstream pipeline while
>> reading from pubsub.
>>
>> In my publisher pipeline code I am writing json strings(serialized from
>> pojos). withTimestampAttribute lets me specify the name of the attribute
>> which will contain the timestamp but it does not let me set the value of
>> the timestamp. how to set the timestamp value which will be extracted from
>> each record being published.
>>
>> I would like to do something like this -
>> PubsubIO.writeStrings()
>> .withTimestampAttribute("attribute_timestamp",
>>  ExtractTimestampEpocFromStringFn())
>>                             .to(outputTopicId));
>>
>>
>>

Re: [PubsubIO]question about withTimestampAttribute

Posted by Brian Hulette <bh...@google.com>.
Hi Gaurav,

The timestamp comes from the event time on the record. The Beam programming
guide has details on how you can set this [1].

Brian

[1]
https://beam.apache.org/documentation/programming-guide/#adding-timestamps-to-a-pcollections-elements

On Fri, Mar 4, 2022 at 12:25 PM gaurav mishra <ga...@gmail.com>
wrote:

> Hi,
> from the docs I can see withTimestampAttribute is used to attach a custom
> timestamp attribute which can be used by a downstream pipeline while
> reading from pubsub.
>
> In my publisher pipeline code I am writing json strings(serialized from
> pojos). withTimestampAttribute lets me specify the name of the attribute
> which will contain the timestamp but it does not let me set the value of
> the timestamp. how to set the timestamp value which will be extracted from
> each record being published.
>
> I would like to do something like this -
> PubsubIO.writeStrings()
> .withTimestampAttribute("attribute_timestamp",
>  ExtractTimestampEpocFromStringFn())
>                             .to(outputTopicId));
>
>
>