You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Sahil Modak <sm...@paloaltonetworks.com> on 2022/01/05 09:32:37 UTC

Re: Accessing side input in event_time timer callback function

Hi Luke,

Please find answers below:
1) Number of keys in main input : this is indefinite, currently we are
observing around ~2 million keys
2) How often (if ever) does the side input change : this will stay constant
most of the time, will change once a month in the worse case
3) How big the side input is : this is a file of size 3kb

Regards,
Sahil

On Sat, Jan 1, 2022 at 2:14 AM Luke Cwik <lc...@google.com> wrote:

> This is a missing feature[1]. It looks like there was some partial
> progress over the past two years but it is still incomplete.
>
> Can you provide more details about:
> 1) Number of keys in main input
> 2) How often (if ever) does the side input change
> 3) How big the side input is
>
> 1: https://issues.apache.org/jira/browse/BEAM-6855
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_BEAM-2D6855&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=rrh7TUJDMiiwiRP8ZZfi_SaO8JJfpCE9KWDX13mPj7k&m=ucnxB09gIQs1ccxLdY7p_LSO6yAQyZIOk5-KTvIFvQqFR6EP_2ultOTsi-Dxj_Cw&s=puwDWFS5HBrSRZWocwPFvySnWopLgW1WxQ5GJdAUfRM&e=>
>
>
> On Thu, Dec 30, 2021 at 2:37 AM Sahil Modak <sm...@paloaltonetworks.com>
> wrote:
>
>> Hi,
>>
>> We are using beam's EVENT_TIME based timers in a DoFn that operates on a
>> KV pair in a global window.
>>
>> We are also providing this DoFn with a side input, however we are unable
>> to access this side input in the callback function provided for the
>> EVENT_TIME based timers.
>>
>> Is there way to access this side input in the callback function of the
>> EVENT_TIME timers ? If not, what would be the right way to do this?
>>
>> Thanks,
>> Sahil
>>
>>
>>

Re: Accessing side input in event_time timer callback function

Posted by Luke Cwik <lc...@google.com>.
You should see if it works with Dataflow Prime:
https://cloud.google.com/dataflow/docs/guides/enable-dataflow-prime
You'll want a recent SDK version (preferably the latest).

Otherwise you could try writing the side input to a file in a shared place
like GCS and then loading the data in memory once per worker from the file
with the worker occasionally re-reading the data.

On Wed, Jan 5, 2022 at 6:54 AM Sahil Modak <sm...@paloaltonetworks.com>
wrote:

> Also to confirm, I am seeing the same exception as mentioned in the bug : java.lang.UnsupportedOperationException:
> Attempt to deliver a timer to a DoFn, but timers are not supported in
> Dataflow.
> Due to this the timers in my code are not expiring when they are supposed
> to
>
> On Wed, Jan 5, 2022 at 3:02 PM Sahil Modak <sm...@paloaltonetworks.com>
> wrote:
>
>> Hi Luke,
>>
>> Please find answers below:
>> 1) Number of keys in main input : this is indefinite, currently we are
>> observing around ~2 million keys
>> 2) How often (if ever) does the side input change : this will stay
>> constant most of the time, will change once a month in the worse case
>> 3) How big the side input is : this is a file of size 3kb
>>
>> Regards,
>> Sahil
>>
>> On Sat, Jan 1, 2022 at 2:14 AM Luke Cwik <lc...@google.com> wrote:
>>
>>> This is a missing feature[1]. It looks like there was some partial
>>> progress over the past two years but it is still incomplete.
>>>
>>> Can you provide more details about:
>>> 1) Number of keys in main input
>>> 2) How often (if ever) does the side input change
>>> 3) How big the side input is
>>>
>>> 1: https://issues.apache.org/jira/browse/BEAM-6855
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_BEAM-2D6855&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=rrh7TUJDMiiwiRP8ZZfi_SaO8JJfpCE9KWDX13mPj7k&m=ucnxB09gIQs1ccxLdY7p_LSO6yAQyZIOk5-KTvIFvQqFR6EP_2ultOTsi-Dxj_Cw&s=puwDWFS5HBrSRZWocwPFvySnWopLgW1WxQ5GJdAUfRM&e=>
>>>
>>>
>>> On Thu, Dec 30, 2021 at 2:37 AM Sahil Modak <sm...@paloaltonetworks.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> We are using beam's EVENT_TIME based timers in a DoFn that operates on
>>>> a KV pair in a global window.
>>>>
>>>> We are also providing this DoFn with a side input, however we are
>>>> unable to access this side input in the callback function provided for the
>>>> EVENT_TIME based timers.
>>>>
>>>> Is there way to access this side input in the callback function of the
>>>> EVENT_TIME timers ? If not, what would be the right way to do this?
>>>>
>>>> Thanks,
>>>> Sahil
>>>>
>>>>
>>>>

Re: Accessing side input in event_time timer callback function

Posted by Sahil Modak <sm...@paloaltonetworks.com>.
Also to confirm, I am seeing the same exception as mentioned in the
bug : java.lang.UnsupportedOperationException:
Attempt to deliver a timer to a DoFn, but timers are not supported in
Dataflow.
Due to this the timers in my code are not expiring when they are supposed to

On Wed, Jan 5, 2022 at 3:02 PM Sahil Modak <sm...@paloaltonetworks.com>
wrote:

> Hi Luke,
>
> Please find answers below:
> 1) Number of keys in main input : this is indefinite, currently we are
> observing around ~2 million keys
> 2) How often (if ever) does the side input change : this will stay
> constant most of the time, will change once a month in the worse case
> 3) How big the side input is : this is a file of size 3kb
>
> Regards,
> Sahil
>
> On Sat, Jan 1, 2022 at 2:14 AM Luke Cwik <lc...@google.com> wrote:
>
>> This is a missing feature[1]. It looks like there was some partial
>> progress over the past two years but it is still incomplete.
>>
>> Can you provide more details about:
>> 1) Number of keys in main input
>> 2) How often (if ever) does the side input change
>> 3) How big the side input is
>>
>> 1: https://issues.apache.org/jira/browse/BEAM-6855
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_BEAM-2D6855&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=rrh7TUJDMiiwiRP8ZZfi_SaO8JJfpCE9KWDX13mPj7k&m=ucnxB09gIQs1ccxLdY7p_LSO6yAQyZIOk5-KTvIFvQqFR6EP_2ultOTsi-Dxj_Cw&s=puwDWFS5HBrSRZWocwPFvySnWopLgW1WxQ5GJdAUfRM&e=>
>>
>>
>> On Thu, Dec 30, 2021 at 2:37 AM Sahil Modak <sm...@paloaltonetworks.com>
>> wrote:
>>
>>> Hi,
>>>
>>> We are using beam's EVENT_TIME based timers in a DoFn that operates on a
>>> KV pair in a global window.
>>>
>>> We are also providing this DoFn with a side input, however we are unable
>>> to access this side input in the callback function provided for the
>>> EVENT_TIME based timers.
>>>
>>> Is there way to access this side input in the callback function of the
>>> EVENT_TIME timers ? If not, what would be the right way to do this?
>>>
>>> Thanks,
>>> Sahil
>>>
>>>
>>>