You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Jaswin Shah <ja...@outlook.com> on 2020/05/19 11:10:31 UTC

Heap out of memory issue on execution of flink job

Hi,

I am executing a flink job on cluster wherein I am consuming two data streams from two kafka topics and then a simple interval join on two data streams. I am not storing any states or some large data/objects in memory. But, I am facing heap space issue on every job execution. Can anyone please help me what could be the probable reason/issue for this to occur?
 This are the code snippets:
[cid:8afa1acf-75cf-44b0-b681-f70315df7ec6]
[cid:a0907181-d68a-4577-a4ef-0bc06388a8e9]

 Outofmemory errors:

[cid:eea8f922-e7d3-4add-8587-f24bcc49584f]
[cid:658dcc55-118f-45f4-91ad-cef708c9ebaf]

[cid:f406f12d-2bb6-46b7-baa6-fef1b538d057]

Any help would be highly appreciated.

Thanks,
Jaswin

Re: Heap out of memory issue on execution of flink job

Posted by Xintong Song <to...@gmail.com>.
Alternatively, you can try to configure 'taskmanager.memory.managed.size'
to 0, if you haven't already. This should give JVM larger heap space.
Please be aware that you should not do that with RocksDBStateBackend,
because RocksDB needs managed memory. Also, if the size of state keeps
growing, you may still run into heap OOM.

Thank you~

Xintong Song



On Tue, May 19, 2020 at 9:30 PM Congxian Qiu <qc...@gmail.com> wrote:

> Hi
>
> Yes, Window Join will use state also. they will convert to window operator
> which uses state. you can ref the the code[1] for more information.
>
> [1]
> https://github.com/apache/flink/blob/cbd9ca0ca3c6d2ff1ae36a7c3aa8f88d41506929/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/JoinedStreams.java#L313
>
> Best,
> Congxian
>
>
> Ankit Singhal <an...@paytm.com> 于2020年5月19日周二 下午7:53写道:
>
>> Hi @Benchao Li <li...@gmail.com>
>>
>>
>>
>> Is the state usage only specific to Interval join ? If we use a simple
>> Window Join, the state is not used?
>>
>>
>>
>> Thanks,
>>
>> Ankit Singhal
>>
>>
>>
>> *From: *Benchao Li <li...@gmail.com>
>> *Date: *Tuesday, 19 May 2020 at 5:17 PM
>> *To: *Jaswin Shah <ja...@outlook.com>
>> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>, Arvid Heise <
>> arvid@ververica.com>, Yun Tang <my...@live.com>, "
>> ankit.singhal@paytm.com" <an...@paytm.com>, "
>> isha.singhal@paytm.com" <is...@paytm.com>
>> *Subject: *Re: Heap out of memory issue on execution of flink job
>>
>>
>>
>> Hi Jaswin,
>>
>>
>>
>> The interval join operator will use state heavily, that's why you see
>> heap oom.
>>
>> Maybe you could consider rocksdb state backend if the data is too large
>> to reside in heap.
>>
>>
>>
>> Jaswin Shah <ja...@outlook.com> 于2020年5月19日周二 下午7:12写道:
>>
>> Hi,
>>
>>
>>
>> I am executing a flink job on cluster wherein I am consuming two data
>> streams from two kafka topics and then a simple interval join on two data
>> streams. I am not storing any states or some large data/objects in memory.
>> But, I am facing heap space issue on every job execution. Can anyone please
>> help me what could be the probable reason/issue for this to occur?
>>
>>  This are the code snippets:
>>
>>
>>
>>  Outofmemory errors:
>>
>>
>>
>>
>>
>>
>>
>> Any help would be highly appreciated.
>>
>>
>>
>> Thanks,
>>
>> Jaswin
>>
>>
>>
>>
>> --
>>
>> Benchao Li
>>
>> School of Electronics Engineering and Computer Science, Peking University
>>
>> Tel:+86-15650713730
>>
>> Email: libenchao@gmail.com; libenchao@pku.edu.cn
>>
>>

Re: Heap out of memory issue on execution of flink job

Posted by Congxian Qiu <qc...@gmail.com>.
Hi Isha

Is the length of the window 15 minutes?  as the time out exception, my gut
feeling is that this maybe something caused by gc. you can enable gclog and
try to find is there is something wrong?(too frequent gc)
PS: what the state backend do you use?(maybe you can try
RocksDBStateBackend)

Best,
Congxian


Isha Singhal <is...@paytm.com> 于2020年5月20日周三 下午9:18写道:

> Hi Congxian,
>
> One more issue we are facing..we are getting this error in every 15 min..
>
> *The heartbeat of TaskManager with id 9d2c5728f6f7876af3614518c1f01c2f
>  timed out.*
>
> I think it is something related to configuration, but not sure what can
> make us resolve this. Please suggest.
>
> Thanks,
>
> On Tue, May 19, 2020 at 6:59 PM Congxian Qiu <qc...@gmail.com>
> wrote:
>
>> Hi
>>
>> Yes, Window Join will use state also. they will convert to window
>> operator which uses state. you can ref the the code[1] for more information.
>>
>> [1]
>> https://github.com/apache/flink/blob/cbd9ca0ca3c6d2ff1ae36a7c3aa8f88d41506929/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/JoinedStreams.java#L313
>>
>> Best,
>> Congxian
>>
>>
>> Ankit Singhal <an...@paytm.com> 于2020年5月19日周二 下午7:53写道:
>>
>>> Hi @Benchao Li <li...@gmail.com>
>>>
>>>
>>>
>>> Is the state usage only specific to Interval join ? If we use a simple
>>> Window Join, the state is not used?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Ankit Singhal
>>>
>>>
>>>
>>> *From: *Benchao Li <li...@gmail.com>
>>> *Date: *Tuesday, 19 May 2020 at 5:17 PM
>>> *To: *Jaswin Shah <ja...@outlook.com>
>>> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>, Arvid Heise <
>>> arvid@ververica.com>, Yun Tang <my...@live.com>, "
>>> ankit.singhal@paytm.com" <an...@paytm.com>, "
>>> isha.singhal@paytm.com" <is...@paytm.com>
>>> *Subject: *Re: Heap out of memory issue on execution of flink job
>>>
>>>
>>>
>>> Hi Jaswin,
>>>
>>>
>>>
>>> The interval join operator will use state heavily, that's why you see
>>> heap oom.
>>>
>>> Maybe you could consider rocksdb state backend if the data is too large
>>> to reside in heap.
>>>
>>>
>>>
>>> Jaswin Shah <ja...@outlook.com> 于2020年5月19日周二 下午7:12写道:
>>>
>>> Hi,
>>>
>>>
>>>
>>> I am executing a flink job on cluster wherein I am consuming two data
>>> streams from two kafka topics and then a simple interval join on two data
>>> streams. I am not storing any states or some large data/objects in memory.
>>> But, I am facing heap space issue on every job execution. Can anyone please
>>> help me what could be the probable reason/issue for this to occur?
>>>
>>>  This are the code snippets:
>>>
>>>
>>>
>>>  Outofmemory errors:
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Any help would be highly appreciated.
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Jaswin
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Benchao Li
>>>
>>> School of Electronics Engineering and Computer Science, Peking University
>>>
>>> Tel:+86-15650713730
>>>
>>> Email: libenchao@gmail.com; libenchao@pku.edu.cn
>>>
>>>
>
> --
>
> Isha Singhal
>
> Tech Lead | Payments
>
> +91 9968248351
>
> [image: Paytm-Logo (1)]
>

Re: Heap out of memory issue on execution of flink job

Posted by Congxian Qiu <qc...@gmail.com>.
Hi

Yes, Window Join will use state also. they will convert to window operator
which uses state. you can ref the the code[1] for more information.

[1]
https://github.com/apache/flink/blob/cbd9ca0ca3c6d2ff1ae36a7c3aa8f88d41506929/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/JoinedStreams.java#L313

Best,
Congxian


Ankit Singhal <an...@paytm.com> 于2020年5月19日周二 下午7:53写道:

> Hi @Benchao Li <li...@gmail.com>
>
>
>
> Is the state usage only specific to Interval join ? If we use a simple
> Window Join, the state is not used?
>
>
>
> Thanks,
>
> Ankit Singhal
>
>
>
> *From: *Benchao Li <li...@gmail.com>
> *Date: *Tuesday, 19 May 2020 at 5:17 PM
> *To: *Jaswin Shah <ja...@outlook.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>, Arvid Heise <
> arvid@ververica.com>, Yun Tang <my...@live.com>, "
> ankit.singhal@paytm.com" <an...@paytm.com>, "
> isha.singhal@paytm.com" <is...@paytm.com>
> *Subject: *Re: Heap out of memory issue on execution of flink job
>
>
>
> Hi Jaswin,
>
>
>
> The interval join operator will use state heavily, that's why you see heap
> oom.
>
> Maybe you could consider rocksdb state backend if the data is too large to
> reside in heap.
>
>
>
> Jaswin Shah <ja...@outlook.com> 于2020年5月19日周二 下午7:12写道:
>
> Hi,
>
>
>
> I am executing a flink job on cluster wherein I am consuming two data
> streams from two kafka topics and then a simple interval join on two data
> streams. I am not storing any states or some large data/objects in memory.
> But, I am facing heap space issue on every job execution. Can anyone please
> help me what could be the probable reason/issue for this to occur?
>
>  This are the code snippets:
>
>
>
>  Outofmemory errors:
>
>
>
>
>
>
>
> Any help would be highly appreciated.
>
>
>
> Thanks,
>
> Jaswin
>
>
>
>
> --
>
> Benchao Li
>
> School of Electronics Engineering and Computer Science, Peking University
>
> Tel:+86-15650713730
>
> Email: libenchao@gmail.com; libenchao@pku.edu.cn
>
>

Re: Heap out of memory issue on execution of flink job

Posted by Ankit Singhal <an...@paytm.com>.
Hi @Benchao Li

 

Is the state usage only specific to Interval join ? If we use a simple Window Join, the state is not used?

 

Thanks,

Ankit Singhal

 

From: Benchao Li <li...@gmail.com>
Date: Tuesday, 19 May 2020 at 5:17 PM
To: Jaswin Shah <ja...@outlook.com>
Cc: "user@flink.apache.org" <us...@flink.apache.org>, Arvid Heise <ar...@ververica.com>, Yun Tang <my...@live.com>, "ankit.singhal@paytm.com" <an...@paytm.com>, "isha.singhal@paytm.com" <is...@paytm.com>
Subject: Re: Heap out of memory issue on execution of flink job

 

Hi Jaswin,

 

The interval join operator will use state heavily, that's why you see heap oom.

Maybe you could consider rocksdb state backend if the data is too large to reside in heap.

 

Jaswin Shah <ja...@outlook.com> 于2020年5月19日周二 下午7:12写道:

Hi,

 

I am executing a flink job on cluster wherein I am consuming two data streams from two kafka topics and then a simple interval join on two data streams. I am not storing any states or some large data/objects in memory. But, I am facing heap space issue on every job execution. Can anyone please help me what could be the probable reason/issue for this to occur?

 This are the code snippets:

 

 Outofmemory errors:

 

 

 

Any help would be highly appreciated.

 

Thanks,

Jaswin


 

-- 

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenchao@gmail.com; libenchao@pku.edu.cn


Re: Heap out of memory issue on execution of flink job

Posted by Benchao Li <li...@gmail.com>.
Hi Jaswin,

The interval join operator will use state heavily, that's why you see heap
oom.
Maybe you could consider rocksdb state backend if the data is too large to
reside in heap.

Jaswin Shah <ja...@outlook.com> 于2020年5月19日周二 下午7:12写道:

> Hi,
>
> I am executing a flink job on cluster wherein I am consuming two data
> streams from two kafka topics and then a simple interval join on two data
> streams. I am not storing any states or some large data/objects in memory.
> But, I am facing heap space issue on every job execution. Can anyone please
> help me what could be the probable reason/issue for this to occur?
>  This are the code snippets:
>
>
>
>  Outofmemory errors:
>
>
>
>
>
>
> Any help would be highly appreciated.
>
> Thanks,
> Jaswin
>


-- 

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenchao@gmail.com; libenchao@pku.edu.cn