You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Weihua Hu <hu...@gmail.com> on 2020/05/24 10:32:22 UTC

Singal task backpressure problem with Credit-based Flow Control

Hi, all

I ran into a weird single Task BackPressure problem.

JobInfo:
    DAG: Source (1000)-> Map (2000)-> Sink (1000), which is linked via rescale. 
    Flink version: 1.9.0
    
There is no related info in jobmanager/taskamanger log.

Through Metrics, I see that Map (242) 's outPoolUsage is full, but its downstream Sink (121)' s inPoolUsage is 0.

After dumping the memory and analyzing it, I found:
Sink (121)'s RemoteInputChannel.unannouncedCredit = 0,
Map (242)'s CreditBasedSequenceNumberingViewReader.numCreditsAvailable = 0.
This is not consistent with my understanding of the Flink network transmission mechanism.

Can someone help me? Thanks a lot.


Best
Weihua Hu

Re: Singal task backpressure problem with Credit-based Flow Control

Posted by Piotr Nowojski <pi...@ververica.com>.

Hi Weihua,

Good to hear that you have found the problem. Let us know if you find some other problems after all.

Piotrek

> On 27 May 2020, at 14:18, Weihua Hu <hu...@gmail.com> wrote:
> 
> Hi Piotrek,
> 
> Thanks for your suggestions, I found some network issues which seems to be the cause of back pressure.
> 
> Best
> Weihua Hu
> 
>> 2020年5月26日 02:54，Piotr Nowojski <piotr@ververica.com <ma...@ververica.com>> 写道：
>> 
>> Hi Weihua,
>> 
>> > After dumping the memory and analyzing it, I found:
>> > Sink (121)'s RemoteInputChannel.unannouncedCredit = 0,
>> > Map (242)'s CreditBasedSequenceNumberingViewReader.numCreditsAvailable = 0.
>> > This is not consistent with my understanding of the Flink network transmission mechanism.
>> 
>> It probably is consistent. Downstream receiver unannounced all of the credits, and it’s simply waiting for the data to arrive, while upstream sender is waiting for the data to be sent down the stream.
>> 
>> Stack trace you posted confirms that the sink you posted has empty input buffer - it’s waiting for input data. Assuming rescale partitoning works as expected and indeed node 242 is connected to node 121, it implies the bottleneck is your data exchange between those two tasks. It could be
>> 
>> - network bottleneck (slow network? Packet losses?)
>> - machine swapping/long GC pauses (If upstream node is experiencing long pauses it might show up like this)
>> - cpu bottleneck in the network stack (frequent flushing? SSL?)
>> - some resource competition (too high parallelism for given number of machines)
>> - netty threads are not keeping up
>> 
>> It’s hard to say what’s the problem without looking at the resource usage (CPU/Network/Memory/Disk IO), GC logs, code profiling results.
>> 
>> Piotrek
>> 
>> PS Zhijiang:
>> 
>> RescalePartitioner in this case should be connect just two upstream subtasks with one downstream sink. Upstream subtasks N and N+1 should be connected to sink with N/2 id.
>> 
>>> On 25 May 2020, at 04:39, Weihua Hu <huweihua.ckl@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Hi, Zhijiang
>>> 
>>> I understand the normal credit-based backpressure mechanism. as usual the Sink inPoolUsage will be full, and the task stack will also have some information. 
>>> but this time is not the same. The Sink inPoolUsage is 0. 
>>> I also checked the stack. The Map is waiting org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestMemorySegment
>>> The Sink is waiting data to deal, this is not very in line with expectations.
>>> 
>>> 
>>> <粘贴的图形-2.tiff>
>>> 
>>> <粘贴的图形-1.tiff>
>>> 
>>> 
>>> 
>>> Best
>>> Weihua Hu
>>> 
>>>> 2020年5月24日 21:57，Zhijiang <wangzhijiang999@aliyun.com <ma...@aliyun.com>> 写道：
>>>> 
>>>> Hi Weihua,
>>>> 
>>>> From your below info, it is with the expectation in credit-based flow control. 
>>>> 
>>>> I guess one of the sink parallelism causes the backpressure, so you will see that there are no available credits on Sink side and
>>>> the outPoolUsage of Map is almost 100%. It really reflects the credit-based states in the case of backpressure.
>>>> 
>>>> If you want to analyze the root cause of backpressure, you can trace the task stack of respective Sink parallelism to find which operation costs much,
>>>> then you can increase the parallelism or improve the UDF(if have bottleneck) to have a try. In addition, i am not sure why you choose rescale to shuffle data among operators. The default
>>>> forward mode can gain really good performance by default if you adjusting the same parallelism among them.
>>>> 
>>>> Best,
>>>> Zhijiang
>>>> ------------------------------------------------------------------
>>>> From:Weihua Hu <huweihua.ckl@gmail.com <ma...@gmail.com>>
>>>> Send Time:2020年5月24日(星期日) 18:32
>>>> To:user <user@flink.apache.org <ma...@flink.apache.org>>
>>>> Subject:Singal task backpressure problem with Credit-based Flow Control
>>>> 
>>>> Hi, all
>>>> 
>>>> I ran into a weird single Task BackPressure problem.
>>>> 
>>>> JobInfo:
>>>>     DAG: Source (1000)-> Map (2000)-> Sink (1000), which is linked via rescale. 
>>>>     Flink version: 1.9.0
>>>>     
>>>> There is no related info in jobmanager/taskamanger log.
>>>> 
>>>> Through Metrics, I see that Map (242) 's outPoolUsage is full, but its downstream Sink (121)' s inPoolUsage is 0.
>>>> 
>>>> After dumping the memory and analyzing it, I found:
>>>> Sink (121)'s RemoteInputChannel.unannouncedCredit = 0,
>>>> Map (242)'s CreditBasedSequenceNumberingViewReader.numCreditsAvailable = 0.
>>>> This is not consistent with my understanding of the Flink network transmission mechanism.
>>>> 
>>>> Can someone help me? Thanks a lot.
>>>> 
>>>> 
>>>> Best
>>>> Weihua Hu
>>>> 
>>>> 
>>> 
>> 
>

Re: Singal task backpressure problem with Credit-based Flow Control

Posted by Weihua Hu <hu...@gmail.com>.

Hi Piotrek,

Thanks for your suggestions, I found some network issues which seems to be the cause of back pressure.

Best
Weihua Hu

> 2020年5月26日 02:54，Piotr Nowojski <pi...@ververica.com> 写道：
> 
> Hi Weihua,
> 
> > After dumping the memory and analyzing it, I found:
> > Sink (121)'s RemoteInputChannel.unannouncedCredit = 0,
> > Map (242)'s CreditBasedSequenceNumberingViewReader.numCreditsAvailable = 0.
> > This is not consistent with my understanding of the Flink network transmission mechanism.
> 
> It probably is consistent. Downstream receiver unannounced all of the credits, and it’s simply waiting for the data to arrive, while upstream sender is waiting for the data to be sent down the stream.
> 
> Stack trace you posted confirms that the sink you posted has empty input buffer - it’s waiting for input data. Assuming rescale partitoning works as expected and indeed node 242 is connected to node 121, it implies the bottleneck is your data exchange between those two tasks. It could be
> 
> - network bottleneck (slow network? Packet losses?)
> - machine swapping/long GC pauses (If upstream node is experiencing long pauses it might show up like this)
> - cpu bottleneck in the network stack (frequent flushing? SSL?)
> - some resource competition (too high parallelism for given number of machines)
> - netty threads are not keeping up
> 
> It’s hard to say what’s the problem without looking at the resource usage (CPU/Network/Memory/Disk IO), GC logs, code profiling results.
> 
> Piotrek
> 
> PS Zhijiang:
> 
> RescalePartitioner in this case should be connect just two upstream subtasks with one downstream sink. Upstream subtasks N and N+1 should be connected to sink with N/2 id.
> 
>> On 25 May 2020, at 04:39, Weihua Hu <huweihua.ckl@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi, Zhijiang
>> 
>> I understand the normal credit-based backpressure mechanism. as usual the Sink inPoolUsage will be full, and the task stack will also have some information. 
>> but this time is not the same. The Sink inPoolUsage is 0. 
>> I also checked the stack. The Map is waiting org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestMemorySegment
>> The Sink is waiting data to deal, this is not very in line with expectations.
>> 
>> 
>> <粘贴的图形-2.tiff>
>> 
>> <粘贴的图形-1.tiff>
>> 
>> 
>> 
>> Best
>> Weihua Hu
>> 
>>> 2020年5月24日 21:57，Zhijiang <wangzhijiang999@aliyun.com <ma...@aliyun.com>> 写道：
>>> 
>>> Hi Weihua,
>>> 
>>> From your below info, it is with the expectation in credit-based flow control. 
>>> 
>>> I guess one of the sink parallelism causes the backpressure, so you will see that there are no available credits on Sink side and
>>> the outPoolUsage of Map is almost 100%. It really reflects the credit-based states in the case of backpressure.
>>> 
>>> If you want to analyze the root cause of backpressure, you can trace the task stack of respective Sink parallelism to find which operation costs much,
>>> then you can increase the parallelism or improve the UDF(if have bottleneck) to have a try. In addition, i am not sure why you choose rescale to shuffle data among operators. The default
>>> forward mode can gain really good performance by default if you adjusting the same parallelism among them.
>>> 
>>> Best,
>>> Zhijiang
>>> ------------------------------------------------------------------
>>> From:Weihua Hu <huweihua.ckl@gmail.com <ma...@gmail.com>>
>>> Send Time:2020年5月24日(星期日) 18:32
>>> To:user <user@flink.apache.org <ma...@flink.apache.org>>
>>> Subject:Singal task backpressure problem with Credit-based Flow Control
>>> 
>>> Hi, all
>>> 
>>> I ran into a weird single Task BackPressure problem.
>>> 
>>> JobInfo:
>>>     DAG: Source (1000)-> Map (2000)-> Sink (1000), which is linked via rescale. 
>>>     Flink version: 1.9.0
>>>     
>>> There is no related info in jobmanager/taskamanger log.
>>> 
>>> Through Metrics, I see that Map (242) 's outPoolUsage is full, but its downstream Sink (121)' s inPoolUsage is 0.
>>> 
>>> After dumping the memory and analyzing it, I found:
>>> Sink (121)'s RemoteInputChannel.unannouncedCredit = 0,
>>> Map (242)'s CreditBasedSequenceNumberingViewReader.numCreditsAvailable = 0.
>>> This is not consistent with my understanding of the Flink network transmission mechanism.
>>> 
>>> Can someone help me? Thanks a lot.
>>> 
>>> 
>>> Best
>>> Weihua Hu
>>> 
>>> 
>> 
>

Re: Singal task backpressure problem with Credit-based Flow Control

Posted by Piotr Nowojski <pi...@ververica.com>.

Hi Weihua,

> After dumping the memory and analyzing it, I found:
> Sink (121)'s RemoteInputChannel.unannouncedCredit = 0,
> Map (242)'s CreditBasedSequenceNumberingViewReader.numCreditsAvailable = 0.
> This is not consistent with my understanding of the Flink network transmission mechanism.

It probably is consistent. Downstream receiver unannounced all of the credits, and it’s simply waiting for the data to arrive, while upstream sender is waiting for the data to be sent down the stream.

Stack trace you posted confirms that the sink you posted has empty input buffer - it’s waiting for input data. Assuming rescale partitoning works as expected and indeed node 242 is connected to node 121, it implies the bottleneck is your data exchange between those two tasks. It could be

- network bottleneck (slow network? Packet losses?)
- machine swapping/long GC pauses (If upstream node is experiencing long pauses it might show up like this)
- cpu bottleneck in the network stack (frequent flushing? SSL?)
- some resource competition (too high parallelism for given number of machines)
- netty threads are not keeping up

It’s hard to say what’s the problem without looking at the resource usage (CPU/Network/Memory/Disk IO), GC logs, code profiling results.

Piotrek

PS Zhijiang:

RescalePartitioner in this case should be connect just two upstream subtasks with one downstream sink. Upstream subtasks N and N+1 should be connected to sink with N/2 id.

> On 25 May 2020, at 04:39, Weihua Hu <hu...@gmail.com> wrote:
> 
> Hi, Zhijiang
> 
> I understand the normal credit-based backpressure mechanism. as usual the Sink inPoolUsage will be full, and the task stack will also have some information. 
> but this time is not the same. The Sink inPoolUsage is 0. 
> I also checked the stack. The Map is waiting org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestMemorySegment
> The Sink is waiting data to deal, this is not very in line with expectations.
> 
> 
> <粘贴的图形-2.tiff>
> 
> <粘贴的图形-1.tiff>
> 
> 
> 
> Best
> Weihua Hu
> 
>> 2020年5月24日 21:57，Zhijiang <wangzhijiang999@aliyun.com <ma...@aliyun.com>> 写道：
>> 
>> Hi Weihua,
>> 
>> From your below info, it is with the expectation in credit-based flow control. 
>> 
>> I guess one of the sink parallelism causes the backpressure, so you will see that there are no available credits on Sink side and
>> the outPoolUsage of Map is almost 100%. It really reflects the credit-based states in the case of backpressure.
>> 
>> If you want to analyze the root cause of backpressure, you can trace the task stack of respective Sink parallelism to find which operation costs much,
>> then you can increase the parallelism or improve the UDF(if have bottleneck) to have a try. In addition, i am not sure why you choose rescale to shuffle data among operators. The default
>> forward mode can gain really good performance by default if you adjusting the same parallelism among them.
>> 
>> Best,
>> Zhijiang
>> ------------------------------------------------------------------
>> From:Weihua Hu <huweihua.ckl@gmail.com <ma...@gmail.com>>
>> Send Time:2020年5月24日(星期日) 18:32
>> To:user <user@flink.apache.org <ma...@flink.apache.org>>
>> Subject:Singal task backpressure problem with Credit-based Flow Control
>> 
>> Hi, all
>> 
>> I ran into a weird single Task BackPressure problem.
>> 
>> JobInfo:
>>     DAG: Source (1000)-> Map (2000)-> Sink (1000), which is linked via rescale. 
>>     Flink version: 1.9.0
>>     
>> There is no related info in jobmanager/taskamanger log.
>> 
>> Through Metrics, I see that Map (242) 's outPoolUsage is full, but its downstream Sink (121)' s inPoolUsage is 0.
>> 
>> After dumping the memory and analyzing it, I found:
>> Sink (121)'s RemoteInputChannel.unannouncedCredit = 0,
>> Map (242)'s CreditBasedSequenceNumberingViewReader.numCreditsAvailable = 0.
>> This is not consistent with my understanding of the Flink network transmission mechanism.
>> 
>> Can someone help me? Thanks a lot.
>> 
>> 
>> Best
>> Weihua Hu
>> 
>> 
>

Re: Singal task backpressure problem with Credit-based Flow Control

Posted by Zhijiang <wa...@aliyun.com>.

Hi Weihua,

From your below info, it is with the expectation in credit-based flow control. 

I guess one of the sink parallelism causes the backpressure, so you will see that there are no available credits on Sink side and
the outPoolUsage of Map is almost 100%. It really reflects the credit-based states in the case of backpressure.

If you want to analyze the root cause of backpressure, you can trace the task stack of respective Sink parallelism to find which operation costs much,
 then you can increase the parallelism or improve the UDF(if have bottleneck) to have a try. In addition, i am not sure why you choose rescale to shuffle data among operators. The default
forward mode can gain really good performance by default if you adjusting the same parallelism among them.

Best,
Zhijiang
------------------------------------------------------------------
From:Weihua Hu <hu...@gmail.com>
Send Time:2020年5月24日(星期日) 18:32
To:user <us...@flink.apache.org>
Subject:Singal task backpressure problem with Credit-based Flow Control

Hi, all

I ran into a weird single Task BackPressure problem.

JobInfo:
    DAG: Source (1000)-> Map (2000)-> Sink (1000), which is linked via rescale. 
    Flink version: 1.9.0
There is no related info in jobmanager/taskamanger log.

Through Metrics, I see that Map (242) 's outPoolUsage is full, but its downstream Sink (121)' s inPoolUsage is 0.

After dumping the memory and analyzing it, I found:
Sink (121)'s RemoteInputChannel.unannouncedCredit = 0,
Map (242)'s CreditBasedSequenceNumberingViewReader.numCreditsAvailable = 0.
This is not consistent with my understanding of the Flink network transmission mechanism.

Can someone help me? Thanks a lot.


Best
Weihua Hu