You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Roman Khachatryan (Jira)" <ji...@apache.org> on 2020/02/14 10:28:00 UTC
[jira] [Updated] (FLINK-16057) Performance regression in ContinuousFileReader

     [ https://issues.apache.org/jira/browse/FLINK-16057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roman Khachatryan updated FLINK-16057:
--------------------------------------
    Description: 
After switching to a single-threaded execution model performance regression was expected to be about 15-20% (benchmarked in November).

After merging it turned out to be about 50%.

 

 

One reason is that the chaining strategy isn't set by default in CFRO factory.

Without this even reading and outputting all records of a split in a single mail action doesn't reverse the regression (only about half).

However,  with strategy set AND batching enabled fixes the regression (starting from batch size 6).

 

Though batching can't be used in practice because it can significantly delay checkpointing.

Another approach would be to process one record and the repeat until defaultMailboxActionAvailable OR haveNewMail.

This reverses regression and even improves the performance by about 50% compared to the old version.

 

Other things tried (didn't help):
 * CFRO rework without subsequent commits (removing checkpoint lock)
 * different batch sizes, including the whole split, without chaining strategy fixed - partial improvement only
 * disabling close
 * disabling checkpointing
 * disabling output (serialization)
 * using LinkedList instead of PriorityQueue

 

> Performance regression in ContinuousFileReader
> ----------------------------------------------
>
>                 Key: FLINK-16057
>                 URL: https://issues.apache.org/jira/browse/FLINK-16057
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Task
>    Affects Versions: 1.11.0
>            Reporter: Roman Khachatryan
>            Priority: Blocker
>
> After switching to a single-threaded execution model performance regression was expected to be about 15-20% (benchmarked in November).
> After merging it turned out to be about 50%.
>  
>  
> One reason is that the chaining strategy isn't set by default in CFRO factory.
> Without this even reading and outputting all records of a split in a single mail action doesn't reverse the regression (only about half).
> However,  with strategy set AND batching enabled fixes the regression (starting from batch size 6).
>  
> Though batching can't be used in practice because it can significantly delay checkpointing.
> Another approach would be to process one record and the repeat until defaultMailboxActionAvailable OR haveNewMail.
> This reverses regression and even improves the performance by about 50% compared to the old version.
>  
> Other things tried (didn't help):
>  * CFRO rework without subsequent commits (removing checkpoint lock)
>  * different batch sizes, including the whole split, without chaining strategy fixed - partial improvement only
>  * disabling close
>  * disabling checkpointing
>  * disabling output (serialization)
>  * using LinkedList instead of PriorityQueue
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)