You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by MBilal <mb...@gmail.com> on 2019/03/13 09:46:52 UTC

Custom Partitioner and Graph Algorithms

Hi,

I am observing a behaviour in the task statistics that I don't fully understand. 
Essentially I have create a partitioner that assigns all the edges to a single partition.
I see imbalance (in terms of records sent/received) in the task statistics of different instances of the same operator for the second and third stages. 
But from fourth stage onwards, all operator instances are executing pretty much the same number of records. I would have expected that the imbalance would exist in those stages as well. 

Details of the my code and task statistics are in this stackoverflow question:
https://stackoverflow.com/questions/55138553/behaviour-of-custom-partitioner-in-apache-flink

Thanks. 

- Bilal


Re: Custom Partitioner and Graph Algorithms

Posted by MBilal <mb...@gmail.com>.
I have added a working code example to the stackoverflow question that is representative of what I am using. The github repo can be found here: https://github.com/MBtech/graphtest

On 2019/03/13 09:46:52, MBilal <mb...@gmail.com> wrote: 
> Hi,
> 
> I am observing a behaviour in the task statistics that I don't fully understand. 
> Essentially I have create a partitioner that assigns all the edges to a single partition.
> I see imbalance (in terms of records sent/received) in the task statistics of different instances of the same operator for the second and third stages. 
> But from fourth stage onwards, all operator instances are executing pretty much the same number of records. I would have expected that the imbalance would exist in those stages as well. 
> 
> Details of the my code and task statistics are in this stackoverflow question:
> https://stackoverflow.com/questions/55138553/behaviour-of-custom-partitioner-in-apache-flink
> 
> Thanks. 
> 
> - Bilal
> 
>