You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Folani <ha...@irisa.fr> on 2018/10/15 14:26:57 UTC

The best way to get processing time of each operator?

I'm going to work on effect of parallelism for different operators on
heterogeneous machines.
I need to know the processing time of each operator instance as well as
overall processing time of all instances of each specific operator.
I think there are different ways for this purpose.
However, what is the best way to getting these times as precise as possible? 



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: The best way to get processing time of each operator?

Posted by Kostas Kloudas <k....@data-artisans.com>.
Hi Folani,

Metrics is definitely one way, while the other can be that, depending on your job,
if you have e.g. processFunctions, you can always attach different timestamps
(depending on what you want to measure) and based on these, do the computations
you need. Based on this you can for example compute the per record latency.

Now for the overall latency of an operator (all tasks) you have to be 
more creative, but I am not so sure what is the value of measuring it, as in streaming,
more often than not, you are referring to infinite streams of incoming data.

Cheers,
Kostas

> On Oct 16, 2018, at 3:11 AM, Hequn Cheng <ch...@gmail.com> wrote:
> 
> Hi Folani,
> 
> I see one option that we can achieve this through metrics[1].
> Each operator can report it's processing time as a metric. These metrics can be gathered and queried later. For example, you can get a metric for a specified TaskManager or get max/min/avg value of all TaskManagers. 
> 
> Best, Hequn
> 
> [1] https://ci.apache.org/projects/flink/flink-docs-master/monitoring/metrics.html <https://ci.apache.org/projects/flink/flink-docs-master/monitoring/metrics.html>
> 
> 
> On Mon, Oct 15, 2018 at 10:26 PM Folani <hamidreza.arkian@irisa.fr <ma...@irisa.fr>> wrote:
> I'm going to work on effect of parallelism for different operators on
> heterogeneous machines.
> I need to know the processing time of each operator instance as well as
> overall processing time of all instances of each specific operator.
> I think there are different ways for this purpose.
> However, what is the best way to getting these times as precise as possible? 
> 
> 
> 
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>


Re: The best way to get processing time of each operator?

Posted by Hequn Cheng <ch...@gmail.com>.
Hi Folani,

I see one option that we can achieve this through metrics[1].
Each operator can report it's processing time as a metric. These metrics
can be gathered and queried later. For example, you can get a metric for a
specified TaskManager or get max/min/avg value of all TaskManagers.

Best, Hequn

[1]
https://ci.apache.org/projects/flink/flink-docs-master/monitoring/metrics.html


On Mon, Oct 15, 2018 at 10:26 PM Folani <ha...@irisa.fr> wrote:

> I'm going to work on effect of parallelism for different operators on
> heterogeneous machines.
> I need to know the processing time of each operator instance as well as
> overall processing time of all instances of each specific operator.
> I think there are different ways for this purpose.
> However, what is the best way to getting these times as precise as
> possible?
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>