You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by M Singh <ma...@yahoo.com> on 2019/06/27 15:36:22 UTC

Apache Flink - Are counters reliable and accurate ?

Hi:
I need to collect application metrics which are counts (per unit of time eg: minute)  for certain events.  There are two ways of doing this:
1. Create separate streams (using split stream etc) in the application explicitly, then aggregate the counts in a window and save them.  This mixes metrics collection with application logic and making the application logic complex.2. Use Flink metrics framework (counter, guage, etc) to save metrics
I have a very small test with 2 events but when I run the application the counters are not getting saved (they show value 0) even though that part of the code is being executed.  I do see the numRecordsIn counters being updated from the source operator.  I've also tried incrementing the count by 10 (instead of 1) every time the function gets execute but still the counts remain 0.
Here is snippet of the code:
dataStream.map(new RichMapFunction<String, String>() {
            protected Counter counter;
            public void open(Configuration parameters) {                counter = getRuntimeContext().getMetricGroup().addGroup("test", "split").counter("success");            }            @Override            public String map(String value) throws Exception {                counter.inc();                return value;            }        });

As I mentioned, I do get the success metric count but the value is always 0, even though the above map function was executed.  
My questions are:
1. Are there any issues regarding counters being approximate ?2. If I want to collect accurate counts, is it recommended to use counters or should I do it explicitly (which is making the code too complex) ?3. Do counters participate in flink's failure/checkpointing/recovery ?4. Is there any better way of collecting application metric counts ?
Thanks
Mans

Re: Apache Flink - Are counters reliable and accurate ?

Posted by Chesnay Schepler <ch...@apache.org>.
So here's the thing: Metrics are accurate, so long as the job is 
running. Once the job terminates metrics are cleaned up and not 
persisted anywhere, with the exception of a few metrics (like numRecordsIn).

Another thing that is always good to double-check is to enable DEBUG 
logging and re-run your test.

On 27/06/2019 22:41, M Singh wrote:
> Hi Chesnay:
>
> Thanks for your response.
>
> My job runs for a few minutes and i've tried setting the reporter 
> interval to 1 second.
>
> I will try the counter on a longer running job.
>
> Thanks again.
>
> On Thursday, June 27, 2019, 11:46:17 AM EDT, Chesnay Schepler 
> <ch...@apache.org> wrote:
>
>
> 1) None that I'm aware of.
> 2) You should use counters.
> 3) No, counters are not checkpointed, but you could store the value in 
> state yourself.
> 4) None that I'm aware of that doesn't require modifications to the 
> application logic.
>
> How long does your job run for, and how do you access metrics?
>
> On 27/06/2019 17:36, M Singh wrote:
>> Hi:
>>
>> I need to collect application metrics which are counts (per unit of 
>> time eg: minute) for certain events.  There are two ways of doing this:
>>
>> 1. Create separate streams (using split stream etc) in the 
>> application explicitly, then aggregate the counts in a window and 
>> save them.  This mixes metrics collection with application logic and 
>> making the application logic complex.
>> 2. Use Flink metrics framework (counter, guage, etc) to save metrics
>>
>> I have a very small test with 2 events but when I run the application 
>> the counters are not getting saved (they show value 0) even though 
>> that part of the code is being executed.  I do see the numRecordsIn 
>> counters being updated from the source operator.  I've also tried 
>> incrementing the count by 10 (instead of 1) every time the function 
>> gets execute but still the counts remain 0.
>>
>> Here is snippet of the code:
>>
>> dataStream.map(new RichMapFunction<String, String>() {
>>
>>             protected Counter counter;
>>
>>             public void open(Configuration parameters) {
>>                 counter = 
>> getRuntimeContext().getMetricGroup().addGroup("test", 
>> "split").counter("success");
>>             }
>>             @Override
>>             public String map(String value) throws Exception {
>>                 counter.inc();
>>                 return value;
>>             }
>>         });
>>
>>
>> As I mentioned, I do get the success metric count but the value is 
>> always 0, even though the above map function was executed.
>>
>> My questions are:
>>
>> 1. Are there any issues regarding counters being approximate ?
>> 2. If I want to collect accurate counts, is it recommended to use 
>> counters or should I do it explicitly (which is making the code too 
>> complex) ?
>> 3. Do counters participate in flink's failure/checkpointing/recovery ?
>> 4. Is there any better way of collecting application metric counts ?
>>
>> Thanks
>>
>> Mans
>
>


Re: Apache Flink - Are counters reliable and accurate ?

Posted by M Singh <ma...@yahoo.com>.
 Hi Chesnay:
Thanks for your response.
My job runs for a few minutes and i've tried setting the reporter interval to 1 second.
I will try the counter on a longer running job.
Thanks again.
    On Thursday, June 27, 2019, 11:46:17 AM EDT, Chesnay Schepler <ch...@apache.org> wrote:  
 
  1) None that I'm aware of.
 2) You should use counters.
 3) No, counters are not checkpointed, but you could store the value in state yourself.
 4) None that I'm aware of that doesn't require modifications to the application logic.
 
 How long does your job run for, and how do you access metrics?
 
 On 27/06/2019 17:36, M Singh wrote:
  
      Hi: 
  I need to collect application metrics which are counts (per unit of time eg: minute)  for certain events.  There are two ways of doing this: 
  1. Create separate streams (using split stream etc) in the application explicitly, then aggregate the counts in a window and save them.  This mixes metrics collection with application logic and making the application logic complex. 2. Use Flink metrics framework (counter, guage, etc) to save metrics 
  I have a very small test with 2 events but when I run the application the counters are not getting saved (they show value 0) even though that part of the code is being executed.  I do see the numRecordsIn counters being updated from the source operator.  I've also tried incrementing the count by 10 (instead of 1) every time the function gets execute but still the counts remain 0. 
  Here is snippet of the code: 
    dataStream.map(new RichMapFunction<String, String>() { 
              protected Counter counter; 
              public void open(Configuration parameters) {                 counter = getRuntimeContext().getMetricGroup().addGroup("test", "split").counter("success");             }             @Override             public String map(String value) throws Exception {                 counter.inc();                 return value;             }         });  
  
  As I mentioned, I do get the success metric count but the value is always 0, even though the above map function was executed.   
  My questions are: 
  1. Are there any issues regarding counters being approximate ? 2. If I want to collect accurate counts, is it recommended to use counters or should I do it explicitly (which is making the code too complex) ? 3. Do counters participate in flink's failure/checkpointing/recovery ? 4. Is there any better way of collecting application metric counts ? 
  Thanks 
  Mans      
 

 
   

Re: Apache Flink - Are counters reliable and accurate ?

Posted by Chesnay Schepler <ch...@apache.org>.
1) None that I'm aware of.
2) You should use counters.
3) No, counters are not checkpointed, but you could store the value in 
state yourself.
4) None that I'm aware of that doesn't require modifications to the 
application logic.

How long does your job run for, and how do you access metrics?

On 27/06/2019 17:36, M Singh wrote:
> Hi:
>
> I need to collect application metrics which are counts (per unit of 
> time eg: minute) for certain events.  There are two ways of doing this:
>
> 1. Create separate streams (using split stream etc) in the application 
> explicitly, then aggregate the counts in a window and save them.  This 
> mixes metrics collection with application logic and making the 
> application logic complex.
> 2. Use Flink metrics framework (counter, guage, etc) to save metrics
>
> I have a very small test with 2 events but when I run the application 
> the counters are not getting saved (they show value 0) even though 
> that part of the code is being executed.  I do see the numRecordsIn 
> counters being updated from the source operator.  I've also tried 
> incrementing the count by 10 (instead of 1) every time the function 
> gets execute but still the counts remain 0.
>
> Here is snippet of the code:
>
> dataStream.map(new RichMapFunction<String, String>() {
>
>             protected Counter counter;
>
>             public void open(Configuration parameters) {
>                 counter = 
> getRuntimeContext().getMetricGroup().addGroup("test", 
> "split").counter("success");
>             }
>             @Override
>             public String map(String value) throws Exception {
>                 counter.inc();
>                 return value;
>             }
>         });
>
>
> As I mentioned, I do get the success metric count but the value is 
> always 0, even though the above map function was executed.
>
> My questions are:
>
> 1. Are there any issues regarding counters being approximate ?
> 2. If I want to collect accurate counts, is it recommended to use 
> counters or should I do it explicitly (which is making the code too 
> complex) ?
> 3. Do counters participate in flink's failure/checkpointing/recovery ?
> 4. Is there any better way of collecting application metric counts ?
>
> Thanks
>
> Mans