You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Navneeth Krishnan <re...@gmail.com> on 2017/12/10 17:51:54 UTC

Custom Metrics

Hi,

I have a streaming pipeline running on flink and I need to collect metrics
to identify how my algorithm is performing. The entire pipeline is
multi-tenanted and I also need metrics per tenant. Lets say there would be
around 20 metrics to be captured per tenant. I have the following ideas for
implemention but any suggestions on which one might be better will help.

1. Use flink metric group and register a group per tenant at the operator
level. The disadvantage of this approach for me is I need the
runtimecontext parameter to register a metric and I have various subclasses
to which I need to pass this object to limit the metric scope within the
operator. Also there will be too many metrics reported if there are higher
number of subtasks.
How is everyone accessing flink state/ metrics from other classes where you
don't have access to runtimecontext?

2. Use a custom singleton metric registry to add and send these metrics
using custom sink. Instead of using flink metric group to collect metrics
per operatior - subtask, collect per jvm and use influx sink to send the
metric data. What i'm not sure in this case is how to collect only once per
node/jvm.

Thanks a bunch in advance.

Re: Custom Metrics

Posted by Piotr Nowojski <pi...@data-artisans.com>.

Hi,

> I have couple more questions related to metrics. I use Influx db reporter to report flink metrics and I see a lot of metrics are bring reported. Is there a way to select only a subset of metrics that we need to monitor the application?

At this point is up to either reporter, or up to the system that metrics are reported. You would need to extend an Influx db reporter to add some configuration options to ignore some metrics.

> Also, Is there a way to specify custom metics scope? Basically I register metrics like below, add a custom metric group and then add a meter per user. I would like this to be reported as measurement "Users" and tags with user id. This way I can easily visualize the data in grafana or any other tool by selecting the measurement and group by tag. Is there a way to report like that instead of host, process_type, tm_id, job_name, task_name & subtask_index?


Can not you ignore first couple of groups/scopes in the Grafana? I think you can also add more groups in the user scope.

metricGroup.addGroup("Users”).addGroup(“Foo”).addGroup(“Bar”).

Piotrek

> On 13 Dec 2017, at 22:34, Navneeth Krishnan <re...@gmail.com> wrote:
> 
> Thanks Pitor.
> 
> I have couple more questions related to metrics. I use Influx db reporter to report flink metrics and I see a lot of metrics are bring reported. Is there a way to select only a subset of metrics that we need to monitor the application?
> 
> Also, Is there a way to specify custom metics scope? Basically I register metrics like below, add a custom metric group and then add a meter per user. I would like this to be reported as measurement "Users" and tags with user id. This way I can easily visualize the data in grafana or any other tool by selecting the measurement and group by tag. Is there a way to report like that instead of host, process_type, tm_id, job_name, task_name & subtask_index?
> 
> metricGroup.addGroup("Users")
>         .meter(userId, new DropwizardMeterWrapper(new com.codahale.metrics.Meter()));
> Thanks a bunch.
> 
> On Mon, Dec 11, 2017 at 11:12 PM, Piotr Nowojski <piotr@data-artisans.com <ma...@data-artisans.com>> wrote:
> Hi,
> 
> Reporting once per 10 seconds shouldn’t create problems. Best to try it out. Let us know if you get into some troubles :)
> 
> Piotrek
> 
>> On 11 Dec 2017, at 18:23, Navneeth Krishnan <reachnavneeth2@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Thanks Piotr. 
>> 
>> Yes, passing the metric group should be sufficient. The subcomponents will not be able to provide the list of metrics to register since the metrics are created based on incoming data by tenant. Also I am planning to have the metrics reported every 10 seconds and hope it shouldn't be a problem. We use influx and grafana to plot the metrics.
>> 
>> The option 2 that I had in mind was to collect all metrics and use influx db sink to report it directly inside the pipeline. But it seems reporting per node might not be possible.
>> 
>> 
>> On Mon, Dec 11, 2017 at 3:14 AM, Piotr Nowojski <piotr@data-artisans.com <ma...@data-artisans.com>> wrote:
>> Hi,
>> 
>> I’m not sure if I completely understand your issue.
>> 
>> 1.
>> - You don’t have to pass RuntimeContext, you can always pass just the MetricGroup or ask your components/subclasses “what metrics do you want to register” and register them at the top level.
>> - Reporting tens/hundreds/thousands of metrics shouldn’t be an issue for Flink, as long as you have a reasonable reporting interval. However keep in mind that Flink only reports your metrics and you still need something to read/handle/process/aggregate your metrics
>> 2.
>> I don’t think that reporting per node/jvm is possible with Flink’s metric system. For that you would need some other solution, like report your metrics using JMX (directly register MBeans from your code)
>> 
>> Piotrek
>> 
>> > On 10 Dec 2017, at 18:51, Navneeth Krishnan <reachnavneeth2@gmail.com <ma...@gmail.com>> wrote:
>> >
>> > Hi,
>> >
>> > I have a streaming pipeline running on flink and I need to collect metrics to identify how my algorithm is performing. The entire pipeline is multi-tenanted and I also need metrics per tenant. Lets say there would be around 20 metrics to be captured per tenant. I have the following ideas for implemention but any suggestions on which one might be better will help.
>> >
>> > 1. Use flink metric group and register a group per tenant at the operator level. The disadvantage of this approach for me is I need the runtimecontext parameter to register a metric and I have various subclasses to which I need to pass this object to limit the metric scope within the operator. Also there will be too many metrics reported if there are higher number of subtasks.
>> > How is everyone accessing flink state/ metrics from other classes where you don't have access to runtimecontext?
>> >
>> > 2. Use a custom singleton metric registry to add and send these metrics using custom sink. Instead of using flink metric group to collect metrics per operatior - subtask, collect per jvm and use influx sink to send the metric data. What i'm not sure in this case is how to collect only once per node/jvm.
>> >
>> > Thanks a bunch in advance.
>> 
>> 
> 
>

Re: Custom Metrics

Posted by Navneeth Krishnan <re...@gmail.com>.

Thanks Pitor.

I have couple more questions related to metrics. I use Influx db reporter
to report flink metrics and I see a lot of metrics are bring reported. Is
there a way to select only a subset of metrics that we need to monitor the
application?

Also, Is there a way to specify custom metics scope? Basically I register
metrics like below, add a custom metric group and then add a meter per
user. I would like this to be reported as measurement "Users" and tags with
user id. This way I can easily visualize the data in grafana or any other
tool by selecting the measurement and group by tag. Is there a way to
report like that instead of host, process_type, tm_id, job_name, task_name
& subtask_index?

metricGroup.addGroup("Users")
        .meter(userId, new DropwizardMeterWrapper(new
com.codahale.metrics.Meter()));

Thanks a bunch.

On Mon, Dec 11, 2017 at 11:12 PM, Piotr Nowojski <pi...@data-artisans.com>
wrote:

> Hi,
>
> Reporting once per 10 seconds shouldn’t create problems. Best to try it
> out. Let us know if you get into some troubles :)
>
> Piotrek
>
> On 11 Dec 2017, at 18:23, Navneeth Krishnan <re...@gmail.com>
> wrote:
>
> Thanks Piotr.
>
> Yes, passing the metric group should be sufficient. The subcomponents will
> not be able to provide the list of metrics to register since the metrics
> are created based on incoming data by tenant. Also I am planning to have
> the metrics reported every 10 seconds and hope it shouldn't be a problem.
> We use influx and grafana to plot the metrics.
>
> The option 2 that I had in mind was to collect all metrics and use influx
> db sink to report it directly inside the pipeline. But it seems reporting
> per node might not be possible.
>
>
> On Mon, Dec 11, 2017 at 3:14 AM, Piotr Nowojski <pi...@data-artisans.com>
> wrote:
>
>> Hi,
>>
>> I’m not sure if I completely understand your issue.
>>
>> 1.
>> - You don’t have to pass RuntimeContext, you can always pass just the
>> MetricGroup or ask your components/subclasses “what metrics do you want to
>> register” and register them at the top level.
>> - Reporting tens/hundreds/thousands of metrics shouldn’t be an issue for
>> Flink, as long as you have a reasonable reporting interval. However keep in
>> mind that Flink only reports your metrics and you still need something to
>> read/handle/process/aggregate your metrics
>> 2.
>> I don’t think that reporting per node/jvm is possible with Flink’s metric
>> system. For that you would need some other solution, like report your
>> metrics using JMX (directly register MBeans from your code)
>>
>> Piotrek
>>
>> > On 10 Dec 2017, at 18:51, Navneeth Krishnan <re...@gmail.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > I have a streaming pipeline running on flink and I need to collect
>> metrics to identify how my algorithm is performing. The entire pipeline is
>> multi-tenanted and I also need metrics per tenant. Lets say there would be
>> around 20 metrics to be captured per tenant. I have the following ideas for
>> implemention but any suggestions on which one might be better will help.
>> >
>> > 1. Use flink metric group and register a group per tenant at the
>> operator level. The disadvantage of this approach for me is I need the
>> runtimecontext parameter to register a metric and I have various subclasses
>> to which I need to pass this object to limit the metric scope within the
>> operator. Also there will be too many metrics reported if there are higher
>> number of subtasks.
>> > How is everyone accessing flink state/ metrics from other classes where
>> you don't have access to runtimecontext?
>> >
>> > 2. Use a custom singleton metric registry to add and send these metrics
>> using custom sink. Instead of using flink metric group to collect metrics
>> per operatior - subtask, collect per jvm and use influx sink to send the
>> metric data. What i'm not sure in this case is how to collect only once per
>> node/jvm.
>> >
>> > Thanks a bunch in advance.
>>
>>
>
>

Re: Custom Metrics

Posted by Piotr Nowojski <pi...@data-artisans.com>.

Hi,

Reporting once per 10 seconds shouldn’t create problems. Best to try it out. Let us know if you get into some troubles :)

Piotrek

> On 11 Dec 2017, at 18:23, Navneeth Krishnan <re...@gmail.com> wrote:
> 
> Thanks Piotr. 
> 
> Yes, passing the metric group should be sufficient. The subcomponents will not be able to provide the list of metrics to register since the metrics are created based on incoming data by tenant. Also I am planning to have the metrics reported every 10 seconds and hope it shouldn't be a problem. We use influx and grafana to plot the metrics.
> 
> The option 2 that I had in mind was to collect all metrics and use influx db sink to report it directly inside the pipeline. But it seems reporting per node might not be possible.
> 
> 
> On Mon, Dec 11, 2017 at 3:14 AM, Piotr Nowojski <piotr@data-artisans.com <ma...@data-artisans.com>> wrote:
> Hi,
> 
> I’m not sure if I completely understand your issue.
> 
> 1.
> - You don’t have to pass RuntimeContext, you can always pass just the MetricGroup or ask your components/subclasses “what metrics do you want to register” and register them at the top level.
> - Reporting tens/hundreds/thousands of metrics shouldn’t be an issue for Flink, as long as you have a reasonable reporting interval. However keep in mind that Flink only reports your metrics and you still need something to read/handle/process/aggregate your metrics
> 2.
> I don’t think that reporting per node/jvm is possible with Flink’s metric system. For that you would need some other solution, like report your metrics using JMX (directly register MBeans from your code)
> 
> Piotrek
> 
> > On 10 Dec 2017, at 18:51, Navneeth Krishnan <reachnavneeth2@gmail.com <ma...@gmail.com>> wrote:
> >
> > Hi,
> >
> > I have a streaming pipeline running on flink and I need to collect metrics to identify how my algorithm is performing. The entire pipeline is multi-tenanted and I also need metrics per tenant. Lets say there would be around 20 metrics to be captured per tenant. I have the following ideas for implemention but any suggestions on which one might be better will help.
> >
> > 1. Use flink metric group and register a group per tenant at the operator level. The disadvantage of this approach for me is I need the runtimecontext parameter to register a metric and I have various subclasses to which I need to pass this object to limit the metric scope within the operator. Also there will be too many metrics reported if there are higher number of subtasks.
> > How is everyone accessing flink state/ metrics from other classes where you don't have access to runtimecontext?
> >
> > 2. Use a custom singleton metric registry to add and send these metrics using custom sink. Instead of using flink metric group to collect metrics per operatior - subtask, collect per jvm and use influx sink to send the metric data. What i'm not sure in this case is how to collect only once per node/jvm.
> >
> > Thanks a bunch in advance.
> 
>

Re: Custom Metrics

Posted by Navneeth Krishnan <re...@gmail.com>.

Thanks Piotr.

Yes, passing the metric group should be sufficient. The subcomponents will
not be able to provide the list of metrics to register since the metrics
are created based on incoming data by tenant. Also I am planning to have
the metrics reported every 10 seconds and hope it shouldn't be a problem.
We use influx and grafana to plot the metrics.

The option 2 that I had in mind was to collect all metrics and use influx
db sink to report it directly inside the pipeline. But it seems reporting
per node might not be possible.


On Mon, Dec 11, 2017 at 3:14 AM, Piotr Nowojski <pi...@data-artisans.com>
wrote:

> Hi,
>
> I’m not sure if I completely understand your issue.
>
> 1.
> - You don’t have to pass RuntimeContext, you can always pass just the
> MetricGroup or ask your components/subclasses “what metrics do you want to
> register” and register them at the top level.
> - Reporting tens/hundreds/thousands of metrics shouldn’t be an issue for
> Flink, as long as you have a reasonable reporting interval. However keep in
> mind that Flink only reports your metrics and you still need something to
> read/handle/process/aggregate your metrics
> 2.
> I don’t think that reporting per node/jvm is possible with Flink’s metric
> system. For that you would need some other solution, like report your
> metrics using JMX (directly register MBeans from your code)
>
> Piotrek
>
> > On 10 Dec 2017, at 18:51, Navneeth Krishnan <re...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > I have a streaming pipeline running on flink and I need to collect
> metrics to identify how my algorithm is performing. The entire pipeline is
> multi-tenanted and I also need metrics per tenant. Lets say there would be
> around 20 metrics to be captured per tenant. I have the following ideas for
> implemention but any suggestions on which one might be better will help.
> >
> > 1. Use flink metric group and register a group per tenant at the
> operator level. The disadvantage of this approach for me is I need the
> runtimecontext parameter to register a metric and I have various subclasses
> to which I need to pass this object to limit the metric scope within the
> operator. Also there will be too many metrics reported if there are higher
> number of subtasks.
> > How is everyone accessing flink state/ metrics from other classes where
> you don't have access to runtimecontext?
> >
> > 2. Use a custom singleton metric registry to add and send these metrics
> using custom sink. Instead of using flink metric group to collect metrics
> per operatior - subtask, collect per jvm and use influx sink to send the
> metric data. What i'm not sure in this case is how to collect only once per
> node/jvm.
> >
> > Thanks a bunch in advance.
>
>

Re: Custom Metrics

Posted by Piotr Nowojski <pi...@data-artisans.com>.

Hi,

I’m not sure if I completely understand your issue.

1.
- You don’t have to pass RuntimeContext, you can always pass just the MetricGroup or ask your components/subclasses “what metrics do you want to register” and register them at the top level.
- Reporting tens/hundreds/thousands of metrics shouldn’t be an issue for Flink, as long as you have a reasonable reporting interval. However keep in mind that Flink only reports your metrics and you still need something to read/handle/process/aggregate your metrics
2.
I don’t think that reporting per node/jvm is possible with Flink’s metric system. For that you would need some other solution, like report your metrics using JMX (directly register MBeans from your code)

Piotrek

> On 10 Dec 2017, at 18:51, Navneeth Krishnan <re...@gmail.com> wrote:
> 
> Hi,
> 
> I have a streaming pipeline running on flink and I need to collect metrics to identify how my algorithm is performing. The entire pipeline is multi-tenanted and I also need metrics per tenant. Lets say there would be around 20 metrics to be captured per tenant. I have the following ideas for implemention but any suggestions on which one might be better will help.
> 
> 1. Use flink metric group and register a group per tenant at the operator level. The disadvantage of this approach for me is I need the runtimecontext parameter to register a metric and I have various subclasses to which I need to pass this object to limit the metric scope within the operator. Also there will be too many metrics reported if there are higher number of subtasks. 
> How is everyone accessing flink state/ metrics from other classes where you don't have access to runtimecontext?
> 
> 2. Use a custom singleton metric registry to add and send these metrics using custom sink. Instead of using flink metric group to collect metrics per operatior - subtask, collect per jvm and use influx sink to send the metric data. What i'm not sure in this case is how to collect only once per node/jvm.
> 
> Thanks a bunch in advance.