You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user-zh@flink.apache.org by Yang Wang <wa...@apache.org> on 2023/04/11 10:04:15 UTC

Re: 监控flink的prometheus经常OOM

可以通过给Prometheus  server来配置metric_relabel_configs[1]来控制采集哪些metrics

[1].
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs

Best,
Yang

casel.chen <ca...@126.com> 于2023年3月22日周三 13:47写道：

> 更正一下，监控flink的方式从pushgateway方式改成了直接prometheus定期来抓取，周期设置的是1分钟，之前用pushgateway方式也总是把pushgateway打挂，现在改成pull方式还是照样把prometheus打挂。flink作业侧有什么参数可以配置吗？prometheus侧能否配置只抓取那些grafana
> dashboard展示需要的metrics?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> 在 2023-03-22 12:08:29，"casel.chen" <ca...@126.com> 写道：
>
> >我们通过pushgateway上报metrics到prometheus，设置的上报周期是30秒，整个实时平台有200多个作业，启了一台50GB的prometheus还是撑不住，指标保留1天，设置了指标在内存中驻留2小时后写入磁盘。最大的一个metric已经有37万条。请问有什么解决办法么？能选择哪些指标进行上报不？
>