You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/05/06 16:57:29 UTC

[GitHub] [pulsar-helm-chart] ckdarby opened a new issue #9: Grafana dashboards not entirely working

ckdarby opened a new issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9


   **Describe the bug**
   Today me & @nickelozz created a new Pulsar install and noticed some broken Grafana.
   
   Broken dashboards:
   Proxy Metrics: Empty
   Node Metrics: Empty
   Overview: Storage/Backlog -> only queries returning single series or tables are supported; Nodes section -> no data
   Pulsar Logs: Empty
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. Fresh helm install from this Github repo
   2. Establish consumer & producer through proxy
   2. Open up Grafana
   3. Go to the listed dashboards
   
   **Expected behavior**
   Dashboards load the correct data, helm includes an agent for missing data or instructions on installing agent if required (example the node metrics).
   
   **Screenshots**
   ![Selection_235](https://user-images.githubusercontent.com/220283/81204918-f2872c00-8f97-11ea-929e-b1dddb1d678d.png)
   ![Selection_236](https://user-images.githubusercontent.com/220283/81205034-219d9d80-8f98-11ea-8f8f-96f1daed1776.png)
   ![Selection_237](https://user-images.githubusercontent.com/220283/81205236-6cb7b080-8f98-11ea-83bf-91293487c130.png)
   ![Selection_238](https://user-images.githubusercontent.com/220283/81205248-704b3780-8f98-11ea-865c-b2d5e4cea0bb.png)
   
   **Additional context**
   We're using the latest helm commit here 06652d7
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] sijie commented on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
sijie commented on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-625483458


   @ckdarby yes. the goal is to push the changes from StreamNative to the upstream. So we will maintain the changes in the official repo. We will port the changes individually. So every change is clear in the git log history. 
   
   The list of changes will be contributed back include but not limited (since I didn't go through all the changes):
   
   - node_exporter integration
   - jwt token for scraping the proxy metrics
   - presto support
   - some other enhancements related to bookkeeper and broker
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] alexku7 edited a comment on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
alexku7 edited a comment on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-625734400


   Btw, there is some mistake in grafana image (ver 0.0.7) . On the Bookie Metrics dashboard the Ledger Disk usage chart  is not working.
   I think there is a mistake in the metric name. It should be **bookie_ledger_dir_data_bookkeeper_ledgers_usage** instead of **bookie_ledger_dir__pulsar_data_bookkeeper_ledgers_usage** that is currently defined in this chart settings. Any chance to correct that mistake ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] alexku7 commented on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
alexku7 commented on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-625517908


   Hello @sijie  
   If it possible to get proxy and broker metrics with authentication enabled by supplying client certificate instead of JWT ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] alexku7 commented on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
alexku7 commented on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-625734400


   Btw, there is some mistake in grafana image (ver 0.0.7) . On the Bookie Metrics dashboard the Ledger Disk usage chart  is not working.
   I think there is a mistake in the metric name. It should be **ookie_ledger_dir_data_bookkeeper_ledgers_usage** instead of **bookie_ledger_dir__pulsar_data_bookkeeper_ledgers_usage** that is currently defined in this chart settings. Any chance to correct that mistake ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] ckdarby commented on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
ckdarby commented on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-625304530


   >Node metrics
   This repo doesn't have the node_exporter.yaml template like Streamnative does.
   
   >Proxy metrics
   Yes, we enabled authentication. I do see in the Streamnative handling the metric endpoint for JWT but this repo doesn't. So, you are correct about the back push.
   
   > Pulsar Logs
   Good to know :)
   
   Are there efforts for consolidation with Streamnative's charts & this official chart? I see the Streamnative charts has a lot of functionality that would be nice to end up in here as well eventually. I'm hopeful that Streamnative's changes end up here and then Streamnative can use an "umbrella" helm to add this as a dependency and keep their custom only required stuff there.
   
   As PMC for Pulsar if you haven't already can you loop in the rest of the contributions to make sure we minimize the changes happening in apache/pulsar for helm? Would prefer to not have massive PR "upstream" have to also be merged here and also trying to pull in Streamnative's version into this :P


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] ckdarby edited a comment on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
ckdarby edited a comment on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-625304530


   >Node metrics
   
   This repo doesn't have the node_exporter.yaml template like Streamnative does.
   
   >Proxy metrics
   
   Yes, we enabled authentication. I do see in the Streamnative handling the metric endpoint for JWT but this repo doesn't. So, you are correct about the back push.
   
   > Pulsar Logs
   
   Good to know :)
   
   Are there efforts for consolidation with Streamnative's charts & this official chart? I see the Streamnative charts has a lot of functionality that would be nice to end up in here as well eventually. I'm hopeful that Streamnative's changes end up here and then Streamnative can use an "umbrella" helm to add this as a dependency and keep their custom only required stuff there.
   
   As PMC for Pulsar if you haven't already can you loop in the rest of the contributions to make sure we minimize the changes happening in apache/pulsar for helm? Would prefer to not have massive PR "upstream" have to also be merged here and also trying to pull in Streamnative's version into this :P


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] sannadi99 commented on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
sannadi99 commented on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-662381480


   Hello @sijie,   How can we publish all the Pulsar logs (create tenants or produce/consume messages extra logs) to Grafana dashboard. Right now with some config changes in log4j2.yml i am able to push the logs to my local logs folder. But how can we push them to Prometheus and then to Grafana. I have a Grafana dashboard where i am able to see my pulsar metrics but i couldn't see logs. Pulsar logs in my dashboard are just empty. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] sijie commented on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
sijie commented on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-662543446


   @alexku7 The latest version of `streamnative/apache-pulsar-grafana-dashboard-k8s:0.0.9 ` (https://github.com/streamnative/apache-pulsar-grafana-dashboard) has fixed the ledger disk usage issue.
   
   @sannadi99 You can use [Grafana Loki](https://grafana.com/oss/loki/) for log aggregation and then add a datasource for Loki to your Grafana dashboard. Then you can use Grafana for viewing the logs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] sijie commented on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
sijie commented on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-624811832


   @ckdarby 
   
   - Overview:
   
   > Storage/Backlog -> only queries returning single series or tables are supported; 
   
   The "Storage/Backlog" in Overview section is designed to display the total storage size and total backlog of the whole cluster. 
   
   - Node metrics: By default, the Helm chart installs node_exporter to export k8s node metrics. Did you see node_exporter pods running in your helm release?
   
   - Proxy metrics: Did you enable authentication? If you are enabling authentication, the proxy metrics scraping might be a problem. We need to add bearer token for scraping the metrics. I do have a fix in [streamnative/charts](https://github.com/streamnative/charts) but I didn't push the change back here yet.  
   
   - Pulsar Logs: The pulsar logs dashboard was added to the Grafana image for supporting using Grafana Loki for collecting logs from Pulsar pods. The current helm chart doesn't integrate with Loki yet. We will have a pull request to add Loki as an optional dependency to the chart. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] sijie commented on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
sijie commented on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-625576883


   @alexku7 It is doable. I believe Grafana supports the m-TLS but I need to double-check. The current helm chart isn't configured with m-TLS yet. but most of the TLS settings are already in-pace. So it should be done fairly simple. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] sijie edited a comment on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
sijie edited a comment on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-625576883


   @alexku7 It is doable. I believe Prometheus supports the m-TLS but I need to double-check. The current helm chart isn't configured with m-TLS yet. but most of the TLS settings are already in-pace. So it should be done fairly simple. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] sijie commented on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
sijie commented on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-625990490


   > I meant that Proetheus (not grafana) will scrape
   
   Sorry. I meant to say Prometheus. Fixed my original comment.
   
   > On the Bookie Metrics dashboard the Ledger Disk usage chart is not working.
   > I think there is a mistake in the metric name. It should be bookie_ledger_dir_data_bookkeeper_ledgers_usage instead of bookie_ledger_dir__pulsar_data_bookkeeper_ledgers_usage
   
   Ah I see. The metric name is dynamic. so it is generated from your configured ledger dir path. It seems that the original dashboard was using "/pulsar/data" as the path, so the metric name has "__pulsar_data". And you are using "data" as the path, so the metric name has `data` in the name.
   
   We can use "bookie_ledger_*_usage" for the metric name. The dashboard file is not along with the helm chart. It is actually built as part of the grafana image. We can fix the original dashboard file. Are you interested in contributing a fix? https://github.com/streamnative/apache-pulsar-grafana-dashboard/blob/master/dashboards.kubernetes/bookkeeper.json#L787


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] alexku7 commented on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
alexku7 commented on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-626183865


   Hi
   I would happy but i can't make it working :( Seems , that you prometheus/grafana don't support wildcards in the metric names. I tried to experiment but got an error in prometheus Graph and no data in Grafana if i put a wildcard in the metric name.
   
   I think to implement a workaround by relabeling setting in the Prometheus scrap config file and simply to change the metric name to the name which is defined in he grafana dashboard .


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar-helm-chart] alexku7 commented on issue #9: Grafana dashboards not entirely working

Posted by GitBox <gi...@apache.org>.
alexku7 commented on issue #9:
URL: https://github.com/apache/pulsar-helm-chart/issues/9#issuecomment-625731118


   Hi, I meant that Proetheus (not grafana) will scrape the metrics from proxy servers with authentication enabled and it will do it by certificate authentication but not with JWT.
   I played with it yesterday but it seems not working.  So as a workaround for now  i use the authenticateMetricsEndpoint=false in the proxy settings. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org