You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by mbode <gi...@git.apache.org> on 2017/08/25 09:40:48 UTC

[GitHub] flink pull request #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

GitHub user mbode opened a pull request:

    https://github.com/apache/flink/pull/4586

    [FLINK-7502] [metrics] Improve PrometheusReporter

    ## What is the purpose of the change
    
    *This pull request addresses several issues that came up when testing PrometheusReporter in a YARN environment. Most significantly, it is now possible to specify a port range similarly to how it is done for JmxReporter.*
    
    
    ## Brief change log
    
    * Do not throw exception when same metric is added twice
    * Add possibility to configure port range
    * Bump prometheus.version 0.0.21 -> 0.0.26
    * Use simpleclient_httpserver instead of nanohttpd
    
    
    ## Verifying this change
    
    * The changed code is covered by existing test `PrometheusReporterTest`, test cases for new functionality were added.
    * Test building the jar and adding it to Flink lib directory. Then configure Prometheus reporter and check endpoints in configured port range for metrics. In particular, test case where JobManager and one TaskManager are co-located – there now should be separately reported metrics endpoints.
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): **yes**, upgrade prometheus client
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: **yes**, `PrometheusReporter` is `@PublicEvolving`, but it has not been released as of now.
      - The serializers: **no**
      - The runtime per-record code paths (performance sensitive): **don't know**
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: **don't know**
    
    ## Documentation
    
      - Does this pull request introduce a new feature? **yes**, configurable port range
      - If yes, how is the feature documented? **docs**

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mbode/flink prometheus_reporter_improvements

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4586.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4586
    
----
commit ef4e1aa79db1dca361f2a7e274e2355008881cfe
Author: Maximilian Bode <ma...@tngtech.com>
Date:   2017-08-24T13:59:24Z

    [FLINK-7502] Improve PrometheusReporter
    
    * Do not throw exception when same metric is added twice
    * Add possibility to configure port range
    * Bump prometheus.version 0.0.21 -> 0.0.26
    * Use simpleclient_httpserver instead of nanohttpd

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by mbode <gi...@git.apache.org>.
Github user mbode commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    Also there should have been the warning "Could not start PrometheusReporter HTTP server on any configured port. Ports: ...", wasn't this logged?


---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    I can't think of a better solution either.


---

[GitHub] flink pull request #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4586#discussion_r135764923
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTest.java ---
    @@ -160,6 +151,43 @@ public void invalidCharactersAreReplacedWithUnderscore() {
     		assertThat(PrometheusReporter.replaceInvalidChars("a,=;:?'b,=;:?'c"), equalTo("a___:__b___:__c"));
     	}
     
    +	@Test
    +	public void registeringSameMetricTwiceDoesNotThrowException() {
    +		Counter counter = new SimpleCounter();
    +		counter.inc();
    +		String counterName = "testCounter";
    +		final FrontMetricGroup<TaskManagerMetricGroup> group = new FrontMetricGroup<>(0, new TaskManagerMetricGroup(registry, HOST_NAME, TASK_MANAGER));
    +
    +		reporter.notifyOfAddedMetric(counter, counterName, group);
    +		reporter.notifyOfAddedMetric(counter, counterName, group);
    +	}
    +
    +	@Test
    +	public void cannotStartTwoReportersOnSamePort() {
    +		final MetricRegistry fixedPort1 = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "12345")));
    +		final MetricRegistry fixedPort2 = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test2", "12345")));
    +		assertThat(fixedPort1.getReporters(), hasSize(1));
    +		assertThat(fixedPort2.getReporters(), hasSize(0));
    +	}
    +
    +	@Test
    +	public void canStartTwoReportersWhenUsingPortRange() {
    +		final MetricRegistry portRange1 = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "9249-9252")));
    +		final MetricRegistry portRange2 = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test2", "9249-9252")));
    +		assertThat(portRange1.getReporters(), hasSize(1));
    +		assertThat(portRange2.getReporters(), hasSize(1));
    +	}
    +
    +	@Test
    +	public void cannotStartThreeReportersWhenPortRangeIsTooSmall() {
    +		final MetricRegistry smallPortRange1 = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "9253-9254")));
    --- End diff --
    
    You should call `MetricRegistry#shutdown()` when you no longer need it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4586#discussion_r144804293
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusReporter.java ---
    @@ -114,39 +120,78 @@ public void notifyOfAddedMetric(final Metric metric, final String metricName, fi
     			dimensionValues.add(CHARACTER_FILTER.filterCharacters(dimension.getValue()));
     		}
     
    -		final String validMetricName = scope + SCOPE_SEPARATOR + CHARACTER_FILTER.filterCharacters(metricName);
    -		final String metricIdentifier = group.getMetricIdentifier(metricName);
    +		final String scopedMetricName = getScopedName(metricName, group);
    +		final String helpString = metricName + " (scope: " + getLogicalScope(group) + ")";
    +
     		final Collector collector;
    -		if (metric instanceof Gauge) {
    -			collector = createGauge((Gauge) metric, validMetricName, metricIdentifier, dimensionKeys, dimensionValues);
    -		} else if (metric instanceof Counter) {
    -			collector = createGauge((Counter) metric, validMetricName, metricIdentifier, dimensionKeys, dimensionValues);
    -		} else if (metric instanceof Meter) {
    -			collector = createGauge((Meter) metric, validMetricName, metricIdentifier, dimensionKeys, dimensionValues);
    -		} else if (metric instanceof Histogram) {
    -			collector = createSummary((Histogram) metric, validMetricName, metricIdentifier, dimensionKeys, dimensionValues);
    +		Integer count = 0;
    +
    +		if (!collectorsWithCountByMetricName.containsKey(scopedMetricName)) {
    +			if (metric instanceof Gauge) {
    +				collector = newGauge(scopedMetricName, helpString, dimensionKeys, dimensionValues, gaugeFrom((Gauge) metric));
    +			} else if (metric instanceof Counter) {
    +				collector = newGauge(scopedMetricName, helpString, dimensionKeys, dimensionValues, gaugeFrom((Counter) metric));
    +			} else if (metric instanceof Meter) {
    +				collector = newGauge(scopedMetricName, helpString, dimensionKeys, dimensionValues, gaugeFrom((Meter) metric));
    +			} else if (metric instanceof Histogram) {
    +				collector = new HistogramSummaryProxy((Histogram) metric, scopedMetricName, helpString, dimensionKeys, dimensionValues);
    +			} else {
    +				LOG.warn("Cannot add unknown metric type: {}. This indicates that the metric type is not supported by this reporter.",
    +					metric.getClass().getName());
    +				return;
    +			}
    +			try {
    +				collector.register();
    +			} catch (Exception e) {
    +				LOG.warn("There was a problem registering metric {}.", metricName, e);
    +			}
     		} else {
    -			LOG.warn("Cannot add unknown metric type: {}. This indicates that the metric type is not supported by this reporter.",
    -				metric.getClass().getName());
    -			return;
    +			final AbstractMap.SimpleImmutableEntry<Collector, Integer> collectorWithCount = collectorsWithCountByMetricName.get(scopedMetricName);
    +			collector = collectorWithCount.getKey();
    +			count = collectorWithCount.getValue();
    +			if (metric instanceof Gauge) {
    +				((io.prometheus.client.Gauge) collector).setChild(gaugeFrom((Gauge) metric), toArray(dimensionValues));
    +			} else if (metric instanceof Counter) {
    +				((io.prometheus.client.Gauge) collector).setChild(gaugeFrom((Counter) metric), toArray(dimensionValues));
    +			} else if (metric instanceof Meter) {
    +				((io.prometheus.client.Gauge) collector).setChild(gaugeFrom((Meter) metric), toArray(dimensionValues));
    +			} else if (metric instanceof Histogram) {
    +				((HistogramSummaryProxy) collector).addChild((Histogram) metric, dimensionValues);
    +			}
     		}
    -		collector.register();
    -		collectorsByMetricName.put(metricName, collector);
    +		collectorsWithCountByMetricName.put(scopedMetricName, new AbstractMap.SimpleImmutableEntry<>(collector, count + 1));
    +	}
    +
    +	private static String getScopedName(String metricName, MetricGroup group) {
    +		return SCOPE_PREFIX + getLogicalScope(group) + SCOPE_SEPARATOR + CHARACTER_FILTER.filterCharacters(metricName);
     	}
     
     	@Override
     	public void notifyOfRemovedMetric(final Metric metric, final String metricName, final MetricGroup group) {
    -		CollectorRegistry.defaultRegistry.unregister(collectorsByMetricName.get(metricName));
    -		collectorsByMetricName.remove(metricName);
    +		final String scopedMetricName = getScopedName(metricName, group);
    +		final AbstractMap.SimpleImmutableEntry<Collector, Integer> collectorWithCount = collectorsWithCountByMetricName.get(scopedMetricName);
    +		final Integer count = collectorWithCount.getValue();
    +		final Collector collector = collectorWithCount.getKey();
    +		if (count == 1) {
    --- End diff --
    
    You need a separate synchronization mechanism here to prevent race-conditions between adding/removing metrics. A metric can be swallowed if it is added between retrieving the collector and removing it from the map.


---

[GitHub] flink pull request #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/4586


---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by mbode <gi...@git.apache.org>.
Github user mbode commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    I implemented your comments and assembled a [small setup](https://github.com/mbode/flink-prometheus-example) to test the reporter again. 
    
    It currently clones *master* and build the reporter from there. In order to test another revision, you can just build *flink-metrics-prometheus-1.4-SNAPSHOT.jar* locally, put it into the *flink* subdirectory of the project and add `COPY flink-metrics-prometheus-1.4-SNAPSHOT.jar /opt/flink/lib` to *flink/Dockerfile*.


---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    This is rather odd. This minimal example does fail as expected:
    ```
    
    	
    	public static void main(String[] args) throws IOException {
    		HTTPServer s1 = null;
    		HTTPServer s2 = null;
    		try {
    			s1 = new HTTPServer(9010);
    			s2 = new HTTPServer(9010);
    		} finally {
    			if (s1 != null) {
    				s1.stop();
    			}
    			if (s2 != null) {
    				s2.stop();
    			}
    		}
    	}
    ```


---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by mbode <gi...@git.apache.org>.
Github user mbode commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    @zentol It would be great if you could have another look on occasion, I added better handling for metrics that are registered e.g. by different subtasks.
    
    [green travis](https://travis-ci.org/mbode/flink/builds/274685138)


---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    And this fails as well:
    ```
    public static void main(String[] args) throws IOException {
    		PrometheusReporter p1 = new PrometheusReporter();
    		PrometheusReporter p2 = new PrometheusReporter();
    		
    		MetricConfig mc = new MetricConfig();
    		mc.put("port", "9001");
    		
    		p1.open(mc);
    		p2.open(mc);
    	}
    ```


---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by mbode <gi...@git.apache.org>.
Github user mbode commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    I mean in general it is probably not the best thing to just rely on the port not being available as a consensus algorithm of who should claim which port. Then again, I could not think of a straightforward way to coordinate across Job/TaskManagers without using something external, which would probably get too involved for a metrics-related component – maybe there is something I am missing?


---

[GitHub] flink pull request #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4586#discussion_r144807511
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTest.java ---
    @@ -124,19 +133,6 @@ public void histogramIsReportedAsPrometheusSummary() throws UnirestException {
     	}
     
     	@Test
    -	public void meterRateIsReportedAsPrometheusGauge() throws UnirestException {
    -		Meter testMeter = new TestMeter();
    -
    -		String meterName = "testMeter";
    -		String counterName = SCOPE_PREFIX + meterName;
    -
    -		assertThat(addMetricAndPollResponse(testMeter, meterName),
    -			equalTo(HELP_PREFIX + counterName + " " + getFullMetricName(meterName) + "\n" +
    --- End diff --
    
    getFullMetricName is now unused and can be removed.


---

[GitHub] flink pull request #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4586#discussion_r135764351
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusReporter.java ---
    @@ -84,21 +84,30 @@ static String replaceInvalidChars(final String input) {
     
     	@Override
     	public void open(MetricConfig config) {
    -		int port = config.getInteger(ARG_PORT, DEFAULT_PORT);
    -		LOG.info("Using port {}.", port);
    -		prometheusEndpoint = new PrometheusEndpoint(port);
    -		try {
    -			prometheusEndpoint.start(NanoHTTPD.SOCKET_READ_TIMEOUT, true);
    -		} catch (IOException e) {
    -			final String msg = "Could not start PrometheusEndpoint on port " + port;
    -			LOG.warn(msg, e);
    -			throw new RuntimeException(msg, e);
    +		String portsConfig = config.getString(ARG_PORT, DEFAULT_PORT);
    +
    +		if (portsConfig != null) {
    --- End diff --
    
    This check isn't needed. If we keep it we should also throw an exception if portsConfig is null as otherwise we're hitting an NPE later on since httpServer is still null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    I configured 3 ports `9001-9003`. Each reporter logged `2017-10-25 13:10:14,848 INFO  org.apache.flink.metrics.prometheus.PrometheusReporter        - Started PrometheusReporter HTTP server on port 9001.`, which should only be possible if they can successfully create the HttpServer.


---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    So I went back to configuring a single port, starting a flink cluster (JM + 2 TM) and 1 HTTPServer in the IDE.
    
    Starting the HTTPServer in the IDE before the cluster does in fact lead to an IOException in the reporters.
    Starting the cluster before the HTTPServer leads to an exception in the IDE.
    
    But if i only start the cluster the reporters start up without an exception.
    
    This may be an OS issue, in particular the Windows10 WSL feature. I'll try it again later on a linux machine; if it works there I'll merge this.


---

[GitHub] flink pull request #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4586#discussion_r135764533
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusReporter.java ---
    @@ -130,7 +139,11 @@ public void notifyOfAddedMetric(final Metric metric, final String metricName, fi
     				metric.getClass().getName());
     			return;
     		}
    -		collector.register();
    +		try {
    +			collector.register();
    +		} catch (Exception e) {
    +			LOG.warn("There was a problem registering metric {}: {}", metricName, e);
    --- End diff --
    
    We usually don't include placeholders for exceptions (because they are added implicitly), i.e. it should be ```LOG.warn("There was a problem registering metric {}.", metricName, e);```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    WIll try it out now and merge it afterwards.


---

[GitHub] flink pull request #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4586#discussion_r144805511
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusReporter.java ---
    @@ -114,39 +120,78 @@ public void notifyOfAddedMetric(final Metric metric, final String metricName, fi
     			dimensionValues.add(CHARACTER_FILTER.filterCharacters(dimension.getValue()));
     		}
     
    -		final String validMetricName = scope + SCOPE_SEPARATOR + CHARACTER_FILTER.filterCharacters(metricName);
    -		final String metricIdentifier = group.getMetricIdentifier(metricName);
    +		final String scopedMetricName = getScopedName(metricName, group);
    +		final String helpString = metricName + " (scope: " + getLogicalScope(group) + ")";
    +
     		final Collector collector;
    -		if (metric instanceof Gauge) {
    -			collector = createGauge((Gauge) metric, validMetricName, metricIdentifier, dimensionKeys, dimensionValues);
    -		} else if (metric instanceof Counter) {
    -			collector = createGauge((Counter) metric, validMetricName, metricIdentifier, dimensionKeys, dimensionValues);
    -		} else if (metric instanceof Meter) {
    -			collector = createGauge((Meter) metric, validMetricName, metricIdentifier, dimensionKeys, dimensionValues);
    -		} else if (metric instanceof Histogram) {
    -			collector = createSummary((Histogram) metric, validMetricName, metricIdentifier, dimensionKeys, dimensionValues);
    +		Integer count = 0;
    +
    +		if (!collectorsWithCountByMetricName.containsKey(scopedMetricName)) {
    +			if (metric instanceof Gauge) {
    +				collector = newGauge(scopedMetricName, helpString, dimensionKeys, dimensionValues, gaugeFrom((Gauge) metric));
    --- End diff --
    
    We can simplify this block a little bit by not registering the metric right away.
    
    Instead of
    
    ```
    if(!collectorExists) {
    	//create collector and add metric
    } else {
    	//get collector and add metric
    }
    ```
    
    do
    
    ```
    if(!collectorExists) {
    	//create collector
    } else {
    	//get collector
    }
    add metric
    ```


---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by mbode <gi...@git.apache.org>.
Github user mbode commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    [Green Travis](https://travis-ci.org/mbode/flink/builds/268258386)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by mbode <gi...@git.apache.org>.
Github user mbode commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    Did you configure a port range with sufficient (i.e.) three ports? By default, it uses only one port. I added a sentence about this to the readme but maybe we can make this more explicit?


---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    The port conflict detection doesn't appear to be working as intended. I started a jobmanager and 2 taskmanagers on the same machine, and each reporter used the same port. Prometheus could only pick up the metrics from the jobmanager, and the taskmanagers ones were silently ignored.


---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    I've double checked with DEBUG logging enabled, no warning anywhere.


---

[GitHub] flink pull request #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4586#discussion_r144807031
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusReporter.java ---
    @@ -194,66 +239,46 @@ private static Collector newGauge(String name, String identifier, List<String> l
     			.setChild(child, toArray(labelValues));
     	}
     
    -	private static HistogramSummaryProxy createSummary(final Histogram histogram, final String name, final String identifier, final List<String> dimensionKeys, final List<String> dimensionValues) {
    -		return new HistogramSummaryProxy(histogram, name, identifier, dimensionKeys, dimensionValues);
    -	}
    -
    -	static class PrometheusEndpoint extends NanoHTTPD {
    -		static final String MIME_TYPE = "plain/text";
    -
    -		PrometheusEndpoint(int port) {
    -			super(port);
    -		}
    -
    -		@Override
    -		public Response serve(IHTTPSession session) {
    -			if (session.getUri().equals("/metrics")) {
    -				StringWriter writer = new StringWriter();
    -				try {
    -					TextFormat.write004(writer, CollectorRegistry.defaultRegistry.metricFamilySamples());
    -				} catch (IOException e) {
    -					return newFixedLengthResponse(Response.Status.INTERNAL_ERROR, MIME_TYPE, "Unable to output metrics");
    -				}
    -				return newFixedLengthResponse(Response.Status.OK, TextFormat.CONTENT_TYPE_004, writer.toString());
    -			} else {
    -				return newFixedLengthResponse(Response.Status.NOT_FOUND, MIME_TYPE, "Not found");
    -			}
    -		}
    -	}
    -
    -	private static class HistogramSummaryProxy extends Collector {
    -		private static final List<Double> QUANTILES = Arrays.asList(.5, .75, .95, .98, .99, .999);
    +	static class HistogramSummaryProxy extends Collector {
    --- End diff --
    
    `@VisibleForTesting`?


---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    Taking another look now.



---

[GitHub] flink pull request #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by mbode <gi...@git.apache.org>.
Github user mbode commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4586#discussion_r136503621
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTest.java ---
    @@ -160,6 +151,43 @@ public void invalidCharactersAreReplacedWithUnderscore() {
     		assertThat(PrometheusReporter.replaceInvalidChars("a,=;:?'b,=;:?'c"), equalTo("a___:__b___:__c"));
     	}
     
    +	@Test
    +	public void registeringSameMetricTwiceDoesNotThrowException() {
    +		Counter counter = new SimpleCounter();
    +		counter.inc();
    +		String counterName = "testCounter";
    +		final FrontMetricGroup<TaskManagerMetricGroup> group = new FrontMetricGroup<>(0, new TaskManagerMetricGroup(registry, HOST_NAME, TASK_MANAGER));
    +
    +		reporter.notifyOfAddedMetric(counter, counterName, group);
    +		reporter.notifyOfAddedMetric(counter, counterName, group);
    +	}
    +
    +	@Test
    +	public void cannotStartTwoReportersOnSamePort() {
    +		final MetricRegistry fixedPort1 = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "12345")));
    +		final MetricRegistry fixedPort2 = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test2", "12345")));
    +		assertThat(fixedPort1.getReporters(), hasSize(1));
    +		assertThat(fixedPort2.getReporters(), hasSize(0));
    +	}
    +
    +	@Test
    +	public void canStartTwoReportersWhenUsingPortRange() {
    +		final MetricRegistry portRange1 = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "9249-9252")));
    +		final MetricRegistry portRange2 = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test2", "9249-9252")));
    +		assertThat(portRange1.getReporters(), hasSize(1));
    +		assertThat(portRange2.getReporters(), hasSize(1));
    +	}
    +
    +	@Test
    +	public void cannotStartThreeReportersWhenPortRangeIsTooSmall() {
    +		final MetricRegistry smallPortRange1 = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "9253-9254")));
    --- End diff --
    
    done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by mbode <gi...@git.apache.org>.
Github user mbode commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    [Green Travis](https://travis-ci.org/mbode/flink/builds/290468452)


---

[GitHub] flink issue #4586: [FLINK-7502] [metrics] Improve PrometheusReporter

Posted by mbode <gi...@git.apache.org>.
Github user mbode commented on the issue:

    https://github.com/apache/flink/pull/4586
  
    @zentol *ping*


---