You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by zentol <gi...@git.apache.org> on 2017/10/26 17:10:58 UTC

[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...

GitHub user zentol opened a pull request:

    https://github.com/apache/flink/pull/4908

    [FLINK-7933][metrics] Improve PrometheusReporter tests

    ## What is the purpose of the change
    
    This PR resolves the test instabilities of the prometheus reporter.
    
    ## Brief change log
    
    * use a port range instead of a single hard-coded port
      * required the exposure of the bound port by the reporter
    * move registry/reporter initialization into an `@Before` method (guarantees execution order)
    * shutdown registry in `PrometheusReporterTest`
    
    ## Verifying this change
    
    This change is already covered by existing tests.
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (no)
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (no)
      - If yes, how is the feature documented? (not applicable)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zentol/flink 7933

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4908.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4908
    
----
commit 36168084423f704d8fb2647acf8dda2a09e4ee1d
Author: zentol <ch...@apache.org>
Date:   2017-10-26T17:04:30Z

    [FLINK-7933][metrics] Improve PrometheusReporter tests

----


---

[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...

Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4908#discussion_r147357050
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTaskScopeTest.java ---
    @@ -72,10 +72,30 @@
     	private final AbstractID taskAttemptId2 = new AbstractID();
     	private final String[] labelValues2 = {jobId.toString(), taskId2.toString(), taskAttemptId2.toString(), TASK_MANAGER_HOST, TASK_NAME, "" + ATTEMPT_NUMBER, JOB_NAME, TASK_MANAGER_ID, "" + SUBTASK_INDEX_2};
     
    -	private final TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
    -	private final TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
    -	private final TaskMetricGroup taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
    -	private final TaskMetricGroup taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
    +	private TaskMetricGroup taskMetricGroup1;
    +	private TaskMetricGroup taskMetricGroup2;
    +
    +	private MetricRegistry registry;
    +	private int port;
    +
    +	@Before
    +	public void setupReporter() {
    +		registry = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "9400-9500")));
    --- End diff --
    
    Can not we bind to a random port (`0`) and later call `getPort()` as you do it now? 


---

[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4908#discussion_r147379229
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTaskScopeTest.java ---
    @@ -72,10 +72,30 @@
     	private final AbstractID taskAttemptId2 = new AbstractID();
     	private final String[] labelValues2 = {jobId.toString(), taskId2.toString(), taskAttemptId2.toString(), TASK_MANAGER_HOST, TASK_NAME, "" + ATTEMPT_NUMBER, JOB_NAME, TASK_MANAGER_ID, "" + SUBTASK_INDEX_2};
     
    -	private final TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
    -	private final TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
    -	private final TaskMetricGroup taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
    -	private final TaskMetricGroup taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
    +	private TaskMetricGroup taskMetricGroup1;
    +	private TaskMetricGroup taskMetricGroup2;
    +
    +	private MetricRegistry registry;
    +	private int port;
    +
    +	@Before
    +	public void setupReporter() {
    +		registry = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "9400-9500")));
    +		PrometheusReporter reporter = (PrometheusReporter) registry.getReporters().get(0);
    +		port = reporter.getPort();
    +
    +		TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
    +		TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
    +		taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
    +		taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
    +	}
    +
    +	@After
    +	public void shutdownRegistry() {
    +		if (registry != null) {
    +			registry.shutdown();
    --- End diff --
    
    the registry is closing the reporter.


---

[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4908#discussion_r147639583
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTaskScopeTest.java ---
    @@ -72,10 +72,30 @@
     	private final AbstractID taskAttemptId2 = new AbstractID();
     	private final String[] labelValues2 = {jobId.toString(), taskId2.toString(), taskAttemptId2.toString(), TASK_MANAGER_HOST, TASK_NAME, "" + ATTEMPT_NUMBER, JOB_NAME, TASK_MANAGER_ID, "" + SUBTASK_INDEX_2};
     
    -	private final TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
    -	private final TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
    -	private final TaskMetricGroup taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
    -	private final TaskMetricGroup taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
    +	private TaskMetricGroup taskMetricGroup1;
    +	private TaskMetricGroup taskMetricGroup2;
    +
    +	private MetricRegistry registry;
    +	private int port;
    +
    +	@Before
    +	public void setupReporter() {
    +		registry = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "9400-9500")));
    --- End diff --
    
    no. The underlying http server in the reporter does not expose the port used. We have to try the constructor with a specific port so that we know which port is actually used.


---

[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/4908


---

[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4908#discussion_r147379207
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTaskScopeTest.java ---
    @@ -72,10 +72,30 @@
     	private final AbstractID taskAttemptId2 = new AbstractID();
     	private final String[] labelValues2 = {jobId.toString(), taskId2.toString(), taskAttemptId2.toString(), TASK_MANAGER_HOST, TASK_NAME, "" + ATTEMPT_NUMBER, JOB_NAME, TASK_MANAGER_ID, "" + SUBTASK_INDEX_2};
     
    -	private final TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
    -	private final TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
    -	private final TaskMetricGroup taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
    -	private final TaskMetricGroup taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
    +	private TaskMetricGroup taskMetricGroup1;
    +	private TaskMetricGroup taskMetricGroup2;
    +
    +	private MetricRegistry registry;
    +	private int port;
    --- End diff --
    
    yup we could do that too.


---

[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...

Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4908#discussion_r147357745
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTaskScopeTest.java ---
    @@ -72,10 +72,30 @@
     	private final AbstractID taskAttemptId2 = new AbstractID();
     	private final String[] labelValues2 = {jobId.toString(), taskId2.toString(), taskAttemptId2.toString(), TASK_MANAGER_HOST, TASK_NAME, "" + ATTEMPT_NUMBER, JOB_NAME, TASK_MANAGER_ID, "" + SUBTASK_INDEX_2};
     
    -	private final TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
    -	private final TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
    -	private final TaskMetricGroup taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
    -	private final TaskMetricGroup taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
    +	private TaskMetricGroup taskMetricGroup1;
    +	private TaskMetricGroup taskMetricGroup2;
    +
    +	private MetricRegistry registry;
    +	private int port;
    +
    +	@Before
    +	public void setupReporter() {
    +		registry = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "9400-9500")));
    +		PrometheusReporter reporter = (PrometheusReporter) registry.getReporters().get(0);
    +		port = reporter.getPort();
    +
    +		TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
    +		TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
    +		taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
    +		taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
    +	}
    +
    +	@After
    +	public void shutdownRegistry() {
    +		if (registry != null) {
    +			registry.shutdown();
    --- End diff --
    
    you are not closing the reporter here. Isn't this a root cause of the test instability?


---

[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...

Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4908#discussion_r147357533
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTaskScopeTest.java ---
    @@ -72,10 +72,30 @@
     	private final AbstractID taskAttemptId2 = new AbstractID();
     	private final String[] labelValues2 = {jobId.toString(), taskId2.toString(), taskAttemptId2.toString(), TASK_MANAGER_HOST, TASK_NAME, "" + ATTEMPT_NUMBER, JOB_NAME, TASK_MANAGER_ID, "" + SUBTASK_INDEX_2};
     
    -	private final TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
    -	private final TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
    -	private final TaskMetricGroup taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
    -	private final TaskMetricGroup taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
    +	private TaskMetricGroup taskMetricGroup1;
    +	private TaskMetricGroup taskMetricGroup2;
    +
    +	private MetricRegistry registry;
    +	private int port;
    --- End diff --
    
    can't we access `getPort()` instead of using this variable? Isn't it redundant?


---

[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...

Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4908#discussion_r147357330
  
    --- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTest.java ---
    @@ -70,9 +69,25 @@
     	@Rule
     	public ExpectedException thrown = ExpectedException.none();
     
    -	private final MetricRegistry registry = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "" + NON_DEFAULT_PORT)));
    -	private final FrontMetricGroup<TaskManagerMetricGroup> metricGroup = new FrontMetricGroup<>(0, new TaskManagerMetricGroup(registry, HOST_NAME, TASK_MANAGER));
    -	private final MetricReporter reporter = registry.getReporters().get(0);
    +	private MetricRegistry registry;
    +	private FrontMetricGroup<TaskManagerMetricGroup> metricGroup;
    +	private PrometheusReporter reporter;
    +	private int port;
    --- End diff --
    
    do we need this extra field for that? Can't we access `reporter.getPort()`? Isn't it redundant?


---

[GitHub] flink issue #4908: [FLINK-7933][metrics] Improve PrometheusReporter tests

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/4908
  
    Thanks, looks good, +1 to merge


---