You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by zentol <gi...@git.apache.org> on 2017/10/26 17:10:58 UTC
[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...
GitHub user zentol opened a pull request:
https://github.com/apache/flink/pull/4908
[FLINK-7933][metrics] Improve PrometheusReporter tests
## What is the purpose of the change
This PR resolves the test instabilities of the prometheus reporter.
## Brief change log
* use a port range instead of a single hard-coded port
* required the exposure of the bound port by the reporter
* move registry/reporter initialization into an `@Before` method (guarantees execution order)
* shutdown registry in `PrometheusReporterTest`
## Verifying this change
This change is already covered by existing tests.
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): (no)
- The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
- The serializers: (no)
- The runtime per-record code paths (performance sensitive): (no)
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
## Documentation
- Does this pull request introduce a new feature? (no)
- If yes, how is the feature documented? (not applicable)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/zentol/flink 7933
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/4908.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4908
----
commit 36168084423f704d8fb2647acf8dda2a09e4ee1d
Author: zentol <ch...@apache.org>
Date: 2017-10-26T17:04:30Z
[FLINK-7933][metrics] Improve PrometheusReporter tests
----
---
[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...
Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on a diff in the pull request:
https://github.com/apache/flink/pull/4908#discussion_r147357050
--- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTaskScopeTest.java ---
@@ -72,10 +72,30 @@
private final AbstractID taskAttemptId2 = new AbstractID();
private final String[] labelValues2 = {jobId.toString(), taskId2.toString(), taskAttemptId2.toString(), TASK_MANAGER_HOST, TASK_NAME, "" + ATTEMPT_NUMBER, JOB_NAME, TASK_MANAGER_ID, "" + SUBTASK_INDEX_2};
- private final TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
- private final TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
- private final TaskMetricGroup taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
- private final TaskMetricGroup taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
+ private TaskMetricGroup taskMetricGroup1;
+ private TaskMetricGroup taskMetricGroup2;
+
+ private MetricRegistry registry;
+ private int port;
+
+ @Before
+ public void setupReporter() {
+ registry = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "9400-9500")));
--- End diff --
Can not we bind to a random port (`0`) and later call `getPort()` as you do it now?
---
[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...
Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:
https://github.com/apache/flink/pull/4908#discussion_r147379229
--- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTaskScopeTest.java ---
@@ -72,10 +72,30 @@
private final AbstractID taskAttemptId2 = new AbstractID();
private final String[] labelValues2 = {jobId.toString(), taskId2.toString(), taskAttemptId2.toString(), TASK_MANAGER_HOST, TASK_NAME, "" + ATTEMPT_NUMBER, JOB_NAME, TASK_MANAGER_ID, "" + SUBTASK_INDEX_2};
- private final TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
- private final TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
- private final TaskMetricGroup taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
- private final TaskMetricGroup taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
+ private TaskMetricGroup taskMetricGroup1;
+ private TaskMetricGroup taskMetricGroup2;
+
+ private MetricRegistry registry;
+ private int port;
+
+ @Before
+ public void setupReporter() {
+ registry = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "9400-9500")));
+ PrometheusReporter reporter = (PrometheusReporter) registry.getReporters().get(0);
+ port = reporter.getPort();
+
+ TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
+ TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
+ taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
+ taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
+ }
+
+ @After
+ public void shutdownRegistry() {
+ if (registry != null) {
+ registry.shutdown();
--- End diff --
the registry is closing the reporter.
---
[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...
Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:
https://github.com/apache/flink/pull/4908#discussion_r147639583
--- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTaskScopeTest.java ---
@@ -72,10 +72,30 @@
private final AbstractID taskAttemptId2 = new AbstractID();
private final String[] labelValues2 = {jobId.toString(), taskId2.toString(), taskAttemptId2.toString(), TASK_MANAGER_HOST, TASK_NAME, "" + ATTEMPT_NUMBER, JOB_NAME, TASK_MANAGER_ID, "" + SUBTASK_INDEX_2};
- private final TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
- private final TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
- private final TaskMetricGroup taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
- private final TaskMetricGroup taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
+ private TaskMetricGroup taskMetricGroup1;
+ private TaskMetricGroup taskMetricGroup2;
+
+ private MetricRegistry registry;
+ private int port;
+
+ @Before
+ public void setupReporter() {
+ registry = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "9400-9500")));
--- End diff --
no. The underlying http server in the reporter does not expose the port used. We have to try the constructor with a specific port so that we know which port is actually used.
---
[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/flink/pull/4908
---
[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...
Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:
https://github.com/apache/flink/pull/4908#discussion_r147379207
--- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTaskScopeTest.java ---
@@ -72,10 +72,30 @@
private final AbstractID taskAttemptId2 = new AbstractID();
private final String[] labelValues2 = {jobId.toString(), taskId2.toString(), taskAttemptId2.toString(), TASK_MANAGER_HOST, TASK_NAME, "" + ATTEMPT_NUMBER, JOB_NAME, TASK_MANAGER_ID, "" + SUBTASK_INDEX_2};
- private final TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
- private final TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
- private final TaskMetricGroup taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
- private final TaskMetricGroup taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
+ private TaskMetricGroup taskMetricGroup1;
+ private TaskMetricGroup taskMetricGroup2;
+
+ private MetricRegistry registry;
+ private int port;
--- End diff --
yup we could do that too.
---
[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...
Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on a diff in the pull request:
https://github.com/apache/flink/pull/4908#discussion_r147357745
--- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTaskScopeTest.java ---
@@ -72,10 +72,30 @@
private final AbstractID taskAttemptId2 = new AbstractID();
private final String[] labelValues2 = {jobId.toString(), taskId2.toString(), taskAttemptId2.toString(), TASK_MANAGER_HOST, TASK_NAME, "" + ATTEMPT_NUMBER, JOB_NAME, TASK_MANAGER_ID, "" + SUBTASK_INDEX_2};
- private final TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
- private final TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
- private final TaskMetricGroup taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
- private final TaskMetricGroup taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
+ private TaskMetricGroup taskMetricGroup1;
+ private TaskMetricGroup taskMetricGroup2;
+
+ private MetricRegistry registry;
+ private int port;
+
+ @Before
+ public void setupReporter() {
+ registry = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "9400-9500")));
+ PrometheusReporter reporter = (PrometheusReporter) registry.getReporters().get(0);
+ port = reporter.getPort();
+
+ TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
+ TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
+ taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
+ taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
+ }
+
+ @After
+ public void shutdownRegistry() {
+ if (registry != null) {
+ registry.shutdown();
--- End diff --
you are not closing the reporter here. Isn't this a root cause of the test instability?
---
[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...
Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on a diff in the pull request:
https://github.com/apache/flink/pull/4908#discussion_r147357533
--- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTaskScopeTest.java ---
@@ -72,10 +72,30 @@
private final AbstractID taskAttemptId2 = new AbstractID();
private final String[] labelValues2 = {jobId.toString(), taskId2.toString(), taskAttemptId2.toString(), TASK_MANAGER_HOST, TASK_NAME, "" + ATTEMPT_NUMBER, JOB_NAME, TASK_MANAGER_ID, "" + SUBTASK_INDEX_2};
- private final TaskManagerMetricGroup tmMetricGroup = new TaskManagerMetricGroup(registry, TASK_MANAGER_HOST, TASK_MANAGER_ID);
- private final TaskManagerJobMetricGroup tmJobMetricGroup = new TaskManagerJobMetricGroup(registry, tmMetricGroup, jobId, JOB_NAME);
- private final TaskMetricGroup taskMetricGroup1 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId1, taskAttemptId1, TASK_NAME, SUBTASK_INDEX_1, ATTEMPT_NUMBER);
- private final TaskMetricGroup taskMetricGroup2 = new TaskMetricGroup(registry, tmJobMetricGroup, taskId2, taskAttemptId2, TASK_NAME, SUBTASK_INDEX_2, ATTEMPT_NUMBER);
+ private TaskMetricGroup taskMetricGroup1;
+ private TaskMetricGroup taskMetricGroup2;
+
+ private MetricRegistry registry;
+ private int port;
--- End diff --
can't we access `getPort()` instead of using this variable? Isn't it redundant?
---
[GitHub] flink pull request #4908: [FLINK-7933][metrics] Improve PrometheusReporter t...
Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on a diff in the pull request:
https://github.com/apache/flink/pull/4908#discussion_r147357330
--- Diff: flink-metrics/flink-metrics-prometheus/src/test/java/org/apache/flink/metrics/prometheus/PrometheusReporterTest.java ---
@@ -70,9 +69,25 @@
@Rule
public ExpectedException thrown = ExpectedException.none();
- private final MetricRegistry registry = new MetricRegistry(MetricRegistryConfiguration.fromConfiguration(createConfigWithOneReporter("test1", "" + NON_DEFAULT_PORT)));
- private final FrontMetricGroup<TaskManagerMetricGroup> metricGroup = new FrontMetricGroup<>(0, new TaskManagerMetricGroup(registry, HOST_NAME, TASK_MANAGER));
- private final MetricReporter reporter = registry.getReporters().get(0);
+ private MetricRegistry registry;
+ private FrontMetricGroup<TaskManagerMetricGroup> metricGroup;
+ private PrometheusReporter reporter;
+ private int port;
--- End diff --
do we need this extra field for that? Can't we access `reporter.getPort()`? Isn't it redundant?
---
[GitHub] flink issue #4908: [FLINK-7933][metrics] Improve PrometheusReporter tests
Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/4908
Thanks, looks good, +1 to merge
---