You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by zentol <gi...@git.apache.org> on 2017/06/12 13:40:16 UTC

[GitHub] flink pull request #4110: [FLINK-6900] [metrics] Limit size of metric name c...

GitHub user zentol opened a pull request:

    https://github.com/apache/flink/pull/4110

    [FLINK-6900] [metrics] Limit size of metric name components

    This PR modifies the `ScheduledDropwizardReporter` to limit the size of every metric name component to 80 characters, with the same reasoning as #4109.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zentol/flink 6900

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4110.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4110
    
----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4110: [FLINK-6900] [metrics] Limit size of metric name componen...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4110
  
    We don't truncate the full identifier, but each individual component before assembling the final identifier.
    
    I.e. `a.b.<reallyReallyLongSection>.c.d` would become `a.b.<notSoLongSection>.c.d`.
    
    The primary cause of this currently are the names of WindowOperators or long task chains; but given that the sections are partially controlled by the user there may be more cases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4110: [FLINK-6900] [metrics] Limit size of metric name componen...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4110
  
    yes it is still required, but this PR does address a similar issue.
    
    #4109 limits the size of the operator name in the metric identifier. This was a problem for all reporters, because a 200+ character name just isn't manageable.
    
    This PR limits the size of all components of the metric identifier for DropwizardReporters, as the backends of several subclasses store metrics in files, with each component being one directory., like "taskmanager/abcde/job/myjob/task/mytask". Since they are used as names for directories they mustn't exceed a certain size (commonly 255). While technically a value close to 255 would suffice, i figure that anything above 80 characters isn't really well manageable as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4110: [FLINK-6900] [metrics] Limit size of metric name c...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4110#discussion_r126930326
  
    --- Diff: docs/monitoring/metrics.md ---
    @@ -376,6 +376,7 @@ Parameters:
     - `dmax` - hard limit for how long an old metric should be retained
     - `ttl` - time-to-live for transmitted UDP packets
     - `addressingMode` - UDP addressing mode to use (UNICAST/MULTICAST)
    +- `maxComponentLength` - limits the size of each scope component
    --- End diff --
    
    Alright, works for me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4110: [FLINK-6900] [metrics] Limit size of metric name componen...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the issue:

    https://github.com/apache/flink/pull/4110
  
    What can cause such a long metric identifier? It seems risky to truncate the full identifier which could even completely remove the base name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4110: [FLINK-6900] [metrics] Limit size of metric name c...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4110#discussion_r126918307
  
    --- Diff: docs/monitoring/metrics.md ---
    @@ -376,6 +376,7 @@ Parameters:
     - `dmax` - hard limit for how long an old metric should be retained
     - `ttl` - time-to-live for transmitted UDP packets
     - `addressingMode` - UDP addressing mode to use (UNICAST/MULTICAST)
    +- `maxComponentLength` - limits the size of each scope component
    --- End diff --
    
    "length of each scope component" would be better, I don't think we us "name of scope component" anywhere in the docs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4110: [FLINK-6900] [metrics] Limit size of metric name c...

Posted by NicoK <gi...@git.apache.org>.
Github user NicoK commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4110#discussion_r126443486
  
    --- Diff: flink-metrics/flink-metrics-statsd/src/test/java/org/apache/flink/metrics/statsd/StatsDReporterTest.java ---
    @@ -60,6 +60,20 @@
     public class StatsDReporterTest extends TestLogger {
     
     	@Test
    +	public void testNameTruncating() {
    +		StatsDReporter reporter = new StatsDReporter();
    +
    +		MetricConfig config = new MetricConfig();
    +		config.setProperty(StatsDReporter.ARG_HOST, "localhost");
    +		config.setProperty(StatsDReporter.ARG_PORT, "12345");
    +		config.setProperty(StatsDReporter.ARG_MAX_COMPONENT_LENGTH, "10");
    +		
    +		reporter.open(config);
    +		
    +		assertEquals("0123456789", reporter.filterCharacters("0123456789DEADBEEF"));
    --- End diff --
    
    ...and the additional tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4110: [FLINK-6900] [metrics] Limit size of metric name c...

Posted by NicoK <gi...@git.apache.org>.
Github user NicoK commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4110#discussion_r126442516
  
    --- Diff: flink-metrics/flink-metrics-dropwizard/src/main/java/org/apache/flink/dropwizard/ScheduledDropwizardReporter.java ---
    @@ -184,7 +188,10 @@ public void notifyOfRemovedMetric(Metric metric, String metricName, MetricGroup
     	@Override
     	public String filterCharacters(String metricName) {
     		char[] chars = null;
    -		final int strLen = metricName.length();
    +		if (metricName.length() > maxComponentLength) {
    +			log.warn("The metric name component {} exceeded the {} characters length limit and was truncated.", metricName, maxComponentLength);
    +		}
    +		final int strLen = Math.min(metricName.length(), maxComponentLength);
    --- End diff --
    
    You actually don't need to call `Math.min()` anymore after you already checked the condition for the warning message. You could thus assign `strLen` yourself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4110: [FLINK-6900] [metrics] Limit size of metric name c...

Posted by NicoK <gi...@git.apache.org>.
Github user NicoK commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4110#discussion_r126940120
  
    --- Diff: flink-metrics/flink-metrics-statsd/src/test/java/org/apache/flink/metrics/statsd/StatsDReporterTest.java ---
    @@ -60,6 +60,20 @@
     public class StatsDReporterTest extends TestLogger {
     
     	@Test
    +	public void testNameTruncating() {
    +		StatsDReporter reporter = new StatsDReporter();
    +
    +		MetricConfig config = new MetricConfig();
    +		config.setProperty(StatsDReporter.ARG_HOST, "localhost");
    +		config.setProperty(StatsDReporter.ARG_PORT, "12345");
    +		config.setProperty(StatsDReporter.ARG_MAX_COMPONENT_LENGTH, "10");
    +		
    +		reporter.open(config);
    +		
    +		assertEquals("0123456789", reporter.filterCharacters("0123456789DEADBEEF"));
    --- End diff --
    
    hmm, seems something was lost during the transfer :(
    
    iirc, I wanted to ask whether it makes sense to also test that things like `a.b.0123456789DEADBEEF.c` are properly truncated to `a.b.0123456789.c` for the whole metric name


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4110: [FLINK-6900] [metrics] Limit size of metric name c...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4110#discussion_r126446702
  
    --- Diff: docs/monitoring/metrics.md ---
    @@ -376,6 +376,7 @@ Parameters:
     - `dmax` - hard limit for how long an old metric should be retained
     - `ttl` - time-to-live for transmitted UDP packets
     - `addressingMode` - UDP addressing mode to use (UNICAST/MULTICAST)
    +- `maxComponentLength` - limits the size of each scope component
    --- End diff --
    
    Could this be described as the "length of the name of each scope component" rather than simply "size"? I'm not sure that it's immediately obvious what this parameter is limiting.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4110: [FLINK-6900] [metrics] Limit size of metric name componen...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the issue:

    https://github.com/apache/flink/pull/4110
  
    That sounds reasonable. Can we add a warning as in #4109 and replace `80` with a constant (I see just now in `TaskMetricGroup.java` that `80` is hard-coded in the log string rather than using the constant)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4110: [FLINK-6900] [metrics] Limit size of metric name componen...

Posted by NicoK <gi...@git.apache.org>.
Github user NicoK commented on the issue:

    https://github.com/apache/flink/pull/4110
  
    @zentol then +1 after addressing @greghogan's comments (adding a warning + using a constant for the `80`)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4110: [FLINK-6900] [metrics] Limit size of metric name componen...

Posted by NicoK <gi...@git.apache.org>.
Github user NicoK commented on the issue:

    https://github.com/apache/flink/pull/4110
  
    just for clarification, `a.b.<notSoLongSection>.c.d` would then be stored in `a/b/<notSoLongSection>/c/d` so that each component/file/directory is not larger than 80?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4110: [FLINK-6900] [metrics] Limit size of metric name c...

Posted by NicoK <gi...@git.apache.org>.
Github user NicoK commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4110#discussion_r126443494
  
    --- Diff: flink-metrics/flink-metrics-statsd/src/main/java/org/apache/flink/metrics/statsd/StatsDReporter.java ---
    @@ -193,7 +198,10 @@ private void send(final String name, final String value) {
     	@Override
     	public String filterCharacters(String input) {
     		char[] chars = null;
    -		final int strLen = input.length();
    +		if (input.length() > maxComponentLength) {
    +			log.warn("The metric name component {} exceeded the {} characters length limit and was truncated.", input, maxComponentLength);
    +		}
    +		final int strLen = Math.min(input.length(), maxComponentLength);
    --- End diff --
    
    same here about the `Math.min`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4110: [FLINK-6900] [metrics] Limit size of metric name componen...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4110
  
    @NicoK yes, that's how they would be stored in some backends.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4110: [FLINK-6900] [metrics] Limit size of metric name componen...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4110
  
    I've update the PR.
    * extended the change to cover the StatsDReporter
    * added the warning as requested
    * moved the limit into a configurable field
    * updated documentation
    * added tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4110: [FLINK-6900] [metrics] Limit size of metric name componen...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/4110
  
    sure, i can do that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4110: [FLINK-6900] [metrics] Limit size of metric name c...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4110#discussion_r126915402
  
    --- Diff: flink-metrics/flink-metrics-statsd/src/test/java/org/apache/flink/metrics/statsd/StatsDReporterTest.java ---
    @@ -60,6 +60,20 @@
     public class StatsDReporterTest extends TestLogger {
     
     	@Test
    +	public void testNameTruncating() {
    +		StatsDReporter reporter = new StatsDReporter();
    +
    +		MetricConfig config = new MetricConfig();
    +		config.setProperty(StatsDReporter.ARG_HOST, "localhost");
    +		config.setProperty(StatsDReporter.ARG_PORT, "12345");
    +		config.setProperty(StatsDReporter.ARG_MAX_COMPONENT_LENGTH, "10");
    +		
    +		reporter.open(config);
    +		
    +		assertEquals("0123456789", reporter.filterCharacters("0123456789DEADBEEF"));
    --- End diff --
    
    What about the tests?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4110: [FLINK-6900] [metrics] Limit size of metric name componen...

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the issue:

    https://github.com/apache/flink/pull/4110
  
    @zentol since this has not yet been reviewed I'll chance a question: is this needed in addition to #4109?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---