You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by zentol <gi...@git.apache.org> on 2018/05/07 08:09:20 UTC

[GitHub] flink pull request #5959: [FLINK-9258][metrics] Thread-safe initialization o...

GitHub user zentol opened a pull request:

    https://github.com/apache/flink/pull/5959

    [FLINK-9258][metrics] Thread-safe initialization of variables map

    ## What is the purpose of the change
    
    This PR fixes a thread-safety issue in `ComponentMetricGroup` where a map was assigned to the `variables` field in a synchronized block before it was populated.
    This meant that a modification could be made to the map that is already visible to other threads, leading to `ConcurrentModificationExceptions`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zentol/flink 9258

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/5959.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5959
    
----

----


---

[GitHub] flink pull request #5959: [FLINK-9258][metrics] Thread-safe initialization o...

Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5959#discussion_r190165058
  
    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/ComponentMetricGroup.java ---
    @@ -57,11 +57,12 @@ public ComponentMetricGroup(MetricRegistry registry, String[] scope, P parent) {
     		if (variables == null) { // avoid synchronization for common case
     			synchronized (this) {
     				if (variables == null) {
    -					variables = new HashMap<>();
    -					putVariables(variables);
    +					Map<String, String> tmpVariables = new HashMap<>();
    +					putVariables(tmpVariables);
     					if (parent != null) { // not true for Job-/TaskManagerMetricGroup
    -						variables.putAll(parent.getAllVariables());
    +						tmpVariables.putAll(parent.getAllVariables());
     					}
    +					variables = tmpVariables;
    --- End diff --
    
    This looks potentially still broken to me because it does double-checked locking, which is a defective anti-pattern.


---

[GitHub] flink pull request #5959: [FLINK-9258][metrics] Thread-safe initialization o...

Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5959#discussion_r190168697
  
    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/ComponentMetricGroup.java ---
    @@ -57,11 +57,12 @@ public ComponentMetricGroup(MetricRegistry registry, String[] scope, P parent) {
     		if (variables == null) { // avoid synchronization for common case
     			synchronized (this) {
     				if (variables == null) {
    -					variables = new HashMap<>();
    -					putVariables(variables);
    +					Map<String, String> tmpVariables = new HashMap<>();
    +					putVariables(tmpVariables);
     					if (parent != null) { // not true for Job-/TaskManagerMetricGroup
    -						variables.putAll(parent.getAllVariables());
    +						tmpVariables.putAll(parent.getAllVariables());
     					}
    +					variables = tmpVariables;
    --- End diff --
    
    EDIT removed my previous comment about double-checked locking because I saw that the variable is `volatile`


---

[GitHub] flink pull request #5959: [FLINK-9258][metrics] Thread-safe initialization o...

Posted by zentol <gi...@git.apache.org>.
Github user zentol closed the pull request at:

    https://github.com/apache/flink/pull/5959


---

[GitHub] flink pull request #5959: [FLINK-9258][metrics] Thread-safe initialization o...

Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5959#discussion_r190165634
  
    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/ComponentMetricGroup.java ---
    @@ -57,11 +57,12 @@ public ComponentMetricGroup(MetricRegistry registry, String[] scope, P parent) {
     		if (variables == null) { // avoid synchronization for common case
     			synchronized (this) {
     				if (variables == null) {
    -					variables = new HashMap<>();
    -					putVariables(variables);
    +					Map<String, String> tmpVariables = new HashMap<>();
    +					putVariables(tmpVariables);
     					if (parent != null) { // not true for Job-/TaskManagerMetricGroup
    -						variables.putAll(parent.getAllVariables());
    +						tmpVariables.putAll(parent.getAllVariables());
     					}
    +					variables = tmpVariables;
    --- End diff --
    
    To be clear, this is not exactly about the fixed lines, but the line that is commented with "optimization for common case" is the remaining problem.


---

[GitHub] flink issue #5959: [FLINK-9258][metrics] Thread-safe initialization of varia...

Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on the issue:

    https://github.com/apache/flink/pull/5959
  
    Sounds good. After the mentioned changes this looks ready to merge for me 👍 


---

[GitHub] flink issue #5959: [FLINK-9258][metrics] Thread-safe initialization of varia...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/5959
  
    merging.


---

[GitHub] flink pull request #5959: [FLINK-9258][metrics] Thread-safe initialization o...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5959#discussion_r186630665
  
    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/ComponentMetricGroup.java ---
    @@ -57,11 +57,12 @@ public ComponentMetricGroup(MetricRegistry registry, String[] scope, P parent) {
     		if (variables == null) { // avoid synchronization for common case
     			synchronized (this) {
     				if (variables == null) {
    -					variables = new HashMap<>();
    -					putVariables(variables);
    +					Map<String, String> tmpVariables = new HashMap<>();
    +					putVariables(tmpVariables);
     					if (parent != null) { // not true for Job-/TaskManagerMetricGroup
    -						variables.putAll(parent.getAllVariables());
    +						tmpVariables.putAll(parent.getAllVariables());
     					}
    +					variables = tmpVariables;
    --- End diff --
    
    We _could_ write a test, but it would be heavily tied to the implementation and is thus a bit questionable.


---

[GitHub] flink pull request #5959: [FLINK-9258][metrics] Thread-safe initialization o...

Posted by yuqi1129 <gi...@git.apache.org>.
Github user yuqi1129 commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5959#discussion_r186603684
  
    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/ComponentMetricGroup.java ---
    @@ -57,11 +57,12 @@ public ComponentMetricGroup(MetricRegistry registry, String[] scope, P parent) {
     		if (variables == null) { // avoid synchronization for common case
     			synchronized (this) {
     				if (variables == null) {
    -					variables = new HashMap<>();
    -					putVariables(variables);
    +					Map<String, String> tmpVariables = new HashMap<>();
    +					putVariables(tmpVariables);
     					if (parent != null) { // not true for Job-/TaskManagerMetricGroup
    -						variables.putAll(parent.getAllVariables());
    +						tmpVariables.putAll(parent.getAllVariables());
     					}
    +					variables = tmpVariables;
    --- End diff --
    
    Do not need any test that verify this change ?


---

[GitHub] flink issue #5959: [FLINK-9258][metrics] Thread-safe initialization of varia...

Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on the issue:

    https://github.com/apache/flink/pull/5959
  
    I have two quick questions:
    - Why does `ComponentMetricGroup` even override the method `getAllVariables` from `AbstractMetricGroup` with essentially the exact same code?
    -Why is it only fixed in `ComponentMetricGroup`? Could it make sense to fix it in `AbstractMetricGroup` and remove the overriding method in the subclass?


---

[GitHub] flink issue #5959: [FLINK-9258][metrics] Thread-safe initialization of varia...

Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on the issue:

    https://github.com/apache/flink/pull/5959
  
    LGTM 👍 


---

[GitHub] flink issue #5959: [FLINK-9258][metrics] Thread-safe initialization of varia...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/5959
  
    The methods have been identical since FLINK-7692, but I didn't catch it in the review. Thus, `ComponentMetricGroup#getAllVariables()` should be removed, along with `ComponentMetricGroup#putVariables()`, and the fix applied to `AbstractMetricGroup#getAllVariables()`.
    
    We try to construct as many things as possible lazily, to reduce the resource impact in case the metric system isn't used.


---