You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by revans2 <gi...@git.apache.org> on 2018/04/04 20:27:09 UTC

[GitHub] storm pull request #2622: STORM-3020: fix possible race condition in AsyncLo...

GitHub user revans2 opened a pull request:

    https://github.com/apache/storm/pull/2622

    STORM-3020: fix possible race condition in AsyncLocalizer

    There were a number of places in AsyncLocalizer that were using synchronized to try and protect some maps.  When we added in support for restarting a worker if specific blobs change one of the places that was protected before moved to a background thread and lost that protection.
    
    Now that we are java 8 we can use lambdas to provide the same protection without the need for the locks so I removed all of the course grained locking in AsyncLocalizer.  Now the only locking is on a per-blob basis.
    
    
    
    I tested this manually by launching topologies with lots of different types of blobs, modified the blobs and verified that everything worked correctly.  I shot things, including the supervisor, various times and verified that it was able to recover in each case.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/revans2/incubator-storm STORM-3020

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/2622.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2622
    
----
commit 337aef8f1291aba0ab228f6e9e0800c19b8c5ceb
Author: Robert (Bobby) Evans <ev...@...>
Date:   2018-04-03T19:58:14Z

    STORM-3020: fix possible race condition in AsyncLocalizer

----


---

[GitHub] storm pull request #2622: STORM-3020: fix possible race condition in AsyncLo...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/2622


---

[GitHub] storm issue #2622: STORM-3020: fix possible race condition in AsyncLocalizer

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/2622
  
    Thanks for the review @danny0405 
    
    This was not trying to fix STORM-2905.  This was a separate race condition I found when reviewing your pull request for STORM-2905.


---

[GitHub] storm issue #2622: STORM-3020: fix possible race condition in AsyncLocalizer

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/2622
  
    @HeartSaVioR 
    
    Great catch I forgot to update the normal has Map/HashMap to a ConcurrentHashMap.  Yes the guarantees of ConcurrentMap allow for retry and we do have side effects in some of the computes.
    
    I will update the comments as well to say what requirements we have for the type.  The computeIfAbsent methods have no side effects, so we don't need to worry about them as much.  I'll see if I can come up with a way to make it so we don't need as strong of a guarantee for `compute` so perhaps we could use other implementations.
    



---

[GitHub] storm issue #2622: STORM-3020: fix possible race condition in AsyncLocalizer

Posted by danny0405 <gi...@git.apache.org>.
Github user danny0405 commented on the issue:

    https://github.com/apache/storm/pull/2622
  
    I test it, and it worked ok as before, but a KeyNotFoundException still got of master when finally cleaned the base blobs.


---

[GitHub] storm issue #2622: STORM-3020: fix possible race condition in AsyncLocalizer

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/2622
  
    @revans2 
    Could you elaborate how protection works?
    
    At the first time I thought it leverages atomicity of compute*, but reading description of compute* in Map, looks like guaranteeing atomicity is not forced to implementations.
    
    https://docs.oracle.com/javase/8/docs/api/java/util/Map.html#compute-K-java.util.function.BiFunction-
    
    and HashMap doesn't mention about atomicity (`topologyBlobs` is a HashMap):
    
    https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#compute-K-java.util.function.BiFunction-
    
    whereas ConcurrentHashMap mentions about atomicity so at least this guarantees atomicity:
    
    https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html#compute-K-java.util.function.BiFunction-
    
    It may worth noting that implementation requirement in ConcurrentMap looks like a bit different from what ConcurrentHashMap provides. `The default implementation may retry these steps when multiple threads attempt updates including potentially calling the remapping function multiple times.` It doesn't mention locking but retrying.
    
    https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentMap.html#compute-K-java.util.function.BiFunction-


---

[GitHub] storm issue #2622: STORM-3020: fix possible race condition in AsyncLocalizer

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/2622
  
    @HeartSaVioR thanks for the review.  I fixed your concerns by making them all ConcurrentHashMaps and adding a note about why they need to be that.  I could not find a good way to remove the sideeffects.


---