You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by pczb <gi...@git.apache.org> on 2018/05/10 06:31:10 UTC

[GitHub] storm pull request #2669: [STORM-3055] remove conext connection cache

GitHub user pczb opened a pull request:

    https://github.com/apache/storm/pull/2669

    [STORM-3055] remove conext connection cache

    workerState has already cache connection use supervisor_id + port
    and if context and worker both cache connection, there can be some problem describe in jira
    jira link: https://issues.apache.org/jira/browse/STORM-3055

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/pczb/storm dev

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/2669.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2669
    
----
commit 62f66cd3f1e39953391f36e2b01197a6bade4e5c
Author: pczb <pc...@...>
Date:   2018-05-09T17:41:18Z

    remove context connection
    
    Signed-off-by: pczb <pc...@163.com>

----


---

[GitHub] storm issue #2669: [STORM-3055] remove conext connection cache

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/2669
  
    Once you make the corresponding changes on #2661 I would be happy to merge them both in.


---

[GitHub] storm issue #2669: [STORM-3055] remove conext connection cache

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on the issue:

    https://github.com/apache/storm/pull/2669
  
    LGTM, thanks for addressing comments


---

[GitHub] storm issue #2669: [STORM-3055] remove conext connection cache

Posted by pczb <gi...@git.apache.org>.
Github user pczb commented on the issue:

    https://github.com/apache/storm/pull/2669
  
    @srdo @revans2 thanks for your review, i just squash all commit in one


---

[GitHub] storm pull request #2669: [STORM-3055] remove conext connection cache

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/2669


---

[GitHub] storm pull request #2669: [STORM-3055] remove conext connection cache

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2669#discussion_r188377925
  
    --- Diff: storm-client/src/jvm/org/apache/storm/messaging/IContext.java ---
    @@ -38,6 +38,7 @@
     
         /**
          * This method establishes a server side connection
    +     * implementation should return a new connection every call
    --- End diff --
    
    Bind doesn't always return a new connection, the local Context will return a cached one if possible. Consider changing to something like "This method returns a server side connection. If one does not exist for the given ID and port, a new one will be established."


---

[GitHub] storm issue #2669: [STORM-3055] remove conext connection cache

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on the issue:

    https://github.com/apache/storm/pull/2669
  
    As far as I can tell all connections are still being shut down with this change. The connections cached by WorkerState are closed during worker shutdown in https://github.com/apache/storm/blob/14b0b4fc5e0945456769fd58a3595188e3dea234/storm-client/src/jvm/org/apache/storm/daemon/worker/Worker.java#L449, and the context is shut down a few lines further down. 
    
    The WorkerState.refreshConnections method also makes sure to never create a connection for a NodeInfo that is already present, so I don't think we're leaking connections there that would need to be picked up by context.term. 
    
    Would like to see the cleanup @revans2 mentioned, as well as a change in the IContext docs so it's specified that the created connections are new.


---

[GitHub] storm pull request #2669: [STORM-3055] remove conext connection cache

Posted by pczb <gi...@git.apache.org>.
Github user pczb commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2669#discussion_r188355473
  
    --- Diff: storm-client/src/jvm/org/apache/storm/messaging/netty/Client.java ---
    @@ -451,7 +451,6 @@ public int getPort() {
         public void close() {
             if (!closing) {
                 LOG.info("closing Netty Client {}", dstAddressPrefixedName);
    -            context.removeClient(dstHost, dstAddress.getPort());
    --- End diff --
    
    done


---

[GitHub] storm issue #2669: [STORM-3055] remove conext connection cache

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/2669
  
    The original code added this so when shutting down the context we could be sure that all ongoing connections were also terminated as cleanly as possible.  After this change that is only true for the server side and not the client side.
    
    What are the ramifications of that?  I think it was just defensive programming so it is probably fine to make this change from that perspective, but I would to make sure that my understanding of the problem matches yours.
    
    From the comments in STORM-3055 the error appears to be triggered when a supervisor is shut down, wiped clean (so it gets a new supervisor id), and then brought back up.  At that point nimbus schedules the worker on the same node/slot as before, but with a new supervisor ID.  This confuses the connection caching because when updating the connections it gets a list of connections to shut down and a separate list of connections to create.  The new connections are created, but in this case a new one is not created because we already have it open.  Then the old unneeded connections are torn down, but in this case the connection is needed.
    
    Looking at the javadocs for IContext. It looks like the caching really does violate the spec, but it is a bit of a gray area.
    
    https://github.com/apache/storm/blob/53f38bc31f2fd315a520ba6b86c0a60be08381cc/storm-client/src/jvm/org/apache/storm/messaging/IContext.java#L48-L57
    
    I am fine with removing the caching like this JIRA does, but I do want to see the code cleaned up, because without the caching there is a lot of extra code that can be removed.


---

[GitHub] storm pull request #2669: [STORM-3055] remove conext connection cache

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2669#discussion_r188056330
  
    --- Diff: storm-client/src/jvm/org/apache/storm/messaging/netty/Context.java ---
    @@ -65,20 +65,8 @@ public synchronized IConnection bind(String storm_id, int port) {
          * establish a connection to a remote server
          */
         public synchronized IConnection connect(String storm_id, String host, int port, AtomicBoolean[] remoteBpStatus) {
    -        IConnection connection = connections.get(key(host, port));
    --- End diff --
    
    Please rename connections to `serverConnections` as it is only for Servers now.  Also it would be nice to change it back to a list or a set instead of a map, because we will no longer be using it as a cache, so we don't need to pull out a specific value any longer.  We just want it so we can close them on termination.


---

[GitHub] storm pull request #2669: [STORM-3055] remove conext connection cache

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2669#discussion_r188055726
  
    --- Diff: storm-client/src/jvm/org/apache/storm/messaging/netty/Context.java ---
    @@ -65,20 +65,8 @@ public synchronized IConnection bind(String storm_id, int port) {
          * establish a connection to a remote server
          */
         public synchronized IConnection connect(String storm_id, String host, int port, AtomicBoolean[] remoteBpStatus) {
    --- End diff --
    
    This no longer needs to be synchronized.  That was used to protect connections, and is no longer needed if all we do is create a new Client.


---

[GitHub] storm issue #2669: [STORM-3055] remove conext connection cache

Posted by pczb <gi...@git.apache.org>.
Github user pczb commented on the issue:

    https://github.com/apache/storm/pull/2669
  
    @srdo add doc specification, i was wonder if we need remove synchronized when bind port
    @revans2  just as srdo says the connection will be closed by the worker before worker shutdown.
    i will squash all commit after review


---

[GitHub] storm pull request #2669: [STORM-3055] remove conext connection cache

Posted by pczb <gi...@git.apache.org>.
Github user pczb commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2669#discussion_r188666612
  
    --- Diff: storm-client/src/jvm/org/apache/storm/messaging/IContext.java ---
    @@ -38,6 +38,7 @@
     
         /**
          * This method establishes a server side connection
    +     * implementation should return a new connection every call
    --- End diff --
    
    remove bind specification,  since the new and old netty context never cache server connections, server connections is add to array or map just for cleanup 


---

[GitHub] storm pull request #2669: [STORM-3055] remove conext connection cache

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2669#discussion_r188055495
  
    --- Diff: storm-client/src/jvm/org/apache/storm/messaging/netty/Client.java ---
    @@ -451,7 +451,6 @@ public int getPort() {
         public void close() {
             if (!closing) {
                 LOG.info("closing Netty Client {}", dstAddressPrefixedName);
    -            context.removeClient(dstHost, dstAddress.getPort());
    --- End diff --
    
    This line is the only reason to pass context into the Client.  Please remove all references to context if we are going to remove this.


---

[GitHub] storm issue #2669: [STORM-3055] remove conext connection cache

Posted by pczb <gi...@git.apache.org>.
Github user pczb commented on the issue:

    https://github.com/apache/storm/pull/2669
  
    @revans2 could you help review this, i see your pull request #2436 is related to connection cache . and the travis fail is not related to this pr.


---

[GitHub] storm issue #2669: [STORM-3055] remove conext connection cache

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/2669
  
    Actually we just hit this in production so thanks again @pczb for finding and fixing this.


---