You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@curator.apache.org by Randgalt <gi...@git.apache.org> on 2017/11/22 02:15:34 UTC

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

GitHub user Randgalt opened a pull request:

    https://github.com/apache/curator/pull/242

    [CURATOR-443] CuratorFrameworkImpl may sleep in foreground even if inBackground is called

    To avoid massive spinning, background operations are paused for 1 second when there is no connection. However, this can hurt performance terribly if background operations are queued, for example, prior to initial connection. This changes the behavior so that the sleeps are cleared when the connection is re-established. A separate queue of "forced sleep" operations are kept while the connection is down. This queue then gets its sleep cleared when the connection is re-established.
    
    Reviewers - please verify this logic. I think I have it right. I'm only concerned that I might be leaking operations into `forcedSleepOperations` that never get cleared.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/curator CURATOR-443

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/curator/pull/242.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #242
    
----
commit bfdb790bc69022b0eef74558e9511e6ed2b665e9
Author: randgalt <ra...@apache.org>
Date:   2017-11-22T02:11:01Z

    To avoid massive spinning, background operations are paused for 1 second when there is no connection. However, this can hurt performance terribly if background operations are queued, for example, prior to initial connection. This changes the behavior so that the sleeps are cleared when the connection is re-established. A separate queue of "forced sleep" operations are kept while the connection is down. This queue then gets its sleep cleared when the connection is re-established.

----


---

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

Posted by madrob <gi...@git.apache.org>.
Github user madrob commented on a diff in the pull request:

    https://github.com/apache/curator/pull/242#discussion_r153252423
  
    --- Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CuratorFrameworkImpl.java ---
    @@ -640,12 +646,14 @@ boolean useContainerParentsIfAvailable()
             }
         }
     
    -    <DATA_TYPE> void queueOperation(OperationAndData<DATA_TYPE> operationAndData)
    +    <DATA_TYPE> boolean queueOperation(OperationAndData<DATA_TYPE> operationAndData)
    --- End diff --
    
    Please add a comment explaining how to handle the return value -- there are a lot of callers that have not been updated to check the new return value. Looks like true iff the operation was queued?


---

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

Posted by madrob <gi...@git.apache.org>.
Github user madrob commented on a diff in the pull request:

    https://github.com/apache/curator/pull/242#discussion_r153252731
  
    --- Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CuratorFrameworkImpl.java ---
    @@ -973,6 +981,33 @@ void performBackgroundOperation(OperationAndData<?> operationAndData)
             }
         }
     
    +    @VisibleForTesting
    +    volatile long sleepAndQueueOperationSeconds = 1;
    +
    +    private void sleepAndQueueOperation(OperationAndData<?> operationAndData) throws InterruptedException
    +    {
    +        operationAndData.sleepFor(sleepAndQueueOperationSeconds, TimeUnit.SECONDS);
    +        if ( queueOperation(operationAndData) )
    --- End diff --
    
    If this returns false, then we just drop the operation?


---

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

Posted by madrob <gi...@git.apache.org>.
Github user madrob commented on a diff in the pull request:

    https://github.com/apache/curator/pull/242#discussion_r153341281
  
    --- Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CuratorFrameworkImpl.java ---
    @@ -973,6 +981,33 @@ void performBackgroundOperation(OperationAndData<?> operationAndData)
             }
         }
     
    +    @VisibleForTesting
    +    volatile long sleepAndQueueOperationSeconds = 1;
    +
    +    private void sleepAndQueueOperation(OperationAndData<?> operationAndData) throws InterruptedException
    +    {
    +        operationAndData.sleepFor(sleepAndQueueOperationSeconds, TimeUnit.SECONDS);
    +        if ( queueOperation(operationAndData) )
    +        {
    +            forcedSleepOperations.add(operationAndData);
    +        }
    +    }
    +
    +    private void unSleepBackgroundOperations()
    +    {
    +        Collection<OperationAndData<?>> drain = new ArrayList<>(forcedSleepOperations.size());
    +        forcedSleepOperations.drainTo(drain);
    +        log.debug("Clearing sleep for {} operations", drain.size());
    +        for ( OperationAndData<?> operation : drain )
    +        {
    +            operation.clearSleep();
    +            if ( backgroundOperations.remove(operation) )
    --- End diff --
    
    Ah, I'm with you now. Can you add a method comment that explains the nuance of the implementation here?
    
    If we're really concerned about ordering, then we could update clearSleep() to set expiration time to the current time instead of setting it to zero.


---

[GitHub] curator issue #242: [CURATOR-443] CuratorFrameworkImpl may sleep in foregrou...

Posted by Randgalt <gi...@git.apache.org>.
Github user Randgalt commented on the issue:

    https://github.com/apache/curator/pull/242
  
    I added another change to remove slept operations from the DelayQueue and re-add to force a resort. I'm concerned, though, that this might affect operation order. In fact, I'm concerned that old sleep that was added might affect operation order. More testing in that area is needed.


---

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

Posted by Randgalt <gi...@git.apache.org>.
Github user Randgalt commented on a diff in the pull request:

    https://github.com/apache/curator/pull/242#discussion_r155086416
  
    --- Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/OperationAndData.java ---
    @@ -115,6 +115,11 @@ BackgroundCallback getCallback()
             return operation;
         }
     
    +    void clearSleep()
    +    {
    +        sleepUntilTimeMs.set(0);
    --- End diff --
    
    But 0 is fine and very clear


---

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

Posted by Randgalt <gi...@git.apache.org>.
Github user Randgalt commented on a diff in the pull request:

    https://github.com/apache/curator/pull/242#discussion_r155071186
  
    --- Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CuratorFrameworkImpl.java ---
    @@ -973,6 +981,33 @@ void performBackgroundOperation(OperationAndData<?> operationAndData)
             }
         }
     
    +    @VisibleForTesting
    +    volatile long sleepAndQueueOperationSeconds = 1;
    +
    +    private void sleepAndQueueOperation(OperationAndData<?> operationAndData) throws InterruptedException
    +    {
    +        operationAndData.sleepFor(sleepAndQueueOperationSeconds, TimeUnit.SECONDS);
    +        if ( queueOperation(operationAndData) )
    --- End diff --
    
    Yes, the only reason it wouldn't be queued is that the handle is closing, etc. `false` is the indication that no further operations are needed.


---

[GitHub] curator issue #242: [CURATOR-443] CuratorFrameworkImpl may sleep in foregrou...

Posted by madrob <gi...@git.apache.org>.
Github user madrob commented on the issue:

    https://github.com/apache/curator/pull/242
  
    👍 


---

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

Posted by Randgalt <gi...@git.apache.org>.
Github user Randgalt commented on a diff in the pull request:

    https://github.com/apache/curator/pull/242#discussion_r153337468
  
    --- Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CuratorFrameworkImpl.java ---
    @@ -973,6 +981,33 @@ void performBackgroundOperation(OperationAndData<?> operationAndData)
             }
         }
     
    +    @VisibleForTesting
    +    volatile long sleepAndQueueOperationSeconds = 1;
    +
    +    private void sleepAndQueueOperation(OperationAndData<?> operationAndData) throws InterruptedException
    +    {
    +        operationAndData.sleepFor(sleepAndQueueOperationSeconds, TimeUnit.SECONDS);
    +        if ( queueOperation(operationAndData) )
    +        {
    +            forcedSleepOperations.add(operationAndData);
    +        }
    +    }
    +
    +    private void unSleepBackgroundOperations()
    +    {
    +        Collection<OperationAndData<?>> drain = new ArrayList<>(forcedSleepOperations.size());
    +        forcedSleepOperations.drainTo(drain);
    +        log.debug("Clearing sleep for {} operations", drain.size());
    +        for ( OperationAndData<?> operation : drain )
    +        {
    +            operation.clearSleep();
    +            if ( backgroundOperations.remove(operation) )
    --- End diff --
    
    From my read of DelayQueue and research on the net it's not enough. We need the DelayQueue to re-do the sort and it only does this on item insert.


---

[GitHub] curator issue #242: [CURATOR-443] CuratorFrameworkImpl may sleep in foregrou...

Posted by Randgalt <gi...@git.apache.org>.
Github user Randgalt commented on the issue:

    https://github.com/apache/curator/pull/242
  
    @madrob you OK with merging this now? 


---

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/curator/pull/242


---

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

Posted by madrob <gi...@git.apache.org>.
Github user madrob commented on a diff in the pull request:

    https://github.com/apache/curator/pull/242#discussion_r155074746
  
    --- Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/OperationAndData.java ---
    @@ -115,6 +115,11 @@ BackgroundCallback getCallback()
             return operation;
         }
     
    +    void clearSleep()
    +    {
    +        sleepUntilTimeMs.set(0);
    --- End diff --
    
    I think this should be set to the time to current time (equivalent to `sleepFor(0)`), rather than blowing it away and setting to zero.


---

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

Posted by Randgalt <gi...@git.apache.org>.
Github user Randgalt commented on a diff in the pull request:

    https://github.com/apache/curator/pull/242#discussion_r153336927
  
    --- Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CuratorFrameworkImpl.java ---
    @@ -973,6 +981,33 @@ void performBackgroundOperation(OperationAndData<?> operationAndData)
             }
         }
     
    +    @VisibleForTesting
    +    volatile long sleepAndQueueOperationSeconds = 1;
    --- End diff --
    
    I've been moving away from AtomicLong when compareAndSet ops aren't needed. Those wrappers ended up creating unnecessary garbage in the heap.


---

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

Posted by madrob <gi...@git.apache.org>.
Github user madrob commented on a diff in the pull request:

    https://github.com/apache/curator/pull/242#discussion_r153252872
  
    --- Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CuratorFrameworkImpl.java ---
    @@ -973,6 +981,33 @@ void performBackgroundOperation(OperationAndData<?> operationAndData)
             }
         }
     
    +    @VisibleForTesting
    +    volatile long sleepAndQueueOperationSeconds = 1;
    +
    +    private void sleepAndQueueOperation(OperationAndData<?> operationAndData) throws InterruptedException
    +    {
    +        operationAndData.sleepFor(sleepAndQueueOperationSeconds, TimeUnit.SECONDS);
    +        if ( queueOperation(operationAndData) )
    +        {
    +            forcedSleepOperations.add(operationAndData);
    +        }
    +    }
    +
    +    private void unSleepBackgroundOperations()
    +    {
    +        Collection<OperationAndData<?>> drain = new ArrayList<>(forcedSleepOperations.size());
    +        forcedSleepOperations.drainTo(drain);
    +        log.debug("Clearing sleep for {} operations", drain.size());
    +        for ( OperationAndData<?> operation : drain )
    +        {
    +            operation.clearSleep();
    +            if ( backgroundOperations.remove(operation) )
    --- End diff --
    
    Why do we need to reinsert all of the operations? Isn't clearing the sleep enough?


---

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

Posted by madrob <gi...@git.apache.org>.
Github user madrob commented on a diff in the pull request:

    https://github.com/apache/curator/pull/242#discussion_r153341026
  
    --- Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CuratorFrameworkImpl.java ---
    @@ -973,6 +981,33 @@ void performBackgroundOperation(OperationAndData<?> operationAndData)
             }
         }
     
    +    @VisibleForTesting
    +    volatile long sleepAndQueueOperationSeconds = 1;
    --- End diff --
    
    That's reasonable, thanks.


---

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

Posted by madrob <gi...@git.apache.org>.
Github user madrob commented on a diff in the pull request:

    https://github.com/apache/curator/pull/242#discussion_r153253397
  
    --- Diff: curator-framework/src/test/java/org/apache/curator/framework/imps/TestFrameworkEdges.java ---
    @@ -60,6 +60,37 @@
     {
         private final Timing2 timing = new Timing2();
     
    +    @Test
    +    public void testBackgroundLatencyUnSleep() throws Exception
    --- End diff --
    
    Could we test multiple operations and verify that after the first one fires the rest don't continue to sleep?


---

[GitHub] curator pull request #242: [CURATOR-443] CuratorFrameworkImpl may sleep in f...

Posted by madrob <gi...@git.apache.org>.
Github user madrob commented on a diff in the pull request:

    https://github.com/apache/curator/pull/242#discussion_r153243087
  
    --- Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CuratorFrameworkImpl.java ---
    @@ -973,6 +981,33 @@ void performBackgroundOperation(OperationAndData<?> operationAndData)
             }
         }
     
    +    @VisibleForTesting
    +    volatile long sleepAndQueueOperationSeconds = 1;
    --- End diff --
    
    why volatile? did you mean to use an AtomicLong?


---