You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by anilkumar gingade <ag...@pivotal.io> on 2016/03/16 20:53:31 UTC

Review Request 44924: GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44924/
-----------------------------------------------------------

Review request for geode, anilkumar gingade, Barry Oglesby, Jason Huynh, nabarun nag, Dan Smith, and xiaojian zhou.


Repository: geode


Description
-------

GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.

In this test the client sees previously delivered events, when it disconnects and reconnects immediately. 
This happens when the message dispatcher is in the process of removing the messages while its in the pause state; and client reconnects at the same time. When client reconnects, it clears out the acked-event list, this causes the message dispatcher thread not to remove dispatched event from the queue. The fix is to take the pause lock and start removing the dispatched event, this blocks the client reconnect process to wait tille the messages are removed. Also removed the sleep call, that was not needed with the lock.


Diffs
-----

  geode-core/src/main/java/com/gemstone/gemfire/internal/cache/tier/sockets/CacheClientProxy.java f650fee 

Diff: https://reviews.apache.org/r/44924/diff/


Testing
-------

Testing: 
The test used to fail once in 4 runs. With the change it run for 50 times.
precheckin


Thanks,

anilkumar gingade


Re: Review Request 44924: GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.

Posted by anilkumar gingade <ag...@pivotal.io>.

> On March 17, 2016, 4:23 p.m., Hitesh Khamesra wrote:
> > geode-core/src/main/java/com/gemstone/gemfire/internal/cache/tier/sockets/CacheClientProxy.java, line 2583
> > <https://reviews.apache.org/r/44924/diff/1/?file=1301746#file1301746line2583>
> >
> >     This function is in MessageDispatcher, which calls initializaTransient of DurableHARegionQueue. I think the issue here is "this._messageQueue.isPeekInitialized()" which seems not initialized for other thread. But intention of that code was to remove all ack events. if possible fix that part as "pausedLock" can be taken other thread first.

Looking at the code, the lock aquisation by new thread is less likely or not possible (the time delay between client disconnecting and reconnecting); by that time the dispatcher thread will already gotten hold of the pause lock...


- anilkumar


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44924/#review124041
-----------------------------------------------------------


On March 16, 2016, 7:53 p.m., anilkumar gingade wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44924/
> -----------------------------------------------------------
> 
> (Updated March 16, 2016, 7:53 p.m.)
> 
> 
> Review request for geode, anilkumar gingade, Barry Oglesby, Jason Huynh, nabarun nag, Dan Smith, and xiaojian zhou.
> 
> 
> Repository: geode
> 
> 
> Description
> -------
> 
> GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.
> 
> In this test the client sees previously delivered events, when it disconnects and reconnects immediately. 
> This happens when the message dispatcher is in the process of removing the messages while its in the pause state; and client reconnects at the same time. When client reconnects, it clears out the acked-event list, this causes the message dispatcher thread not to remove dispatched event from the queue. The fix is to take the pause lock and start removing the dispatched event, this blocks the client reconnect process to wait tille the messages are removed. Also removed the sleep call, that was not needed with the lock.
> 
> 
> Diffs
> -----
> 
>   geode-core/src/main/java/com/gemstone/gemfire/internal/cache/tier/sockets/CacheClientProxy.java f650fee 
> 
> Diff: https://reviews.apache.org/r/44924/diff/
> 
> 
> Testing
> -------
> 
> Testing: 
> The test used to fail once in 4 runs. With the change it run for 50 times.
> precheckin
> 
> 
> Thanks,
> 
> anilkumar gingade
> 
>


Re: Review Request 44924: GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.

Posted by Hitesh Khamesra <hk...@pivotal.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44924/#review124041
-----------------------------------------------------------




geode-core/src/main/java/com/gemstone/gemfire/internal/cache/tier/sockets/CacheClientProxy.java (line 2582)
<https://reviews.apache.org/r/44924/#comment186393>

    This function is in MessageDispatcher, which calls initializaTransient of DurableHARegionQueue. I think the issue here is "this._messageQueue.isPeekInitialized()" which seems not initialized for other thread. But intention of that code was to remove all ack events. if possible fix that part as "pausedLock" can be taken other thread first.


- Hitesh Khamesra


On March 16, 2016, 7:53 p.m., anilkumar gingade wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44924/
> -----------------------------------------------------------
> 
> (Updated March 16, 2016, 7:53 p.m.)
> 
> 
> Review request for geode, anilkumar gingade, Barry Oglesby, Jason Huynh, nabarun nag, Dan Smith, and xiaojian zhou.
> 
> 
> Repository: geode
> 
> 
> Description
> -------
> 
> GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.
> 
> In this test the client sees previously delivered events, when it disconnects and reconnects immediately. 
> This happens when the message dispatcher is in the process of removing the messages while its in the pause state; and client reconnects at the same time. When client reconnects, it clears out the acked-event list, this causes the message dispatcher thread not to remove dispatched event from the queue. The fix is to take the pause lock and start removing the dispatched event, this blocks the client reconnect process to wait tille the messages are removed. Also removed the sleep call, that was not needed with the lock.
> 
> 
> Diffs
> -----
> 
>   geode-core/src/main/java/com/gemstone/gemfire/internal/cache/tier/sockets/CacheClientProxy.java f650fee 
> 
> Diff: https://reviews.apache.org/r/44924/diff/
> 
> 
> Testing
> -------
> 
> Testing: 
> The test used to fail once in 4 runs. With the change it run for 50 times.
> precheckin
> 
> 
> Thanks,
> 
> anilkumar gingade
> 
>


Re: Review Request 44924: GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.

Posted by Jason Huynh <hu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44924/#review124046
-----------------------------------------------------------


Ship it!




Interesting that we clear out the ack list when reconnecting with a client.  If we could find a better way (in the future) to remove ack'd events, it would simplify this quite a bit.

- Jason Huynh


On March 16, 2016, 7:53 p.m., anilkumar gingade wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44924/
> -----------------------------------------------------------
> 
> (Updated March 16, 2016, 7:53 p.m.)
> 
> 
> Review request for geode, anilkumar gingade, Barry Oglesby, Jason Huynh, nabarun nag, Dan Smith, and xiaojian zhou.
> 
> 
> Repository: geode
> 
> 
> Description
> -------
> 
> GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.
> 
> In this test the client sees previously delivered events, when it disconnects and reconnects immediately. 
> This happens when the message dispatcher is in the process of removing the messages while its in the pause state; and client reconnects at the same time. When client reconnects, it clears out the acked-event list, this causes the message dispatcher thread not to remove dispatched event from the queue. The fix is to take the pause lock and start removing the dispatched event, this blocks the client reconnect process to wait tille the messages are removed. Also removed the sleep call, that was not needed with the lock.
> 
> 
> Diffs
> -----
> 
>   geode-core/src/main/java/com/gemstone/gemfire/internal/cache/tier/sockets/CacheClientProxy.java f650fee 
> 
> Diff: https://reviews.apache.org/r/44924/diff/
> 
> 
> Testing
> -------
> 
> Testing: 
> The test used to fail once in 4 runs. With the change it run for 50 times.
> precheckin
> 
> 
> Thanks,
> 
> anilkumar gingade
> 
>


Re: Review Request 44924: GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.

Posted by Dan Smith <ds...@pivotal.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44924/#review123915
-----------------------------------------------------------


Ship it!




Ship It!

- Dan Smith


On March 16, 2016, 7:53 p.m., anilkumar gingade wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44924/
> -----------------------------------------------------------
> 
> (Updated March 16, 2016, 7:53 p.m.)
> 
> 
> Review request for geode, anilkumar gingade, Barry Oglesby, Jason Huynh, nabarun nag, Dan Smith, and xiaojian zhou.
> 
> 
> Repository: geode
> 
> 
> Description
> -------
> 
> GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.
> 
> In this test the client sees previously delivered events, when it disconnects and reconnects immediately. 
> This happens when the message dispatcher is in the process of removing the messages while its in the pause state; and client reconnects at the same time. When client reconnects, it clears out the acked-event list, this causes the message dispatcher thread not to remove dispatched event from the queue. The fix is to take the pause lock and start removing the dispatched event, this blocks the client reconnect process to wait tille the messages are removed. Also removed the sleep call, that was not needed with the lock.
> 
> 
> Diffs
> -----
> 
>   geode-core/src/main/java/com/gemstone/gemfire/internal/cache/tier/sockets/CacheClientProxy.java f650fee 
> 
> Diff: https://reviews.apache.org/r/44924/diff/
> 
> 
> Testing
> -------
> 
> Testing: 
> The test used to fail once in 4 runs. With the change it run for 50 times.
> precheckin
> 
> 
> Thanks,
> 
> anilkumar gingade
> 
>


Re: Review Request 44924: GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.

Posted by anilkumar gingade <ag...@pivotal.io>.

> On March 16, 2016, 9:51 p.m., Hitesh Khamesra wrote:
> > geode-core/src/main/java/com/gemstone/gemfire/internal/cache/tier/sockets/CacheClientProxy.java, line 2583
> > <https://reviews.apache.org/r/44924/diff/1/?file=1301746#file1301746line2583>
> >
> >     Just quick question..I see before calling initializeTransiient it also tries to remove ack events...that doesn't help??
> >     
> >      protected void initializeTransients()
> >         {
> >             while (!this._messageQueue.isEmptyAckList()&& this._messageQueue.isPeekInitialized()) {
> >                 try {
> >                     this._messageQueue.remove();
> >                 }catch(InterruptedException e){
> >                     e.printStackTrace();
> >                 }
> >             }
> >           this._messageQueue.initializeTransients();
> >         }

No..It was getting into overrided DurableHARegionQueue.initializeTransients() where the acked events were getting cleared...I believe its all timing related, when the client is in pause state and connect happens immediately...and ack getting recieved...


- anilkumar


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44924/#review123938
-----------------------------------------------------------


On March 16, 2016, 7:53 p.m., anilkumar gingade wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44924/
> -----------------------------------------------------------
> 
> (Updated March 16, 2016, 7:53 p.m.)
> 
> 
> Review request for geode, anilkumar gingade, Barry Oglesby, Jason Huynh, nabarun nag, Dan Smith, and xiaojian zhou.
> 
> 
> Repository: geode
> 
> 
> Description
> -------
> 
> GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.
> 
> In this test the client sees previously delivered events, when it disconnects and reconnects immediately. 
> This happens when the message dispatcher is in the process of removing the messages while its in the pause state; and client reconnects at the same time. When client reconnects, it clears out the acked-event list, this causes the message dispatcher thread not to remove dispatched event from the queue. The fix is to take the pause lock and start removing the dispatched event, this blocks the client reconnect process to wait tille the messages are removed. Also removed the sleep call, that was not needed with the lock.
> 
> 
> Diffs
> -----
> 
>   geode-core/src/main/java/com/gemstone/gemfire/internal/cache/tier/sockets/CacheClientProxy.java f650fee 
> 
> Diff: https://reviews.apache.org/r/44924/diff/
> 
> 
> Testing
> -------
> 
> Testing: 
> The test used to fail once in 4 runs. With the change it run for 50 times.
> precheckin
> 
> 
> Thanks,
> 
> anilkumar gingade
> 
>


Re: Review Request 44924: GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.

Posted by Hitesh Khamesra <hk...@pivotal.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44924/#review123938
-----------------------------------------------------------




geode-core/src/main/java/com/gemstone/gemfire/internal/cache/tier/sockets/CacheClientProxy.java (line 2582)
<https://reviews.apache.org/r/44924/#comment186266>

    Just quick question..I see before calling initializeTransiient it also tries to remove ack events...that doesn't help??
    
     protected void initializeTransients()
        {
            while (!this._messageQueue.isEmptyAckList()&& this._messageQueue.isPeekInitialized()) {
                try {
                    this._messageQueue.remove();
                }catch(InterruptedException e){
                    e.printStackTrace();
                }
            }
          this._messageQueue.initializeTransients();
        }


- Hitesh Khamesra


On March 16, 2016, 7:53 p.m., anilkumar gingade wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44924/
> -----------------------------------------------------------
> 
> (Updated March 16, 2016, 7:53 p.m.)
> 
> 
> Review request for geode, anilkumar gingade, Barry Oglesby, Jason Huynh, nabarun nag, Dan Smith, and xiaojian zhou.
> 
> 
> Repository: geode
> 
> 
> Description
> -------
> 
> GEODE-934: CI failure: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyWithCacheXML.
> 
> In this test the client sees previously delivered events, when it disconnects and reconnects immediately. 
> This happens when the message dispatcher is in the process of removing the messages while its in the pause state; and client reconnects at the same time. When client reconnects, it clears out the acked-event list, this causes the message dispatcher thread not to remove dispatched event from the queue. The fix is to take the pause lock and start removing the dispatched event, this blocks the client reconnect process to wait tille the messages are removed. Also removed the sleep call, that was not needed with the lock.
> 
> 
> Diffs
> -----
> 
>   geode-core/src/main/java/com/gemstone/gemfire/internal/cache/tier/sockets/CacheClientProxy.java f650fee 
> 
> Diff: https://reviews.apache.org/r/44924/diff/
> 
> 
> Testing
> -------
> 
> Testing: 
> The test used to fail once in 4 runs. With the change it run for 50 times.
> precheckin
> 
> 
> Thanks,
> 
> anilkumar gingade
> 
>