You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Michi Mutsuzaki <mi...@cs.stanford.edu> on 2014/07/09 05:20:02 UTC

quorum connection manager shutdown takes long time

Hi,

I'm using ZooKeeper 3.4.5 (over IPSec!), and I saw a case where the
quorum connection manager takes a long time to shut down. It looks
like one of the receiver threads didn't exit for ~14 minutes.

https://paste.apache.org/2wFN?action=download

The tickTime is set to 3000 and initLimit is set to 5, so readInt()
should have gotten a socket timeout exception after 15 seconds.
Instead, it got an eof exception after 14 minutes. I didn't get a
chance to do a thread dump when this happened, but has anybody seen
something similiar?

Re: quorum connection manager shutdown takes long time

Posted by Michi Mutsuzaki <mi...@cs.stanford.edu>.
I heard that commenting out that line did fix the problem. I'll open a
jira for this.

On Thu, Jul 10, 2014 at 11:04 AM, Michi Mutsuzaki <mi...@cs.stanford.edu> wrote:
> I haven't had time to try this yet. I'll let you guys know once I have
> the result.
>
> On Thu, Jul 10, 2014 at 11:02 AM, Raúl Gutiérrez Segalés
> <rg...@itevenworks.net> wrote:
>> On 9 July 2014 08:28, Michi Mutsuzaki <mi...@cs.stanford.edu> wrote:
>>>
>>> I don't know how I missed that :) QA said this is reproducible, so
>>> I'll try commenting this line out. Thanks Flavio!
>>
>>
>> I am curious, was it that?
>>
>>
>> -rgs

Re: quorum connection manager shutdown takes long time

Posted by Michi Mutsuzaki <mi...@cs.stanford.edu>.
I haven't had time to try this yet. I'll let you guys know once I have
the result.

On Thu, Jul 10, 2014 at 11:02 AM, Raúl Gutiérrez Segalés
<rg...@itevenworks.net> wrote:
> On 9 July 2014 08:28, Michi Mutsuzaki <mi...@cs.stanford.edu> wrote:
>>
>> I don't know how I missed that :) QA said this is reproducible, so
>> I'll try commenting this line out. Thanks Flavio!
>
>
> I am curious, was it that?
>
>
> -rgs

Re: quorum connection manager shutdown takes long time

Posted by Raúl Gutiérrez Segalés <rg...@itevenworks.net>.
On 9 July 2014 08:28, Michi Mutsuzaki <mi...@cs.stanford.edu> wrote:

> I don't know how I missed that :) QA said this is reproducible, so
> I'll try commenting this line out. Thanks Flavio!
>

I am curious, was it that?


-rgs

Re: quorum connection manager shutdown takes long time

Posted by Michi Mutsuzaki <mi...@cs.stanford.edu>.
I don't know how I missed that :) QA said this is reproducible, so
I'll try commenting this line out. Thanks Flavio!

On Wed, Jul 9, 2014 at 6:18 AM, Flavio Junqueira <fp...@yahoo.com> wrote:
> I wonder if this is the culprit:
>
> // OK to wait until socket disconnects while reading.
>                 sock.setSoTimeout(0);
>
>
> On Wednesday, July 9, 2014 5:55 AM, Michi Mutsuzaki <mi...@cs.stanford.edu>
> wrote:
>
>
>
> Hi,
>
> I'm using ZooKeeper 3.4.5 (over IPSec!), and I saw a case where the
> quorum connection manager takes a long time to shut down. It looks
> like one of the receiver threads didn't exit for ~14 minutes.
>
> https://paste.apache.org/2wFN?action=download
>
> The tickTime is set to 3000 and initLimit is set to 5, so readInt()
> should have gotten a socket timeout exception after 15 seconds.
> Instead, it got an eof exception after 14 minutes. I didn't get a
> chance to do a thread dump when this happened, but has anybody seen
> something similiar?
>
>

Re: quorum connection manager shutdown takes long time

Posted by Flavio Junqueira <fp...@yahoo.com.INVALID>.
I wonder if this is the culprit:

// OK to wait until socket disconnects while reading.
                sock.setSoTimeout(0);


On Wednesday, July 9, 2014 5:55 AM, Michi Mutsuzaki <mi...@cs.stanford.edu> wrote:
 

>
>
>Hi,
>
>I'm using ZooKeeper 3.4.5 (over IPSec!), and I saw a case where the
>quorum connection manager takes a long time to shut down. It looks
>like one of the receiver threads didn't exit for ~14 minutes.
>
>https://paste.apache.org/2wFN?action=download
>
>The tickTime is set to 3000 and initLimit is set to 5, so readInt()
>should have gotten a socket timeout exception after 15 seconds.
>Instead, it got an eof exception after 14 minutes. I didn't get a
>chance to do a thread dump when this happened, but has anybody seen
>something similiar?
>
>
>