You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mina.apache.org by Buddy Butterfly <bu...@web.de> on 2016/02/26 12:07:35 UTC

We still experience DIRMINA-1006 despite using 2.0.13

Hi,

we still experience isse DIRMINA-1006. We have an RFID device that
rejects connections when connection limit is reached. Our mina client
thread then goes into 100% cpu usage. Any idea if the patch maybe
did not cover the all situations?

Cheers,
Buddy





Re: We still experience DIRMINA-1006 despite using 2.0.13

Posted by Buddy Butterfly <bu...@web.de>.
Am 26.02.2016 um 13:03 schrieb Emmanuel Lécharny:
> Le 26/02/16 12:07, Buddy Butterfly a écrit :
>> Hi,
>>
>> we still experience isse DIRMINA-1006. We have an RFID device that
>> rejects connections when connection limit is reached. Our mina client
>> thread then goes into 100% cpu usage. Any idea if the patch maybe
>> did not cover the all situations?
>>
>> Cheers,
>> Buddy
>>
>>
>>
>>
> Cr*p...
> 
> Can you reopen the issue, and attach a stacktrace (kill -3) when you
> observe a 100% CPU ?
> 

We are on windows. I will try a thread dump when it happens again.

Tnx a lot,
Buddy



Re: We still experience DIRMINA-1006 despite using 2.0.13

Posted by Buddy Butterfly <bu...@web.de>.
Hi,

have you been able to find a cause for that?
Yes, we could test a nightly!

Tnx,
Buddy

Am 29.02.2016 um 19:46 schrieb Emmanuel Lécharny:
> Le 28/02/16 18:00, Buddy Butterfly a écrit :
>> Hi,
>>
>> if I understand it correct, you are telling me that there will be no
>> sultion for it? 
> 
> Did I say that ???
> 
>> Or did I get it wrong.
> Probably... I'm just trying to see what could be the root cause.
> 
>> This problem strikes
>> us heavily! Project responsible is over the top :-(.
>>
>> We saw 4 lines of this type:
>>
>> 	Line 41771: 2016-02-28 01:52:01,147 [   NioProcessor-2358] WARN
>> ing.AbstractPollingIoProcessor$Processor                  run - Create a
>> new selector. Selected is 0, delta = 0
>>
>> 	Line 71754: 2016-02-28 04:27:14,918 [   NioProcessor-3238] WARN
>> ing.AbstractPollingIoProcessor$Processor                  run - Create a
>> new selector. Selected is 0, delta = 0
>>
>> 	Line 74744: 2016-02-28 06:52:45,293 [   NioProcessor-4054] WARN
>> ing.AbstractPollingIoProcessor$Processor                  run - Create a
>> new selector. Selected is 0, delta = 0
>>
>> 	Line 87830: 2016-02-28 17:24:35,806 [   NioProcessor-7346] WARN
>> ing.AbstractPollingIoProcessor$Processor                  run - Create a
>> new selector. Selected is 0, delta = 0
> 
> So clearly something else is wrong. If we were looping because the
> select() exits immediately with no socket being active, we would have
> way more of such messages. Here, it happens only from time to time.
> 
> Now, reading teh various links you have provided, I wonder if the
> problem is not in the JDK code itself...
> 
> I'll investigate more tonite.
> 
> Would you be ready to test a nightly build if I provide one ?
> 
> Thanks !
> 
> 



Re: We still experience DIRMINA-1006 despite using 2.0.13

Posted by Emmanuel Lécharny <el...@gmail.com>.
Le 28/02/16 18:00, Buddy Butterfly a écrit :
> Hi,
>
> if I understand it correct, you are telling me that there will be no
> sultion for it? 

Did I say that ???

> Or did I get it wrong.
Probably... I'm just trying to see what could be the root cause.

> This problem strikes
> us heavily! Project responsible is over the top :-(.
>
> We saw 4 lines of this type:
>
> 	Line 41771: 2016-02-28 01:52:01,147 [   NioProcessor-2358] WARN
> ing.AbstractPollingIoProcessor$Processor                  run - Create a
> new selector. Selected is 0, delta = 0
>
> 	Line 71754: 2016-02-28 04:27:14,918 [   NioProcessor-3238] WARN
> ing.AbstractPollingIoProcessor$Processor                  run - Create a
> new selector. Selected is 0, delta = 0
>
> 	Line 74744: 2016-02-28 06:52:45,293 [   NioProcessor-4054] WARN
> ing.AbstractPollingIoProcessor$Processor                  run - Create a
> new selector. Selected is 0, delta = 0
>
> 	Line 87830: 2016-02-28 17:24:35,806 [   NioProcessor-7346] WARN
> ing.AbstractPollingIoProcessor$Processor                  run - Create a
> new selector. Selected is 0, delta = 0

So clearly something else is wrong. If we were looping because the
select() exits immediately with no socket being active, we would have
way more of such messages. Here, it happens only from time to time.

Now, reading teh various links you have provided, I wonder if the
problem is not in the JDK code itself...

I'll investigate more tonite.

Would you be ready to test a nightly build if I provide one ?

Thanks !


Re: We still experience DIRMINA-1006 despite using 2.0.13

Posted by Buddy Butterfly <bu...@web.de>.
Hi,

if I understand it correct, you are telling me that there will be no
sultion for it? Or did I get it wrong. I am lost. This problem strikes
us heavily! Project responsible is over the top :-(.

We saw 4 lines of this type:

	Line 41771: 2016-02-28 01:52:01,147 [   NioProcessor-2358] WARN
ing.AbstractPollingIoProcessor$Processor                  run - Create a
new selector. Selected is 0, delta = 0

	Line 71754: 2016-02-28 04:27:14,918 [   NioProcessor-3238] WARN
ing.AbstractPollingIoProcessor$Processor                  run - Create a
new selector. Selected is 0, delta = 0

	Line 74744: 2016-02-28 06:52:45,293 [   NioProcessor-4054] WARN
ing.AbstractPollingIoProcessor$Processor                  run - Create a
new selector. Selected is 0, delta = 0

	Line 87830: 2016-02-28 17:24:35,806 [   NioProcessor-7346] WARN
ing.AbstractPollingIoProcessor$Processor                  run - Create a
new selector. Selected is 0, delta = 0


What would be a workaround?

Tnx a lot!
Buddy


Am 28.02.2016 um 11:33 schrieb Emmanuel Lécharny:
> Le 27/02/16 18:08, Buddy Butterfly a écrit :
>> Obviously netty fixed this with a workaround.
> 
> Thanks for the interesting analysis and pointers.
> 
> It seems that in some case, we are still looping on the select().
> 
> Can you do one more thing to confirm that it's teh real problem ? Set
> the log to WARN, you should get messages like :
> 
>     Create a new selector. Selected is 0, delta = XXX
> 
> This would be the signal that we detected a stalled selector (ie a
> selector that return immediately on select(), even if no event has
> arrived, leading to a fast loop eating all the CPU). In thsi case, we
> recreate a new selector. In your case, and accordingly to the bug
> report, there might be a case where creating a new selector does not
> solve the issue.
> 
> Many thanks !
> 
> 



Re: We still experience DIRMINA-1006 despite using 2.0.13

Posted by Emmanuel Lécharny <el...@gmail.com>.
Le 27/02/16 18:08, Buddy Butterfly a écrit :
> Obviously netty fixed this with a workaround.

Thanks for the interesting analysis and pointers.

It seems that in some case, we are still looping on the select().

Can you do one more thing to confirm that it's teh real problem ? Set
the log to WARN, you should get messages like :

    Create a new selector. Selected is 0, delta = XXX

This would be the signal that we detected a stalled selector (ie a
selector that return immediately on select(), even if no event has
arrived, leading to a fast loop eating all the CPU). In thsi case, we
recreate a new selector. In your case, and accordingly to the bug
report, there might be a case where creating a new selector does not
solve the issue.

Many thanks !


Re: We still experience DIRMINA-1006 despite using 2.0.13

Posted by Buddy Butterfly <bu...@web.de>.
Obviously netty fixed this with a workaround.

Am 27.02.2016 um 18:06 schrieb Buddy Butterfly:
> 
> This bug from netty seesm to be related:
> 
> https://github.com/netty/netty/issues/3857
> 
> Am 27.02.2016 um 17:38 schrieb Buddy Butterfly:
>> Am 26.02.2016 um 13:03 schrieb Emmanuel Lécharny:
>>> Le 26/02/16 12:07, Buddy Butterfly a écrit :
>>>> Hi,
>>>>
>>>> we still experience isse DIRMINA-1006. We have an RFID device that
>>>> rejects connections when connection limit is reached. Our mina client
>>>> thread then goes into 100% cpu usage. Any idea if the patch maybe
>>>> did not cover the all situations?
>>>>
>>>> Cheers,
>>>> Buddy
>>>>
>>>>
>>>>
>>>>
>>> Cr*p...
>>>
>>> Can you reopen the issue, and attach a stacktrace (kill -3) when you
>>> observe a 100% CPU ?
>>>
>>
>> "NioProcessor-22" - Thread t@2977
>>    java.lang.Thread.State: RUNNABLE
>> 	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method)
>> 	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(Unknown Source)
>> 	at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(Unknown Source)
>> 	at sun.nio.ch.WindowsSelectorImpl.doSelect(Unknown Source)
>> 	at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
>> 	- locked <6ad192> (a sun.nio.ch.Util$2)
>> 	- locked <e4f17d> (a java.util.Collections$UnmodifiableSet)
>> 	- locked <aa5042> (a sun.nio.ch.WindowsSelectorImpl)
>> 	at sun.nio.ch.SelectorImpl.select(Unknown Source)
>> 	at
>> org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:98)
>> 	at
>> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1059)
>> 	at
>> org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
>> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>> 	at java.lang.Thread.run(Unknown Source)
>>
>> Any Idea?
>>
>> Cheers,
>> Buddy
>>
>>
>>
> 
> 



Re: We still experience DIRMINA-1006 despite using 2.0.13

Posted by Buddy Butterfly <bu...@web.de>.
This bug from netty seesm to be related:

https://github.com/netty/netty/issues/3857

Am 27.02.2016 um 17:38 schrieb Buddy Butterfly:
> Am 26.02.2016 um 13:03 schrieb Emmanuel Lécharny:
>> Le 26/02/16 12:07, Buddy Butterfly a écrit :
>>> Hi,
>>>
>>> we still experience isse DIRMINA-1006. We have an RFID device that
>>> rejects connections when connection limit is reached. Our mina client
>>> thread then goes into 100% cpu usage. Any idea if the patch maybe
>>> did not cover the all situations?
>>>
>>> Cheers,
>>> Buddy
>>>
>>>
>>>
>>>
>> Cr*p...
>>
>> Can you reopen the issue, and attach a stacktrace (kill -3) when you
>> observe a 100% CPU ?
>>
> 
> "NioProcessor-22" - Thread t@2977
>    java.lang.Thread.State: RUNNABLE
> 	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method)
> 	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(Unknown Source)
> 	at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(Unknown Source)
> 	at sun.nio.ch.WindowsSelectorImpl.doSelect(Unknown Source)
> 	at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
> 	- locked <6ad192> (a sun.nio.ch.Util$2)
> 	- locked <e4f17d> (a java.util.Collections$UnmodifiableSet)
> 	- locked <aa5042> (a sun.nio.ch.WindowsSelectorImpl)
> 	at sun.nio.ch.SelectorImpl.select(Unknown Source)
> 	at
> org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:98)
> 	at
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1059)
> 	at
> org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> 	at java.lang.Thread.run(Unknown Source)
> 
> Any Idea?
> 
> Cheers,
> Buddy
> 
> 
> 


Re: We still experience DIRMINA-1006 despite using 2.0.13

Posted by Buddy Butterfly <bu...@web.de>.
Am 26.02.2016 um 13:03 schrieb Emmanuel Lécharny:
> Le 26/02/16 12:07, Buddy Butterfly a écrit :
>> Hi,
>>
>> we still experience isse DIRMINA-1006. We have an RFID device that
>> rejects connections when connection limit is reached. Our mina client
>> thread then goes into 100% cpu usage. Any idea if the patch maybe
>> did not cover the all situations?
>>
>> Cheers,
>> Buddy
>>
>>
>>
>>
> Cr*p...
> 
> Can you reopen the issue, and attach a stacktrace (kill -3) when you
> observe a 100% CPU ?
> 

"NioProcessor-22" - Thread t@2977
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method)
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(Unknown Source)
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(Unknown Source)
	at sun.nio.ch.WindowsSelectorImpl.doSelect(Unknown Source)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
	- locked <6ad192> (a sun.nio.ch.Util$2)
	- locked <e4f17d> (a java.util.Collections$UnmodifiableSet)
	- locked <aa5042> (a sun.nio.ch.WindowsSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(Unknown Source)
	at
org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:98)
	at
org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1059)
	at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)

Any Idea?

Cheers,
Buddy



Re: We still experience DIRMINA-1006 despite using 2.0.13

Posted by Emmanuel Lécharny <el...@gmail.com>.
Le 26/02/16 12:07, Buddy Butterfly a écrit :
> Hi,
>
> we still experience isse DIRMINA-1006. We have an RFID device that
> rejects connections when connection limit is reached. Our mina client
> thread then goes into 100% cpu usage. Any idea if the patch maybe
> did not cover the all situations?
>
> Cheers,
> Buddy
>
>
>
>
Cr*p...

Can you reopen the issue, and attach a stacktrace (kill -3) when you
observe a 100% CPU ?