You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by 胡海麟 <h...@bulbit.jp> on 2018/06/24 16:40:51 UTC

Re-post: java.io.IOException: Too many open files

Hi,

Re-post message 'cause I failed to post my logs pasted.

I have got repeated Too many open files exceptions since sometime.
================================
[11:26:24,493][SEVERE][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
Failed to process selector key [ses=GridSelectorNioSessionImpl
[worker=ByteBufferNioClientWorker
[readBuf=java.nio.HeapByteBuffer[pos=0 lim=8192 cap=8192],
super=AbstractNioClientWorker [idx=1, bytesRcvd=0, bytesSent=0,
bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker
[name=grid-nio-worker-tcp-rest-1, igniteInstanceName=null,
finished=false, hashCode=1611196193, interrupted=false,
runner=grid-nio-worker-tcp-rest-1-#57]]], writeBuf=null, readBuf=null,
inRecovery=null, outRecovery=null, super=GridNioSessionImpl
[locAddr=/10.1.14.11:11211, rmtAddr=/10.1.252.184:40680,
createTime=1529666783471, closeTime=0, bytesSent=5, bytesRcvd=1074,
bytesSent0=0, bytesRcvd0=0, sndSchedTime=1529666783481,
lastSndTime=1529666783481, lastRcvTime=1529666783481,
readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter
[parser=GridTcpRestParser [marsh=JdkMarshaller
[clsFilter=o.a.i.i.IgniteKernal$5@331b0c4a], routerClient=false],
directMode=false]], accepted=true]]]
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:197)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
        at org.apache.ignite.internal.util.nio.GridNioServer$ByteBufferNioClientWorker.processRead(GridNioServer.java:1085)
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2339)
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2110)
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1764)
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:748)
[11:26:24,493][WARNING][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
Closing NIO session because of unhandled exception [cls=class
o.a.i.i.util.nio.GridNioException, msg=Connection reset by peer]
[11:26:24,493][WARNING][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
Closed client session due to exception [ses=GridSelectorNioSessionImpl
[worker=ByteBufferNioClientWorker
[readBuf=java.nio.HeapByteBuffer[pos=0 lim=8192 cap=8192],
super=AbstractNioClientWorker [idx=1, bytesRcvd=0, bytesSent=0,
bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker
[name=grid-nio-worker-tcp-rest-1, igniteInstanceName=null,
finished=false, hashCode=1611196193, interrupted=false,
runner=grid-nio-worker-tcp-rest-1-#57]]], writeBuf=null, readBuf=null,
inRecovery=null, outRecovery=null, super=GridNioSessionImpl
[locAddr=/10.1.14.11:11211, rmtAddr=/10.1.252.184:40680,
createTime=1529666783471, closeTime=1529666784488, bytesSent=5,
bytesRcvd=1074, bytesSent0=0, bytesRcvd0=0,
sndSchedTime=1529666783481, lastSndTime=1529666783481,
lastRcvTime=1529666783481, readsPaused=false,
filterChain=FilterChain[filters=[GridNioCodecFilter
[parser=GridTcpRestParser [marsh=JdkMarshaller
[clsFilter=o.a.i.i.IgniteKernal$5@331b0c4a], routerClient=false],
directMode=false]], accepted=true]], msg=Connection reset by peer]
[11:26:24,513][SEVERE][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
Caught unhandled exception in NIO worker thread (restart the node).
java.lang.NullPointerException
        at sun.nio.ch.EPollArrayWrapper.isEventsHighKilled(EPollArrayWrapper.java:174)
        at sun.nio.ch.EPollArrayWrapper.setUpdateEvents(EPollArrayWrapper.java:190)
        at sun.nio.ch.EPollArrayWrapper.add(EPollArrayWrapper.java:239)
        at sun.nio.ch.EPollSelectorImpl.implRegister(EPollSelectorImpl.java:178)
        at sun.nio.ch.SelectorImpl.register(SelectorImpl.java:132)
        at java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:212)
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.register(GridNioServer.java:2545)
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1934)
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1764)
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:748)
[11:26:30,277][SEVERE][nio-acceptor-#55][GridTcpRestProtocol] Failed
to accept remote connection (will wait for 2000ms).
class org.apache.ignite.IgniteCheckedException: Failed to accept
connection: GridWorker [name=nio-acceptor, igniteInstanceName=null,
finished=false, hashCode=1020662787, interrupted=false,
runner=nio-acceptor-#55]
        at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2888)
        at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.body(GridNioServer.java:2822)
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Too many open files
        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
        at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
        at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.processSelectedKeys(GridNioServer.java:2938)
        at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2872)
        ... 3 more
[11:26:32,284][SEVERE][nio-acceptor-#55][GridTcpRestProtocol] Failed
to accept remote connection (will wait for 2000ms).
class org.apache.ignite.IgniteCheckedException: Failed to accept
connection: GridWorker [name=nio-acceptor, igniteInstanceName=null,
finished=false, hashCode=1020662787, interrupted=false,
runner=nio-acceptor-#55]
        at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2888)
        at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.body(GridNioServer.java:2822)
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Too many open files
        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
        at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
        at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.processSelectedKeys(GridNioServer.java:2938)
        at org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2872)
        ... 3 more
================================

My max open files is 32768, and ignite process does have 32768 open files.
================================
$ sudo ls -hl /proc/4055/fd/ | wc -l
32768
================================

Most of them look like this
================================
...
lrwx------ 1 root root 64 Jun 23 12:22 9990 -> socket:[1167798]
lrwx------ 1 root root 64 Jun 23 12:22 9991 -> socket:[1167799]
lrwx------ 1 root root 64 Jun 23 12:22 9992 -> socket:[1166839]
lrwx------ 1 root root 64 Jun 23 12:22 9993 -> socket:[1167800]
lrwx------ 1 root root 64 Jun 23 12:22 9994 -> socket:[1168762]
lrwx------ 1 root root 64 Jun 23 12:22 9995 -> socket:[1168763]
lrwx------ 1 root root 64 Jun 23 12:22 9996 -> socket:[1164109]
lrwx------ 1 root root 64 Jun 23 12:22 9997 -> socket:[1166840]
lrwx------ 1 root root 64 Jun 23 12:22 9998 -> socket:[1164110]
lrwx------ 1 root root 64 Jun 23 12:22 9999 -> socket:[1169810]
================================

I haven't found any document about how ignite uses unix socket.
It seems ignite doesn't close them properly. Any help?

Thanks.

Re: Re-post: java.io.IOException: Too many open files

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

My expectation is that your code could skip the 2. step under load with
timeouts.

Regards,

-- 
Ilya Kasnacheev

2018-07-04 11:42 GMT+03:00 胡海麟 <h...@bulbit.jp>:

> Hi,
>
> After have some more tests, I believe the things are like this:
>
> 1. client writes to ignite and timeout (client setting is 15ms)
> 2. client resets the connection since it seems dead (timeout).
> 3. server catches the connection reset and throw the exception
>
> Actually, it's just a normal use case about connection timeout.
>
> I guess "Too many open files" is caused by huge number and high
> frequency timeout and reconnecting.
> I tried to set nofile 491403 to avoid the problem.
>
> For 500 microseconds, it is just for testing. Generally, we use
> milliseconds level settings.
>
> Thanks.
>
> On Tue, Jul 3, 2018 at 9:57 PM, ilya.kasnacheev
> <il...@gmail.com> wrote:
> > Hello!
> >
> > I have tried to reproduce your case, but I don't observe any growth of
> > number of open file descriptors on Ignite side.
> >
> > I think the problem here is on Go side. Please make sure to always close
> > connection if you open it. If your program is terminated, this is not so
> > strict, but if you create connections in a loop it can become a problem.
> >
> > Also, socket:[2322160] is not necessarily a UNIX socket, it is most
> often a
> > TCP socket as well.
> >
> > I also recommend changing '500 microseconds' to '500 milliseconds'
> because
> > there's not much you can expect to happen in half a millisecond or
> > one-2000th of a second. Especially when over a network.
> >
> > Regards,
> >
> >
> >
> > --
> > Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Re-post: java.io.IOException: Too many open files

Posted by 胡海麟 <h...@bulbit.jp>.
Hi,

After have some more tests, I believe the things are like this:

1. client writes to ignite and timeout (client setting is 15ms)
2. client resets the connection since it seems dead (timeout).
3. server catches the connection reset and throw the exception

Actually, it's just a normal use case about connection timeout.

I guess "Too many open files" is caused by huge number and high
frequency timeout and reconnecting.
I tried to set nofile 491403 to avoid the problem.

For 500 microseconds, it is just for testing. Generally, we use
milliseconds level settings.

Thanks.

On Tue, Jul 3, 2018 at 9:57 PM, ilya.kasnacheev
<il...@gmail.com> wrote:
> Hello!
>
> I have tried to reproduce your case, but I don't observe any growth of
> number of open file descriptors on Ignite side.
>
> I think the problem here is on Go side. Please make sure to always close
> connection if you open it. If your program is terminated, this is not so
> strict, but if you create connections in a loop it can become a problem.
>
> Also, socket:[2322160] is not necessarily a UNIX socket, it is most often a
> TCP socket as well.
>
> I also recommend changing '500 microseconds' to '500 milliseconds' because
> there's not much you can expect to happen in half a millisecond or
> one-2000th of a second. Especially when over a network.
>
> Regards,
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Re-post: java.io.IOException: Too many open files

Posted by "ilya.kasnacheev" <il...@gmail.com>.
Hello!

I have tried to reproduce your case, but I don't observe any growth of
number of open file descriptors on Ignite side.

I think the problem here is on Go side. Please make sure to always close
connection if you open it. If your program is terminated, this is not so
strict, but if you create connections in a loop it can become a problem.

Also, socket:[2322160] is not necessarily a UNIX socket, it is most often a
TCP socket as well.

I also recommend changing '500 microseconds' to '500 milliseconds' because
there's not much you can expect to happen in half a millisecond or
one-2000th of a second. Especially when over a network.

Regards,



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Re-post: java.io.IOException: Too many open files

Posted by 胡海麟 <h...@bulbit.jp>.
Hi,

In case dial timeout, client is nil so that client.Close() can't work.

Thanks.

Re: Re-post: java.io.IOException: Too many open files

Posted by Roman Guseinov <ro...@gromtech.ru>.
Hi,

Thank you for attaching the code sample and configration. Most likely this
information will be enough to reproduce the issue.

Meanwhile could you please try to add `defer client.Close()` into your code:

func main() {
  client, err := redis.DialTimeout("tcp", "10.1.14.221:11211", 500 *
time.Microsecond)
  if err != nil {
    panic(err)
  }
  
  defer client.Close()

  foo, err := client.Cmd("GET", "foo").Str()
  if err != nil {
    panic(err)
  }
  fmt.Println("foo: ", foo)
}

I think it should help to fix "Connection reset by peer" exceptions on
Ignite nodes.

Best Regards,
Roman



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Re-post: java.io.IOException: Too many open files

Posted by 胡海麟 <h...@bulbit.jp>.
Hi,

Sorry I have no knowledge about maven.

Here is my config file and sample code of the client.
I reproduced "Connection reset by peer" by adjust the timeout setting,
but ignite's file descriptor count didn't increase.

Before ignite was halted by "Too many open files", there was a close
wait spike in network metrics (see network.png).
I'm still looking for the reason for that.

Thanks.

Re: Re-post: java.io.IOException: Too many open files

Posted by Roman Guseinov <ro...@gromtech.ru>.
Hi, 

I checked the logs and it saw there are a lot of "Connection reset by peer"
exceptions there which can be a cause of "Too many open files".

It seems that the clients connect to Ignite server node via REST API. Could
you please share a small reproducer (maven project at github) and server
node configuration as well?

I will try to reproduce the issue in my environment. I think we need to
analyze how a client app interacts with a server node.

Thanks.

Best Regards,
Roman



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Re-post: java.io.IOException: Too many open files

Posted by 胡海麟 <h...@bulbit.jp>.
Hi,

I have only one cache redis-ignite-internal-cache-0 with 1 backup,
persistence disabled, 2 nodes totally.
Here are the full logs.

Thanks.

Re: Re-post: java.io.IOException: Too many open files

Posted by Roman Guseinov <ro...@gromtech.ru>.
Hi,

Could you please attach full Ignite logs? Ignite node should not keep a lot
of socket descriptors. But some network issues can lead to increasing those
ones.

Another possible cause, if you have one node with 30 caches and Ignite
Persistence is enabled, then it is an expected behavior the more 30K files
are opened - 30 x 1024 (default partitions count). Is it your case?

Best Regards,
Roman



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Re-post: java.io.IOException: Too many open files

Posted by 胡海麟 <h...@bulbit.jp>.
I set it 32768, exhausted. I have many clients to connect to ignite,
but don't have so many.
I'm afraid that to set it higher just win me a little more time, but
not a solution.

On Mon, Jun 25, 2018 at 5:20 AM, David Harvey <dh...@jobcase.com> wrote:
> MYou must increase the Linux NOFILE ulimit when running Ignite.  The
> documentation describes how to do this.
>
> Disclaimer
>
> The information contained in this communication from the sender is
> confidential. It is intended solely for use by the recipient and others
> authorized to receive it. If you are not the recipient, you are hereby
> notified that any disclosure, copying, distribution or taking action in
> relation of the contents of this information is strictly prohibited and may
> be unlawful.
>
> This email has been scanned for viruses and malware, and may have been
> automatically archived by Mimecast Ltd, an innovator in Software as a
> Service (SaaS) for business. Providing a safer and more useful place for
> your human generated data. Specializing in; Security, archiving and
> compliance. To find out more Click Here.

Re: Re-post: java.io.IOException: Too many open files

Posted by David Harvey <dh...@jobcase.com>.
MYou must increase the Linux NOFILE ulimit when running Ignite.  The
documentation describes how to do this.

On Sun, Jun 24, 2018, 12:47 PM 胡海麟 <h...@bulbit.jp> wrote:

> Hi,
>
> Re-post message 'cause I failed to post my logs pasted.
>
> I have got repeated Too many open files exceptions since sometime.
> ================================
> [11:26:24,493][SEVERE][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
> Failed to process selector key [ses=GridSelectorNioSessionImpl
> [worker=ByteBufferNioClientWorker
> [readBuf=java.nio.HeapByteBuffer[pos=0 lim=8192 cap=8192],
> super=AbstractNioClientWorker [idx=1, bytesRcvd=0, bytesSent=0,
> bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker
> [name=grid-nio-worker-tcp-rest-1, igniteInstanceName=null,
> finished=false, hashCode=1611196193, interrupted=false,
> runner=grid-nio-worker-tcp-rest-1-#57]]], writeBuf=null, readBuf=null,
> inRecovery=null, outRecovery=null, super=GridNioSessionImpl
> [locAddr=/10.1.14.11:11211, rmtAddr=/10.1.252.184:40680,
> createTime=1529666783471, closeTime=0, bytesSent=5, bytesRcvd=1074,
> bytesSent0=0, bytesRcvd0=0, sndSchedTime=1529666783481,
> lastSndTime=1529666783481, lastRcvTime=1529666783481,
> readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter
> [parser=GridTcpRestParser [marsh=JdkMarshaller
> [clsFilter=o.a.i.i.IgniteKernal$5@331b0c4a], routerClient=false],
> directMode=false]], accepted=true]]]
> java.io.IOException: Connection reset by peer
>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>         at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$ByteBufferNioClientWorker.processRead(GridNioServer.java:1085)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2339)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2110)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1764)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>         at java.lang.Thread.run(Thread.java:748)
>
> [11:26:24,493][WARNING][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
> Closing NIO session because of unhandled exception [cls=class
> o.a.i.i.util.nio.GridNioException, msg=Connection reset by peer]
>
> [11:26:24,493][WARNING][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
> Closed client session due to exception [ses=GridSelectorNioSessionImpl
> [worker=ByteBufferNioClientWorker
> [readBuf=java.nio.HeapByteBuffer[pos=0 lim=8192 cap=8192],
> super=AbstractNioClientWorker [idx=1, bytesRcvd=0, bytesSent=0,
> bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker
> [name=grid-nio-worker-tcp-rest-1, igniteInstanceName=null,
> finished=false, hashCode=1611196193, interrupted=false,
> runner=grid-nio-worker-tcp-rest-1-#57]]], writeBuf=null, readBuf=null,
> inRecovery=null, outRecovery=null, super=GridNioSessionImpl
> [locAddr=/10.1.14.11:11211, rmtAddr=/10.1.252.184:40680,
> createTime=1529666783471, closeTime=1529666784488, bytesSent=5,
> bytesRcvd=1074, bytesSent0=0, bytesRcvd0=0,
> sndSchedTime=1529666783481, lastSndTime=1529666783481,
> lastRcvTime=1529666783481, readsPaused=false,
> filterChain=FilterChain[filters=[GridNioCodecFilter
> [parser=GridTcpRestParser [marsh=JdkMarshaller
> [clsFilter=o.a.i.i.IgniteKernal$5@331b0c4a], routerClient=false],
> directMode=false]], accepted=true]], msg=Connection reset by peer]
> [11:26:24,513][SEVERE][grid-nio-worker-tcp-rest-1-#57][GridTcpRestProtocol]
> Caught unhandled exception in NIO worker thread (restart the node).
> java.lang.NullPointerException
>         at
> sun.nio.ch.EPollArrayWrapper.isEventsHighKilled(EPollArrayWrapper.java:174)
>         at
> sun.nio.ch.EPollArrayWrapper.setUpdateEvents(EPollArrayWrapper.java:190)
>         at sun.nio.ch.EPollArrayWrapper.add(EPollArrayWrapper.java:239)
>         at
> sun.nio.ch.EPollSelectorImpl.implRegister(EPollSelectorImpl.java:178)
>         at sun.nio.ch.SelectorImpl.register(SelectorImpl.java:132)
>         at
> java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:212)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.register(GridNioServer.java:2545)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1934)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1764)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>         at java.lang.Thread.run(Thread.java:748)
> [11:26:30,277][SEVERE][nio-acceptor-#55][GridTcpRestProtocol] Failed
> to accept remote connection (will wait for 2000ms).
> class org.apache.ignite.IgniteCheckedException: Failed to accept
> connection: GridWorker [name=nio-acceptor, igniteInstanceName=null,
> finished=false, hashCode=1020662787, interrupted=false,
> runner=nio-acceptor-#55]
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2888)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.body(GridNioServer.java:2822)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Too many open files
>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>         at sun.nio.ch
> .ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
>         at sun.nio.ch
> .ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.processSelectedKeys(GridNioServer.java:2938)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2872)
>         ... 3 more
> [11:26:32,284][SEVERE][nio-acceptor-#55][GridTcpRestProtocol] Failed
> to accept remote connection (will wait for 2000ms).
> class org.apache.ignite.IgniteCheckedException: Failed to accept
> connection: GridWorker [name=nio-acceptor, igniteInstanceName=null,
> finished=false, hashCode=1020662787, interrupted=false,
> runner=nio-acceptor-#55]
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2888)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.body(GridNioServer.java:2822)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Too many open files
>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>         at sun.nio.ch
> .ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
>         at sun.nio.ch
> .ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.processSelectedKeys(GridNioServer.java:2938)
>         at
> org.apache.ignite.internal.util.nio.GridNioServer$GridNioAcceptWorker.accept(GridNioServer.java:2872)
>         ... 3 more
> ================================
>
> My max open files is 32768, and ignite process does have 32768 open files.
> ================================
> $ sudo ls -hl /proc/4055/fd/ | wc -l
> 32768
> ================================
>
> Most of them look like this
> ================================
> ...
> lrwx------ 1 root root 64 Jun 23 12:22 9990 -> socket:[1167798]
> lrwx------ 1 root root 64 Jun 23 12:22 9991 -> socket:[1167799]
> lrwx------ 1 root root 64 Jun 23 12:22 9992 -> socket:[1166839]
> lrwx------ 1 root root 64 Jun 23 12:22 9993 -> socket:[1167800]
> lrwx------ 1 root root 64 Jun 23 12:22 9994 -> socket:[1168762]
> lrwx------ 1 root root 64 Jun 23 12:22 9995 -> socket:[1168763]
> lrwx------ 1 root root 64 Jun 23 12:22 9996 -> socket:[1164109]
> lrwx------ 1 root root 64 Jun 23 12:22 9997 -> socket:[1166840]
> lrwx------ 1 root root 64 Jun 23 12:22 9998 -> socket:[1164110]
> lrwx------ 1 root root 64 Jun 23 12:22 9999 -> socket:[1169810]
> ================================
>
> I haven't found any document about how ignite uses unix socket.
> It seems ignite doesn't close them properly. Any help?
>
> Thanks.
>
>

Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more visit the Mimecast website.