You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by 李家宏 <jh...@gmail.com> on 2014/03/04 14:12:12 UTC

Worker Halting: too many open files

hi, all

When I submit a topology to a storm cluster of 0.9.0.1, the following error
occurs:
----------------------------------------------------------------------------------------------------------------------
[INFO] Starting
   2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Initiating client connection,
   connectString=10.207.52.82:2181,10.207.52.83:2181,10.207.52.84:2181sessionTimeout=20000
   watcher=com.netflix.curator.ConnectionState@796cefa8
   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Opening socket connection to
server /10.207.52.83:2181
   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Socket connection
established to
   storm010207052083.cm3.tbsite.net/10.207.52.83:2181, initiating session
   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Session establishment
complete on server
   storm010207052083.cm3.tbsite.net/10.207.52.83:2181, sessionid =
0x2423f964207c973, negotiated timeout = 20000
   2014-03-04 20:24:13 b.s.zookeeper [INFO] Zookeeper state update:
:connected:none
   2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Session: 0x2423f964207c973
closed
   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] EventThread shut down
   2014-03-04 20:24:13 c.n.c.f.i.CuratorFrameworkImpl [INFO] Starting
   2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Initiating client connection,
   connectString=10.207.52.82:2181,10.207.52.83:2181,
10.207.52.84:2181/tmp/storm-0.9.0.1 sessionTimeout=20000
   watcher=com.netflix.curator.ConnectionState@58f41393
   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Opening socket connection to
server /10.207.52.82:2181
   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Socket connection
established to
   storm010207052082.cm3.tbsite.net/10.207.52.82:2181, initiating session
   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Session establishment
complete on server
   storm010207052082.cm3.tbsite.net/10.207.52.82:2181, sessionid =
0x1423f964209c65f, negotiated timeout = 20000
   2014-03-04 20:24:14 b.s.m.TransportFactory [INFO] Storm peer transport
plugin:backtype.storm.messaging.netty.Context
   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [2]
   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
   2014-03-04 20:24:14 b.s.d.worker [ERROR] Error on initialization of
server mk-worker
   org.jboss.netty.channel.ChannelException: Failed to create a selector.
   at
org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:337)
   ~[netty-3.6.3.Final.jar:na]
   at
org.jboss.netty.channel.socket.nio.AbstractNioSelector.(AbstractNioSelector.java:95)
~[netty-3.6.3.Final.jar:na]
   at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.(AbstractNioWorker.java:51)
~[netty-3.6.3.Final.jar:na]
   at org.jboss.netty.channel.socket.nio.NioWorker.(NioWorker.java:45)
~[netty-3.6.3.Final.jar:na]
   at
org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:45)
~[netty-3.6.3.Final.jar:na]
   at
org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:28)
~[netty-3.6.3.Final.jar:na]
   at
org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.newWorker(AbstractNioWorkerPool.java:99)
   ~[netty-3.6.3.Final.jar:na]
   at
org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.init(AbstractNioWorkerPool.java:69)
   ~[netty-3.6.3.Final.jar:na]
   at
org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:39)
~[netty-3.6.3.Final.jar:na]
   at
org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:33)
~[netty-3.6.3.Final.jar:na]
   at
org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:152)
   ~[netty-3.6.3.Final.jar:na]
   at
org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:134)
   ~[netty-3.6.3.Final.jar:na]
   at backtype.storm.messaging.netty.Client.(Client.java:54)
~[storm-netty-0.9.0.1.jar:na]
   at backtype.storm.messaging.netty.Context.connect(Context.java:36)
~[storm-netty-0.9.0.1.jar:na]
   at
backtype.storm.daemon.worker$mk_refresh_connections$this__5827$iter__5834__5838$fn__5839.invoke(worker.clj:250)
   ~[storm-core-0.9.0.1.jar:na]
   at clojure.lang.LazySeq.sval(LazySeq.java:42) ~[clojure-1.4.0.jar:na]
   at clojure.lang.LazySeq.seq(LazySeq.java:60) ~[clojure-1.4.0.jar:na]
   at clojure.lang.Cons.next(Cons.java:39) ~[clojure-1.4.0.jar:na]
   at clojure.lang.RT.next(RT.java:587) ~[clojure-1.4.0.jar:na]
   at clojure.core$next.invoke(core.clj:64) ~[clojure-1.4.0.jar:na]
   at clojure.core$dorun.invoke(core.clj:2726) ~[clojure-1.4.0.jar:na]
   at clojure.core$doall.invoke(core.clj:2741) ~[clojure-1.4.0.jar:na]
   at
backtype.storm.daemon.worker$mk_refresh_connections$this__5827.invoke(worker.clj:244)
~[storm-core-0.9.0.1.jar:na]
   at
backtype.storm.daemon.worker$fn__5882$exec_fn__1229__auto____5883.invoke(worker.clj:357)
   ~[storm-core-0.9.0.1.jar:na]
   at clojure.lang.AFn.applyToHelper(AFn.java:185) [clojure-1.4.0.jar:na]
   at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na]
   at clojure.core$apply.invoke(core.clj:601) ~[clojure-1.4.0.jar:na]
   at
backtype.storm.daemon.worker$fn__5882$mk_worker__5938.doInvoke(worker.clj:329)
[storm-core-0.9.0.1.jar:na]
   at clojure.lang.RestFn.invoke(RestFn.java:512) [clojure-1.4.0.jar:na]
   at backtype.storm.daemon.worker$_main.invoke(worker.clj:439)
[storm-core-0.9.0.1.jar:na]
   at clojure.lang.AFn.applyToHelper(AFn.java:172) [clojure-1.4.0.jar:na]
   at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na]
   at backtype.storm.daemon.worker.main(Unknown Source)
[storm-core-0.9.0.1.jar:na]
   Caused by: java.io.IOException: Too many open files
   at sun.nio.ch.IOUtil.initPipe(Native Method) ~[na:1.6.0_38]
   at sun.nio.ch.EPollSelectorImpl.(EPollSelectorImpl.java:49)
~[na:1.6.0_38]
   at
sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18)
~[na:1.6.0_38]
   at java.nio.channels.Selector.open(Selector.java:209) ~[na:1.6.0_38]
   at
org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:335)
   ~[netty-3.6.3.Final.jar:na]
   ... 32 common frames omitted
   2014-03-04 20:24:14 b.s.util [INFO] Halting process: ("Error on
initialization")
--------------------------------------------------------------------------------------------------------------------

This topology works fine with storm cluster of 0.8.0.
And:
    ulimit -n => 131072;
    sudo losf | grep java | wc -l => 5000
 it seems like opened fds do not reaching limits

What's the problem ?

Regards

-- 

======================================================

Gvain

Email: jh.li.em@gmail.com

Re: Worker Halting: too many open files

Posted by 李家宏 <jh...@gmail.com>.
 What may be the problem ?

Moreover, i reduce the worker numbers from 150 to 60, it works. However,
storm-netty-client throws some negative timeout exceptions which is
probably the same problem discussed in mails "netty errors, chain
reactions, topology breaks down", and a newly pull request

https://github.com/apache/incubator-storm/pull/41


2014-03-05 11:41 GMT+08:00 Andrew Feng <af...@yahoo-inc.com>:

> Please create a Jira ticket. We will submit a pull request with a fix
>
> Andy Feng
>
> Sent from my iPhone
>
> > On Mar 4, 2014, at 6:32 PM, "李家宏" <jh...@gmail.com> wrote:
> >
> > hi , Andy Feng,
> >   there are 150 workers and 450 executors in my topology.
> >
> > Thanks for your reply
> >
> >
> > 2014-03-04 23:13 GMT+08:00 Andrew Feng <af...@yahoo-inc.com>:
> >
> >> How many workers do you have in your topology?
> >>
> >> Andy Feng
> >>
> >> Sent from my iPhone
> >>
> >>> On Mar 4, 2014, at 5:21 AM, "李家宏" <jh...@gmail.com> wrote:
> >>>
> >>> hi, all
> >>>
> >>> When I submit a topology to a storm cluster of 0.9.0.1, the following
> >> error
> >>> occurs:
> >>
> ----------------------------------------------------------------------------------------------------------------------
> >>> [INFO] Starting
> >>>  2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Initiating client
> >> connection,
> >>>  connectString=10.207.52.82:2181,10.207.52.83:2181,10.207.52.84:2181
> >> sessionTimeout=20000
> >>>  watcher=com.netflix.curator.ConnectionState@796cefa8
> >>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Opening socket connection
> >> to
> >>> server /10.207.52.83:2181
> >>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Socket connection
> >>> established to
> >>>  storm010207052083.cm3.tbsite.net/10.207.52.83:2181, initiating
> session
> >>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Session establishment
> >>> complete on server
> >>>  storm010207052083.cm3.tbsite.net/10.207.52.83:2181, sessionid =
> >>> 0x2423f964207c973, negotiated timeout = 20000
> >>>  2014-03-04 20:24:13 b.s.zookeeper [INFO] Zookeeper state update:
> >>> :connected:none
> >>>  2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Session: 0x2423f964207c973
> >>> closed
> >>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] EventThread shut down
> >>>  2014-03-04 20:24:13 c.n.c.f.i.CuratorFrameworkImpl [INFO] Starting
> >>>  2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Initiating client
> >> connection,
> >>>  connectString=10.207.52.82:2181,10.207.52.83:2181,
> >>> 10.207.52.84:2181/tmp/storm-0.9.0.1 sessionTimeout=20000
> >>>  watcher=com.netflix.curator.ConnectionState@58f41393
> >>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Opening socket connection
> >> to
> >>> server /10.207.52.82:2181
> >>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Socket connection
> >>> established to
> >>>  storm010207052082.cm3.tbsite.net/10.207.52.82:2181, initiating
> session
> >>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Session establishment
> >>> complete on server
> >>>  storm010207052082.cm3.tbsite.net/10.207.52.82:2181, sessionid =
> >>> 0x1423f964209c65f, negotiated timeout = 20000
> >>>  2014-03-04 20:24:14 b.s.m.TransportFactory [INFO] Storm peer transport
> >>> plugin:backtype.storm.messaging.netty.Context
> >>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [2]
> >>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >>>  2014-03-04 20:24:14 b.s.d.worker [ERROR] Error on initialization of
> >>> server mk-worker
> >>>  org.jboss.netty.channel.ChannelException: Failed to create a selector.
> >>>  at
> >>
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:337)
> >>>  ~[netty-3.6.3.Final.jar:na]
> >>>  at
> >>
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.(AbstractNioSelector.java:95)
> >>> ~[netty-3.6.3.Final.jar:na]
> >>>  at
> >>
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.(AbstractNioWorker.java:51)
> >>> ~[netty-3.6.3.Final.jar:na]
> >>>  at org.jboss.netty.channel.socket.nio.NioWorker.(NioWorker.java:45)
> >>> ~[netty-3.6.3.Final.jar:na]
> >>>  at
> >>
> org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:45)
> >>> ~[netty-3.6.3.Final.jar:na]
> >>>  at
> >>
> org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:28)
> >>> ~[netty-3.6.3.Final.jar:na]
> >>>  at
> >>
> org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.newWorker(AbstractNioWorkerPool.java:99)
> >>>  ~[netty-3.6.3.Final.jar:na]
> >>>  at
> >>
> org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.init(AbstractNioWorkerPool.java:69)
> >>>  ~[netty-3.6.3.Final.jar:na]
> >>>  at
> >>>
> org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:39)
> >>> ~[netty-3.6.3.Final.jar:na]
> >>>  at
> >>>
> org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:33)
> >>> ~[netty-3.6.3.Final.jar:na]
> >>>  at
> >>
> org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:152)
> >>>  ~[netty-3.6.3.Final.jar:na]
> >>>  at
> >>
> org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:134)
> >>>  ~[netty-3.6.3.Final.jar:na]
> >>>  at backtype.storm.messaging.netty.Client.(Client.java:54)
> >>> ~[storm-netty-0.9.0.1.jar:na]
> >>>  at backtype.storm.messaging.netty.Context.connect(Context.java:36)
> >>> ~[storm-netty-0.9.0.1.jar:na]
> >>>  at
> >>
> backtype.storm.daemon.worker$mk_refresh_connections$this__5827$iter__5834__5838$fn__5839.invoke(worker.clj:250)
> >>>  ~[storm-core-0.9.0.1.jar:na]
> >>>  at clojure.lang.LazySeq.sval(LazySeq.java:42) ~[clojure-1.4.0.jar:na]
> >>>  at clojure.lang.LazySeq.seq(LazySeq.java:60) ~[clojure-1.4.0.jar:na]
> >>>  at clojure.lang.Cons.next(Cons.java:39) ~[clojure-1.4.0.jar:na]
> >>>  at clojure.lang.RT.next(RT.java:587) ~[clojure-1.4.0.jar:na]
> >>>  at clojure.core$next.invoke(core.clj:64) ~[clojure-1.4.0.jar:na]
> >>>  at clojure.core$dorun.invoke(core.clj:2726) ~[clojure-1.4.0.jar:na]
> >>>  at clojure.core$doall.invoke(core.clj:2741) ~[clojure-1.4.0.jar:na]
> >>>  at
> >>
> backtype.storm.daemon.worker$mk_refresh_connections$this__5827.invoke(worker.clj:244)
> >>> ~[storm-core-0.9.0.1.jar:na]
> >>>  at
> >>
> backtype.storm.daemon.worker$fn__5882$exec_fn__1229__auto____5883.invoke(worker.clj:357)
> >>>  ~[storm-core-0.9.0.1.jar:na]
> >>>  at clojure.lang.AFn.applyToHelper(AFn.java:185) [clojure-1.4.0.jar:na]
> >>>  at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na]
> >>>  at clojure.core$apply.invoke(core.clj:601) ~[clojure-1.4.0.jar:na]
> >>>  at
> >>
> backtype.storm.daemon.worker$fn__5882$mk_worker__5938.doInvoke(worker.clj:329)
> >>> [storm-core-0.9.0.1.jar:na]
> >>>  at clojure.lang.RestFn.invoke(RestFn.java:512) [clojure-1.4.0.jar:na]
> >>>  at backtype.storm.daemon.worker$_main.invoke(worker.clj:439)
> >>> [storm-core-0.9.0.1.jar:na]
> >>>  at clojure.lang.AFn.applyToHelper(AFn.java:172) [clojure-1.4.0.jar:na]
> >>>  at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na]
> >>>  at backtype.storm.daemon.worker.main(Unknown Source)
> >>> [storm-core-0.9.0.1.jar:na]
> >>>  Caused by: java.io.IOException: Too many open files
> >>>  at sun.nio.ch.IOUtil.initPipe(Native Method) ~[na:1.6.0_38]
> >>>  at sun.nio.ch.EPollSelectorImpl.(EPollSelectorImpl.java:49)
> >>> ~[na:1.6.0_38]
> >>>  at
> >>
> sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18)
> >>> ~[na:1.6.0_38]
> >>>  at java.nio.channels.Selector.open(Selector.java:209) ~[na:1.6.0_38]
> >>>  at
> >>
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:335)
> >>>  ~[netty-3.6.3.Final.jar:na]
> >>>  ... 32 common frames omitted
> >>>  2014-03-04 20:24:14 b.s.util [INFO] Halting process: ("Error on
> >>> initialization")
> >>
> --------------------------------------------------------------------------------------------------------------------
> >>>
> >>> This topology works fine with storm cluster of 0.8.0.
> >>> And:
> >>>   ulimit -n => 131072;
> >>>   sudo losf | grep java | wc -l => 5000
> >>> it seems like opened fds do not reaching limits
> >>>
> >>> What's the problem ?
> >>>
> >>> Regards
> >>>
> >>> --
> >>>
> >>> ======================================================
> >>>
> >>> Gvain
> >>>
> >>> Email: jh.li.em@gmail.com
> >
> >
> >
> > --
> >
> > ======================================================
> >
> > Gvain
> >
> > Email: jh.li.em@gmail.com
>



-- 

======================================================

Gvain

Email: jh.li.em@gmail.com

Re: Worker Halting: too many open files

Posted by Andrew Feng <af...@yahoo-inc.com>.
Please create a Jira ticket. We will submit a pull request with a fix

Andy Feng

Sent from my iPhone

> On Mar 4, 2014, at 6:32 PM, "李家宏" <jh...@gmail.com> wrote:
> 
> hi , Andy Feng,
>   there are 150 workers and 450 executors in my topology.
> 
> Thanks for your reply
> 
> 
> 2014-03-04 23:13 GMT+08:00 Andrew Feng <af...@yahoo-inc.com>:
> 
>> How many workers do you have in your topology?
>> 
>> Andy Feng
>> 
>> Sent from my iPhone
>> 
>>> On Mar 4, 2014, at 5:21 AM, "李家宏" <jh...@gmail.com> wrote:
>>> 
>>> hi, all
>>> 
>>> When I submit a topology to a storm cluster of 0.9.0.1, the following
>> error
>>> occurs:
>> ----------------------------------------------------------------------------------------------------------------------
>>> [INFO] Starting
>>>  2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Initiating client
>> connection,
>>>  connectString=10.207.52.82:2181,10.207.52.83:2181,10.207.52.84:2181
>> sessionTimeout=20000
>>>  watcher=com.netflix.curator.ConnectionState@796cefa8
>>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Opening socket connection
>> to
>>> server /10.207.52.83:2181
>>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Socket connection
>>> established to
>>>  storm010207052083.cm3.tbsite.net/10.207.52.83:2181, initiating session
>>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Session establishment
>>> complete on server
>>>  storm010207052083.cm3.tbsite.net/10.207.52.83:2181, sessionid =
>>> 0x2423f964207c973, negotiated timeout = 20000
>>>  2014-03-04 20:24:13 b.s.zookeeper [INFO] Zookeeper state update:
>>> :connected:none
>>>  2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Session: 0x2423f964207c973
>>> closed
>>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] EventThread shut down
>>>  2014-03-04 20:24:13 c.n.c.f.i.CuratorFrameworkImpl [INFO] Starting
>>>  2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Initiating client
>> connection,
>>>  connectString=10.207.52.82:2181,10.207.52.83:2181,
>>> 10.207.52.84:2181/tmp/storm-0.9.0.1 sessionTimeout=20000
>>>  watcher=com.netflix.curator.ConnectionState@58f41393
>>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Opening socket connection
>> to
>>> server /10.207.52.82:2181
>>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Socket connection
>>> established to
>>>  storm010207052082.cm3.tbsite.net/10.207.52.82:2181, initiating session
>>>  2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Session establishment
>>> complete on server
>>>  storm010207052082.cm3.tbsite.net/10.207.52.82:2181, sessionid =
>>> 0x1423f964209c65f, negotiated timeout = 20000
>>>  2014-03-04 20:24:14 b.s.m.TransportFactory [INFO] Storm peer transport
>>> plugin:backtype.storm.messaging.netty.Context
>>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [2]
>>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>>>  2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>>>  2014-03-04 20:24:14 b.s.d.worker [ERROR] Error on initialization of
>>> server mk-worker
>>>  org.jboss.netty.channel.ChannelException: Failed to create a selector.
>>>  at
>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:337)
>>>  ~[netty-3.6.3.Final.jar:na]
>>>  at
>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.(AbstractNioSelector.java:95)
>>> ~[netty-3.6.3.Final.jar:na]
>>>  at
>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.(AbstractNioWorker.java:51)
>>> ~[netty-3.6.3.Final.jar:na]
>>>  at org.jboss.netty.channel.socket.nio.NioWorker.(NioWorker.java:45)
>>> ~[netty-3.6.3.Final.jar:na]
>>>  at
>> org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:45)
>>> ~[netty-3.6.3.Final.jar:na]
>>>  at
>> org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:28)
>>> ~[netty-3.6.3.Final.jar:na]
>>>  at
>> org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.newWorker(AbstractNioWorkerPool.java:99)
>>>  ~[netty-3.6.3.Final.jar:na]
>>>  at
>> org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.init(AbstractNioWorkerPool.java:69)
>>>  ~[netty-3.6.3.Final.jar:na]
>>>  at
>>> org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:39)
>>> ~[netty-3.6.3.Final.jar:na]
>>>  at
>>> org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:33)
>>> ~[netty-3.6.3.Final.jar:na]
>>>  at
>> org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:152)
>>>  ~[netty-3.6.3.Final.jar:na]
>>>  at
>> org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:134)
>>>  ~[netty-3.6.3.Final.jar:na]
>>>  at backtype.storm.messaging.netty.Client.(Client.java:54)
>>> ~[storm-netty-0.9.0.1.jar:na]
>>>  at backtype.storm.messaging.netty.Context.connect(Context.java:36)
>>> ~[storm-netty-0.9.0.1.jar:na]
>>>  at
>> backtype.storm.daemon.worker$mk_refresh_connections$this__5827$iter__5834__5838$fn__5839.invoke(worker.clj:250)
>>>  ~[storm-core-0.9.0.1.jar:na]
>>>  at clojure.lang.LazySeq.sval(LazySeq.java:42) ~[clojure-1.4.0.jar:na]
>>>  at clojure.lang.LazySeq.seq(LazySeq.java:60) ~[clojure-1.4.0.jar:na]
>>>  at clojure.lang.Cons.next(Cons.java:39) ~[clojure-1.4.0.jar:na]
>>>  at clojure.lang.RT.next(RT.java:587) ~[clojure-1.4.0.jar:na]
>>>  at clojure.core$next.invoke(core.clj:64) ~[clojure-1.4.0.jar:na]
>>>  at clojure.core$dorun.invoke(core.clj:2726) ~[clojure-1.4.0.jar:na]
>>>  at clojure.core$doall.invoke(core.clj:2741) ~[clojure-1.4.0.jar:na]
>>>  at
>> backtype.storm.daemon.worker$mk_refresh_connections$this__5827.invoke(worker.clj:244)
>>> ~[storm-core-0.9.0.1.jar:na]
>>>  at
>> backtype.storm.daemon.worker$fn__5882$exec_fn__1229__auto____5883.invoke(worker.clj:357)
>>>  ~[storm-core-0.9.0.1.jar:na]
>>>  at clojure.lang.AFn.applyToHelper(AFn.java:185) [clojure-1.4.0.jar:na]
>>>  at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na]
>>>  at clojure.core$apply.invoke(core.clj:601) ~[clojure-1.4.0.jar:na]
>>>  at
>> backtype.storm.daemon.worker$fn__5882$mk_worker__5938.doInvoke(worker.clj:329)
>>> [storm-core-0.9.0.1.jar:na]
>>>  at clojure.lang.RestFn.invoke(RestFn.java:512) [clojure-1.4.0.jar:na]
>>>  at backtype.storm.daemon.worker$_main.invoke(worker.clj:439)
>>> [storm-core-0.9.0.1.jar:na]
>>>  at clojure.lang.AFn.applyToHelper(AFn.java:172) [clojure-1.4.0.jar:na]
>>>  at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na]
>>>  at backtype.storm.daemon.worker.main(Unknown Source)
>>> [storm-core-0.9.0.1.jar:na]
>>>  Caused by: java.io.IOException: Too many open files
>>>  at sun.nio.ch.IOUtil.initPipe(Native Method) ~[na:1.6.0_38]
>>>  at sun.nio.ch.EPollSelectorImpl.(EPollSelectorImpl.java:49)
>>> ~[na:1.6.0_38]
>>>  at
>> sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18)
>>> ~[na:1.6.0_38]
>>>  at java.nio.channels.Selector.open(Selector.java:209) ~[na:1.6.0_38]
>>>  at
>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:335)
>>>  ~[netty-3.6.3.Final.jar:na]
>>>  ... 32 common frames omitted
>>>  2014-03-04 20:24:14 b.s.util [INFO] Halting process: ("Error on
>>> initialization")
>> --------------------------------------------------------------------------------------------------------------------
>>> 
>>> This topology works fine with storm cluster of 0.8.0.
>>> And:
>>>   ulimit -n => 131072;
>>>   sudo losf | grep java | wc -l => 5000
>>> it seems like opened fds do not reaching limits
>>> 
>>> What's the problem ?
>>> 
>>> Regards
>>> 
>>> --
>>> 
>>> ======================================================
>>> 
>>> Gvain
>>> 
>>> Email: jh.li.em@gmail.com
> 
> 
> 
> -- 
> 
> ======================================================
> 
> Gvain
> 
> Email: jh.li.em@gmail.com

Re: Worker Halting: too many open files

Posted by 李家宏 <jh...@gmail.com>.
hi , Andy Feng,
   there are 150 workers and 450 executors in my topology.

Thanks for your reply


2014-03-04 23:13 GMT+08:00 Andrew Feng <af...@yahoo-inc.com>:

> How many workers do you have in your topology?
>
> Andy Feng
>
> Sent from my iPhone
>
> > On Mar 4, 2014, at 5:21 AM, "李家宏" <jh...@gmail.com> wrote:
> >
> > hi, all
> >
> > When I submit a topology to a storm cluster of 0.9.0.1, the following
> error
> > occurs:
> >
> ----------------------------------------------------------------------------------------------------------------------
> > [INFO] Starting
> >   2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Initiating client
> connection,
> >   connectString=10.207.52.82:2181,10.207.52.83:2181,10.207.52.84:2181
> sessionTimeout=20000
> >   watcher=com.netflix.curator.ConnectionState@796cefa8
> >   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Opening socket connection
> to
> > server /10.207.52.83:2181
> >   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Socket connection
> > established to
> >   storm010207052083.cm3.tbsite.net/10.207.52.83:2181, initiating session
> >   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Session establishment
> > complete on server
> >   storm010207052083.cm3.tbsite.net/10.207.52.83:2181, sessionid =
> > 0x2423f964207c973, negotiated timeout = 20000
> >   2014-03-04 20:24:13 b.s.zookeeper [INFO] Zookeeper state update:
> > :connected:none
> >   2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Session: 0x2423f964207c973
> > closed
> >   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] EventThread shut down
> >   2014-03-04 20:24:13 c.n.c.f.i.CuratorFrameworkImpl [INFO] Starting
> >   2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Initiating client
> connection,
> >   connectString=10.207.52.82:2181,10.207.52.83:2181,
> > 10.207.52.84:2181/tmp/storm-0.9.0.1 sessionTimeout=20000
> >   watcher=com.netflix.curator.ConnectionState@58f41393
> >   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Opening socket connection
> to
> > server /10.207.52.82:2181
> >   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Socket connection
> > established to
> >   storm010207052082.cm3.tbsite.net/10.207.52.82:2181, initiating session
> >   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Session establishment
> > complete on server
> >   storm010207052082.cm3.tbsite.net/10.207.52.82:2181, sessionid =
> > 0x1423f964209c65f, negotiated timeout = 20000
> >   2014-03-04 20:24:14 b.s.m.TransportFactory [INFO] Storm peer transport
> > plugin:backtype.storm.messaging.netty.Context
> >   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [2]
> >   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
> >   2014-03-04 20:24:14 b.s.d.worker [ERROR] Error on initialization of
> > server mk-worker
> >   org.jboss.netty.channel.ChannelException: Failed to create a selector.
> >   at
> >
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:337)
> >   ~[netty-3.6.3.Final.jar:na]
> >   at
> >
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.(AbstractNioSelector.java:95)
> > ~[netty-3.6.3.Final.jar:na]
> >   at
> >
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.(AbstractNioWorker.java:51)
> > ~[netty-3.6.3.Final.jar:na]
> >   at org.jboss.netty.channel.socket.nio.NioWorker.(NioWorker.java:45)
> > ~[netty-3.6.3.Final.jar:na]
> >   at
> >
> org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:45)
> > ~[netty-3.6.3.Final.jar:na]
> >   at
> >
> org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:28)
> > ~[netty-3.6.3.Final.jar:na]
> >   at
> >
> org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.newWorker(AbstractNioWorkerPool.java:99)
> >   ~[netty-3.6.3.Final.jar:na]
> >   at
> >
> org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.init(AbstractNioWorkerPool.java:69)
> >   ~[netty-3.6.3.Final.jar:na]
> >   at
> > org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:39)
> > ~[netty-3.6.3.Final.jar:na]
> >   at
> > org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:33)
> > ~[netty-3.6.3.Final.jar:na]
> >   at
> >
> org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:152)
> >   ~[netty-3.6.3.Final.jar:na]
> >   at
> >
> org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:134)
> >   ~[netty-3.6.3.Final.jar:na]
> >   at backtype.storm.messaging.netty.Client.(Client.java:54)
> > ~[storm-netty-0.9.0.1.jar:na]
> >   at backtype.storm.messaging.netty.Context.connect(Context.java:36)
> > ~[storm-netty-0.9.0.1.jar:na]
> >   at
> >
> backtype.storm.daemon.worker$mk_refresh_connections$this__5827$iter__5834__5838$fn__5839.invoke(worker.clj:250)
> >   ~[storm-core-0.9.0.1.jar:na]
> >   at clojure.lang.LazySeq.sval(LazySeq.java:42) ~[clojure-1.4.0.jar:na]
> >   at clojure.lang.LazySeq.seq(LazySeq.java:60) ~[clojure-1.4.0.jar:na]
> >   at clojure.lang.Cons.next(Cons.java:39) ~[clojure-1.4.0.jar:na]
> >   at clojure.lang.RT.next(RT.java:587) ~[clojure-1.4.0.jar:na]
> >   at clojure.core$next.invoke(core.clj:64) ~[clojure-1.4.0.jar:na]
> >   at clojure.core$dorun.invoke(core.clj:2726) ~[clojure-1.4.0.jar:na]
> >   at clojure.core$doall.invoke(core.clj:2741) ~[clojure-1.4.0.jar:na]
> >   at
> >
> backtype.storm.daemon.worker$mk_refresh_connections$this__5827.invoke(worker.clj:244)
> > ~[storm-core-0.9.0.1.jar:na]
> >   at
> >
> backtype.storm.daemon.worker$fn__5882$exec_fn__1229__auto____5883.invoke(worker.clj:357)
> >   ~[storm-core-0.9.0.1.jar:na]
> >   at clojure.lang.AFn.applyToHelper(AFn.java:185) [clojure-1.4.0.jar:na]
> >   at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na]
> >   at clojure.core$apply.invoke(core.clj:601) ~[clojure-1.4.0.jar:na]
> >   at
> >
> backtype.storm.daemon.worker$fn__5882$mk_worker__5938.doInvoke(worker.clj:329)
> > [storm-core-0.9.0.1.jar:na]
> >   at clojure.lang.RestFn.invoke(RestFn.java:512) [clojure-1.4.0.jar:na]
> >   at backtype.storm.daemon.worker$_main.invoke(worker.clj:439)
> > [storm-core-0.9.0.1.jar:na]
> >   at clojure.lang.AFn.applyToHelper(AFn.java:172) [clojure-1.4.0.jar:na]
> >   at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na]
> >   at backtype.storm.daemon.worker.main(Unknown Source)
> > [storm-core-0.9.0.1.jar:na]
> >   Caused by: java.io.IOException: Too many open files
> >   at sun.nio.ch.IOUtil.initPipe(Native Method) ~[na:1.6.0_38]
> >   at sun.nio.ch.EPollSelectorImpl.(EPollSelectorImpl.java:49)
> > ~[na:1.6.0_38]
> >   at
> >
> sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18)
> > ~[na:1.6.0_38]
> >   at java.nio.channels.Selector.open(Selector.java:209) ~[na:1.6.0_38]
> >   at
> >
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:335)
> >   ~[netty-3.6.3.Final.jar:na]
> >   ... 32 common frames omitted
> >   2014-03-04 20:24:14 b.s.util [INFO] Halting process: ("Error on
> > initialization")
> >
> --------------------------------------------------------------------------------------------------------------------
> >
> > This topology works fine with storm cluster of 0.8.0.
> > And:
> >    ulimit -n => 131072;
> >    sudo losf | grep java | wc -l => 5000
> > it seems like opened fds do not reaching limits
> >
> > What's the problem ?
> >
> > Regards
> >
> > --
> >
> > ======================================================
> >
> > Gvain
> >
> > Email: jh.li.em@gmail.com
>



-- 

======================================================

Gvain

Email: jh.li.em@gmail.com

Re: Worker Halting: too many open files

Posted by Andrew Feng <af...@yahoo-inc.com>.
How many workers do you have in your topology?

Andy Feng

Sent from my iPhone

> On Mar 4, 2014, at 5:21 AM, "李家宏" <jh...@gmail.com> wrote:
> 
> hi, all
> 
> When I submit a topology to a storm cluster of 0.9.0.1, the following error
> occurs:
> ----------------------------------------------------------------------------------------------------------------------
> [INFO] Starting
>   2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Initiating client connection,
>   connectString=10.207.52.82:2181,10.207.52.83:2181,10.207.52.84:2181sessionTimeout=20000
>   watcher=com.netflix.curator.ConnectionState@796cefa8
>   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Opening socket connection to
> server /10.207.52.83:2181
>   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Socket connection
> established to
>   storm010207052083.cm3.tbsite.net/10.207.52.83:2181, initiating session
>   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Session establishment
> complete on server
>   storm010207052083.cm3.tbsite.net/10.207.52.83:2181, sessionid =
> 0x2423f964207c973, negotiated timeout = 20000
>   2014-03-04 20:24:13 b.s.zookeeper [INFO] Zookeeper state update:
> :connected:none
>   2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Session: 0x2423f964207c973
> closed
>   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] EventThread shut down
>   2014-03-04 20:24:13 c.n.c.f.i.CuratorFrameworkImpl [INFO] Starting
>   2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Initiating client connection,
>   connectString=10.207.52.82:2181,10.207.52.83:2181,
> 10.207.52.84:2181/tmp/storm-0.9.0.1 sessionTimeout=20000
>   watcher=com.netflix.curator.ConnectionState@58f41393
>   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Opening socket connection to
> server /10.207.52.82:2181
>   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Socket connection
> established to
>   storm010207052082.cm3.tbsite.net/10.207.52.82:2181, initiating session
>   2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Session establishment
> complete on server
>   storm010207052082.cm3.tbsite.net/10.207.52.82:2181, sessionid =
> 0x1423f964209c65f, negotiated timeout = 20000
>   2014-03-04 20:24:14 b.s.m.TransportFactory [INFO] Storm peer transport
> plugin:backtype.storm.messaging.netty.Context
>   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [2]
>   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>   2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1]
>   2014-03-04 20:24:14 b.s.d.worker [ERROR] Error on initialization of
> server mk-worker
>   org.jboss.netty.channel.ChannelException: Failed to create a selector.
>   at
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:337)
>   ~[netty-3.6.3.Final.jar:na]
>   at
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.(AbstractNioSelector.java:95)
> ~[netty-3.6.3.Final.jar:na]
>   at
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.(AbstractNioWorker.java:51)
> ~[netty-3.6.3.Final.jar:na]
>   at org.jboss.netty.channel.socket.nio.NioWorker.(NioWorker.java:45)
> ~[netty-3.6.3.Final.jar:na]
>   at
> org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:45)
> ~[netty-3.6.3.Final.jar:na]
>   at
> org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:28)
> ~[netty-3.6.3.Final.jar:na]
>   at
> org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.newWorker(AbstractNioWorkerPool.java:99)
>   ~[netty-3.6.3.Final.jar:na]
>   at
> org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.init(AbstractNioWorkerPool.java:69)
>   ~[netty-3.6.3.Final.jar:na]
>   at
> org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:39)
> ~[netty-3.6.3.Final.jar:na]
>   at
> org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:33)
> ~[netty-3.6.3.Final.jar:na]
>   at
> org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:152)
>   ~[netty-3.6.3.Final.jar:na]
>   at
> org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:134)
>   ~[netty-3.6.3.Final.jar:na]
>   at backtype.storm.messaging.netty.Client.(Client.java:54)
> ~[storm-netty-0.9.0.1.jar:na]
>   at backtype.storm.messaging.netty.Context.connect(Context.java:36)
> ~[storm-netty-0.9.0.1.jar:na]
>   at
> backtype.storm.daemon.worker$mk_refresh_connections$this__5827$iter__5834__5838$fn__5839.invoke(worker.clj:250)
>   ~[storm-core-0.9.0.1.jar:na]
>   at clojure.lang.LazySeq.sval(LazySeq.java:42) ~[clojure-1.4.0.jar:na]
>   at clojure.lang.LazySeq.seq(LazySeq.java:60) ~[clojure-1.4.0.jar:na]
>   at clojure.lang.Cons.next(Cons.java:39) ~[clojure-1.4.0.jar:na]
>   at clojure.lang.RT.next(RT.java:587) ~[clojure-1.4.0.jar:na]
>   at clojure.core$next.invoke(core.clj:64) ~[clojure-1.4.0.jar:na]
>   at clojure.core$dorun.invoke(core.clj:2726) ~[clojure-1.4.0.jar:na]
>   at clojure.core$doall.invoke(core.clj:2741) ~[clojure-1.4.0.jar:na]
>   at
> backtype.storm.daemon.worker$mk_refresh_connections$this__5827.invoke(worker.clj:244)
> ~[storm-core-0.9.0.1.jar:na]
>   at
> backtype.storm.daemon.worker$fn__5882$exec_fn__1229__auto____5883.invoke(worker.clj:357)
>   ~[storm-core-0.9.0.1.jar:na]
>   at clojure.lang.AFn.applyToHelper(AFn.java:185) [clojure-1.4.0.jar:na]
>   at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na]
>   at clojure.core$apply.invoke(core.clj:601) ~[clojure-1.4.0.jar:na]
>   at
> backtype.storm.daemon.worker$fn__5882$mk_worker__5938.doInvoke(worker.clj:329)
> [storm-core-0.9.0.1.jar:na]
>   at clojure.lang.RestFn.invoke(RestFn.java:512) [clojure-1.4.0.jar:na]
>   at backtype.storm.daemon.worker$_main.invoke(worker.clj:439)
> [storm-core-0.9.0.1.jar:na]
>   at clojure.lang.AFn.applyToHelper(AFn.java:172) [clojure-1.4.0.jar:na]
>   at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na]
>   at backtype.storm.daemon.worker.main(Unknown Source)
> [storm-core-0.9.0.1.jar:na]
>   Caused by: java.io.IOException: Too many open files
>   at sun.nio.ch.IOUtil.initPipe(Native Method) ~[na:1.6.0_38]
>   at sun.nio.ch.EPollSelectorImpl.(EPollSelectorImpl.java:49)
> ~[na:1.6.0_38]
>   at
> sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18)
> ~[na:1.6.0_38]
>   at java.nio.channels.Selector.open(Selector.java:209) ~[na:1.6.0_38]
>   at
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:335)
>   ~[netty-3.6.3.Final.jar:na]
>   ... 32 common frames omitted
>   2014-03-04 20:24:14 b.s.util [INFO] Halting process: ("Error on
> initialization")
> --------------------------------------------------------------------------------------------------------------------
> 
> This topology works fine with storm cluster of 0.8.0.
> And:
>    ulimit -n => 131072;
>    sudo losf | grep java | wc -l => 5000
> it seems like opened fds do not reaching limits
> 
> What's the problem ?
> 
> Regards
> 
> -- 
> 
> ======================================================
> 
> Gvain
> 
> Email: jh.li.em@gmail.com