You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Mark Greene <ma...@evertrue.com> on 2014/01/31 04:12:07 UTC

Topology dies immediately upon deployment when configured with two workers instead of one

Exception in log:

2014-01-31 02:58:14 task [INFO] Emitting: change-spout default [[B@38fc659c]
2014-01-31 02:58:14 task [INFO] Emitting: change-spout __ack_init
[1863657906985036001 0 2]
2014-01-31 02:58:14 util [ERROR] Async loop died!
java.lang.RuntimeException: org.zeromq.ZMQException: Invalid argument(0x16)
at
backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87)
 at
backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58)
at
backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
 at
backtype.storm.disruptor$consume_loop_STAR_$fn__1619.invoke(disruptor.clj:73)
at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
 at clojure.lang.AFn.run(AFn.java:24)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.zeromq.ZMQException: Invalid argument(0x16)
 at org.zeromq.ZMQ$Socket.send(Native Method)
at zilch.mq$send.invoke(mq.clj:93)
 at backtype.storm.messaging.zmq.ZMQConnection.send(zmq.clj:43)
at
backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333$fn__4334.invoke(worker.clj:298)
 at
backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333.invoke(worker.clj:287)
at
backtype.storm.disruptor$clojure_handler$reify__1606.onEvent(disruptor.clj:43)
 at
backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
... 6 more
2014-01-31 02:58:14 util [INFO] Halting process: ("Async loop died!")
2014-01-31 02:58:24 executor [INFO] Processing received message source:
__system:-1, stream: __tick, id: {}, [30]

I see the above exception almost immediately upon which my spout emits the
first tuple from the queue. I have pared down my topology so there is just
one spout and no bolts so as to narrow the problem down but the only time I
can keep the spout running is if I omit the collector.emit call itself.

I'm not sure if it would make a difference but the supervisor has three
slots and this topology would occupy two of them, however, when configured
with two I get the above exception, when configured with one, everything
works fine.

Re: Topology dies immediately upon deployment when configured with two workers instead of one

Posted by Nathan Leung <nc...@gmail.com>.
Are you also using the forked version of jzmq?


On Fri, Jan 31, 2014 at 11:26 AM, Mark Greene <ma...@evertrue.com> wrote:

> We are on 2.1.7 on all environments, they are all managed by chef.
>
>
> On Fri, Jan 31, 2014 at 9:51 AM, Nathan Leung <nc...@gmail.com> wrote:
>
>> It can work with ZMQ, but you MUST use the version specified (2.1.7).
>>  Newer versions change the API which causes errors, which might be what you
>> are seeing.  Is the version of libzmq you installed the same as the one you
>> are using in production?
>>
>>
>> On Fri, Jan 31, 2014 at 9:47 AM, Mark Greene <ma...@evertrue.com> wrote:
>>
>>> Storm uses the internal queuing (through ZMQ) only when there is a
>>>> communication between two worker processes is required,which is why this
>>>> error comes up only when you set num_workers>1.
>>>
>>>
>>> I'm a little confused by the answer, are you suggesting that storm
>>> cannot run more than one worker even with the correct (older) version of
>>> ZMQ?
>>>
>>> What's unique about the environment I was having trouble with is it only
>>> had 1 supervisor where as my prod environment has multiple supervisors and
>>> I am not seeing a problem there.
>>>
>>>
>>> On Thu, Jan 30, 2014 at 11:59 PM, bijoy deb <bi...@gmail.com>wrote:
>>>
>>>> Hi Mark,
>>>>
>>>> Storm uses the internal queuing (through ZMQ) only when there is a
>>>> communication between two worker processes is required,which is why this
>>>> error comes up only when you set num_workers>1.
>>>>
>>>> Though I won't be able to help with with an exact solution for this,I
>>>> can provide some pointers:
>>>>
>>>> a) Regarding the reason for the error,documentation says that The
>>>> ZMQ/zeromq version needs to be downgraded to 2.1.7 if its higher than that.
>>>> b) In a future version of Storm (don't recollect the exact version
>>>> number or if it has already been released),they are supposed to remove the
>>>> ZMQ dependency at all,so the above error should not be coming then.
>>>>
>>>> Thanks
>>>> Bijoy
>>>>
>>>>
>>>> On Fri, Jan 31, 2014 at 8:42 AM, Mark Greene <ma...@evertrue.com> wrote:
>>>>
>>>>> Exception in log:
>>>>>
>>>>> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout default
>>>>> [[B@38fc659c]
>>>>> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout __ack_init
>>>>> [1863657906985036001 0 2]
>>>>> 2014-01-31 02:58:14 util [ERROR] Async loop died!
>>>>> java.lang.RuntimeException: org.zeromq.ZMQException: Invalid
>>>>> argument(0x16)
>>>>> at
>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87)
>>>>>  at
>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58)
>>>>> at
>>>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
>>>>>  at
>>>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1619.invoke(disruptor.clj:73)
>>>>> at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
>>>>>  at clojure.lang.AFn.run(AFn.java:24)
>>>>> at java.lang.Thread.run(Thread.java:744)
>>>>> Caused by: org.zeromq.ZMQException: Invalid argument(0x16)
>>>>>  at org.zeromq.ZMQ$Socket.send(Native Method)
>>>>> at zilch.mq$send.invoke(mq.clj:93)
>>>>>  at backtype.storm.messaging.zmq.ZMQConnection.send(zmq.clj:43)
>>>>> at
>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333$fn__4334.invoke(worker.clj:298)
>>>>>  at
>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333.invoke(worker.clj:287)
>>>>> at
>>>>> backtype.storm.disruptor$clojure_handler$reify__1606.onEvent(disruptor.clj:43)
>>>>>  at
>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
>>>>> ... 6 more
>>>>> 2014-01-31 02:58:14 util [INFO] Halting process: ("Async loop died!")
>>>>> 2014-01-31 02:58:24 executor [INFO] Processing received message
>>>>> source: __system:-1, stream: __tick, id: {}, [30]
>>>>>
>>>>> I see the above exception almost immediately upon which my spout emits
>>>>> the first tuple from the queue. I have pared down my topology so there is
>>>>> just one spout and no bolts so as to narrow the problem down but the only
>>>>> time I can keep the spout running is if I omit the collector.emit call
>>>>> itself.
>>>>>
>>>>> I'm not sure if it would make a difference but the supervisor has
>>>>> three slots and this topology would occupy two of them, however, when
>>>>> configured with two I get the above exception, when configured with one,
>>>>> everything works fine.
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Topology dies immediately upon deployment when configured with two workers instead of one

Posted by Mark Greene <ma...@evertrue.com>.
We are on 2.1.7 on all environments, they are all managed by chef.


On Fri, Jan 31, 2014 at 9:51 AM, Nathan Leung <nc...@gmail.com> wrote:

> It can work with ZMQ, but you MUST use the version specified (2.1.7).
>  Newer versions change the API which causes errors, which might be what you
> are seeing.  Is the version of libzmq you installed the same as the one you
> are using in production?
>
>
> On Fri, Jan 31, 2014 at 9:47 AM, Mark Greene <ma...@evertrue.com> wrote:
>
>> Storm uses the internal queuing (through ZMQ) only when there is a
>>> communication between two worker processes is required,which is why this
>>> error comes up only when you set num_workers>1.
>>
>>
>> I'm a little confused by the answer, are you suggesting that storm cannot
>> run more than one worker even with the correct (older) version of ZMQ?
>>
>> What's unique about the environment I was having trouble with is it only
>> had 1 supervisor where as my prod environment has multiple supervisors and
>> I am not seeing a problem there.
>>
>>
>> On Thu, Jan 30, 2014 at 11:59 PM, bijoy deb <bi...@gmail.com>wrote:
>>
>>> Hi Mark,
>>>
>>> Storm uses the internal queuing (through ZMQ) only when there is a
>>> communication between two worker processes is required,which is why this
>>> error comes up only when you set num_workers>1.
>>>
>>> Though I won't be able to help with with an exact solution for this,I
>>> can provide some pointers:
>>>
>>> a) Regarding the reason for the error,documentation says that The
>>> ZMQ/zeromq version needs to be downgraded to 2.1.7 if its higher than that.
>>> b) In a future version of Storm (don't recollect the exact version
>>> number or if it has already been released),they are supposed to remove the
>>> ZMQ dependency at all,so the above error should not be coming then.
>>>
>>> Thanks
>>> Bijoy
>>>
>>>
>>> On Fri, Jan 31, 2014 at 8:42 AM, Mark Greene <ma...@evertrue.com> wrote:
>>>
>>>> Exception in log:
>>>>
>>>> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout default
>>>> [[B@38fc659c]
>>>> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout __ack_init
>>>> [1863657906985036001 0 2]
>>>> 2014-01-31 02:58:14 util [ERROR] Async loop died!
>>>> java.lang.RuntimeException: org.zeromq.ZMQException: Invalid
>>>> argument(0x16)
>>>> at
>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87)
>>>>  at
>>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58)
>>>> at
>>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
>>>>  at
>>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1619.invoke(disruptor.clj:73)
>>>> at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
>>>>  at clojure.lang.AFn.run(AFn.java:24)
>>>> at java.lang.Thread.run(Thread.java:744)
>>>> Caused by: org.zeromq.ZMQException: Invalid argument(0x16)
>>>>  at org.zeromq.ZMQ$Socket.send(Native Method)
>>>> at zilch.mq$send.invoke(mq.clj:93)
>>>>  at backtype.storm.messaging.zmq.ZMQConnection.send(zmq.clj:43)
>>>> at
>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333$fn__4334.invoke(worker.clj:298)
>>>>  at
>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333.invoke(worker.clj:287)
>>>> at
>>>> backtype.storm.disruptor$clojure_handler$reify__1606.onEvent(disruptor.clj:43)
>>>>  at
>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
>>>> ... 6 more
>>>> 2014-01-31 02:58:14 util [INFO] Halting process: ("Async loop died!")
>>>> 2014-01-31 02:58:24 executor [INFO] Processing received message source:
>>>> __system:-1, stream: __tick, id: {}, [30]
>>>>
>>>> I see the above exception almost immediately upon which my spout emits
>>>> the first tuple from the queue. I have pared down my topology so there is
>>>> just one spout and no bolts so as to narrow the problem down but the only
>>>> time I can keep the spout running is if I omit the collector.emit call
>>>> itself.
>>>>
>>>> I'm not sure if it would make a difference but the supervisor has three
>>>> slots and this topology would occupy two of them, however, when configured
>>>> with two I get the above exception, when configured with one, everything
>>>> works fine.
>>>>
>>>>
>>>>
>>>
>>
>

Re: Topology dies immediately upon deployment when configured with two workers instead of one

Posted by Nathan Leung <nc...@gmail.com>.
It can work with ZMQ, but you MUST use the version specified (2.1.7).
 Newer versions change the API which causes errors, which might be what you
are seeing.  Is the version of libzmq you installed the same as the one you
are using in production?


On Fri, Jan 31, 2014 at 9:47 AM, Mark Greene <ma...@evertrue.com> wrote:

> Storm uses the internal queuing (through ZMQ) only when there is a
>> communication between two worker processes is required,which is why this
>> error comes up only when you set num_workers>1.
>
>
> I'm a little confused by the answer, are you suggesting that storm cannot
> run more than one worker even with the correct (older) version of ZMQ?
>
> What's unique about the environment I was having trouble with is it only
> had 1 supervisor where as my prod environment has multiple supervisors and
> I am not seeing a problem there.
>
>
> On Thu, Jan 30, 2014 at 11:59 PM, bijoy deb <bi...@gmail.com>wrote:
>
>> Hi Mark,
>>
>> Storm uses the internal queuing (through ZMQ) only when there is a
>> communication between two worker processes is required,which is why this
>> error comes up only when you set num_workers>1.
>>
>> Though I won't be able to help with with an exact solution for this,I can
>> provide some pointers:
>>
>> a) Regarding the reason for the error,documentation says that The
>> ZMQ/zeromq version needs to be downgraded to 2.1.7 if its higher than that.
>> b) In a future version of Storm (don't recollect the exact version number
>> or if it has already been released),they are supposed to remove the ZMQ
>> dependency at all,so the above error should not be coming then.
>>
>> Thanks
>> Bijoy
>>
>>
>> On Fri, Jan 31, 2014 at 8:42 AM, Mark Greene <ma...@evertrue.com> wrote:
>>
>>> Exception in log:
>>>
>>> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout default
>>> [[B@38fc659c]
>>> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout __ack_init
>>> [1863657906985036001 0 2]
>>> 2014-01-31 02:58:14 util [ERROR] Async loop died!
>>> java.lang.RuntimeException: org.zeromq.ZMQException: Invalid
>>> argument(0x16)
>>> at
>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87)
>>>  at
>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58)
>>> at
>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
>>>  at
>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1619.invoke(disruptor.clj:73)
>>> at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
>>>  at clojure.lang.AFn.run(AFn.java:24)
>>> at java.lang.Thread.run(Thread.java:744)
>>> Caused by: org.zeromq.ZMQException: Invalid argument(0x16)
>>>  at org.zeromq.ZMQ$Socket.send(Native Method)
>>> at zilch.mq$send.invoke(mq.clj:93)
>>>  at backtype.storm.messaging.zmq.ZMQConnection.send(zmq.clj:43)
>>> at
>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333$fn__4334.invoke(worker.clj:298)
>>>  at
>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333.invoke(worker.clj:287)
>>> at
>>> backtype.storm.disruptor$clojure_handler$reify__1606.onEvent(disruptor.clj:43)
>>>  at
>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
>>> ... 6 more
>>> 2014-01-31 02:58:14 util [INFO] Halting process: ("Async loop died!")
>>> 2014-01-31 02:58:24 executor [INFO] Processing received message source:
>>> __system:-1, stream: __tick, id: {}, [30]
>>>
>>> I see the above exception almost immediately upon which my spout emits
>>> the first tuple from the queue. I have pared down my topology so there is
>>> just one spout and no bolts so as to narrow the problem down but the only
>>> time I can keep the spout running is if I omit the collector.emit call
>>> itself.
>>>
>>> I'm not sure if it would make a difference but the supervisor has three
>>> slots and this topology would occupy two of them, however, when configured
>>> with two I get the above exception, when configured with one, everything
>>> works fine.
>>>
>>>
>>>
>>
>

Re: Topology dies immediately upon deployment when configured with two workers instead of one

Posted by bijoy deb <bi...@gmail.com>.
Hi Mark,

Ideally it should work once you use the correct (older) version of ZMQ.But
since you say you are getting the error with 2.1.7 version,I would suggest
you to try the below steps and let me know if that worked:

1) Build the jzmq using steps mentioned in below site:
https://github.com/nathanmarz/jzmq

After that restart Storm daemons and see if that resolves the issue.

2) If the issue still persists,you should try downgrading to 2.1.4 version
of zeromq,followed by building jzmq (as in Step 1 above), as suggested by
Nathan Marz in below link:

https://github.com/nathanmarz/storm/wiki/Setting-up-a-Storm-cluster , which
says:
 *Note that you should not install version 2.1.10, as that version has some
serious bugs that can cause strange issues for a Storm cluster. In some
rare cases, users have reported an "IllegalArgumentException" bubbling up
from the ZeroMQ code when using 2.1.7 - in these cases downgrading to 2.1.4
fixed the problem*

Let me know how it goes.

Thanks
Bijoy


On Fri, Jan 31, 2014 at 8:17 PM, Mark Greene <ma...@evertrue.com> wrote:

>  Storm uses the internal queuing (through ZMQ) only when there is a
>> communication between two worker processes is required,which is why this
>> error comes up only when you set num_workers>1.
>
>
> I'm a little confused by the answer, are you suggesting that storm cannot
> run more than one worker even with the correct (older) version of ZMQ?
>
> What's unique about the environment I was having trouble with is it only
> had 1 supervisor where as my prod environment has multiple supervisors and
> I am not seeing a problem there.
>
>
> On Thu, Jan 30, 2014 at 11:59 PM, bijoy deb <bi...@gmail.com>wrote:
>
>>    Hi Mark,
>>
>> Storm uses the internal queuing (through ZMQ) only when there is a
>> communication between two worker processes is required,which is why this
>> error comes up only when you set num_workers>1.
>>
>> Though I won't be able to help with with an exact solution for this,I can
>> provide some pointers:
>>
>> a) Regarding the reason for the error,documentation says that The
>> ZMQ/zeromq version needs to be downgraded to 2.1.7 if its higher than that.
>> b) In a future version of Storm (don't recollect the exact version number
>> or if it has already been released),they are supposed to remove the ZMQ
>> dependency at all,so the above error should not be coming then.
>>
>> Thanks
>> Bijoy
>>
>>
>> On Fri, Jan 31, 2014 at 8:42 AM, Mark Greene <ma...@evertrue.com> wrote:
>>
>>> Exception in log:
>>>
>>>  2014-01-31 02:58:14 task [INFO] Emitting: change-spout default
>>> [[B@38fc659c]
>>> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout __ack_init
>>> [1863657906985036001 0 2]
>>> 2014-01-31 02:58:14 util [ERROR] Async loop died!
>>>  java.lang.RuntimeException: org.zeromq.ZMQException: Invalid
>>> argument(0x16)
>>> at
>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87)
>>> at
>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58)
>>> at
>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
>>> at
>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1619.invoke(disruptor.clj:73)
>>> at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
>>> at clojure.lang.AFn.run(AFn.java:24)
>>> at java.lang.Thread.run(Thread.java:744)
>>> Caused by: org.zeromq.ZMQException: Invalid argument(0x16)
>>> at org.zeromq.ZMQ$Socket.send(Native Method)
>>> at zilch.mq$send.invoke(mq.clj:93)
>>> at backtype.storm.messaging.zmq.ZMQConnection.send(zmq.clj:43)
>>> at
>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333$fn__4334.invoke(worker.clj:298)
>>> at
>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333.invoke(worker.clj:287)
>>> at
>>> backtype.storm.disruptor$clojure_handler$reify__1606.onEvent(disruptor.clj:43)
>>> at
>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
>>> ... 6 more
>>> 2014-01-31 02:58:14 util [INFO] Halting process: ("Async loop died!")
>>> 2014-01-31 02:58:24 executor [INFO] Processing received message source:
>>> __system:-1, stream: __tick, id: {}, [30]
>>>
>>> I see the above exception almost immediately upon which my spout emits
>>> the first tuple from the queue. I have pared down my topology so there is
>>> just one spout and no bolts so as to narrow the problem down but the only
>>> time I can keep the spout running is if I omit the collector.emit call
>>> itself.
>>>
>>> I'm not sure if it would make a difference but the supervisor has three
>>> slots and this topology would occupy two of them, however, when configured
>>> with two I get the above exception, when configured with one, everything
>>> works fine.
>>>
>>>
>>>
>>
>

Re: Topology dies immediately upon deployment when configured with two workers instead of one

Posted by Mark Greene <ma...@evertrue.com>.
>
> Storm uses the internal queuing (through ZMQ) only when there is a
> communication between two worker processes is required,which is why this
> error comes up only when you set num_workers>1.


I'm a little confused by the answer, are you suggesting that storm cannot
run more than one worker even with the correct (older) version of ZMQ?

What's unique about the environment I was having trouble with is it only
had 1 supervisor where as my prod environment has multiple supervisors and
I am not seeing a problem there.


On Thu, Jan 30, 2014 at 11:59 PM, bijoy deb <bi...@gmail.com>wrote:

> Hi Mark,
>
> Storm uses the internal queuing (through ZMQ) only when there is a
> communication between two worker processes is required,which is why this
> error comes up only when you set num_workers>1.
>
> Though I won't be able to help with with an exact solution for this,I can
> provide some pointers:
>
> a) Regarding the reason for the error,documentation says that The
> ZMQ/zeromq version needs to be downgraded to 2.1.7 if its higher than that.
> b) In a future version of Storm (don't recollect the exact version number
> or if it has already been released),they are supposed to remove the ZMQ
> dependency at all,so the above error should not be coming then.
>
> Thanks
> Bijoy
>
>
> On Fri, Jan 31, 2014 at 8:42 AM, Mark Greene <ma...@evertrue.com> wrote:
>
>> Exception in log:
>>
>> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout default
>> [[B@38fc659c]
>> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout __ack_init
>> [1863657906985036001 0 2]
>> 2014-01-31 02:58:14 util [ERROR] Async loop died!
>> java.lang.RuntimeException: org.zeromq.ZMQException: Invalid
>> argument(0x16)
>> at
>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87)
>>  at
>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58)
>> at
>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
>>  at
>> backtype.storm.disruptor$consume_loop_STAR_$fn__1619.invoke(disruptor.clj:73)
>> at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
>>  at clojure.lang.AFn.run(AFn.java:24)
>> at java.lang.Thread.run(Thread.java:744)
>> Caused by: org.zeromq.ZMQException: Invalid argument(0x16)
>>  at org.zeromq.ZMQ$Socket.send(Native Method)
>> at zilch.mq$send.invoke(mq.clj:93)
>>  at backtype.storm.messaging.zmq.ZMQConnection.send(zmq.clj:43)
>> at
>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333$fn__4334.invoke(worker.clj:298)
>>  at
>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333.invoke(worker.clj:287)
>> at
>> backtype.storm.disruptor$clojure_handler$reify__1606.onEvent(disruptor.clj:43)
>>  at
>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
>> ... 6 more
>> 2014-01-31 02:58:14 util [INFO] Halting process: ("Async loop died!")
>> 2014-01-31 02:58:24 executor [INFO] Processing received message source:
>> __system:-1, stream: __tick, id: {}, [30]
>>
>> I see the above exception almost immediately upon which my spout emits
>> the first tuple from the queue. I have pared down my topology so there is
>> just one spout and no bolts so as to narrow the problem down but the only
>> time I can keep the spout running is if I omit the collector.emit call
>> itself.
>>
>> I'm not sure if it would make a difference but the supervisor has three
>> slots and this topology would occupy two of them, however, when configured
>> with two I get the above exception, when configured with one, everything
>> works fine.
>>
>>
>>
>

Re: Topology dies immediately upon deployment when configured with two workers instead of one

Posted by bijoy deb <bi...@gmail.com>.
Hi Mark,

Storm uses the internal queuing (through ZMQ) only when there is a
communication between two worker processes is required,which is why this
error comes up only when you set num_workers>1.

Though I won't be able to help with with an exact solution for this,I can
provide some pointers:

a) Regarding the reason for the error,documentation says that The
ZMQ/zeromq version needs to be downgraded to 2.1.7 if its higher than that.
b) In a future version of Storm (don't recollect the exact version number
or if it has already been released),they are supposed to remove the ZMQ
dependency at all,so the above error should not be coming then.

Thanks
Bijoy


On Fri, Jan 31, 2014 at 8:42 AM, Mark Greene <ma...@evertrue.com> wrote:

> Exception in log:
>
> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout default [[B@38fc659c
> ]
> 2014-01-31 02:58:14 task [INFO] Emitting: change-spout __ack_init
> [1863657906985036001 0 2]
> 2014-01-31 02:58:14 util [ERROR] Async loop died!
> java.lang.RuntimeException: org.zeromq.ZMQException: Invalid argument(0x16)
> at
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87)
>  at
> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58)
> at
> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
>  at
> backtype.storm.disruptor$consume_loop_STAR_$fn__1619.invoke(disruptor.clj:73)
> at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
>  at clojure.lang.AFn.run(AFn.java:24)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: org.zeromq.ZMQException: Invalid argument(0x16)
>  at org.zeromq.ZMQ$Socket.send(Native Method)
> at zilch.mq$send.invoke(mq.clj:93)
>  at backtype.storm.messaging.zmq.ZMQConnection.send(zmq.clj:43)
> at
> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333$fn__4334.invoke(worker.clj:298)
>  at
> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4333.invoke(worker.clj:287)
> at
> backtype.storm.disruptor$clojure_handler$reify__1606.onEvent(disruptor.clj:43)
>  at
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
> ... 6 more
> 2014-01-31 02:58:14 util [INFO] Halting process: ("Async loop died!")
> 2014-01-31 02:58:24 executor [INFO] Processing received message source:
> __system:-1, stream: __tick, id: {}, [30]
>
> I see the above exception almost immediately upon which my spout emits the
> first tuple from the queue. I have pared down my topology so there is just
> one spout and no bolts so as to narrow the problem down but the only time I
> can keep the spout running is if I omit the collector.emit call itself.
>
> I'm not sure if it would make a difference but the supervisor has three
> slots and this topology would occupy two of them, however, when configured
> with two I get the above exception, when configured with one, everything
> works fine.
>
>
>