You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Geeta Iyer <iy...@gmail.com> on 2014/05/05 18:18:55 UTC

Getting error while running topology using netty with storm 0.9.1-incubating

Hi,

I am exploring storm-0.9.1-incubating for running a topology. The topology
consists of 1 spout and 4 bolts. I was trying this out on a 10 node
cluster. When I start streaming messages through the topology, the workers
fail again and again with the exception:

2014-05-05 10:40:28 b.s.m.n.Client [WARN] Remote address is not reachable.
We will close this client.
2014-05-05 10:40:31 b.s.util [ERROR] Async loop died!
java.lang.RuntimeException: java.lang.RuntimeException: Client is being
closed, and does not take requests any more
        at
backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107)
~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
        at
backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78)
~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
        at
backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77)
~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
        at
backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89)
~[na:na]
        at backtype.storm.util$async_loop$fn__384.invoke(util.clj:433)
~[na:na]
        at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
        at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
Caused by: java.lang.RuntimeException: Client is being closed, and does not
take requests any more
        at backtype.storm.messaging.netty.Client.send(Client.java:125)
~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
        at
backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319)
~[na:na]
        at
backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308)
~[na:na]
        at
backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58)
~[na:na]
        at
backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:104)
~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
        ... 6 common frames omitted

I tried multiple configurations, by running multiple executors/tasks on
single worker versus assigning one task per worker and running multiple
workers per node. However, every time, it is the same issue.

The same topology works fine on storm-0.8.2 version with the same amount of
traffic.

Is there any configuration that needs to be tweaked? Any suggestions will
be really helpful.

I want to compare the performance of using storm-0.8.2 with 0mq and 0.9.1
with netty for my topology and see if we can achieve better performance
with storm 0.9.1

This is what my storm.yaml on supervisor nodes look like currently:

########### These MUST be filled in for a storm configuration
 storm.zookeeper.servers:
     - "<zk-hostname-1>"
     - "<zk-hostname-2>"
     - "<zk-hostname-3>"

 nimbus.host: "<nimbus-host>"

 storm.local.dir: "/tmp/forStorm"

 supervisor.slots.ports:
    - 6900
    - 6901
    - 6902
    - 6903
    - 6904

 storm.messaging.transport: "backtype.storm.messaging.netty.Context"
 storm.messaging.netty.server_worker_threads: 1
 storm.messaging.netty.client_worker_threads: 1
 storm.messaging.netty.buffer_size: 5242880
 storm.messaging.netty.max_retries: 10
 storm.messaging.netty.max_wait_ms: 1000
 storm.messaging.netty.min_wait_ms: 100
 worker.childopts: "-Xmx6144m -Djava.net.preferIPv4Stack=true
-Dcom.sun.management.jmxremote.port=1%ID%
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false"




-- 
Thanks,
Geeta

Re: Getting error while running topology using netty with storm 0.9.1-incubating

Posted by Ebot Tabi <eb...@gmail.com>.
Ok, try to drop the parallelism and number of task your new topology has to
do and see what happens, its weird if you say its working on another
service then, it should work on this one as well.


On Mon, May 5, 2014 at 5:10 PM, Geeta Iyer <iy...@gmail.com> wrote:

> Basically, our spout is REST service based. This is what is amounting to
> the http requests. We have the same topology with the exact same
> configuration running without any issues on 0.8.2. So is there any
> configuration on 0.9.1 that I should be tweaking explicitly?
>
> Also, apart from the REST Spout based workers, even the workers that
> execute the bolt tasks fail consistently.
>
>
> On Mon, May 5, 2014 at 10:30 PM, Ebot Tabi <eb...@gmail.com> wrote:
>
>> Hey Geeta,
>> from what i see on the logs, your topology kind does an http request to
>> external service for some extra data, and its gets alot of time and failed
>> connections as well, this ends up causing your topology not to correctly. I
>> will suggestion you implement a better way to do http request, remember
>> storm is pretty fast, and if you getting a stream of data coming and for
>> you to hit the external source everyone, it's not good.
>>
>>
>> On Mon, May 5, 2014 at 4:55 PM, Geeta Iyer <iy...@gmail.com> wrote:
>>
>>> I am attaching the logs of one of the workers. Hope that helps...
>>>
>>>
>>> On Mon, May 5, 2014 at 10:18 PM, Ebot Tabi <eb...@gmail.com> wrote:
>>>
>>>> can you check the logs on your production server and see why it keeps
>>>> restarting ? i could suspect zookeeper, but not sure if it is the case
>>>> here. if you can get the logs on production server that will be great.
>>>>
>>>>
>>>> On Mon, May 5, 2014 at 4:34 PM, Geeta Iyer <iy...@gmail.com> wrote:
>>>>
>>>>> I verified the nimbus hostname in the configuration nimbus.host:
>>>>> "<nimbus-host>".
>>>>> It is correct. The topology do run for some short time and acks some
>>>>> very small number of messages successfully. But as time progresses, the
>>>>> workers keep restarting.
>>>>>
>>>>>
>>>>> On Mon, May 5, 2014 at 9:58 PM, Ebot Tabi <eb...@gmail.com> wrote:
>>>>>
>>>>>> Hi Geeta,
>>>>>> check the first line of your error log you will see that it says:  Remote
>>>>>> address is not reachable. We will close this client.
>>>>>> Your remote Nimbus isn't available to receive the topology you are
>>>>>> submit, make sure you got the right IP address on your
>>>>>> /home/yourname/.storm/storm.yaml file.
>>>>>>
>>>>>>
>>>>>> On Mon, May 5, 2014 at 4:18 PM, Geeta Iyer <iy...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am exploring storm-0.9.1-incubating for running a topology. The
>>>>>>> topology consists of 1 spout and 4 bolts. I was trying this out on a 10
>>>>>>> node cluster. When I start streaming messages through the topology, the
>>>>>>> workers fail again and again with the exception:
>>>>>>>
>>>>>>> 2014-05-05 10:40:28 b.s.m.n.Client [WARN] Remote address is not
>>>>>>> reachable. We will close this client.
>>>>>>> 2014-05-05 10:40:31 b.s.util [ERROR] Async loop died!
>>>>>>> java.lang.RuntimeException: java.lang.RuntimeException: Client is
>>>>>>> being closed, and does not take requests any more
>>>>>>>         at
>>>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107)
>>>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>>>         at
>>>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78)
>>>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>>>         at
>>>>>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77)
>>>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>>>         at
>>>>>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89)
>>>>>>> ~[na:na]
>>>>>>>         at
>>>>>>> backtype.storm.util$async_loop$fn__384.invoke(util.clj:433) ~[na:na]
>>>>>>>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
>>>>>>>         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
>>>>>>> Caused by: java.lang.RuntimeException: Client is being closed, and
>>>>>>> does not take requests any more
>>>>>>>         at
>>>>>>> backtype.storm.messaging.netty.Client.send(Client.java:125)
>>>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>>>         at
>>>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319)
>>>>>>> ~[na:na]
>>>>>>>         at
>>>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308)
>>>>>>> ~[na:na]
>>>>>>>         at
>>>>>>> backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58)
>>>>>>> ~[na:na]
>>>>>>>         at
>>>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:104)
>>>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>>>         ... 6 common frames omitted
>>>>>>>
>>>>>>> I tried multiple configurations, by running multiple executors/tasks
>>>>>>> on single worker versus assigning one task per worker and running multiple
>>>>>>> workers per node. However, every time, it is the same issue.
>>>>>>>
>>>>>>> The same topology works fine on storm-0.8.2 version with the same
>>>>>>> amount of traffic.
>>>>>>>
>>>>>>> Is there any configuration that needs to be tweaked? Any suggestions
>>>>>>> will be really helpful.
>>>>>>>
>>>>>>> I want to compare the performance of using storm-0.8.2 with 0mq and
>>>>>>> 0.9.1 with netty for my topology and see if we can achieve better
>>>>>>> performance with storm 0.9.1
>>>>>>>
>>>>>>> This is what my storm.yaml on supervisor nodes look like currently:
>>>>>>>
>>>>>>> ########### These MUST be filled in for a storm configuration
>>>>>>>  storm.zookeeper.servers:
>>>>>>>      - "<zk-hostname-1>"
>>>>>>>      - "<zk-hostname-2>"
>>>>>>>      - "<zk-hostname-3>"
>>>>>>>
>>>>>>>  nimbus.host: "<nimbus-host>"
>>>>>>>
>>>>>>>  storm.local.dir: "/tmp/forStorm"
>>>>>>>
>>>>>>>  supervisor.slots.ports:
>>>>>>>     - 6900
>>>>>>>     - 6901
>>>>>>>     - 6902
>>>>>>>     - 6903
>>>>>>>     - 6904
>>>>>>>
>>>>>>>  storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>>>>>>>  storm.messaging.netty.server_worker_threads: 1
>>>>>>>  storm.messaging.netty.client_worker_threads: 1
>>>>>>>  storm.messaging.netty.buffer_size: 5242880
>>>>>>>  storm.messaging.netty.max_retries: 10
>>>>>>>  storm.messaging.netty.max_wait_ms: 1000
>>>>>>>  storm.messaging.netty.min_wait_ms: 100
>>>>>>>  worker.childopts: "-Xmx6144m -Djava.net.preferIPv4Stack=true
>>>>>>> -Dcom.sun.management.jmxremote.port=1%ID%
>>>>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>>>>> -Dcom.sun.management.jmxremote.authenticate=false"
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Thanks,
>>>>>>> Geeta
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Ebot T.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Geeta
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Ebot T.
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Geeta
>>>
>>
>>
>>
>> --
>> Ebot T.
>>
>>
>
>
> --
> Thanks,
> Geeta
>



-- 
Ebot T.

Re: Getting error while running topology using netty with storm 0.9.1-incubating

Posted by Geeta Iyer <iy...@gmail.com>.
Basically, our spout is REST service based. This is what is amounting to
the http requests. We have the same topology with the exact same
configuration running without any issues on 0.8.2. So is there any
configuration on 0.9.1 that I should be tweaking explicitly?

Also, apart from the REST Spout based workers, even the workers that
execute the bolt tasks fail consistently.


On Mon, May 5, 2014 at 10:30 PM, Ebot Tabi <eb...@gmail.com> wrote:

> Hey Geeta,
> from what i see on the logs, your topology kind does an http request to
> external service for some extra data, and its gets alot of time and failed
> connections as well, this ends up causing your topology not to correctly. I
> will suggestion you implement a better way to do http request, remember
> storm is pretty fast, and if you getting a stream of data coming and for
> you to hit the external source everyone, it's not good.
>
>
> On Mon, May 5, 2014 at 4:55 PM, Geeta Iyer <iy...@gmail.com> wrote:
>
>> I am attaching the logs of one of the workers. Hope that helps...
>>
>>
>> On Mon, May 5, 2014 at 10:18 PM, Ebot Tabi <eb...@gmail.com> wrote:
>>
>>> can you check the logs on your production server and see why it keeps
>>> restarting ? i could suspect zookeeper, but not sure if it is the case
>>> here. if you can get the logs on production server that will be great.
>>>
>>>
>>> On Mon, May 5, 2014 at 4:34 PM, Geeta Iyer <iy...@gmail.com> wrote:
>>>
>>>> I verified the nimbus hostname in the configuration nimbus.host:
>>>> "<nimbus-host>".
>>>> It is correct. The topology do run for some short time and acks some
>>>> very small number of messages successfully. But as time progresses, the
>>>> workers keep restarting.
>>>>
>>>>
>>>> On Mon, May 5, 2014 at 9:58 PM, Ebot Tabi <eb...@gmail.com> wrote:
>>>>
>>>>> Hi Geeta,
>>>>> check the first line of your error log you will see that it says:  Remote
>>>>> address is not reachable. We will close this client.
>>>>> Your remote Nimbus isn't available to receive the topology you are
>>>>> submit, make sure you got the right IP address on your
>>>>> /home/yourname/.storm/storm.yaml file.
>>>>>
>>>>>
>>>>> On Mon, May 5, 2014 at 4:18 PM, Geeta Iyer <iy...@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am exploring storm-0.9.1-incubating for running a topology. The
>>>>>> topology consists of 1 spout and 4 bolts. I was trying this out on a 10
>>>>>> node cluster. When I start streaming messages through the topology, the
>>>>>> workers fail again and again with the exception:
>>>>>>
>>>>>> 2014-05-05 10:40:28 b.s.m.n.Client [WARN] Remote address is not
>>>>>> reachable. We will close this client.
>>>>>> 2014-05-05 10:40:31 b.s.util [ERROR] Async loop died!
>>>>>> java.lang.RuntimeException: java.lang.RuntimeException: Client is
>>>>>> being closed, and does not take requests any more
>>>>>>         at
>>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107)
>>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>>         at
>>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78)
>>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>>         at
>>>>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77)
>>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>>         at
>>>>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89)
>>>>>> ~[na:na]
>>>>>>         at
>>>>>> backtype.storm.util$async_loop$fn__384.invoke(util.clj:433) ~[na:na]
>>>>>>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
>>>>>>         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
>>>>>> Caused by: java.lang.RuntimeException: Client is being closed, and
>>>>>> does not take requests any more
>>>>>>         at
>>>>>> backtype.storm.messaging.netty.Client.send(Client.java:125)
>>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>>         at
>>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319)
>>>>>> ~[na:na]
>>>>>>         at
>>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308)
>>>>>> ~[na:na]
>>>>>>         at
>>>>>> backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58)
>>>>>> ~[na:na]
>>>>>>         at
>>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:104)
>>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>>         ... 6 common frames omitted
>>>>>>
>>>>>> I tried multiple configurations, by running multiple executors/tasks
>>>>>> on single worker versus assigning one task per worker and running multiple
>>>>>> workers per node. However, every time, it is the same issue.
>>>>>>
>>>>>> The same topology works fine on storm-0.8.2 version with the same
>>>>>> amount of traffic.
>>>>>>
>>>>>> Is there any configuration that needs to be tweaked? Any suggestions
>>>>>> will be really helpful.
>>>>>>
>>>>>> I want to compare the performance of using storm-0.8.2 with 0mq and
>>>>>> 0.9.1 with netty for my topology and see if we can achieve better
>>>>>> performance with storm 0.9.1
>>>>>>
>>>>>> This is what my storm.yaml on supervisor nodes look like currently:
>>>>>>
>>>>>> ########### These MUST be filled in for a storm configuration
>>>>>>  storm.zookeeper.servers:
>>>>>>      - "<zk-hostname-1>"
>>>>>>      - "<zk-hostname-2>"
>>>>>>      - "<zk-hostname-3>"
>>>>>>
>>>>>>  nimbus.host: "<nimbus-host>"
>>>>>>
>>>>>>  storm.local.dir: "/tmp/forStorm"
>>>>>>
>>>>>>  supervisor.slots.ports:
>>>>>>     - 6900
>>>>>>     - 6901
>>>>>>     - 6902
>>>>>>     - 6903
>>>>>>     - 6904
>>>>>>
>>>>>>  storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>>>>>>  storm.messaging.netty.server_worker_threads: 1
>>>>>>  storm.messaging.netty.client_worker_threads: 1
>>>>>>  storm.messaging.netty.buffer_size: 5242880
>>>>>>  storm.messaging.netty.max_retries: 10
>>>>>>  storm.messaging.netty.max_wait_ms: 1000
>>>>>>  storm.messaging.netty.min_wait_ms: 100
>>>>>>  worker.childopts: "-Xmx6144m -Djava.net.preferIPv4Stack=true
>>>>>> -Dcom.sun.management.jmxremote.port=1%ID%
>>>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>>>> -Dcom.sun.management.jmxremote.authenticate=false"
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>> Geeta
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ebot T.
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Geeta
>>>>
>>>
>>>
>>>
>>> --
>>> Ebot T.
>>>
>>>
>>
>>
>> --
>> Thanks,
>> Geeta
>>
>
>
>
> --
> Ebot T.
>
>


-- 
Thanks,
Geeta

Re: Getting error while running topology using netty with storm 0.9.1-incubating

Posted by Ebot Tabi <eb...@gmail.com>.
Hey Geeta,
from what i see on the logs, your topology kind does an http request to
external service for some extra data, and its gets alot of time and failed
connections as well, this ends up causing your topology not to correctly. I
will suggestion you implement a better way to do http request, remember
storm is pretty fast, and if you getting a stream of data coming and for
you to hit the external source everyone, it's not good.


On Mon, May 5, 2014 at 4:55 PM, Geeta Iyer <iy...@gmail.com> wrote:

> I am attaching the logs of one of the workers. Hope that helps...
>
>
> On Mon, May 5, 2014 at 10:18 PM, Ebot Tabi <eb...@gmail.com> wrote:
>
>> can you check the logs on your production server and see why it keeps
>> restarting ? i could suspect zookeeper, but not sure if it is the case
>> here. if you can get the logs on production server that will be great.
>>
>>
>> On Mon, May 5, 2014 at 4:34 PM, Geeta Iyer <iy...@gmail.com> wrote:
>>
>>> I verified the nimbus hostname in the configuration nimbus.host:
>>> "<nimbus-host>".
>>> It is correct. The topology do run for some short time and acks some
>>> very small number of messages successfully. But as time progresses, the
>>> workers keep restarting.
>>>
>>>
>>> On Mon, May 5, 2014 at 9:58 PM, Ebot Tabi <eb...@gmail.com> wrote:
>>>
>>>> Hi Geeta,
>>>> check the first line of your error log you will see that it says:  Remote
>>>> address is not reachable. We will close this client.
>>>> Your remote Nimbus isn't available to receive the topology you are
>>>> submit, make sure you got the right IP address on your
>>>> /home/yourname/.storm/storm.yaml file.
>>>>
>>>>
>>>> On Mon, May 5, 2014 at 4:18 PM, Geeta Iyer <iy...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am exploring storm-0.9.1-incubating for running a topology. The
>>>>> topology consists of 1 spout and 4 bolts. I was trying this out on a 10
>>>>> node cluster. When I start streaming messages through the topology, the
>>>>> workers fail again and again with the exception:
>>>>>
>>>>> 2014-05-05 10:40:28 b.s.m.n.Client [WARN] Remote address is not
>>>>> reachable. We will close this client.
>>>>> 2014-05-05 10:40:31 b.s.util [ERROR] Async loop died!
>>>>> java.lang.RuntimeException: java.lang.RuntimeException: Client is
>>>>> being closed, and does not take requests any more
>>>>>         at
>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107)
>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>         at
>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78)
>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>         at
>>>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77)
>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>         at
>>>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89)
>>>>> ~[na:na]
>>>>>         at backtype.storm.util$async_loop$fn__384.invoke(util.clj:433)
>>>>> ~[na:na]
>>>>>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
>>>>>         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
>>>>> Caused by: java.lang.RuntimeException: Client is being closed, and
>>>>> does not take requests any more
>>>>>         at backtype.storm.messaging.netty.Client.send(Client.java:125)
>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>         at
>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319)
>>>>> ~[na:na]
>>>>>         at
>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308)
>>>>> ~[na:na]
>>>>>         at
>>>>> backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58)
>>>>> ~[na:na]
>>>>>         at
>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:104)
>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>         ... 6 common frames omitted
>>>>>
>>>>> I tried multiple configurations, by running multiple executors/tasks
>>>>> on single worker versus assigning one task per worker and running multiple
>>>>> workers per node. However, every time, it is the same issue.
>>>>>
>>>>> The same topology works fine on storm-0.8.2 version with the same
>>>>> amount of traffic.
>>>>>
>>>>> Is there any configuration that needs to be tweaked? Any suggestions
>>>>> will be really helpful.
>>>>>
>>>>> I want to compare the performance of using storm-0.8.2 with 0mq and
>>>>> 0.9.1 with netty for my topology and see if we can achieve better
>>>>> performance with storm 0.9.1
>>>>>
>>>>> This is what my storm.yaml on supervisor nodes look like currently:
>>>>>
>>>>> ########### These MUST be filled in for a storm configuration
>>>>>  storm.zookeeper.servers:
>>>>>      - "<zk-hostname-1>"
>>>>>      - "<zk-hostname-2>"
>>>>>      - "<zk-hostname-3>"
>>>>>
>>>>>  nimbus.host: "<nimbus-host>"
>>>>>
>>>>>  storm.local.dir: "/tmp/forStorm"
>>>>>
>>>>>  supervisor.slots.ports:
>>>>>     - 6900
>>>>>     - 6901
>>>>>     - 6902
>>>>>     - 6903
>>>>>     - 6904
>>>>>
>>>>>  storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>>>>>  storm.messaging.netty.server_worker_threads: 1
>>>>>  storm.messaging.netty.client_worker_threads: 1
>>>>>  storm.messaging.netty.buffer_size: 5242880
>>>>>  storm.messaging.netty.max_retries: 10
>>>>>  storm.messaging.netty.max_wait_ms: 1000
>>>>>  storm.messaging.netty.min_wait_ms: 100
>>>>>  worker.childopts: "-Xmx6144m -Djava.net.preferIPv4Stack=true
>>>>> -Dcom.sun.management.jmxremote.port=1%ID%
>>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>>> -Dcom.sun.management.jmxremote.authenticate=false"
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Geeta
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Ebot T.
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Geeta
>>>
>>
>>
>>
>> --
>> Ebot T.
>>
>>
>
>
> --
> Thanks,
> Geeta
>



-- 
Ebot T.

Re: Getting error while running topology using netty with storm 0.9.1-incubating

Posted by Geeta Iyer <iy...@gmail.com>.
I am attaching the logs of one of the workers. Hope that helps...


On Mon, May 5, 2014 at 10:18 PM, Ebot Tabi <eb...@gmail.com> wrote:

> can you check the logs on your production server and see why it keeps
> restarting ? i could suspect zookeeper, but not sure if it is the case
> here. if you can get the logs on production server that will be great.
>
>
> On Mon, May 5, 2014 at 4:34 PM, Geeta Iyer <iy...@gmail.com> wrote:
>
>> I verified the nimbus hostname in the configuration nimbus.host:
>> "<nimbus-host>".
>> It is correct. The topology do run for some short time and acks some very
>> small number of messages successfully. But as time progresses, the workers
>> keep restarting.
>>
>>
>> On Mon, May 5, 2014 at 9:58 PM, Ebot Tabi <eb...@gmail.com> wrote:
>>
>>> Hi Geeta,
>>> check the first line of your error log you will see that it says:  Remote
>>> address is not reachable. We will close this client.
>>> Your remote Nimbus isn't available to receive the topology you are
>>> submit, make sure you got the right IP address on your
>>> /home/yourname/.storm/storm.yaml file.
>>>
>>>
>>> On Mon, May 5, 2014 at 4:18 PM, Geeta Iyer <iy...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am exploring storm-0.9.1-incubating for running a topology. The
>>>> topology consists of 1 spout and 4 bolts. I was trying this out on a 10
>>>> node cluster. When I start streaming messages through the topology, the
>>>> workers fail again and again with the exception:
>>>>
>>>> 2014-05-05 10:40:28 b.s.m.n.Client [WARN] Remote address is not
>>>> reachable. We will close this client.
>>>> 2014-05-05 10:40:31 b.s.util [ERROR] Async loop died!
>>>> java.lang.RuntimeException: java.lang.RuntimeException: Client is being
>>>> closed, and does not take requests any more
>>>>         at
>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107)
>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>         at
>>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78)
>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>         at
>>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77)
>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>         at
>>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89)
>>>> ~[na:na]
>>>>         at backtype.storm.util$async_loop$fn__384.invoke(util.clj:433)
>>>> ~[na:na]
>>>>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
>>>>         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
>>>> Caused by: java.lang.RuntimeException: Client is being closed, and does
>>>> not take requests any more
>>>>         at backtype.storm.messaging.netty.Client.send(Client.java:125)
>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>         at
>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319)
>>>> ~[na:na]
>>>>         at
>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308)
>>>> ~[na:na]
>>>>         at
>>>> backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58)
>>>> ~[na:na]
>>>>         at
>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:104)
>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>         ... 6 common frames omitted
>>>>
>>>> I tried multiple configurations, by running multiple executors/tasks on
>>>> single worker versus assigning one task per worker and running multiple
>>>> workers per node. However, every time, it is the same issue.
>>>>
>>>> The same topology works fine on storm-0.8.2 version with the same
>>>> amount of traffic.
>>>>
>>>> Is there any configuration that needs to be tweaked? Any suggestions
>>>> will be really helpful.
>>>>
>>>> I want to compare the performance of using storm-0.8.2 with 0mq and
>>>> 0.9.1 with netty for my topology and see if we can achieve better
>>>> performance with storm 0.9.1
>>>>
>>>> This is what my storm.yaml on supervisor nodes look like currently:
>>>>
>>>> ########### These MUST be filled in for a storm configuration
>>>>  storm.zookeeper.servers:
>>>>      - "<zk-hostname-1>"
>>>>      - "<zk-hostname-2>"
>>>>      - "<zk-hostname-3>"
>>>>
>>>>  nimbus.host: "<nimbus-host>"
>>>>
>>>>  storm.local.dir: "/tmp/forStorm"
>>>>
>>>>  supervisor.slots.ports:
>>>>     - 6900
>>>>     - 6901
>>>>     - 6902
>>>>     - 6903
>>>>     - 6904
>>>>
>>>>  storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>>>>  storm.messaging.netty.server_worker_threads: 1
>>>>  storm.messaging.netty.client_worker_threads: 1
>>>>  storm.messaging.netty.buffer_size: 5242880
>>>>  storm.messaging.netty.max_retries: 10
>>>>  storm.messaging.netty.max_wait_ms: 1000
>>>>  storm.messaging.netty.min_wait_ms: 100
>>>>  worker.childopts: "-Xmx6144m -Djava.net.preferIPv4Stack=true
>>>> -Dcom.sun.management.jmxremote.port=1%ID%
>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>> -Dcom.sun.management.jmxremote.authenticate=false"
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Geeta
>>>>
>>>
>>>
>>>
>>> --
>>> Ebot T.
>>>
>>>
>>
>>
>> --
>> Thanks,
>> Geeta
>>
>
>
>
> --
> Ebot T.
>
>


-- 
Thanks,
Geeta

Re: Getting error while running topology using netty with storm 0.9.1-incubating

Posted by Ebot Tabi <eb...@gmail.com>.
Hi Sajith,
yeah netty is pretty faster for storm compared to OMQ, we move for 60K to
over 120+K per/sec writes when migrated to netty.


On Mon, May 5, 2014 at 4:59 PM, Sajith <sa...@gmail.com> wrote:

> Hi Geeta,
>
> In terms of performance nettey if much better than zeroMQ[1]. I also
> personally evaluated netty and zeroMq in a multinode cluster with
> storm-0.9.2-SNAPSHOT and netty is having much higher performance than
> ZeroMQ .
>
> Thanks,
> Sajith.
>
> [1]
> http://yahooeng.tumblr.com/post/64758709722/making-storm-fly-with-netty
>
>
> On Mon, May 5, 2014 at 10:18 PM, Ebot Tabi <eb...@gmail.com> wrote:
>
>> can you check the logs on your production server and see why it keeps
>> restarting ? i could suspect zookeeper, but not sure if it is the case
>> here. if you can get the logs on production server that will be great.
>>
>>
>> On Mon, May 5, 2014 at 4:34 PM, Geeta Iyer <iy...@gmail.com> wrote:
>>
>>> I verified the nimbus hostname in the configuration nimbus.host:
>>> "<nimbus-host>".
>>> It is correct. The topology do run for some short time and acks some
>>> very small number of messages successfully. But as time progresses, the
>>> workers keep restarting.
>>>
>>>
>>> On Mon, May 5, 2014 at 9:58 PM, Ebot Tabi <eb...@gmail.com> wrote:
>>>
>>>> Hi Geeta,
>>>> check the first line of your error log you will see that it says:  Remote
>>>> address is not reachable. We will close this client.
>>>> Your remote Nimbus isn't available to receive the topology you are
>>>> submit, make sure you got the right IP address on your
>>>> /home/yourname/.storm/storm.yaml file.
>>>>
>>>>
>>>> On Mon, May 5, 2014 at 4:18 PM, Geeta Iyer <iy...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am exploring storm-0.9.1-incubating for running a topology. The
>>>>> topology consists of 1 spout and 4 bolts. I was trying this out on a 10
>>>>> node cluster. When I start streaming messages through the topology, the
>>>>> workers fail again and again with the exception:
>>>>>
>>>>> 2014-05-05 10:40:28 b.s.m.n.Client [WARN] Remote address is not
>>>>> reachable. We will close this client.
>>>>> 2014-05-05 10:40:31 b.s.util [ERROR] Async loop died!
>>>>> java.lang.RuntimeException: java.lang.RuntimeException: Client is
>>>>> being closed, and does not take requests any more
>>>>>         at
>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107)
>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>         at
>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78)
>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>         at
>>>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77)
>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>         at
>>>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89)
>>>>> ~[na:na]
>>>>>         at backtype.storm.util$async_loop$fn__384.invoke(util.clj:433)
>>>>> ~[na:na]
>>>>>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
>>>>>         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
>>>>> Caused by: java.lang.RuntimeException: Client is being closed, and
>>>>> does not take requests any more
>>>>>         at backtype.storm.messaging.netty.Client.send(Client.java:125)
>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>         at
>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319)
>>>>> ~[na:na]
>>>>>         at
>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308)
>>>>> ~[na:na]
>>>>>         at
>>>>> backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58)
>>>>> ~[na:na]
>>>>>         at
>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:104)
>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>>         ... 6 common frames omitted
>>>>>
>>>>> I tried multiple configurations, by running multiple executors/tasks
>>>>> on single worker versus assigning one task per worker and running multiple
>>>>> workers per node. However, every time, it is the same issue.
>>>>>
>>>>> The same topology works fine on storm-0.8.2 version with the same
>>>>> amount of traffic.
>>>>>
>>>>> Is there any configuration that needs to be tweaked? Any suggestions
>>>>> will be really helpful.
>>>>>
>>>>> I want to compare the performance of using storm-0.8.2 with 0mq and
>>>>> 0.9.1 with netty for my topology and see if we can achieve better
>>>>> performance with storm 0.9.1
>>>>>
>>>>> This is what my storm.yaml on supervisor nodes look like currently:
>>>>>
>>>>> ########### These MUST be filled in for a storm configuration
>>>>>  storm.zookeeper.servers:
>>>>>      - "<zk-hostname-1>"
>>>>>      - "<zk-hostname-2>"
>>>>>      - "<zk-hostname-3>"
>>>>>
>>>>>  nimbus.host: "<nimbus-host>"
>>>>>
>>>>>  storm.local.dir: "/tmp/forStorm"
>>>>>
>>>>>  supervisor.slots.ports:
>>>>>     - 6900
>>>>>     - 6901
>>>>>     - 6902
>>>>>     - 6903
>>>>>     - 6904
>>>>>
>>>>>  storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>>>>>  storm.messaging.netty.server_worker_threads: 1
>>>>>  storm.messaging.netty.client_worker_threads: 1
>>>>>  storm.messaging.netty.buffer_size: 5242880
>>>>>  storm.messaging.netty.max_retries: 10
>>>>>  storm.messaging.netty.max_wait_ms: 1000
>>>>>  storm.messaging.netty.min_wait_ms: 100
>>>>>  worker.childopts: "-Xmx6144m -Djava.net.preferIPv4Stack=true
>>>>> -Dcom.sun.management.jmxremote.port=1%ID%
>>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>>> -Dcom.sun.management.jmxremote.authenticate=false"
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Geeta
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Ebot T.
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Geeta
>>>
>>
>>
>>
>> --
>> Ebot T.
>>
>>
>


-- 
Ebot T.

Re: Getting error while running topology using netty with storm 0.9.1-incubating

Posted by Sajith <sa...@gmail.com>.
Hi Geeta,

In terms of performance nettey if much better than zeroMQ[1]. I also
personally evaluated netty and zeroMq in a multinode cluster with
storm-0.9.2-SNAPSHOT and netty is having much higher performance than
ZeroMQ .

Thanks,
Sajith.

[1] http://yahooeng.tumblr.com/post/64758709722/making-storm-fly-with-netty


On Mon, May 5, 2014 at 10:18 PM, Ebot Tabi <eb...@gmail.com> wrote:

> can you check the logs on your production server and see why it keeps
> restarting ? i could suspect zookeeper, but not sure if it is the case
> here. if you can get the logs on production server that will be great.
>
>
> On Mon, May 5, 2014 at 4:34 PM, Geeta Iyer <iy...@gmail.com> wrote:
>
>> I verified the nimbus hostname in the configuration nimbus.host:
>> "<nimbus-host>".
>> It is correct. The topology do run for some short time and acks some very
>> small number of messages successfully. But as time progresses, the workers
>> keep restarting.
>>
>>
>> On Mon, May 5, 2014 at 9:58 PM, Ebot Tabi <eb...@gmail.com> wrote:
>>
>>> Hi Geeta,
>>> check the first line of your error log you will see that it says:  Remote
>>> address is not reachable. We will close this client.
>>> Your remote Nimbus isn't available to receive the topology you are
>>> submit, make sure you got the right IP address on your
>>> /home/yourname/.storm/storm.yaml file.
>>>
>>>
>>> On Mon, May 5, 2014 at 4:18 PM, Geeta Iyer <iy...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am exploring storm-0.9.1-incubating for running a topology. The
>>>> topology consists of 1 spout and 4 bolts. I was trying this out on a 10
>>>> node cluster. When I start streaming messages through the topology, the
>>>> workers fail again and again with the exception:
>>>>
>>>> 2014-05-05 10:40:28 b.s.m.n.Client [WARN] Remote address is not
>>>> reachable. We will close this client.
>>>> 2014-05-05 10:40:31 b.s.util [ERROR] Async loop died!
>>>> java.lang.RuntimeException: java.lang.RuntimeException: Client is being
>>>> closed, and does not take requests any more
>>>>         at
>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107)
>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>         at
>>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78)
>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>         at
>>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77)
>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>         at
>>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89)
>>>> ~[na:na]
>>>>         at backtype.storm.util$async_loop$fn__384.invoke(util.clj:433)
>>>> ~[na:na]
>>>>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
>>>>         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
>>>> Caused by: java.lang.RuntimeException: Client is being closed, and does
>>>> not take requests any more
>>>>         at backtype.storm.messaging.netty.Client.send(Client.java:125)
>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>         at
>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319)
>>>> ~[na:na]
>>>>         at
>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308)
>>>> ~[na:na]
>>>>         at
>>>> backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58)
>>>> ~[na:na]
>>>>         at
>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:104)
>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>>         ... 6 common frames omitted
>>>>
>>>> I tried multiple configurations, by running multiple executors/tasks on
>>>> single worker versus assigning one task per worker and running multiple
>>>> workers per node. However, every time, it is the same issue.
>>>>
>>>> The same topology works fine on storm-0.8.2 version with the same
>>>> amount of traffic.
>>>>
>>>> Is there any configuration that needs to be tweaked? Any suggestions
>>>> will be really helpful.
>>>>
>>>> I want to compare the performance of using storm-0.8.2 with 0mq and
>>>> 0.9.1 with netty for my topology and see if we can achieve better
>>>> performance with storm 0.9.1
>>>>
>>>> This is what my storm.yaml on supervisor nodes look like currently:
>>>>
>>>> ########### These MUST be filled in for a storm configuration
>>>>  storm.zookeeper.servers:
>>>>      - "<zk-hostname-1>"
>>>>      - "<zk-hostname-2>"
>>>>      - "<zk-hostname-3>"
>>>>
>>>>  nimbus.host: "<nimbus-host>"
>>>>
>>>>  storm.local.dir: "/tmp/forStorm"
>>>>
>>>>  supervisor.slots.ports:
>>>>     - 6900
>>>>     - 6901
>>>>     - 6902
>>>>     - 6903
>>>>     - 6904
>>>>
>>>>  storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>>>>  storm.messaging.netty.server_worker_threads: 1
>>>>  storm.messaging.netty.client_worker_threads: 1
>>>>  storm.messaging.netty.buffer_size: 5242880
>>>>  storm.messaging.netty.max_retries: 10
>>>>  storm.messaging.netty.max_wait_ms: 1000
>>>>  storm.messaging.netty.min_wait_ms: 100
>>>>  worker.childopts: "-Xmx6144m -Djava.net.preferIPv4Stack=true
>>>> -Dcom.sun.management.jmxremote.port=1%ID%
>>>> -Dcom.sun.management.jmxremote.ssl=false
>>>> -Dcom.sun.management.jmxremote.authenticate=false"
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Geeta
>>>>
>>>
>>>
>>>
>>> --
>>> Ebot T.
>>>
>>>
>>
>>
>> --
>> Thanks,
>> Geeta
>>
>
>
>
> --
> Ebot T.
>
>

Re: Getting error while running topology using netty with storm 0.9.1-incubating

Posted by Ebot Tabi <eb...@gmail.com>.
can you check the logs on your production server and see why it keeps
restarting ? i could suspect zookeeper, but not sure if it is the case
here. if you can get the logs on production server that will be great.


On Mon, May 5, 2014 at 4:34 PM, Geeta Iyer <iy...@gmail.com> wrote:

> I verified the nimbus hostname in the configuration nimbus.host:
> "<nimbus-host>".
> It is correct. The topology do run for some short time and acks some very
> small number of messages successfully. But as time progresses, the workers
> keep restarting.
>
>
> On Mon, May 5, 2014 at 9:58 PM, Ebot Tabi <eb...@gmail.com> wrote:
>
>> Hi Geeta,
>> check the first line of your error log you will see that it says:  Remote
>> address is not reachable. We will close this client.
>> Your remote Nimbus isn't available to receive the topology you are
>> submit, make sure you got the right IP address on your
>> /home/yourname/.storm/storm.yaml file.
>>
>>
>> On Mon, May 5, 2014 at 4:18 PM, Geeta Iyer <iy...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am exploring storm-0.9.1-incubating for running a topology. The
>>> topology consists of 1 spout and 4 bolts. I was trying this out on a 10
>>> node cluster. When I start streaming messages through the topology, the
>>> workers fail again and again with the exception:
>>>
>>> 2014-05-05 10:40:28 b.s.m.n.Client [WARN] Remote address is not
>>> reachable. We will close this client.
>>> 2014-05-05 10:40:31 b.s.util [ERROR] Async loop died!
>>> java.lang.RuntimeException: java.lang.RuntimeException: Client is being
>>> closed, and does not take requests any more
>>>         at
>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107)
>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>         at
>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78)
>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>         at
>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77)
>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>         at
>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89)
>>> ~[na:na]
>>>         at backtype.storm.util$async_loop$fn__384.invoke(util.clj:433)
>>> ~[na:na]
>>>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
>>>         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
>>> Caused by: java.lang.RuntimeException: Client is being closed, and does
>>> not take requests any more
>>>         at backtype.storm.messaging.netty.Client.send(Client.java:125)
>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>         at
>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319)
>>> ~[na:na]
>>>         at
>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308)
>>> ~[na:na]
>>>         at
>>> backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58)
>>> ~[na:na]
>>>         at
>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:104)
>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>>         ... 6 common frames omitted
>>>
>>> I tried multiple configurations, by running multiple executors/tasks on
>>> single worker versus assigning one task per worker and running multiple
>>> workers per node. However, every time, it is the same issue.
>>>
>>> The same topology works fine on storm-0.8.2 version with the same amount
>>> of traffic.
>>>
>>> Is there any configuration that needs to be tweaked? Any suggestions
>>> will be really helpful.
>>>
>>> I want to compare the performance of using storm-0.8.2 with 0mq and
>>> 0.9.1 with netty for my topology and see if we can achieve better
>>> performance with storm 0.9.1
>>>
>>> This is what my storm.yaml on supervisor nodes look like currently:
>>>
>>> ########### These MUST be filled in for a storm configuration
>>>  storm.zookeeper.servers:
>>>      - "<zk-hostname-1>"
>>>      - "<zk-hostname-2>"
>>>      - "<zk-hostname-3>"
>>>
>>>  nimbus.host: "<nimbus-host>"
>>>
>>>  storm.local.dir: "/tmp/forStorm"
>>>
>>>  supervisor.slots.ports:
>>>     - 6900
>>>     - 6901
>>>     - 6902
>>>     - 6903
>>>     - 6904
>>>
>>>  storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>>>  storm.messaging.netty.server_worker_threads: 1
>>>  storm.messaging.netty.client_worker_threads: 1
>>>  storm.messaging.netty.buffer_size: 5242880
>>>  storm.messaging.netty.max_retries: 10
>>>  storm.messaging.netty.max_wait_ms: 1000
>>>  storm.messaging.netty.min_wait_ms: 100
>>>  worker.childopts: "-Xmx6144m -Djava.net.preferIPv4Stack=true
>>> -Dcom.sun.management.jmxremote.port=1%ID%
>>> -Dcom.sun.management.jmxremote.ssl=false
>>> -Dcom.sun.management.jmxremote.authenticate=false"
>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Geeta
>>>
>>
>>
>>
>> --
>> Ebot T.
>>
>>
>
>
> --
> Thanks,
> Geeta
>



-- 
Ebot T.

Re: Getting error while running topology using netty with storm 0.9.1-incubating

Posted by Geeta Iyer <iy...@gmail.com>.
I verified the nimbus hostname in the configuration nimbus.host:
"<nimbus-host>".
It is correct. The topology do run for some short time and acks some very
small number of messages successfully. But as time progresses, the workers
keep restarting.


On Mon, May 5, 2014 at 9:58 PM, Ebot Tabi <eb...@gmail.com> wrote:

> Hi Geeta,
> check the first line of your error log you will see that it says:  Remote
> address is not reachable. We will close this client.
> Your remote Nimbus isn't available to receive the topology you are submit,
> make sure you got the right IP address on your
> /home/yourname/.storm/storm.yaml file.
>
>
> On Mon, May 5, 2014 at 4:18 PM, Geeta Iyer <iy...@gmail.com> wrote:
>
>> Hi,
>>
>> I am exploring storm-0.9.1-incubating for running a topology. The
>> topology consists of 1 spout and 4 bolts. I was trying this out on a 10
>> node cluster. When I start streaming messages through the topology, the
>> workers fail again and again with the exception:
>>
>> 2014-05-05 10:40:28 b.s.m.n.Client [WARN] Remote address is not
>> reachable. We will close this client.
>> 2014-05-05 10:40:31 b.s.util [ERROR] Async loop died!
>> java.lang.RuntimeException: java.lang.RuntimeException: Client is being
>> closed, and does not take requests any more
>>         at
>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107)
>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>         at
>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78)
>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>         at
>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77)
>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>         at
>> backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89)
>> ~[na:na]
>>         at backtype.storm.util$async_loop$fn__384.invoke(util.clj:433)
>> ~[na:na]
>>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
>>         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
>> Caused by: java.lang.RuntimeException: Client is being closed, and does
>> not take requests any more
>>         at backtype.storm.messaging.netty.Client.send(Client.java:125)
>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>         at
>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319)
>> ~[na:na]
>>         at
>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308)
>> ~[na:na]
>>         at
>> backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58)
>> ~[na:na]
>>         at
>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:104)
>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>>         ... 6 common frames omitted
>>
>> I tried multiple configurations, by running multiple executors/tasks on
>> single worker versus assigning one task per worker and running multiple
>> workers per node. However, every time, it is the same issue.
>>
>> The same topology works fine on storm-0.8.2 version with the same amount
>> of traffic.
>>
>> Is there any configuration that needs to be tweaked? Any suggestions will
>> be really helpful.
>>
>> I want to compare the performance of using storm-0.8.2 with 0mq and 0.9.1
>> with netty for my topology and see if we can achieve better performance
>> with storm 0.9.1
>>
>> This is what my storm.yaml on supervisor nodes look like currently:
>>
>> ########### These MUST be filled in for a storm configuration
>>  storm.zookeeper.servers:
>>      - "<zk-hostname-1>"
>>      - "<zk-hostname-2>"
>>      - "<zk-hostname-3>"
>>
>>  nimbus.host: "<nimbus-host>"
>>
>>  storm.local.dir: "/tmp/forStorm"
>>
>>  supervisor.slots.ports:
>>     - 6900
>>     - 6901
>>     - 6902
>>     - 6903
>>     - 6904
>>
>>  storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>>  storm.messaging.netty.server_worker_threads: 1
>>  storm.messaging.netty.client_worker_threads: 1
>>  storm.messaging.netty.buffer_size: 5242880
>>  storm.messaging.netty.max_retries: 10
>>  storm.messaging.netty.max_wait_ms: 1000
>>  storm.messaging.netty.min_wait_ms: 100
>>  worker.childopts: "-Xmx6144m -Djava.net.preferIPv4Stack=true
>> -Dcom.sun.management.jmxremote.port=1%ID%
>> -Dcom.sun.management.jmxremote.ssl=false
>> -Dcom.sun.management.jmxremote.authenticate=false"
>>
>>
>>
>>
>> --
>> Thanks,
>> Geeta
>>
>
>
>
> --
> Ebot T.
>
>


-- 
Thanks,
Geeta

Re: Getting error while running topology using netty with storm 0.9.1-incubating

Posted by Ebot Tabi <eb...@gmail.com>.
Hi Geeta,
check the first line of your error log you will see that it says:  Remote
address is not reachable. We will close this client.
Your remote Nimbus isn't available to receive the topology you are submit,
make sure you got the right IP address on your
/home/yourname/.storm/storm.yaml file.


On Mon, May 5, 2014 at 4:18 PM, Geeta Iyer <iy...@gmail.com> wrote:

> Hi,
>
> I am exploring storm-0.9.1-incubating for running a topology. The topology
> consists of 1 spout and 4 bolts. I was trying this out on a 10 node
> cluster. When I start streaming messages through the topology, the workers
> fail again and again with the exception:
>
> 2014-05-05 10:40:28 b.s.m.n.Client [WARN] Remote address is not reachable.
> We will close this client.
> 2014-05-05 10:40:31 b.s.util [ERROR] Async loop died!
> java.lang.RuntimeException: java.lang.RuntimeException: Client is being
> closed, and does not take requests any more
>         at
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107)
> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>         at
> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78)
> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>         at
> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77)
> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>         at
> backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89)
> ~[na:na]
>         at backtype.storm.util$async_loop$fn__384.invoke(util.clj:433)
> ~[na:na]
>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
>         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
> Caused by: java.lang.RuntimeException: Client is being closed, and does
> not take requests any more
>         at backtype.storm.messaging.netty.Client.send(Client.java:125)
> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>         at
> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319)
> ~[na:na]
>         at
> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308)
> ~[na:na]
>         at
> backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58)
> ~[na:na]
>         at
> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:104)
> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
>         ... 6 common frames omitted
>
> I tried multiple configurations, by running multiple executors/tasks on
> single worker versus assigning one task per worker and running multiple
> workers per node. However, every time, it is the same issue.
>
> The same topology works fine on storm-0.8.2 version with the same amount
> of traffic.
>
> Is there any configuration that needs to be tweaked? Any suggestions will
> be really helpful.
>
> I want to compare the performance of using storm-0.8.2 with 0mq and 0.9.1
> with netty for my topology and see if we can achieve better performance
> with storm 0.9.1
>
> This is what my storm.yaml on supervisor nodes look like currently:
>
> ########### These MUST be filled in for a storm configuration
>  storm.zookeeper.servers:
>      - "<zk-hostname-1>"
>      - "<zk-hostname-2>"
>      - "<zk-hostname-3>"
>
>  nimbus.host: "<nimbus-host>"
>
>  storm.local.dir: "/tmp/forStorm"
>
>  supervisor.slots.ports:
>     - 6900
>     - 6901
>     - 6902
>     - 6903
>     - 6904
>
>  storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>  storm.messaging.netty.server_worker_threads: 1
>  storm.messaging.netty.client_worker_threads: 1
>  storm.messaging.netty.buffer_size: 5242880
>  storm.messaging.netty.max_retries: 10
>  storm.messaging.netty.max_wait_ms: 1000
>  storm.messaging.netty.min_wait_ms: 100
>  worker.childopts: "-Xmx6144m -Djava.net.preferIPv4Stack=true
> -Dcom.sun.management.jmxremote.port=1%ID%
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.authenticate=false"
>
>
>
>
> --
> Thanks,
> Geeta
>



-- 
Ebot T.