You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by sam mohel <sa...@gmail.com> on 2017/06/28 06:44:03 UTC

java.io.IOException: Connection reset by peer in DRPC

I submitted two topologies in production mode . First one has a data set
with size 215 MB and worked well  and gave me the results . Second topology
has a data set with size 170 MB with same configurations but stopped worked
after some times and didn't complete its result

The error i got is drpc log file

    TNonblockingServer [WARN] Got an IOException in internalRead!
    java.io.IOException: Connection reset by peer

I couldn't figure where is the problem as it supposed to work well as
second data set is smaller in size

Re: java.io.IOException: Connection reset by peer in DRPC

Posted by sam mohel <sa...@gmail.com>.
I can guess something can you help me ?
I ran my topology in production mode with Machine A and Machine B
storm.yaml in Machine A is
storm.zookeeper.servers:
     - "192.168.x.x"

 nimbus.host : "192.168.x.x"

 supervisor.childopts: "-Xmx4g"
 worker.childopts: "-Xmx4g"

storm.yaml in Machine B is
storm.zookeeper.servers:
     - "192.168.x.x"

 nimbus.host : "192.168.x.x"

 supervisor.childopts: "-Xmx4g"
 worker.childopts: "-Xmx4g"

i set drpc in the code

Config conf = new Config();
List<String> dprcServers = new ArrayList<String>();
    dprcServers.add("192.168.x.x");
conf.put(Config.DRPC_SERVERS, dprcServers);
conf.put(Config.DRPC_PORT, 3772);
// distributed mode
Config conf = createTopologyConfiguration(prop, true);
LocalDRPC drpc = null;
StormSubmitter.submitTopology(args[0], conf, buildTopology(drpc));
             client=new DRPCClient("192.168.x.x", 3772);

i used same ip address for storm.zookeeper.servers ,  nimbus.host
,dprcServers and DRPCClient . Is that wrong ?
and i ran nimbus , drpc,ui in Machine A ,   I ran supervisor in Machine B



On Wed, Jun 28, 2017 at 7:11 PM, sam mohel <sa...@gmail.com> wrote:

> @Bobby @Navin Thanks for your time . Is there something i should check it
>  ?
>

Re: java.io.IOException: Connection reset by peer in DRPC

Posted by sam mohel <sa...@gmail.com>.
@Bobby @Navin Thanks for your time . Is there something i should check it
 ?

Re: java.io.IOException: Connection reset by peer in DRPC

Posted by sam mohel <sa...@gmail.com>.
i found in drpc log file just this two statements
2017-06-28T15:49:22.463+0200 b.s.d.drpc [INFO] Starting Distributed RPC
servers...
2017-06-28T16:02:07.473+0200 b.s.d.drpc [WARN] Timeout DRPC request id: 200
start at 1498657923

the strange thing here that i used same configurations in both project .
first with data set larger than second !! why this error appears in second
!! just the file of tweets is the different !!

On Wed, Jun 28, 2017 at 5:39 PM, Bobby Evans <ev...@yahoo-inc.com> wrote:

> OK it looks like the Exception is being caused by an EOF being received.
>
> https://github.com/nathanmarz/thrift/blob/storm/lib/java/
> src/org/apache/thrift7/transport/TIOStreamTransport.java#L132
>
> Did you see an exception in the DRPC server at about the same time?  The
> DRPC server should never close it's connection unless it is shutting down.
> So either the DRPC server was shut down, possibly crashed (but I would
> expect a different IOException for that one),  or there is something
> external that is injecting packets to shut down the connection (some
> network security devices do this).
>
> - Bobby
>
>
>
> On Wednesday, June 28, 2017, 10:29:11 AM CDT, sam mohel <
> sammohel5@gmail.com> wrote:
>
>
> sorry for small font . I found in worker log file
> DRPCSpout [ERROR] Failed to fetch DRPC result from DRPC server
> org.apache.thrift7.transport. TTransportException: null at
> org.apache.thrift7.transport. TIOStreamTransport.read(
> TIOStreamTransport.java:132) ~[storm-core-0.9.6.jar:0.9.6]
> 28T16:18:28.280+0200 b.s.u. StormBoundedExponentialBackoff Retry [INFO]
> The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries [100]
> 2017-06-28T16:18:28.315+0200 b.s.d.DRPCSpout [ERROR] Failed to fetch DRPC
> result from DRPC server org.apache.thrift7.transport. TTransportException:
> java.net.ConnectException: Connection refused at
> org.apache.thrift7.transport. TSocket.open(TSocket.java:183)
> ~[storm-core-0.9.6.jar:0.9.6]
>
> On Wed, Jun 28, 2017 at 5:26 PM, sam mohel <sa...@gmail.com> wrote:
>
> i found in worker log file
> DRPCSpout [ERROR] Failed to fetch DRPC result from DRPC server
> org.apache.thrift7.transport. TTransportException: null at
> org.apache.thrift7.transport. TIOStreamTransport.read(
> TIOStreamTransport.java:132) ~[storm-core-0.9.6.jar:0.9.6]
> 28T16:18:28.280+0200 b.s.u. StormBoundedExponentialBackoff Retry [INFO]
> The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries [100]
> 2017-06-28T16:18:28.315+0200 b.s.d.DRPCSpout [ERROR] Failed to fetch DRPC
> result from DRPC server org.apache.thrift7.transport. TTransportException:
> java.net.ConnectException: Connection refused at
> org.apache.thrift7.transport. TSocket.open(TSocket.java:183)
> ~[storm-core-0.9.6.jar:0.9.6]
>
> On Wed, Jun 28, 2017 at 5:12 PM, Bobby Evans <ev...@yahoo-inc.com> wrote:
>
> OK so I misread your comment.  All it really means is that someone closed
> a connection to the thrift DRPC server.  You should look in the logs of the
> workers to see if any of them have been killed and relaunched recently.
>
>
> - Bobby
>
>
>
> On Wednesday, June 28, 2017, 10:03:22 AM CDT, sam mohel <
> sammohel5@gmail.com> wrote:
>
>
> I haven't see any exception except I wrote here . Connection reset by peer
> and this in droc log file . And in local mode just stopped without any
> message
>
> On Wednesday, June 28, 2017, Bobby Evans <ev...@yahoo-inc.com> wrote:
> > It should have been in the exception.  You have not included enough
> information from the logs to be able to actually debug this.  I am just
> guessing from similar exceptions I have seen in the past.
> >
> >
> > - Bobby
> >
> >
> > On Wednesday, June 28, 2017, 9:23:22 AM CDT, sam mohel <
> sammohel5@gmail.com> wrote:
> >
> > @Bobby Thanks for replying , please how can i check if ip-address and
> port if it has another worker ? i'm on ubuntu 14.04 LTS
> > On Wed, Jun 28, 2017 at 4:15 PM, Bobby Evans <ev...@yahoo-inc.com>
> wrote:
> >
> > Connection reset by peer typically means that another worker was
> rescheduled some place else and that worker closed it's connection to this
> host.  If it did come from the worker then you should see the IP address +
> port of the other worker and see if it was rescheduled.  If this was
> because something else closed the connection then it is hard to tell what
> is happening.
> > With DRPC it is not guaranteed to be processed.  If the message is not
> processed in a timely manor you do need to retry it.
> >
> >
> > - Bobby
> >
> >
> > On Wednesday, June 28, 2017, 9:00:08 AM CDT, sam mohel <
> sammohel5@gmail.com> wrote:
> >
> > Thanks for replying and help . i tried to increase  worker.heap.memory.
> mb: 2048 but not working . DRPC stopped working . I wonder why it is
> stopped and data set i used smaller than first one !! my data set are
> tweets and i'm working on processing it . l tried local mode  but  also not
> working result size stopped in size 57.7 KB. Is there any thing i should
> share it ?
> > On Wed, Jun 28, 2017 at 2:56 PM, Navin Ipe <navin.ipe@searchlighthealth.
> com> wrote:
> >
> > @Sam: You've provided very less information for us to help you. Prima
> facie, if you have allocated very less memory for your topologies, Storm is
> obviously running out of memory, the spouts and bolts are restarting which
> causes the Connection reset by peer error.
> > The solution is to allow Storm to use more RAM (assuming there is more
> RAM).
> >
> > int RAM_IN_MB = 2048;
> > Use stormConfig.put(Config. TOPOLOGY_WORKER_MAX_HEAP_SIZE_ MB,
> RAM_IN_MB);
> >
> > If you provide more details of the error, when it happens and what your
> program is trying to accomplish, the others on this forum would be able to
> help you better.
> >
> >
> > On Wed, Jun 28, 2017 at 3:30 PM, sam mohel <sa...@gmail.com> wrote:
> >
> > Is there any help.please ?
> >
> > On Wednesday, June 28, 2017, sam mohel <sa...@gmail.com> wrote:
> >> I submitted two topologies in production mode . First one has a data
> set with size 215 MB and worked well  and gave me the results . Second
> topology has a data set with size 170 MB with same configurations but
> stopped worked after some times and didn't complete its result
> >> The error i got is drpc log file
> >>     TNonblockingServer [WARN] Got an IOException in internalRead!
> >>     java.io.IOException: Connection reset by peer
> >> I couldn't figure where is the problem as it supposed to work well as
> second data set is smaller in size
> >
> >
> > --
> > Regards,
> > Navin
> >
> >
>
>
>
>

Re: java.io.IOException: Connection reset by peer in DRPC

Posted by Bobby Evans <ev...@yahoo-inc.com>.
OK it looks like the Exception is being caused by an EOF being received.
https://github.com/nathanmarz/thrift/blob/storm/lib/java/src/org/apache/thrift7/transport/TIOStreamTransport.java#L132
Did you see an exception in the DRPC server at about the same time?  The DRPC server should never close it's connection unless it is shutting down.  So either the DRPC server was shut down, possibly crashed (but I would expect a different IOException for that one),  or there is something external that is injecting packets to shut down the connection (some network security devices do this).

- Bobby


On Wednesday, June 28, 2017, 10:29:11 AM CDT, sam mohel <sa...@gmail.com> wrote:

sorry for small font . I found in worker log file DRPCSpout [ERROR] Failed to fetch DRPC result from DRPC serverorg.apache.thrift7.transport. TTransportException: null at org.apache.thrift7.transport. TIOStreamTransport.read( TIOStreamTransport.java:132) ~[storm-core-0.9.6.jar:0.9.6]
28T16:18:28.280+0200 b.s.u. StormBoundedExponentialBackoff Retry [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries [100]2017-06-28T16:18:28.315+0200 b.s.d.DRPCSpout [ERROR] Failed to fetch DRPC result from DRPC serverorg.apache.thrift7.transport. TTransportException: java.net.ConnectException: Connection refused at org.apache.thrift7.transport. TSocket.open(TSocket.java:183) ~[storm-core-0.9.6.jar:0.9.6]
On Wed, Jun 28, 2017 at 5:26 PM, sam mohel <sa...@gmail.com> wrote:

i found in worker log file DRPCSpout [ERROR] Failed to fetch DRPC result from DRPC serverorg.apache.thrift7.transport. TTransportException: null at org.apache.thrift7.transport. TIOStreamTransport.read( TIOStreamTransport.java:132) ~[storm-core-0.9.6.jar:0.9.6]
28T16:18:28.280+0200 b.s.u. StormBoundedExponentialBackoff Retry [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries [100]2017-06-28T16:18:28.315+0200 b.s.d.DRPCSpout [ERROR] Failed to fetch DRPC result from DRPC serverorg.apache.thrift7.transport. TTransportException: java.net.ConnectException: Connection refused at org.apache.thrift7.transport. TSocket.open(TSocket.java:183) ~[storm-core-0.9.6.jar:0.9.6]

On Wed, Jun 28, 2017 at 5:12 PM, Bobby Evans <ev...@yahoo-inc.com> wrote:

OK so I misread your comment.  All it really means is that someone closed a connection to the thrift DRPC server.  You should look in the logs of the workers to see if any of them have been killed and relaunched recently.


- Bobby


On Wednesday, June 28, 2017, 10:03:22 AM CDT, sam mohel <sa...@gmail.com> wrote:

I haven't see any exception except I wrote here . Connection reset by peer and this in droc log file . And in local mode just stopped without any message 

On Wednesday, June 28, 2017, Bobby Evans <ev...@yahoo-inc.com> wrote:
> It should have been in the exception.  You have not included enough information from the logs to be able to actually debug this.  I am just guessing from similar exceptions I have seen in the past.
>
>
> - Bobby
>
>
> On Wednesday, June 28, 2017, 9:23:22 AM CDT, sam mohel <sa...@gmail.com> wrote:
>
> @Bobby Thanks for replying , please how can i check if ip-address and port if it has another worker ? i'm on ubuntu 14.04 LTS 
> On Wed, Jun 28, 2017 at 4:15 PM, Bobby Evans <ev...@yahoo-inc.com> wrote:
>
> Connection reset by peer typically means that another worker was rescheduled some place else and that worker closed it's connection to this host.  If it did come from the worker then you should see the IP address + port of the other worker and see if it was rescheduled.  If this was because something else closed the connection then it is hard to tell what is happening.
> With DRPC it is not guaranteed to be processed.  If the message is not processed in a timely manor you do need to retry it.
>
>
> - Bobby
>
>
> On Wednesday, June 28, 2017, 9:00:08 AM CDT, sam mohel <sa...@gmail.com> wrote:
>
> Thanks for replying and help . i tried to increase  worker.heap.memory. mb: 2048 but not working . DRPC stopped working . I wonder why it is stopped and data set i used smaller than first one !! my data set are tweets and i'm working on processing it . l tried local mode  but  also not working result size stopped in size 57.7 KB. Is there any thing i should share it ? 
> On Wed, Jun 28, 2017 at 2:56 PM, Navin Ipe <navin.ipe@searchlighthealth. com> wrote:
>
> @Sam: You've provided very less information for us to help you. Prima facie, if you have allocated very less memory for your topologies, Storm is obviously running out of memory, the spouts and bolts are restarting which causes the Connection reset by peer error.
> The solution is to allow Storm to use more RAM (assuming there is more RAM).
>
> int RAM_IN_MB = 2048;
> Use stormConfig.put(Config. TOPOLOGY_WORKER_MAX_HEAP_SIZE_ MB, RAM_IN_MB);
>
> If you provide more details of the error, when it happens and what your program is trying to accomplish, the others on this forum would be able to help you better.
>
>
> On Wed, Jun 28, 2017 at 3:30 PM, sam mohel <sa...@gmail.com> wrote:
>
> Is there any help.please ?
>
> On Wednesday, June 28, 2017, sam mohel <sa...@gmail.com> wrote:
>> I submitted two topologies in production mode . First one has a data set with size 215 MB and worked well  and gave me the results . Second topology has a data set with size 170 MB with same configurations but stopped worked after some times and didn't complete its result 
>> The error i got is drpc log file 
>>     TNonblockingServer [WARN] Got an IOException in internalRead!
>>     java.io.IOException: Connection reset by peer
>> I couldn't figure where is the problem as it supposed to work well as second data set is smaller in size 
>
>
> --
> Regards,
> Navin
>
>




Re: java.io.IOException: Connection reset by peer in DRPC

Posted by sam mohel <sa...@gmail.com>.
sorry for small font . I found in worker log file
DRPCSpout [ERROR] Failed to fetch DRPC result from DRPC server
org.apache.thrift7.transport.TTransportException: null at
org.apache.thrift7.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
~[storm-core-0.9.6.jar:0.9.6]
28T16:18:28.280+0200 b.s.u.StormBoundedExponentialBackoffRetry [INFO] The
baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries [100]
2017-06-28T16:18:28.315+0200 b.s.d.DRPCSpout [ERROR] Failed to fetch DRPC
result from DRPC server org.apache.thrift7.transport.TTransportException:
java.net.ConnectException: Connection refused at
org.apache.thrift7.transport.TSocket.open(TSocket.java:183)
~[storm-core-0.9.6.jar:0.9.6]

On Wed, Jun 28, 2017 at 5:26 PM, sam mohel <sa...@gmail.com> wrote:

> i found in worker log file
> DRPCSpout [ERROR] Failed to fetch DRPC result from DRPC server
> org.apache.thrift7.transport.TTransportException: null at
> org.apache.thrift7.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
> ~[storm-core-0.9.6.jar:0.9.6]
> 28T16:18:28.280+0200 b.s.u.StormBoundedExponentialBackoffRetry [INFO] The
> baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries [100]
> 2017-06-28T16:18:28.315+0200 b.s.d.DRPCSpout [ERROR] Failed to fetch DRPC
> result from DRPC server org.apache.thrift7.transport.TTransportException:
> java.net.ConnectException: Connection refused at
> org.apache.thrift7.transport.TSocket.open(TSocket.java:183)
> ~[storm-core-0.9.6.jar:0.9.6]
>
> On Wed, Jun 28, 2017 at 5:12 PM, Bobby Evans <ev...@yahoo-inc.com> wrote:
>
>> OK so I misread your comment.  All it really means is that someone closed
>> a connection to the thrift DRPC server.  You should look in the logs of the
>> workers to see if any of them have been killed and relaunched recently.
>>
>>
>> - Bobby
>>
>>
>>
>> On Wednesday, June 28, 2017, 10:03:22 AM CDT, sam mohel <
>> sammohel5@gmail.com> wrote:
>>
>>
>> I haven't see any exception except I wrote here . Connection reset by
>> peer and this in droc log file . And in local mode just stopped without any
>> message
>>
>> On Wednesday, June 28, 2017, Bobby Evans <ev...@yahoo-inc.com> wrote:
>> > It should have been in the exception.  You have not included enough
>> information from the logs to be able to actually debug this.  I am just
>> guessing from similar exceptions I have seen in the past.
>> >
>> >
>> > - Bobby
>> >
>> >
>> > On Wednesday, June 28, 2017, 9:23:22 AM CDT, sam mohel <
>> sammohel5@gmail.com> wrote:
>> >
>> > @Bobby Thanks for replying , please how can i check if ip-address and
>> port if it has another worker ? i'm on ubuntu 14.04 LTS
>> > On Wed, Jun 28, 2017 at 4:15 PM, Bobby Evans <ev...@yahoo-inc.com>
>> wrote:
>> >
>> > Connection reset by peer typically means that another worker was
>> rescheduled some place else and that worker closed it's connection to this
>> host.  If it did come from the worker then you should see the IP address +
>> port of the other worker and see if it was rescheduled.  If this was
>> because something else closed the connection then it is hard to tell what
>> is happening.
>> > With DRPC it is not guaranteed to be processed.  If the message is not
>> processed in a timely manor you do need to retry it.
>> >
>> >
>> > - Bobby
>> >
>> >
>> > On Wednesday, June 28, 2017, 9:00:08 AM CDT, sam mohel <
>> sammohel5@gmail.com> wrote:
>> >
>> > Thanks for replying and help . i tried to increase  worker.heap.memory.
>> mb: 2048 but not working . DRPC stopped working . I wonder why it is
>> stopped and data set i used smaller than first one !! my data set are
>> tweets and i'm working on processing it . l tried local mode  but  also not
>> working result size stopped in size 57.7 KB. Is there any thing i should
>> share it ?
>> > On Wed, Jun 28, 2017 at 2:56 PM, Navin Ipe <navin.ipe@searchlighthealth.
>> com> wrote:
>> >
>> > @Sam: You've provided very less information for us to help you. Prima
>> facie, if you have allocated very less memory for your topologies, Storm is
>> obviously running out of memory, the spouts and bolts are restarting which
>> causes the Connection reset by peer error.
>> > The solution is to allow Storm to use more RAM (assuming there is more
>> RAM).
>> >
>> > int RAM_IN_MB = 2048;
>> > Use stormConfig.put(Config. TOPOLOGY_WORKER_MAX_HEAP_SIZE_ MB,
>> RAM_IN_MB);
>> >
>> > If you provide more details of the error, when it happens and what your
>> program is trying to accomplish, the others on this forum would be able to
>> help you better.
>> >
>> >
>> > On Wed, Jun 28, 2017 at 3:30 PM, sam mohel <sa...@gmail.com> wrote:
>> >
>> > Is there any help.please ?
>> >
>> > On Wednesday, June 28, 2017, sam mohel <sa...@gmail.com> wrote:
>> >> I submitted two topologies in production mode . First one has a data
>> set with size 215 MB and worked well  and gave me the results . Second
>> topology has a data set with size 170 MB with same configurations but
>> stopped worked after some times and didn't complete its result
>> >> The error i got is drpc log file
>> >>     TNonblockingServer [WARN] Got an IOException in internalRead!
>> >>     java.io.IOException: Connection reset by peer
>> >> I couldn't figure where is the problem as it supposed to work well as
>> second data set is smaller in size
>> >
>> >
>> > --
>> > Regards,
>> > Navin
>> >
>> >
>>
>
>

Re: java.io.IOException: Connection reset by peer in DRPC

Posted by sam mohel <sa...@gmail.com>.
i found in worker log file
DRPCSpout [ERROR] Failed to fetch DRPC result from DRPC server
org.apache.thrift7.transport.TTransportException: null at
org.apache.thrift7.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
~[storm-core-0.9.6.jar:0.9.6]
28T16:18:28.280+0200 b.s.u.StormBoundedExponentialBackoffRetry [INFO] The
baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries [100]
2017-06-28T16:18:28.315+0200 b.s.d.DRPCSpout [ERROR] Failed to fetch DRPC
result from DRPC server org.apache.thrift7.transport.TTransportException:
java.net.ConnectException: Connection refused at
org.apache.thrift7.transport.TSocket.open(TSocket.java:183)
~[storm-core-0.9.6.jar:0.9.6]

On Wed, Jun 28, 2017 at 5:12 PM, Bobby Evans <ev...@yahoo-inc.com> wrote:

> OK so I misread your comment.  All it really means is that someone closed
> a connection to the thrift DRPC server.  You should look in the logs of the
> workers to see if any of them have been killed and relaunched recently.
>
>
> - Bobby
>
>
>
> On Wednesday, June 28, 2017, 10:03:22 AM CDT, sam mohel <
> sammohel5@gmail.com> wrote:
>
>
> I haven't see any exception except I wrote here . Connection reset by peer
> and this in droc log file . And in local mode just stopped without any
> message
>
> On Wednesday, June 28, 2017, Bobby Evans <ev...@yahoo-inc.com> wrote:
> > It should have been in the exception.  You have not included enough
> information from the logs to be able to actually debug this.  I am just
> guessing from similar exceptions I have seen in the past.
> >
> >
> > - Bobby
> >
> >
> > On Wednesday, June 28, 2017, 9:23:22 AM CDT, sam mohel <
> sammohel5@gmail.com> wrote:
> >
> > @Bobby Thanks for replying , please how can i check if ip-address and
> port if it has another worker ? i'm on ubuntu 14.04 LTS
> > On Wed, Jun 28, 2017 at 4:15 PM, Bobby Evans <ev...@yahoo-inc.com>
> wrote:
> >
> > Connection reset by peer typically means that another worker was
> rescheduled some place else and that worker closed it's connection to this
> host.  If it did come from the worker then you should see the IP address +
> port of the other worker and see if it was rescheduled.  If this was
> because something else closed the connection then it is hard to tell what
> is happening.
> > With DRPC it is not guaranteed to be processed.  If the message is not
> processed in a timely manor you do need to retry it.
> >
> >
> > - Bobby
> >
> >
> > On Wednesday, June 28, 2017, 9:00:08 AM CDT, sam mohel <
> sammohel5@gmail.com> wrote:
> >
> > Thanks for replying and help . i tried to increase  worker.heap.memory.
> mb: 2048 but not working . DRPC stopped working . I wonder why it is
> stopped and data set i used smaller than first one !! my data set are
> tweets and i'm working on processing it . l tried local mode  but  also not
> working result size stopped in size 57.7 KB. Is there any thing i should
> share it ?
> > On Wed, Jun 28, 2017 at 2:56 PM, Navin Ipe <navin.ipe@searchlighthealth.
> com> wrote:
> >
> > @Sam: You've provided very less information for us to help you. Prima
> facie, if you have allocated very less memory for your topologies, Storm is
> obviously running out of memory, the spouts and bolts are restarting which
> causes the Connection reset by peer error.
> > The solution is to allow Storm to use more RAM (assuming there is more
> RAM).
> >
> > int RAM_IN_MB = 2048;
> > Use stormConfig.put(Config. TOPOLOGY_WORKER_MAX_HEAP_SIZE_ MB,
> RAM_IN_MB);
> >
> > If you provide more details of the error, when it happens and what your
> program is trying to accomplish, the others on this forum would be able to
> help you better.
> >
> >
> > On Wed, Jun 28, 2017 at 3:30 PM, sam mohel <sa...@gmail.com> wrote:
> >
> > Is there any help.please ?
> >
> > On Wednesday, June 28, 2017, sam mohel <sa...@gmail.com> wrote:
> >> I submitted two topologies in production mode . First one has a data
> set with size 215 MB and worked well  and gave me the results . Second
> topology has a data set with size 170 MB with same configurations but
> stopped worked after some times and didn't complete its result
> >> The error i got is drpc log file
> >>     TNonblockingServer [WARN] Got an IOException in internalRead!
> >>     java.io.IOException: Connection reset by peer
> >> I couldn't figure where is the problem as it supposed to work well as
> second data set is smaller in size
> >
> >
> > --
> > Regards,
> > Navin
> >
> >
>

Re: java.io.IOException: Connection reset by peer in DRPC

Posted by Bobby Evans <ev...@yahoo-inc.com>.
OK so I misread your comment.  All it really means is that someone closed a connection to the thrift DRPC server.  You should look in the logs of the workers to see if any of them have been killed and relaunched recently.


- Bobby


On Wednesday, June 28, 2017, 10:03:22 AM CDT, sam mohel <sa...@gmail.com> wrote:

I haven't see any exception except I wrote here . Connection reset by peer and this in droc log file . And in local mode just stopped without any message 

On Wednesday, June 28, 2017, Bobby Evans <ev...@yahoo-inc.com> wrote:
> It should have been in the exception.  You have not included enough information from the logs to be able to actually debug this.  I am just guessing from similar exceptions I have seen in the past.
>
>
> - Bobby
>
>
> On Wednesday, June 28, 2017, 9:23:22 AM CDT, sam mohel <sa...@gmail.com> wrote:
>
> @Bobby Thanks for replying , please how can i check if ip-address and port if it has another worker ? i'm on ubuntu 14.04 LTS 
> On Wed, Jun 28, 2017 at 4:15 PM, Bobby Evans <ev...@yahoo-inc.com> wrote:
>
> Connection reset by peer typically means that another worker was rescheduled some place else and that worker closed it's connection to this host.  If it did come from the worker then you should see the IP address + port of the other worker and see if it was rescheduled.  If this was because something else closed the connection then it is hard to tell what is happening.
> With DRPC it is not guaranteed to be processed.  If the message is not processed in a timely manor you do need to retry it.
>
>
> - Bobby
>
>
> On Wednesday, June 28, 2017, 9:00:08 AM CDT, sam mohel <sa...@gmail.com> wrote:
>
> Thanks for replying and help . i tried to increase  worker.heap.memory. mb: 2048 but not working . DRPC stopped working . I wonder why it is stopped and data set i used smaller than first one !! my data set are tweets and i'm working on processing it . l tried local mode  but  also not working result size stopped in size 57.7 KB. Is there any thing i should share it ? 
> On Wed, Jun 28, 2017 at 2:56 PM, Navin Ipe <navin.ipe@searchlighthealth. com> wrote:
>
> @Sam: You've provided very less information for us to help you. Prima facie, if you have allocated very less memory for your topologies, Storm is obviously running out of memory, the spouts and bolts are restarting which causes the Connection reset by peer error.
> The solution is to allow Storm to use more RAM (assuming there is more RAM).
>
> int RAM_IN_MB = 2048;
> Use stormConfig.put(Config. TOPOLOGY_WORKER_MAX_HEAP_SIZE_ MB, RAM_IN_MB);
>
> If you provide more details of the error, when it happens and what your program is trying to accomplish, the others on this forum would be able to help you better.
>
>
> On Wed, Jun 28, 2017 at 3:30 PM, sam mohel <sa...@gmail.com> wrote:
>
> Is there any help.please ?
>
> On Wednesday, June 28, 2017, sam mohel <sa...@gmail.com> wrote:
>> I submitted two topologies in production mode . First one has a data set with size 215 MB and worked well  and gave me the results . Second topology has a data set with size 170 MB with same configurations but stopped worked after some times and didn't complete its result 
>> The error i got is drpc log file 
>>     TNonblockingServer [WARN] Got an IOException in internalRead!
>>     java.io.IOException: Connection reset by peer
>> I couldn't figure where is the problem as it supposed to work well as second data set is smaller in size 
>
>
> --
> Regards,
> Navin
>
>

Re: java.io.IOException: Connection reset by peer in DRPC

Posted by sam mohel <sa...@gmail.com>.
I haven't see any exception except I wrote here . Connection reset by peer
and this in droc log file . And in local mode just stopped without any
message

On Wednesday, June 28, 2017, Bobby Evans <ev...@yahoo-inc.com> wrote:
> It should have been in the exception.  You have not included enough
information from the logs to be able to actually debug this.  I am just
guessing from similar exceptions I have seen in the past.
>
>
> - Bobby
>
>
> On Wednesday, June 28, 2017, 9:23:22 AM CDT, sam mohel <
sammohel5@gmail.com> wrote:
>
> @Bobby Thanks for replying , please how can i check if ip-address and
port if it has another worker ? i'm on ubuntu 14.04 LTS
> On Wed, Jun 28, 2017 at 4:15 PM, Bobby Evans <ev...@yahoo-inc.com> wrote:
>
> Connection reset by peer typically means that another worker was
rescheduled some place else and that worker closed it's connection to this
host.  If it did come from the worker then you should see the IP address +
port of the other worker and see if it was rescheduled.  If this was
because something else closed the connection then it is hard to tell what
is happening.
> With DRPC it is not guaranteed to be processed.  If the message is not
processed in a timely manor you do need to retry it.
>
>
> - Bobby
>
>
> On Wednesday, June 28, 2017, 9:00:08 AM CDT, sam mohel <
sammohel5@gmail.com> wrote:
>
> Thanks for replying and help . i tried to increase  worker.heap.memory.
mb: 2048 but not working . DRPC stopped working . I wonder why it is
stopped and data set i used smaller than first one !! my data set are
tweets and i'm working on processing it . l tried local mode  but  also not
working result size stopped in size 57.7 KB. Is there any thing i should
share it ?
> On Wed, Jun 28, 2017 at 2:56 PM, Navin Ipe <navin.ipe@searchlighthealth.
com> wrote:
>
> @Sam: You've provided very less information for us to help you. Prima
facie, if you have allocated very less memory for your topologies, Storm is
obviously running out of memory, the spouts and bolts are restarting which
causes the Connection reset by peer error.
> The solution is to allow Storm to use more RAM (assuming there is more
RAM).
>
> int RAM_IN_MB = 2048;
> Use stormConfig.put(Config. TOPOLOGY_WORKER_MAX_HEAP_SIZE_ MB, RAM_IN_MB);
>
> If you provide more details of the error, when it happens and what your
program is trying to accomplish, the others on this forum would be able to
help you better.
>
>
> On Wed, Jun 28, 2017 at 3:30 PM, sam mohel <sa...@gmail.com> wrote:
>
> Is there any help.please ?
>
> On Wednesday, June 28, 2017, sam mohel <sa...@gmail.com> wrote:
>> I submitted two topologies in production mode . First one has a data set
with size 215 MB and worked well  and gave me the results . Second topology
has a data set with size 170 MB with same configurations but stopped worked
after some times and didn't complete its result
>> The error i got is drpc log file
>>     TNonblockingServer [WARN] Got an IOException in internalRead!
>>     java.io.IOException: Connection reset by peer
>> I couldn't figure where is the problem as it supposed to work well as
second data set is smaller in size
>
>
> --
> Regards,
> Navin
>
>

Re: java.io.IOException: Connection reset by peer in DRPC

Posted by Bobby Evans <ev...@yahoo-inc.com>.
It should have been in the exception.  You have not included enough information from the logs to be able to actually debug this.  I am just guessing from similar exceptions I have seen in the past.


- Bobby


On Wednesday, June 28, 2017, 9:23:22 AM CDT, sam mohel <sa...@gmail.com> wrote:

@Bobby Thanks for replying , please how can i check if ip-address and port if it has another worker ? i'm on ubuntu 14.04 LTS 
On Wed, Jun 28, 2017 at 4:15 PM, Bobby Evans <ev...@yahoo-inc.com> wrote:

Connection reset by peer typically means that another worker was rescheduled some place else and that worker closed it's connection to this host.  If it did come from the worker then you should see the IP address + port of the other worker and see if it was rescheduled.  If this was because something else closed the connection then it is hard to tell what is happening.
With DRPC it is not guaranteed to be processed.  If the message is not processed in a timely manor you do need to retry it.


- Bobby


On Wednesday, June 28, 2017, 9:00:08 AM CDT, sam mohel <sa...@gmail.com> wrote:

Thanks for replying and help . i tried to increase  worker.heap.memory. mb: 2048 but not working . DRPC stopped working . I wonder why it is stopped and data set i used smaller than first one !! my data set are tweets and i'm working on processing it . l tried local mode  but  also not working result size stopped in size 57.7 KB. Is there any thing i should share it ? 
On Wed, Jun 28, 2017 at 2:56 PM, Navin Ipe <navin.ipe@searchlighthealth. com> wrote:

@Sam: You've provided very less information for us to help you. Prima facie, if you have allocated very less memory for your topologies, Storm is obviously running out of memory, the spouts and bolts are restarting which causes the Connection reset by peer error.
The solution is to allow Storm to use more RAM (assuming there is more RAM). 

int RAM_IN_MB = 2048;
Use stormConfig.put(Config. TOPOLOGY_WORKER_MAX_HEAP_SIZE_ MB, RAM_IN_MB);

If you provide more details of the error, when it happens and what your program is trying to accomplish, the others on this forum would be able to help you better.


On Wed, Jun 28, 2017 at 3:30 PM, sam mohel <sa...@gmail.com> wrote:

Is there any help.please ? 

On Wednesday, June 28, 2017, sam mohel <sa...@gmail.com> wrote:
> I submitted two topologies in production mode . First one has a data set with size 215 MB and worked well  and gave me the results . Second topology has a data set with size 170 MB with same configurations but stopped worked after some times and didn't complete its result 
> The error i got is drpc log file 
>     TNonblockingServer [WARN] Got an IOException in internalRead!
>     java.io.IOException: Connection reset by peer
> I couldn't figure where is the problem as it supposed to work well as second data set is smaller in size 



-- 
Regards,Navin




Re: java.io.IOException: Connection reset by peer in DRPC

Posted by sam mohel <sa...@gmail.com>.
@Bobby Thanks for replying , please how can i check if ip-address and port
if it has another worker ? i'm on ubuntu 14.04 LTS

On Wed, Jun 28, 2017 at 4:15 PM, Bobby Evans <ev...@yahoo-inc.com> wrote:

> Connection reset by peer typically means that another worker was
> rescheduled some place else and that worker closed it's connection to this
> host.  If it did come from the worker then you should see the IP address +
> port of the other worker and see if it was rescheduled.  If this was
> because something else closed the connection then it is hard to tell what
> is happening.
>
> With DRPC it is not guaranteed to be processed.  If the message is not
> processed in a timely manor you do need to retry it.
>
>
> - Bobby
>
>
>
> On Wednesday, June 28, 2017, 9:00:08 AM CDT, sam mohel <
> sammohel5@gmail.com> wrote:
>
>
> Thanks for replying and help . i tried to increase  worker.heap.memory.mb:
> 2048 but not working . DRPC stopped working . I wonder why it is stopped
> and data set i used smaller than first one !! my data set are tweets and
> i'm working on processing it . l tried local mode  but  also not working
> result size stopped in size 57.7 KB. Is there any thing i should share it ?
>
> On Wed, Jun 28, 2017 at 2:56 PM, Navin Ipe <navin.ipe@searchlighthealth.
> com> wrote:
>
> @Sam: You've provided very less information for us to help you. Prima
> facie, if you have allocated very less memory for your topologies, Storm is
> obviously running out of memory, the spouts and bolts are restarting which
> causes the Connection reset by peer error.
> The solution is to allow Storm to use more RAM (assuming there is more
> RAM).
>
> int RAM_IN_MB = 2048;
> Use stormConfig.put(Config. TOPOLOGY_WORKER_MAX_HEAP_SIZE_ MB, RAM_IN_MB);
>
> If you provide more details of the error, when it happens and what your
> program is trying to accomplish, the others on this forum would be able to
> help you better.
>
>
> On Wed, Jun 28, 2017 at 3:30 PM, sam mohel <sa...@gmail.com> wrote:
>
> Is there any help.please ?
>
> On Wednesday, June 28, 2017, sam mohel <sa...@gmail.com> wrote:
> > I submitted two topologies in production mode . First one has a data set
> with size 215 MB and worked well  and gave me the results . Second topology
> has a data set with size 170 MB with same configurations but stopped worked
> after some times and didn't complete its result
> > The error i got is drpc log file
> >     TNonblockingServer [WARN] Got an IOException in internalRead!
> >     java.io.IOException: Connection reset by peer
> > I couldn't figure where is the problem as it supposed to work well as
> second data set is smaller in size
>
>
>
>
> --
> Regards,
> Navin
>
>
>

Re: java.io.IOException: Connection reset by peer in DRPC

Posted by Bobby Evans <ev...@yahoo-inc.com>.
Connection reset by peer typically means that another worker was rescheduled some place else and that worker closed it's connection to this host.  If it did come from the worker then you should see the IP address + port of the other worker and see if it was rescheduled.  If this was because something else closed the connection then it is hard to tell what is happening.
With DRPC it is not guaranteed to be processed.  If the message is not processed in a timely manor you do need to retry it.


- Bobby


On Wednesday, June 28, 2017, 9:00:08 AM CDT, sam mohel <sa...@gmail.com> wrote:

Thanks for replying and help . i tried to increase  worker.heap.memory.mb: 2048 but not working . DRPC stopped working . I wonder why it is stopped and data set i used smaller than first one !! my data set are tweets and i'm working on processing it . l tried local mode  but  also not working result size stopped in size 57.7 KB. Is there any thing i should share it ? 
On Wed, Jun 28, 2017 at 2:56 PM, Navin Ipe <na...@searchlighthealth.com> wrote:

@Sam: You've provided very less information for us to help you. Prima facie, if you have allocated very less memory for your topologies, Storm is obviously running out of memory, the spouts and bolts are restarting which causes the Connection reset by peer error.
The solution is to allow Storm to use more RAM (assuming there is more RAM). 

int RAM_IN_MB = 2048;
Use stormConfig.put(Config. TOPOLOGY_WORKER_MAX_HEAP_SIZE_ MB, RAM_IN_MB);

If you provide more details of the error, when it happens and what your program is trying to accomplish, the others on this forum would be able to help you better.


On Wed, Jun 28, 2017 at 3:30 PM, sam mohel <sa...@gmail.com> wrote:

Is there any help.please ? 

On Wednesday, June 28, 2017, sam mohel <sa...@gmail.com> wrote:
> I submitted two topologies in production mode . First one has a data set with size 215 MB and worked well  and gave me the results . Second topology has a data set with size 170 MB with same configurations but stopped worked after some times and didn't complete its result 
> The error i got is drpc log file 
>     TNonblockingServer [WARN] Got an IOException in internalRead!
>     java.io.IOException: Connection reset by peer
> I couldn't figure where is the problem as it supposed to work well as second data set is smaller in size 



-- 
Regards,Navin


Re: java.io.IOException: Connection reset by peer in DRPC

Posted by sam mohel <sa...@gmail.com>.
Thanks for replying and help . i tried to increase  worker.heap.memory.mb:
2048 but not working . DRPC stopped working . I wonder why it is stopped
and data set i used smaller than first one !! my data set are tweets and
i'm working on processing it . l tried local mode  but  also not working
result size stopped in size 57.7 KB. Is there any thing i should share it ?

On Wed, Jun 28, 2017 at 2:56 PM, Navin Ipe <na...@searchlighthealth.com>
wrote:

> @Sam: You've provided very less information for us to help you. Prima
> facie, if you have allocated very less memory for your topologies, Storm is
> obviously running out of memory, the spouts and bolts are restarting which
> causes the Connection reset by peer error.
> The solution is to allow Storm to use more RAM (assuming there is more
> RAM).
>
> int RAM_IN_MB = 2048;
> Use stormConfig.put(Config.TOPOLOGY_WORKER_MAX_HEAP_SIZE_MB, RAM_IN_MB);
>
> If you provide more details of the error, when it happens and what your
> program is trying to accomplish, the others on this forum would be able to
> help you better.
>
>
> On Wed, Jun 28, 2017 at 3:30 PM, sam mohel <sa...@gmail.com> wrote:
>
>> Is there any help.please ?
>>
>> On Wednesday, June 28, 2017, sam mohel <sa...@gmail.com> wrote:
>> > I submitted two topologies in production mode . First one has a data
>> set with size 215 MB and worked well  and gave me the results . Second
>> topology has a data set with size 170 MB with same configurations but
>> stopped worked after some times and didn't complete its result
>> > The error i got is drpc log file
>> >     TNonblockingServer [WARN] Got an IOException in internalRead!
>> >     java.io.IOException: Connection reset by peer
>> > I couldn't figure where is the problem as it supposed to work well as
>> second data set is smaller in size
>>
>
>
>
> --
> Regards,
> Navin
>

Re: java.io.IOException: Connection reset by peer in DRPC

Posted by Navin Ipe <na...@searchlighthealth.com>.
@Sam: You've provided very less information for us to help you. Prima
facie, if you have allocated very less memory for your topologies, Storm is
obviously running out of memory, the spouts and bolts are restarting which
causes the Connection reset by peer error.
The solution is to allow Storm to use more RAM (assuming there is more
RAM).

int RAM_IN_MB = 2048;
Use stormConfig.put(Config.TOPOLOGY_WORKER_MAX_HEAP_SIZE_MB, RAM_IN_MB);

If you provide more details of the error, when it happens and what your
program is trying to accomplish, the others on this forum would be able to
help you better.


On Wed, Jun 28, 2017 at 3:30 PM, sam mohel <sa...@gmail.com> wrote:

> Is there any help.please ?
>
> On Wednesday, June 28, 2017, sam mohel <sa...@gmail.com> wrote:
> > I submitted two topologies in production mode . First one has a data set
> with size 215 MB and worked well  and gave me the results . Second topology
> has a data set with size 170 MB with same configurations but stopped worked
> after some times and didn't complete its result
> > The error i got is drpc log file
> >     TNonblockingServer [WARN] Got an IOException in internalRead!
> >     java.io.IOException: Connection reset by peer
> > I couldn't figure where is the problem as it supposed to work well as
> second data set is smaller in size
>



-- 
Regards,
Navin

Re: java.io.IOException: Connection reset by peer in DRPC

Posted by sam mohel <sa...@gmail.com>.
Is there any help.please ?

On Wednesday, June 28, 2017, sam mohel <sa...@gmail.com> wrote:
> I submitted two topologies in production mode . First one has a data set
with size 215 MB and worked well  and gave me the results . Second topology
has a data set with size 170 MB with same configurations but stopped worked
after some times and didn't complete its result
> The error i got is drpc log file
>     TNonblockingServer [WARN] Got an IOException in internalRead!
>     java.io.IOException: Connection reset by peer
> I couldn't figure where is the problem as it supposed to work well as
second data set is smaller in size

Re: java.io.IOException: Connection reset by peer in DRPC

Posted by sam mohel <sa...@gmail.com>.
Is there any help.please ?

On Wednesday, June 28, 2017, sam mohel <sa...@gmail.com> wrote:
> I submitted two topologies in production mode . First one has a data set
with size 215 MB and worked well  and gave me the results . Second topology
has a data set with size 170 MB with same configurations but stopped worked
after some times and didn't complete its result
> The error i got is drpc log file
>     TNonblockingServer [WARN] Got an IOException in internalRead!
>     java.io.IOException: Connection reset by peer
> I couldn't figure where is the problem as it supposed to work well as
second data set is smaller in size