You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by "mohit.kaushik" <mo...@orkash.com> on 2016/02/22 07:36:48 UTC

IOException in internalRead!

I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.

ACCUMULO-2420  <https://issues.apache.org/jira/browse/ACCUMULO-2420>  says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?

	java.io.IOException: Connection reset by peer
		at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
		at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
		at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
		at sun.nio.ch.IOUtil.read(IOUtil.java:197)
		at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
		at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
		at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)

Regards
Mohit kaushik

Signature

Re: IOException in internalRead! & transient exception communicating with ZooKeeper

Posted by Josh Elser <jo...@gmail.com>.

Keith Turner wrote:
> On Fri, Feb 26, 2016 at 7:33 AM, mohit.kaushik <mohit.kaushik@orkash.com
> <ma...@orkash.com>> wrote:
>
>     Thanks Keith, But My Accumulo clients are using same connection
>     object. And the count for these WARN increase every second . Can
>     Monitor cause these exceptions?
>
>
> I don't think so, but not 100% sure.  I think the MAster process usually
> talks to the tservers to gather info and then the monitor talks to the
> tserver.
>
> I am wondering if there is any reason that this message should be logged
> at WARN.  Seems like a routine event, should we open an issue to look
> into logging this at a lower level?

Yeah, this is my hunch too. I know we can be a little lax in letting 
connections close via their configured timeout. It seems to me like this 
is just the server saying "oh, this got closed when I was expecting more 
data from the the client!". Would need to be certain like you say tho.

Re: IOException in internalRead! & transient exception communicating with ZooKeeper

Posted by Josh Elser <jo...@gmail.com>.
Are you using a Scanner and then a BatchWriter for that existence check?

mohit.kaushik wrote:
> I have upgraded to Accumulo-1.7.1 but the problem doesn't goes
> completely. Now strangely I am getting the same error on a single server
> not all. Is it because of the lookup that the application always does to
> check the existence of a document before inserting one?
>
> recent logs
> Keith if you say I will create a jira issue for this if required.
>
> Thanks
>
>
> On 02/27/2016 04:03 AM, Keith Turner wrote:
>>
>>
>> On Fri, Feb 26, 2016 at 7:33 AM, mohit.kaushik
>> <mohit.kaushik@orkash.com <ma...@orkash.com>> wrote:
>>
>>     Thanks Keith, But My Accumulo clients are using same connection
>>     object. And the count for these WARN increase every second . Can
>>     Monitor cause these exceptions?
>>
>>
>> I don't think so, but not 100% sure.  I think the MAster process
>> usually talks to the tservers to gather info and then the monitor
>> talks to the tserver.
>>
>> I am wondering if there is any reason that this message should be
>> logged at WARN.  Seems like a routine event, should we open an issue
>> to look into logging this at a lower level?
>>
>>
>>
>>     On 02/24/2016 08:18 PM, Keith Turner wrote:
>>>     You can probably ignore those.  I think its caused by an Accumulo
>>>     client closing its connection.
>>>
>>>     On Wed, Feb 24, 2016 at 6:35 AM, mohit.kaushik
>>>     <mohit.kaushik@orkash.com <ma...@orkash.com>> wrote:
>>>
>>>         here is screenshot, should I ignore these warnings?
>>>
>>>
>>>         internal read exception
>>>
>>>
>>>         On 02/22/2016 12:23 PM, mohit.kaushik wrote:
>>>>         Sent so early...
>>>>
>>>>         Another exception I am getting frequently with zookeeper
>>>>         which is a bigger problem.
>>>>         ACCUMULO-3336
>>>>         <https://issues.apache.org/jira/browse/ACCUMULO-3336> says
>>>>         it is unresolved yet
>>>>         Saw (possibly) transient exception communicating with ZooKeeper
>>>>         	org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
>>>>         		at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>>>>         		at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>>>         		at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
>>>>         		at org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
>>>>         		at org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
>>>>         		at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>>>>         		at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>>>>         And the worst case is whenever a zookeeper goes down cluster
>>>>         becomes unreacheble for the time being, untill it restarts
>>>>         ingest process halts.
>>>>
>>>>         What do you suggest, I need to resolve these problems. I do
>>>>         not want to be the ingest process to stop ever.
>>>>
>>>>         Thanks
>>>>         Mohit kaushik
>>>>
>>>>
>>>>         On 02/22/2016 12:06 PM, mohit.kaushik wrote:
>>>>>         I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.
>>>>>
>>>>>         ACCUMULO-2420  <https://issues.apache.org/jira/browse/ACCUMULO-2420>  says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?
>>>>>
>>>>>         	java.io.IOException: Connection reset by peer
>>>>>         		at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>>>>         		at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>>>>         		at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>>>>>         		at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>>>>>         		at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>>>>>         		at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
>>>>>         		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
>>>>>         		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
>>>>>         		at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
>>>>>         		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
>>>>>         		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>>>>>
>>>>>         Regards
>>>>>         Mohit kaushik
>>>>>
>>

Re: IOException in internalRead! & transient exception communicating with ZooKeeper

Posted by "mohit.kaushik" <mo...@orkash.com>.
I have upgraded to Accumulo-1.7.1 but the problem doesn't goes 
completely. Now strangely I am getting the same error on a single server 
not all. Is it because of the lookup that the application always does to 
check the existence of a document before inserting one?

recent logs
Keith if you say I will create a jira issue for this if required.

Thanks


On 02/27/2016 04:03 AM, Keith Turner wrote:
>
>
> On Fri, Feb 26, 2016 at 7:33 AM, mohit.kaushik 
> <mohit.kaushik@orkash.com <ma...@orkash.com>> wrote:
>
>     Thanks Keith, But My Accumulo clients are using same connection
>     object. And the count for these WARN increase every second . Can
>     Monitor cause these exceptions?
>
>
> I don't think so, but not 100% sure.  I think the MAster process 
> usually talks to the tservers to gather info and then the monitor 
> talks to the tserver.
>
> I am wondering if there is any reason that this message should be 
> logged at WARN.  Seems like a routine event, should we open an issue 
> to look into logging this at a lower level?
>
>
>
>     On 02/24/2016 08:18 PM, Keith Turner wrote:
>>     You can probably ignore those.  I think its caused by an Accumulo
>>     client closing its connection.
>>
>>     On Wed, Feb 24, 2016 at 6:35 AM, mohit.kaushik
>>     <mohit.kaushik@orkash.com <ma...@orkash.com>> wrote:
>>
>>         here is screenshot, should I ignore these warnings?
>>
>>
>>         internal read exception
>>
>>
>>         On 02/22/2016 12:23 PM, mohit.kaushik wrote:
>>>         Sent so early...
>>>
>>>         Another exception I am getting frequently with zookeeper
>>>         which is a bigger problem.
>>>         ACCUMULO-3336
>>>         <https://issues.apache.org/jira/browse/ACCUMULO-3336> says
>>>         it is unresolved yet
>>>         Saw (possibly) transient exception communicating with ZooKeeper
>>>         	org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
>>>         		at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>>>         		at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>>         		at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
>>>         		at org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
>>>         		at org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
>>>         		at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>>>         		at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>>>         And the worst case is whenever a zookeeper goes down cluster
>>>         becomes unreacheble for the time being, untill it restarts
>>>         ingest process halts.
>>>
>>>         What do you suggest, I need to resolve these problems. I do
>>>         not want to be the ingest process to stop ever.
>>>
>>>         Thanks
>>>         Mohit kaushik
>>>
>>>
>>>         On 02/22/2016 12:06 PM, mohit.kaushik wrote:
>>>>         I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.
>>>>
>>>>         ACCUMULO-2420  <https://issues.apache.org/jira/browse/ACCUMULO-2420>  says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?
>>>>
>>>>         	java.io.IOException: Connection reset by peer
>>>>         		at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>>>         		at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>>>         		at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>>>>         		at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>>>>         		at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>>>>         		at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
>>>>         		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
>>>>         		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
>>>>         		at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
>>>>         		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
>>>>         		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>>>>
>>>>         Regards
>>>>         Mohit kaushik
>>>>
>

Re: IOException in internalRead! & transient exception communicating with ZooKeeper

Posted by Keith Turner <ke...@deenlo.com>.
On Fri, Feb 26, 2016 at 7:33 AM, mohit.kaushik <mo...@orkash.com>
wrote:

> Thanks Keith, But My Accumulo clients are using same connection object.
> And the count for these WARN increase every second . Can Monitor cause
> these exceptions?
>

I don't think so, but not 100% sure.  I think the MAster process usually
talks to the tservers to gather info and then the monitor talks to the
tserver.

I am wondering if there is any reason that this message should be logged at
WARN.  Seems like a routine event, should we open an issue to look into
logging this at a lower level?


>
>
> On 02/24/2016 08:18 PM, Keith Turner wrote:
>
> You can probably ignore those.  I think its caused by an Accumulo client
> closing its connection.
>
> On Wed, Feb 24, 2016 at 6:35 AM, mohit.kaushik <mo...@orkash.com>
> wrote:
>
>> here is screenshot, should I ignore these warnings?
>>
>>
>> [image: internal read exception]
>>
>>
>> On 02/22/2016 12:23 PM, mohit.kaushik wrote:
>>
>> Sent so early...
>>
>> Another exception I am getting frequently with zookeeper which is a
>> bigger problem.
>> ACCUMULO-3336 <https://issues.apache.org/jira/browse/ACCUMULO-3336> says
>> it is unresolved yet
>>
>> Saw (possibly) transient exception communicating with ZooKeeper
>> 	org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
>> 		at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>> 		at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>> 		at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
>> 		at org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
>> 		at org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
>> 		at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>> 		at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>>
>> And the worst case is whenever a zookeeper goes down cluster becomes
>> unreacheble for the time being, untill it restarts ingest process halts.
>>
>> What do you suggest, I need to resolve these problems. I do not want to
>> be the ingest process to stop ever.
>>
>> Thanks
>> Mohit kaushik
>>
>>
>> On 02/22/2016 12:06 PM, mohit.kaushik wrote:
>>
>> I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.
>> ACCUMULO-2420 <https://issues.apache.org/jira/browse/ACCUMULO-2420> says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?
>>
>> 	java.io.IOException: Connection reset by peer
>> 		at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>> 		at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>> 		at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>> 		at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>> 		at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>> 		at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
>> 		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
>> 		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
>> 		at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
>> 		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
>> 		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>>
>> Regards
>> Mohit kaushik
>>
>>
>>

Re: IOException in internalRead! & transient exception communicating with ZooKeeper

Posted by "mohit.kaushik" <mo...@orkash.com>.
Thanks Keith, But My Accumulo clients are using same connection object. 
And the count for these WARN increase every second . Can Monitor cause 
these exceptions?

On 02/24/2016 08:18 PM, Keith Turner wrote:
> You can probably ignore those.  I think its caused by an Accumulo 
> client closing its connection.
>
> On Wed, Feb 24, 2016 at 6:35 AM, mohit.kaushik 
> <mohit.kaushik@orkash.com <ma...@orkash.com>> wrote:
>
>     here is screenshot, should I ignore these warnings?
>
>
>     internal read exception
>
>
>     On 02/22/2016 12:23 PM, mohit.kaushik wrote:
>>     Sent so early...
>>
>>     Another exception I am getting frequently with zookeeper which is
>>     a bigger problem.
>>     ACCUMULO-3336
>>     <https://issues.apache.org/jira/browse/ACCUMULO-3336> says it is
>>     unresolved yet
>>     Saw (possibly) transient exception communicating with ZooKeeper
>>     	org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
>>     		at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>>     		at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>     		at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
>>     		at org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
>>     		at org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
>>     		at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>>     		at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>>     And the worst case is whenever a zookeeper goes down cluster
>>     becomes unreacheble for the time being, untill it restarts ingest
>>     process halts.
>>
>>     What do you suggest, I need to resolve these problems. I do not
>>     want to be the ingest process to stop ever.
>>
>>     Thanks
>>     Mohit kaushik
>>
>>
>>     On 02/22/2016 12:06 PM, mohit.kaushik wrote:
>>>     I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.
>>>
>>>     ACCUMULO-2420  <https://issues.apache.org/jira/browse/ACCUMULO-2420>  says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?
>>>
>>>     	java.io.IOException: Connection reset by peer
>>>     		at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>>     		at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>>     		at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>>>     		at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>>>     		at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>>>     		at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
>>>     		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
>>>     		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
>>>     		at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
>>>     		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
>>>     		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>>>
>>>     Regards
>>>     Mohit kaushik
>>>

Re: IOException in internalRead! & transient exception communicating with ZooKeeper

Posted by Keith Turner <ke...@deenlo.com>.
You can probably ignore those.  I think its caused by an Accumulo client
closing its connection.

On Wed, Feb 24, 2016 at 6:35 AM, mohit.kaushik <mo...@orkash.com>
wrote:

> here is screenshot, should I ignore these warnings?
>
>
> [image: internal read exception]
>
>
> On 02/22/2016 12:23 PM, mohit.kaushik wrote:
>
> Sent so early...
>
> Another exception I am getting frequently with zookeeper which is a bigger
> problem.
> ACCUMULO-3336 <https://issues.apache.org/jira/browse/ACCUMULO-3336> says
> it is unresolved yet
>
> Saw (possibly) transient exception communicating with ZooKeeper
> 	org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
> 		at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> 		at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> 		at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
> 		at org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
> 		at org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
> 		at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> 		at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>
> And the worst case is whenever a zookeeper goes down cluster becomes
> unreacheble for the time being, untill it restarts ingest process halts.
>
> What do you suggest, I need to resolve these problems. I do not want to be
> the ingest process to stop ever.
>
> Thanks
> Mohit kaushik
>
>
> On 02/22/2016 12:06 PM, mohit.kaushik wrote:
>
> I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.
> ACCUMULO-2420 <https://issues.apache.org/jira/browse/ACCUMULO-2420> says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?
>
> 	java.io.IOException: Connection reset by peer
> 		at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> 		at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> 		at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> 		at sun.nio.ch.IOUtil.read(IOUtil.java:197)
> 		at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
> 		at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
> 		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
> 		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
> 		at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
> 		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
> 		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>
> Regards
> Mohit kaushik
>
>
>
>
> --
>
> * Mohit Kaushik*
> Software Engineer
> A Square,Plot No. 278, Udyog Vihar, Phase 2, Gurgaon 122016, India
> *Tel:* +91 (124) 4969352 | *Fax:* +91 (124) 4033553
>
> <http://politicomapper.orkash.com>interactive social intelligence at
> work...
>
> <https://www.facebook.com/Orkash2012>
> <http://www.linkedin.com/company/orkash-services-private-limited>
> <https://twitter.com/Orkash>  <http://www.orkash.com/blog/>
> <http://www.orkash.com>
> <http://www.orkash.com> ... ensuring Assurance in complexity and
> uncertainty
>
> *This message including the attachments, if any, is a confidential
> business communication. If you are not the intended recipient it may be
> unlawful for you to read, copy, distribute, disclose or otherwise use the
> information in this e-mail. If you have received it in error or are not the
> intended recipient, please destroy it and notify the sender immediately.
> Thank you *
>

Re: IOException in internalRead! & transient exception communicating with ZooKeeper

Posted by "mohit.kaushik" <mo...@orkash.com>.
here is screenshot, should I ignore these warnings?


internal read exception

On 02/22/2016 12:23 PM, mohit.kaushik wrote:
> Sent so early...
>
> Another exception I am getting frequently with zookeeper which is a 
> bigger problem.
> ACCUMULO-3336 <https://issues.apache.org/jira/browse/ACCUMULO-3336> 
> says it is unresolved yet
> Saw (possibly) transient exception communicating with ZooKeeper
> 	org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
> 		at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> 		at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> 		at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
> 		at org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
> 		at org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
> 		at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> 		at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> And the worst case is whenever a zookeeper goes down cluster becomes 
> unreacheble for the time being, untill it restarts ingest process halts.
>
> What do you suggest, I need to resolve these problems. I do not want 
> to be the ingest process to stop ever.
>
> Thanks
> Mohit kaushik
>
>
> On 02/22/2016 12:06 PM, mohit.kaushik wrote:
>> I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.
>>
>> ACCUMULO-2420  <https://issues.apache.org/jira/browse/ACCUMULO-2420>  says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?
>>
>> 	java.io.IOException: Connection reset by peer
>> 		at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>> 		at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>> 		at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>> 		at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>> 		at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>> 		at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
>> 		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
>> 		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
>> 		at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
>> 		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
>> 		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>>
>> Regards
>> Mohit kaushik
>> Signature


-- 
Signature

*Mohit Kaushik*
Software Engineer
A Square,Plot No. 278, Udyog Vihar, Phase 2, Gurgaon 122016, India
*Tel:*+91 (124) 4969352 | *Fax:*+91 (124) 4033553

<http://politicomapper.orkash.com>interactive social intelligence at work...

<https://www.facebook.com/Orkash2012> 
<http://www.linkedin.com/company/orkash-services-private-limited> 
<https://twitter.com/Orkash> <http://www.orkash.com/blog/> 
<http://www.orkash.com>
<http://www.orkash.com> ... ensuring Assurance in complexity and uncertainty

/This message including the attachments, if any, is a confidential 
business communication. If you are not the intended recipient it may be 
unlawful for you to read, copy, distribute, disclose or otherwise use 
the information in this e-mail. If you have received it in error or are 
not the intended recipient, please destroy it and notify the sender 
immediately. Thank you /


Re: IOException in internalRead! & transient exception communicating with ZooKeeper

Posted by "mohit.kaushik" <mo...@orkash.com>.
Thanks for the reply Josh, I am running 3 zookeeper servers.

On 02/24/2016 10:29 PM, Josh Elser wrote:
> ZooKeeper is a funny system. This kind of ConnectionLossException is a 
> normal "state" that a ZooKeeper client can enter. We handle this 
> condition in Accumulo, retrying the operation (in this case, a 
> `create()`), after the client can reconnect to the ZooKeeper servers 
> in the background.
>
> ConnectionLossExceptions can be indicative of over-saturation of your 
> nodes. A ZooKeeper client might lose it's connection because it is 
> starved for CPU time. It can also indicate that the ZooKeeper servers 
> might be starved for resources.
>
> * Check the ZooKeeper server logs for any errors about dropped 
> connections (maxClientCnxns)
> * Make sure your servers running Accumulo are not running at 100% 
> total CPU usage and that there is free memory (no swapping).
>
> ACCUMULO-3336 is about a different ZooKeeper error condition called a 
> "session loss". This is when the entire ZooKeeper session needs to be 
> torn down and recreated. This only happens after prolonged pauses in 
> the client JVM or the ZooKeeper servers actively drop your connections 
> due to the internal configuration (maxClientCnxns). The stacktrace you 
> copied is not a session loss error.
>
> Are you saying that when a ZooKeeper server dies, you cannot use 
> Accumulo? How many are you running?
>
> mohit.kaushik wrote:
>> Sent so early...
>>
>> Another exception I am getting frequently with zookeeper which is a
>> bigger problem.
>> ACCUMULO-3336 <https://issues.apache.org/jira/browse/ACCUMULO-3336> says
>> it is unresolved yet
>>
>> Saw (possibly) transient exception communicating with ZooKeeper
>>     org.apache.zookeeper.KeeperException$ConnectionLossException: 
>> KeeperErrorCode = ConnectionLoss for 
>> /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
>>         at 
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>>         at 
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>         at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
>>         at 
>> org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
>>         at 
>> org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
>>         at 
>> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>>         at 
>> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>>
>> And the worst case is whenever a zookeeper goes down cluster becomes
>> unreacheble for the time being, untill it restarts ingest process halts.
>>
>> What do you suggest, I need to resolve these problems. I do not want to
>> be the ingest process to stop ever.
>>
>> Thanks
>> Mohit kaushik
>>
>>
>> On 02/22/2016 12:06 PM, mohit.kaushik wrote:
>>> I am facing the below given exception continuously, the count keeps 
>>> on increasing every sec(current value around 3000 on a server) I can 
>>> see the exception for all 3 tablet servers.
>>>
>>> ACCUMULO-2420 <https://issues.apache.org/jira/browse/ACCUMULO-2420> 
>>> says that this exception comes when a client closes a connection 
>>> before scan completes. But the connection is not closed every thread 
>>> uses a common connection object to ingest and query, then what could 
>>> cause this exception?
>>>
>>>     java.io.IOException: Connection reset by peer
>>>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>>         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>>>         at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>>>         at 
>>> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>>>         at 
>>> org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
>>>         at 
>>> org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
>>>         at 
>>> org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
>>>         at 
>>> org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
>>>         at 
>>> org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
>>>         at 
>>> org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>>>
>>> Regards
>>> Mohit kaushik
>>>
>


-- 
Signature

*Mohit Kaushik*
Software Engineer
A Square,Plot No. 278, Udyog Vihar, Phase 2, Gurgaon 122016, India
*Tel:*+91 (124) 4969352 | *Fax:*+91 (124) 4033553

<http://politicomapper.orkash.com>interactive social intelligence at work...

<https://www.facebook.com/Orkash2012> 
<http://www.linkedin.com/company/orkash-services-private-limited> 
<https://twitter.com/Orkash> <http://www.orkash.com/blog/> 
<http://www.orkash.com>
<http://www.orkash.com> ... ensuring Assurance in complexity and uncertainty

/This message including the attachments, if any, is a confidential 
business communication. If you are not the intended recipient it may be 
unlawful for you to read, copy, distribute, disclose or otherwise use 
the information in this e-mail. If you have received it in error or are 
not the intended recipient, please destroy it and notify the sender 
immediately. Thank you /


Re: IOException in internalRead! & transient exception communicating with ZooKeeper

Posted by Josh Elser <jo...@gmail.com>.
ZooKeeper is a funny system. This kind of ConnectionLossException is a 
normal "state" that a ZooKeeper client can enter. We handle this 
condition in Accumulo, retrying the operation (in this case, a 
`create()`), after the client can reconnect to the ZooKeeper servers in 
the background.

ConnectionLossExceptions can be indicative of over-saturation of your 
nodes. A ZooKeeper client might lose it's connection because it is 
starved for CPU time. It can also indicate that the ZooKeeper servers 
might be starved for resources.

* Check the ZooKeeper server logs for any errors about dropped 
connections (maxClientCnxns)
* Make sure your servers running Accumulo are not running at 100% total 
CPU usage and that there is free memory (no swapping).

ACCUMULO-3336 is about a different ZooKeeper error condition called a 
"session loss". This is when the entire ZooKeeper session needs to be 
torn down and recreated. This only happens after prolonged pauses in the 
client JVM or the ZooKeeper servers actively drop your connections due 
to the internal configuration (maxClientCnxns). The stacktrace you 
copied is not a session loss error.

Are you saying that when a ZooKeeper server dies, you cannot use 
Accumulo? How many are you running?

mohit.kaushik wrote:
> Sent so early...
>
> Another exception I am getting frequently with zookeeper which is a
> bigger problem.
> ACCUMULO-3336 <https://issues.apache.org/jira/browse/ACCUMULO-3336> says
> it is unresolved yet
>
> Saw (possibly) transient exception communicating with ZooKeeper
> 	org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
> 		at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> 		at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> 		at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
> 		at org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
> 		at org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
> 		at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> 		at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>
> And the worst case is whenever a zookeeper goes down cluster becomes
> unreacheble for the time being, untill it restarts ingest process halts.
>
> What do you suggest, I need to resolve these problems. I do not want to
> be the ingest process to stop ever.
>
> Thanks
> Mohit kaushik
>
>
> On 02/22/2016 12:06 PM, mohit.kaushik wrote:
>> I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.
>>
>> ACCUMULO-2420  <https://issues.apache.org/jira/browse/ACCUMULO-2420>  says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?
>>
>> 	java.io.IOException: Connection reset by peer
>> 		at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>> 		at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>> 		at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>> 		at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>> 		at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>> 		at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
>> 		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
>> 		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
>> 		at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
>> 		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
>> 		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>>
>> Regards
>> Mohit kaushik
>>

Re: IOException in internalRead! & transient exception communicating with ZooKeeper

Posted by "mohit.kaushik" <mo...@orkash.com>.
Sent so early...

Another exception I am getting frequently with zookeeper which is a 
bigger problem.
ACCUMULO-3336 <https://issues.apache.org/jira/browse/ACCUMULO-3336> says 
it is unresolved yet

Saw (possibly) transient exception communicating with ZooKeeper
	org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
		at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
		at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
		at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
		at org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
		at org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
		at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
		at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)

And the worst case is whenever a zookeeper goes down cluster becomes 
unreacheble for the time being, untill it restarts ingest process halts.

What do you suggest, I need to resolve these problems. I do not want to 
be the ingest process to stop ever.

Thanks
Mohit kaushik


On 02/22/2016 12:06 PM, mohit.kaushik wrote:
> I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.
>
> ACCUMULO-2420  <https://issues.apache.org/jira/browse/ACCUMULO-2420>  says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?
>
> 	java.io.IOException: Connection reset by peer
> 		at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> 		at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> 		at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> 		at sun.nio.ch.IOUtil.read(IOUtil.java:197)
> 		at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
> 		at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
> 		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
> 		at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
> 		at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
> 		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
> 		at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>
> Regards
> Mohit kaushik
> Signature