You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Keith Turner <ke...@deenlo.com> on 2016/02/24 15:48:48 UTC
Re: IOException in internalRead! & transient exception communicating
with ZooKeeper
You can probably ignore those. I think its caused by an Accumulo client
closing its connection.
On Wed, Feb 24, 2016 at 6:35 AM, mohit.kaushik <mo...@orkash.com>
wrote:
> here is screenshot, should I ignore these warnings?
>
>
> [image: internal read exception]
>
>
> On 02/22/2016 12:23 PM, mohit.kaushik wrote:
>
> Sent so early...
>
> Another exception I am getting frequently with zookeeper which is a bigger
> problem.
> ACCUMULO-3336 <https://issues.apache.org/jira/browse/ACCUMULO-3336> says
> it is unresolved yet
>
> Saw (possibly) transient exception communicating with ZooKeeper
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
> at org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
> at org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
> at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>
> And the worst case is whenever a zookeeper goes down cluster becomes
> unreacheble for the time being, untill it restarts ingest process halts.
>
> What do you suggest, I need to resolve these problems. I do not want to be
> the ingest process to stop ever.
>
> Thanks
> Mohit kaushik
>
>
> On 02/22/2016 12:06 PM, mohit.kaushik wrote:
>
> I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.
> ACCUMULO-2420 <https://issues.apache.org/jira/browse/ACCUMULO-2420> says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?
>
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> at sun.nio.ch.IOUtil.read(IOUtil.java:197)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
> at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
> at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
> at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
> at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
> at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
> at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>
> Regards
> Mohit kaushik
>
>
>
>
> --
>
> * Mohit Kaushik*
> Software Engineer
> A Square,Plot No. 278, Udyog Vihar, Phase 2, Gurgaon 122016, India
> *Tel:* +91 (124) 4969352 | *Fax:* +91 (124) 4033553
>
> <http://politicomapper.orkash.com>interactive social intelligence at
> work...
>
> <https://www.facebook.com/Orkash2012>
> <http://www.linkedin.com/company/orkash-services-private-limited>
> <https://twitter.com/Orkash> <http://www.orkash.com/blog/>
> <http://www.orkash.com>
> <http://www.orkash.com> ... ensuring Assurance in complexity and
> uncertainty
>
> *This message including the attachments, if any, is a confidential
> business communication. If you are not the intended recipient it may be
> unlawful for you to read, copy, distribute, disclose or otherwise use the
> information in this e-mail. If you have received it in error or are not the
> intended recipient, please destroy it and notify the sender immediately.
> Thank you *
>
Re: IOException in internalRead! & transient exception communicating
with ZooKeeper
Posted by Josh Elser <jo...@gmail.com>.
Keith Turner wrote:
> On Fri, Feb 26, 2016 at 7:33 AM, mohit.kaushik <mohit.kaushik@orkash.com
> <ma...@orkash.com>> wrote:
>
> Thanks Keith, But My Accumulo clients are using same connection
> object. And the count for these WARN increase every second . Can
> Monitor cause these exceptions?
>
>
> I don't think so, but not 100% sure. I think the MAster process usually
> talks to the tservers to gather info and then the monitor talks to the
> tserver.
>
> I am wondering if there is any reason that this message should be logged
> at WARN. Seems like a routine event, should we open an issue to look
> into logging this at a lower level?
Yeah, this is my hunch too. I know we can be a little lax in letting
connections close via their configured timeout. It seems to me like this
is just the server saying "oh, this got closed when I was expecting more
data from the the client!". Would need to be certain like you say tho.
Re: IOException in internalRead! & transient exception communicating
with ZooKeeper
Posted by Josh Elser <jo...@gmail.com>.
Are you using a Scanner and then a BatchWriter for that existence check?
mohit.kaushik wrote:
> I have upgraded to Accumulo-1.7.1 but the problem doesn't goes
> completely. Now strangely I am getting the same error on a single server
> not all. Is it because of the lookup that the application always does to
> check the existence of a document before inserting one?
>
> recent logs
> Keith if you say I will create a jira issue for this if required.
>
> Thanks
>
>
> On 02/27/2016 04:03 AM, Keith Turner wrote:
>>
>>
>> On Fri, Feb 26, 2016 at 7:33 AM, mohit.kaushik
>> <mohit.kaushik@orkash.com <ma...@orkash.com>> wrote:
>>
>> Thanks Keith, But My Accumulo clients are using same connection
>> object. And the count for these WARN increase every second . Can
>> Monitor cause these exceptions?
>>
>>
>> I don't think so, but not 100% sure. I think the MAster process
>> usually talks to the tservers to gather info and then the monitor
>> talks to the tserver.
>>
>> I am wondering if there is any reason that this message should be
>> logged at WARN. Seems like a routine event, should we open an issue
>> to look into logging this at a lower level?
>>
>>
>>
>> On 02/24/2016 08:18 PM, Keith Turner wrote:
>>> You can probably ignore those. I think its caused by an Accumulo
>>> client closing its connection.
>>>
>>> On Wed, Feb 24, 2016 at 6:35 AM, mohit.kaushik
>>> <mohit.kaushik@orkash.com <ma...@orkash.com>> wrote:
>>>
>>> here is screenshot, should I ignore these warnings?
>>>
>>>
>>> internal read exception
>>>
>>>
>>> On 02/22/2016 12:23 PM, mohit.kaushik wrote:
>>>> Sent so early...
>>>>
>>>> Another exception I am getting frequently with zookeeper
>>>> which is a bigger problem.
>>>> ACCUMULO-3336
>>>> <https://issues.apache.org/jira/browse/ACCUMULO-3336> says
>>>> it is unresolved yet
>>>> Saw (possibly) transient exception communicating with ZooKeeper
>>>> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
>>>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>>>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>>> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
>>>> at org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
>>>> at org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
>>>> at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>>>> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>>>> And the worst case is whenever a zookeeper goes down cluster
>>>> becomes unreacheble for the time being, untill it restarts
>>>> ingest process halts.
>>>>
>>>> What do you suggest, I need to resolve these problems. I do
>>>> not want to be the ingest process to stop ever.
>>>>
>>>> Thanks
>>>> Mohit kaushik
>>>>
>>>>
>>>> On 02/22/2016 12:06 PM, mohit.kaushik wrote:
>>>>> I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.
>>>>>
>>>>> ACCUMULO-2420 <https://issues.apache.org/jira/browse/ACCUMULO-2420> says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?
>>>>>
>>>>> java.io.IOException: Connection reset by peer
>>>>> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>>>> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>>>> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>>>>> at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>>>>> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>>>>> at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
>>>>> at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
>>>>> at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
>>>>> at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
>>>>> at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
>>>>> at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>>>>>
>>>>> Regards
>>>>> Mohit kaushik
>>>>>
>>
Re: IOException in internalRead! & transient exception communicating
with ZooKeeper
Posted by "mohit.kaushik" <mo...@orkash.com>.
I have upgraded to Accumulo-1.7.1 but the problem doesn't goes
completely. Now strangely I am getting the same error on a single server
not all. Is it because of the lookup that the application always does to
check the existence of a document before inserting one?
recent logs
Keith if you say I will create a jira issue for this if required.
Thanks
On 02/27/2016 04:03 AM, Keith Turner wrote:
>
>
> On Fri, Feb 26, 2016 at 7:33 AM, mohit.kaushik
> <mohit.kaushik@orkash.com <ma...@orkash.com>> wrote:
>
> Thanks Keith, But My Accumulo clients are using same connection
> object. And the count for these WARN increase every second . Can
> Monitor cause these exceptions?
>
>
> I don't think so, but not 100% sure. I think the MAster process
> usually talks to the tservers to gather info and then the monitor
> talks to the tserver.
>
> I am wondering if there is any reason that this message should be
> logged at WARN. Seems like a routine event, should we open an issue
> to look into logging this at a lower level?
>
>
>
> On 02/24/2016 08:18 PM, Keith Turner wrote:
>> You can probably ignore those. I think its caused by an Accumulo
>> client closing its connection.
>>
>> On Wed, Feb 24, 2016 at 6:35 AM, mohit.kaushik
>> <mohit.kaushik@orkash.com <ma...@orkash.com>> wrote:
>>
>> here is screenshot, should I ignore these warnings?
>>
>>
>> internal read exception
>>
>>
>> On 02/22/2016 12:23 PM, mohit.kaushik wrote:
>>> Sent so early...
>>>
>>> Another exception I am getting frequently with zookeeper
>>> which is a bigger problem.
>>> ACCUMULO-3336
>>> <https://issues.apache.org/jira/browse/ACCUMULO-3336> says
>>> it is unresolved yet
>>> Saw (possibly) transient exception communicating with ZooKeeper
>>> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
>>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>>> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
>>> at org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
>>> at org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
>>> at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>>> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>>> And the worst case is whenever a zookeeper goes down cluster
>>> becomes unreacheble for the time being, untill it restarts
>>> ingest process halts.
>>>
>>> What do you suggest, I need to resolve these problems. I do
>>> not want to be the ingest process to stop ever.
>>>
>>> Thanks
>>> Mohit kaushik
>>>
>>>
>>> On 02/22/2016 12:06 PM, mohit.kaushik wrote:
>>>> I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.
>>>>
>>>> ACCUMULO-2420 <https://issues.apache.org/jira/browse/ACCUMULO-2420> says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?
>>>>
>>>> java.io.IOException: Connection reset by peer
>>>> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>>> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>>> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>>>> at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>>>> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>>>> at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
>>>> at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
>>>> at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
>>>> at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
>>>> at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
>>>> at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>>>>
>>>> Regards
>>>> Mohit kaushik
>>>>
>
Re: IOException in internalRead! & transient exception communicating
with ZooKeeper
Posted by Keith Turner <ke...@deenlo.com>.
On Fri, Feb 26, 2016 at 7:33 AM, mohit.kaushik <mo...@orkash.com>
wrote:
> Thanks Keith, But My Accumulo clients are using same connection object.
> And the count for these WARN increase every second . Can Monitor cause
> these exceptions?
>
I don't think so, but not 100% sure. I think the MAster process usually
talks to the tservers to gather info and then the monitor talks to the
tserver.
I am wondering if there is any reason that this message should be logged at
WARN. Seems like a routine event, should we open an issue to look into
logging this at a lower level?
>
>
> On 02/24/2016 08:18 PM, Keith Turner wrote:
>
> You can probably ignore those. I think its caused by an Accumulo client
> closing its connection.
>
> On Wed, Feb 24, 2016 at 6:35 AM, mohit.kaushik <mo...@orkash.com>
> wrote:
>
>> here is screenshot, should I ignore these warnings?
>>
>>
>> [image: internal read exception]
>>
>>
>> On 02/22/2016 12:23 PM, mohit.kaushik wrote:
>>
>> Sent so early...
>>
>> Another exception I am getting frequently with zookeeper which is a
>> bigger problem.
>> ACCUMULO-3336 <https://issues.apache.org/jira/browse/ACCUMULO-3336> says
>> it is unresolved yet
>>
>> Saw (possibly) transient exception communicating with ZooKeeper
>> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
>> at org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
>> at org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
>> at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>>
>> And the worst case is whenever a zookeeper goes down cluster becomes
>> unreacheble for the time being, untill it restarts ingest process halts.
>>
>> What do you suggest, I need to resolve these problems. I do not want to
>> be the ingest process to stop ever.
>>
>> Thanks
>> Mohit kaushik
>>
>>
>> On 02/22/2016 12:06 PM, mohit.kaushik wrote:
>>
>> I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.
>> ACCUMULO-2420 <https://issues.apache.org/jira/browse/ACCUMULO-2420> says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?
>>
>> java.io.IOException: Connection reset by peer
>> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>> at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>> at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
>> at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
>> at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
>> at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
>> at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
>> at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>>
>> Regards
>> Mohit kaushik
>>
>>
>>
Re: IOException in internalRead! & transient exception communicating
with ZooKeeper
Posted by "mohit.kaushik" <mo...@orkash.com>.
Thanks Keith, But My Accumulo clients are using same connection object.
And the count for these WARN increase every second . Can Monitor cause
these exceptions?
On 02/24/2016 08:18 PM, Keith Turner wrote:
> You can probably ignore those. I think its caused by an Accumulo
> client closing its connection.
>
> On Wed, Feb 24, 2016 at 6:35 AM, mohit.kaushik
> <mohit.kaushik@orkash.com <ma...@orkash.com>> wrote:
>
> here is screenshot, should I ignore these warnings?
>
>
> internal read exception
>
>
> On 02/22/2016 12:23 PM, mohit.kaushik wrote:
>> Sent so early...
>>
>> Another exception I am getting frequently with zookeeper which is
>> a bigger problem.
>> ACCUMULO-3336
>> <https://issues.apache.org/jira/browse/ACCUMULO-3336> says it is
>> unresolved yet
>> Saw (possibly) transient exception communicating with ZooKeeper
>> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /accumulo/f8708e0d-9238-41f5-b948-8f435fd01207/gc/lock
>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
>> at org.apache.accumulo.fate.zookeeper.ZooReader.getStatus(ZooReader.java:132)
>> at org.apache.accumulo.fate.zookeeper.ZooLock.process(ZooLock.java:383)
>> at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>> And the worst case is whenever a zookeeper goes down cluster
>> becomes unreacheble for the time being, untill it restarts ingest
>> process halts.
>>
>> What do you suggest, I need to resolve these problems. I do not
>> want to be the ingest process to stop ever.
>>
>> Thanks
>> Mohit kaushik
>>
>>
>> On 02/22/2016 12:06 PM, mohit.kaushik wrote:
>>> I am facing the below given exception continuously, the count keeps on increasing every sec(current value around 3000 on a server) I can see the exception for all 3 tablet servers.
>>>
>>> ACCUMULO-2420 <https://issues.apache.org/jira/browse/ACCUMULO-2420> says that this exception comes when a client closes a connection before scan completes. But the connection is not closed every thread uses a common connection object to ingest and query, then what could cause this exception?
>>>
>>> java.io.IOException: Connection reset by peer
>>> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>>> at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>>> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>>> at org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
>>> at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:537)
>>> at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:338)
>>> at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:203)
>>> at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.select(CustomNonBlockingServer.java:228)
>>> at org.apache.accumulo.server.rpc.CustomNonBlockingServer$SelectAcceptThread.run(CustomNonBlockingServer.java:184)
>>>
>>> Regards
>>> Mohit kaushik
>>>