You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@accumulo.apache.org by "mohit.kaushik" <mo...@orkash.com> on 2016/01/22 12:19:24 UTC

error sending update to orkash1:9997

Dear All,
/
//error sending update to orkash1:9997:
org.apache.thrift.transport.TTransportException:
java.net.SocketTimeoutException: 120000 millis timeout while waiting for
channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/192.168.10.122:59662
remote=orkash1/192.168.10.121:9997]/

I found the following error in the monitor logs for all three servers
today morning out of which one was dead (orkash3) that may be caused by
network issues. which I am trying to diagnose and shows the message
before it dies

GC pause checker not called in a timely fashion. Expected every 30.0 seconds but was 160.6 seconds since last check

unable to get tablet server status orkash3:9997[1523a7ae07e0081] org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection timed out

Lost tablet server lock (reason = SESSION_EXPIRED), exiting.

But other two servers are not having any network issues. shows error
sending updates and 36000 mutations rejected yesterday.

One more thing I am not getting is whenever (everytime) I compact a
table it says

WARN : Thread "shell" stuck on IO to orkash5:9999 (0) for at least
120084 ms

and after a long ( 4 - 5 hours) it completes with msg

INFO : Thread "shell" no longer stuck on IO to orkash5:9999 (0) sawError
= false

I do not have any idea why shell always stuck for big tables and why
there are errors on sending updates to other server. Please through some
light..

there are some more errors/warn that might help/relate.

Tracing spans are being dropped because there are already 5000 spans queued for delivery.
This does not affect performance, security or data integrity, but distributed tracing information is being lost.

Thread "gc" stuck on IO to orkash5:9999 (0) for at least 120429 ms //on orkashh2

2016-01-22 16:33:57,825 [gc.SimpleGarbageCollector] INFO : List of
delete candidates has exceeded the memory threshold. Attempting to
delete what has been gathered so far.
Should I need to make some GC configurations?? The ingest rate is not so
high for now.

Thanks
Mohit Kaushik