You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by James Srinivasan <ja...@gmail.com> on 2019/12/08 12:39:48 UTC

Accumulo 1.7 tserver TTransportException errors every minute

I'm running Accumulo 1.7 (HDP3) on a Kerberized cluster. When trying
to debug some client libthrift issues, I noticed errors like this
every minute (pretty much on the minute) in my tserver logs:

2019-12-08 12:35:01,255 [server.TThreadPoolServer] ERROR: Error
occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException
        at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
        at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
        at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
        at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.thrift.transport.TTransportException
        at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
        at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178)
        at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
        at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
        at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
        at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
        ... 11 more

I get them even if there are no clients running and the tserver is the
only accumulo process running in the cluster (no master, no other
tservers etc.) and curiously I don't see any network traffic on port
9997. Any idea how to debug further?

Many thanks,

James

Re: Accumulo 1.7 tserver TTransportException errors every minute

Posted by Josh Elser <el...@apache.org>.
Yeah, this is an annoying combination of the ambari-agent and Thrift. 
Good on you to get to the bottom of it, and thanks for commenting back 
to the list!

The agent opens a socket to the Thrift service to do the Ambari "port 
check", but sends no data (as it's just checking network connectivity). 
If the server accepts the connection, the Agent just hangs up, assuming 
everything is good. However, this triggers an error in Thrift instead of 
just proceeding.

IMO, thrift shouldn't log this as an error, but it's what we have :)

On 12/8/19 12:12 PM, James Srinivasan wrote:
> Ahh, looks like it was the ambari-agent process, part of HDP. Since
> that runs on the same machine, it wasn't in my tcpdump.
> 
> Second time ambari-agent has done something unexpected for me! (first
> was renewing a keytab behind my back)
> 
> On Sun, 8 Dec 2019 at 12:39, James Srinivasan
> <ja...@gmail.com> wrote:
>>
>> I'm running Accumulo 1.7 (HDP3) on a Kerberized cluster. When trying
>> to debug some client libthrift issues, I noticed errors like this
>> every minute (pretty much on the minute) in my tserver logs:
>>
>> 2019-12-08 12:35:01,255 [server.TThreadPoolServer] ERROR: Error
>> occurred during processing of message.
>> java.lang.RuntimeException: org.apache.thrift.transport.TTransportException
>>          at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
>>          at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
>>          at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
>>          at java.security.AccessController.doPrivileged(Native Method)
>>          at javax.security.auth.Subject.doAs(Subject.java:360)
>>          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
>>          at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
>>          at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
>>          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>          at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>>          at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.thrift.transport.TTransportException
>>          at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>>          at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>>          at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178)
>>          at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
>>          at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
>>          at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
>>          at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
>>          ... 11 more
>>
>> I get them even if there are no clients running and the tserver is the
>> only accumulo process running in the cluster (no master, no other
>> tservers etc.) and curiously I don't see any network traffic on port
>> 9997. Any idea how to debug further?
>>
>> Many thanks,
>>
>> James

Re: Accumulo 1.7 tserver TTransportException errors every minute

Posted by James Srinivasan <ja...@gmail.com>.
Ahh, looks like it was the ambari-agent process, part of HDP. Since
that runs on the same machine, it wasn't in my tcpdump.

Second time ambari-agent has done something unexpected for me! (first
was renewing a keytab behind my back)

On Sun, 8 Dec 2019 at 12:39, James Srinivasan
<ja...@gmail.com> wrote:
>
> I'm running Accumulo 1.7 (HDP3) on a Kerberized cluster. When trying
> to debug some client libthrift issues, I noticed errors like this
> every minute (pretty much on the minute) in my tserver logs:
>
> 2019-12-08 12:35:01,255 [server.TThreadPoolServer] ERROR: Error
> occurred during processing of message.
> java.lang.RuntimeException: org.apache.thrift.transport.TTransportException
>         at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
>         at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
>         at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:360)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
>         at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.thrift.transport.TTransportException
>         at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>         at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>         at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178)
>         at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
>         at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
>         at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
>         at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
>         ... 11 more
>
> I get them even if there are no clients running and the tserver is the
> only accumulo process running in the cluster (no master, no other
> tservers etc.) and curiously I don't see any network traffic on port
> 9997. Any idea how to debug further?
>
> Many thanks,
>
> James