You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by apratim sharma <ap...@gmail.com> on 2018/02/27 18:16:01 UTC

HBase failed on local exception and failed servers list.

Hi Guys,

I am using hbase 1.2.0 on a kerberos secured cloudera CDH 5.8 cluster.
I have a persistant application that authenticates using keytab and creates
hbase connection. Our code also takes care of reauthentication and
recreating broken connectiion.
The code worked fine in previous versions of hbase. However what we see
with Hbase 1.2 is that after 24 hours the hbase connection does not work
giving following error

org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
attempts=2, exceptions:
Tue Feb 13 12:57:51 PST 2018,
RpcRetryingCaller{globalStartTime=1518555467140, pause=100, retries=2},
org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to
pdmcdh01.xyz.com/192.168.145.62:60020 failed on local exception:
org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection
to pdmcdh01.xyz.com/192.168.145.62:60020 is closing. Call id=137,
waitTime=11
Tue Feb 13 12:58:01 PST 2018,
RpcRetryingCaller{globalStartTime=1518555467140, pause=100, retries=2},
org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to
pdmcdh01.xyz.com/192.168.145.62:60020 failed on local exception:
org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection
to pdmcdh01.xyz.com/192.168.145.62:60020 is closing. Call id=139,
waitTime=13

        at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:147)
        at org.apache.hadoop.hbase.client.HTable.get(HTable.java:935)
        at org.apache.hadoop.hbase.client.HTable.get(HTable.java:901)
Our code reauthnticates and creates connection again but it still keeps
failing
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
attempts=2, exceptions:
Wed Feb 21 14:30:31 PST 2018,
RpcRetryingCaller{globalStartTime=1519252219159, pause=100, retries=2},
java.io.IOException: Couldn't setup connection for pipe@HADOOP.XYZ.COM to
hbase/pdmcdh01.xyz.com@HADOOP.XYZ.COM
Wed Feb 21 14:30:31 PST 2018,
RpcRetryingCaller{globalStartTime=1519252219159, pause=100, retries=2},
org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the
failed servers list: pdmcdh01.xyz.com/192.168.145.62:60020

        at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:147)
        at org.apache.hadoop.hbase.client.HTable.get(HTable.java:935)
        at org.apache.hadoop.hbase.client.HTable.get(HTable.java:901)
I know that client keeps server in the failed list for few seconds in order
to reduce too many connection attempts. So I waited and tried after some
time but still same error.
Once we restart our application everything starts working fine again for
next 24 hours.

This 24 hours gap indicates that it could be something related to Kerberos
ticket expiry time, however there is no log to indicate Kerberos
authentication issue.
Moreover we are handling the exception and trying to authenticate and
create connection again but nothing works until we restart JVM. this is
very strange.

I would really appreciate any help or pointers on this issue.

Thanks a lot
Apratim

Re: HBase failed on local exception and failed servers list.

Posted by Saad Mufti <sa...@gmail.com>.
Are you using AuthUtil class to reauthenticate? This class is in Hbase, and
uses the Hadoop class UserGroupInformation to do the actual login and
re-login. But, if your UserGroupInformation class is from Hadoop 2.5.1 or
earlier, it has a bug if you are using Java 8, as most of us are. The
relogin code uses a test to decide whether the login is kerberos/keytab
based, and that test used to pass on Java 7 but fails in Java 8 because the
test tests for some specific class being in some underlying list of
kerberos objects assigned to your principal, which has disappeared in the
Java 8 implementation. We fixed this by upgrading our Hadoop dependency
explicitly to a newer version, in our case 2.6.1 and they have fixed this
problem in that newer version.

If this is the condition affecting your application, it is an easy enough
fix.

Hope this helps.

Cheers.

----
Saad



On Tue, Feb 27, 2018 at 1:16 PM, apratim sharma <ap...@gmail.com>
wrote:

> Hi Guys,
>
> I am using hbase 1.2.0 on a kerberos secured cloudera CDH 5.8 cluster.
> I have a persistant application that authenticates using keytab and creates
> hbase connection. Our code also takes care of reauthentication and
> recreating broken connectiion.
> The code worked fine in previous versions of hbase. However what we see
> with Hbase 1.2 is that after 24 hours the hbase connection does not work
> giving following error
>
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> attempts=2, exceptions:
> Tue Feb 13 12:57:51 PST 2018,
> RpcRetryingCaller{globalStartTime=1518555467140, pause=100, retries=2},
> org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to
> pdmcdh01.xyz.com/192.168.145.62:60020 failed on local exception:
> org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection
> to pdmcdh01.xyz.com/192.168.145.62:60020 is closing. Call id=137,
> waitTime=11
> Tue Feb 13 12:58:01 PST 2018,
> RpcRetryingCaller{globalStartTime=1518555467140, pause=100, retries=2},
> org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to
> pdmcdh01.xyz.com/192.168.145.62:60020 failed on local exception:
> org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection
> to pdmcdh01.xyz.com/192.168.145.62:60020 is closing. Call id=139,
> waitTime=13
>
>         at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
> RpcRetryingCaller.java:147)
>         at org.apache.hadoop.hbase.client.HTable.get(HTable.java:935)
>         at org.apache.hadoop.hbase.client.HTable.get(HTable.java:901)
> Our code reauthnticates and creates connection again but it still keeps
> failing
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> attempts=2, exceptions:
> Wed Feb 21 14:30:31 PST 2018,
> RpcRetryingCaller{globalStartTime=1519252219159, pause=100, retries=2},
> java.io.IOException: Couldn't setup connection for pipe@HADOOP.XYZ.COM to
> hbase/pdmcdh01.xyz.com@HADOOP.XYZ.COM
> Wed Feb 21 14:30:31 PST 2018,
> RpcRetryingCaller{globalStartTime=1519252219159, pause=100, retries=2},
> org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the
> failed servers list: pdmcdh01.xyz.com/192.168.145.62:60020
>
>         at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
> RpcRetryingCaller.java:147)
>         at org.apache.hadoop.hbase.client.HTable.get(HTable.java:935)
>         at org.apache.hadoop.hbase.client.HTable.get(HTable.java:901)
> I know that client keeps server in the failed list for few seconds in order
> to reduce too many connection attempts. So I waited and tried after some
> time but still same error.
> Once we restart our application everything starts working fine again for
> next 24 hours.
>
> This 24 hours gap indicates that it could be something related to Kerberos
> ticket expiry time, however there is no log to indicate Kerberos
> authentication issue.
> Moreover we are handling the exception and trying to authenticate and
> create connection again but nothing works until we restart JVM. this is
> very strange.
>
> I would really appreciate any help or pointers on this issue.
>
> Thanks a lot
> Apratim
>