You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2016/04/08 19:09:25 UTC

[jira] [Updated] (HBASE-15618) Abort if security credentials become invalid

     [ https://issues.apache.org/jira/browse/HBASE-15618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-15618:
-----------------------------------
    Description: 
We are investigating a production incident where a bad keytab push seems to have caused one regionsever, serving hot regions, to lose the ability to communicate with clients. After the fact we see a steady stream of GSS initiation failure messages in the logs. The affected regionserver lingered in an unhealthy state for too long. HBase did not automatically take any corrective action, like an abort of the affected process, which would have recovered service without operator intervention. 

Consider detecting and aborting if security credentials like the Kerberos keytab become invalid during runtime.

  was:
We are investigating a production incident where a bad keytab push seems to have caused one regionsever, serving hot regions, to lose the ability to communicate with clients. After the fact we see this server threw a steady stream of GSS initiate failed errors and lingered in an unhealthy state for too long. HBase did not automatically take any corrective action, like an abort of the affected process, which would have recovered service without operator intervention. 

Consider detecting and aborting if security credentials like the Kerberos keytab become invalid during runtime.


> Abort if security credentials become invalid
> --------------------------------------------
>
>                 Key: HBASE-15618
>                 URL: https://issues.apache.org/jira/browse/HBASE-15618
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>
> We are investigating a production incident where a bad keytab push seems to have caused one regionsever, serving hot regions, to lose the ability to communicate with clients. After the fact we see a steady stream of GSS initiation failure messages in the logs. The affected regionserver lingered in an unhealthy state for too long. HBase did not automatically take any corrective action, like an abort of the affected process, which would have recovered service without operator intervention. 
> Consider detecting and aborting if security credentials like the Kerberos keytab become invalid during runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)