You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Peter Vary (JIRA)" <ji...@apache.org> on 2016/09/20 14:30:20 UTC
[jira] [Commented] (HIVE-9423) HiveServer2: Implement some admission control mechanism for graceful degradation when resources are exhausted

    [ https://issues.apache.org/jira/browse/HIVE-9423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506729#comment-15506729 ] 

Peter Vary commented on HIVE-9423:
----------------------------------

I have investigated the issue, and here is what I found:
- There was an issue in the Thrift code, that if there was not enough executor, then the TTheadPoolExecutor stuck in an infinite loop see: THRIFT-2046. This issue is resolved in Thrift 0.9.2
- Hive 1.x, 2.x uses Thrift 0.9.3.

I have tested the behavior on Hive 2.2.0-SNAPSHOT with the following configuration:
- Add the following lines to hive-site.xml:
{code}
<property>
  <name>hive.server2.thrift.max.worker.threads</name>
  <value>1</value>
</property>
<property>
  <name>hive.server2.thrift.min.worker.threads</name>
  <value>1</value>
</property>
{code}
- Start a metastore, and a HS2 instance
- Start 2 BeeLine, and connect to the HS2

The 1st BeeLine connected as expected, the 2nd BeeLine after the configured timeout period (default 20s) printed out the following:
{code}
Connecting to jdbc:hive2://localhost:10000
16/09/20 16:23:57 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10000
HS2 may be unavailable, check server status
Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000: null (state=08S01,code=0)
Beeline version 2.2.0-SNAPSHOT by Apache Hive
beeline> 
{code}

This is behavior is much better than the original problem (no HS2 restart is needed, and closing unused connections helps), but this is not a perfect solution, since there is no difference between a non-running HS2, and a HS2 with exhausted executor pool.

> HiveServer2: Implement some admission control mechanism for graceful degradation when resources are exhausted
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-9423
>                 URL: https://issues.apache.org/jira/browse/HIVE-9423
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 0.12.0, 0.13.0, 0.14.0, 0.15.0
>            Reporter: Vaibhav Gumashta
>
> An example of where it is needed: it has been reported that when # of client connections is greater than   {{hive.server2.thrift.max.worker.threads}}, HiveServer2 stops accepting new connections and ends up having to be restarted. This should be handled more gracefully by the server and the JDBC driver, so that the end user gets aware of the problem and can take appropriate steps (either close existing connections or bump of the config value or use multiple server instances with dynamic service discovery enabled). Similarly, we should also review the behaviour of background thread pool to have a well defined behavior on the the pool getting exhausted. 
> Ideally implementing some form of general admission control will be a better solution, so that we do not accept new work unless sufficient resources are available and display graceful degradation under overload.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)