You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Chao Sun (JIRA)" <ji...@apache.org> on 2016/08/12 04:30:22 UTC

[jira] [Commented] (HIVE-14524) BaseSemanticAnalyzer may leak HMS connection

    [ https://issues.apache.org/jira/browse/HIVE-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418356#comment-15418356 ] 

Chao Sun commented on HIVE-14524:
---------------------------------

OK, did some debugging. Here's how the above step 4) carries out:
1. In the HS2 handler thread, when calling {{Driver#compile}}, a {{BaseSemanticAnalyzer}} is initialized. Since the config is changed,
the old {{Hive}} instance is replaced with a new one. Let's call the old one *A* and the new one *B*. the {{BaseSemanticAnalyzer}} instance
is initialized with *B*.
2. immediately following the above code, a HMS connection is created for *B* for flushing metastore cache.
3. the handler thread launches a background thread for query execution, which will first set the thread-local {{Hive}} instance using the handler's {{parentHive}} field, *which still refers to A*. So now *B* is overwritten by *A*!. There's no variable refers to *B*, and *B* holds a open HMS connection...
4. the background thread executes the query, and opens new HMS connections.
5. after the background thread is done, the handler thread will then set the thread-local {{Hive}} instance with {{sessionHive}}, which also points to *A*. So now the handler thread is using *A* again.

As result, the copy *B* is permanently lost, along with the connection.

> BaseSemanticAnalyzer may leak HMS connection
> --------------------------------------------
>
>                 Key: HIVE-14524
>                 URL: https://issues.apache.org/jira/browse/HIVE-14524
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 2.2.0
>            Reporter: Chao Sun
>            Assignee: Chao Sun
>
> Currently {{BaseSemanticAnalyzer}} keeps a copy of thread-local {{Hive}} object to connect to HMS. However, in some cases Hive may overwrite the existing {{Hive}} object:
> {{Hive#getInternal}}:
> {code}
>   private static Hive getInternal(HiveConf c, boolean needsRefresh, boolean isFastCheck,
>       boolean doRegisterAllFns) throws HiveException {
>     Hive db = hiveDB.get();
>     if (db == null || !db.isCurrentUserOwner() || needsRefresh
>         || (c != null && db.metaStoreClient != null && !isCompatible(db, c, isFastCheck))) {
>       return create(c, false, db, doRegisterAllFns);
>     }
>     if (c != null) {
>       db.conf = c;
>     }
>     return db;
>   }
> {code}
> *This poses an potential problem*: if one first instantiates a {{BaseSemanticAnalyzer}} object with the current {{Hive}} object (let's call it A), and for some reason A is overwritten by B with the code above, then {{BaseSemanticAnalyzer}} may keep using A to contact HMS, which will leak connections.
> This can be reproduced by the following steps:
> 1. open a session
> 2. execute some simple query such as {{desc formatted src}}
> 3. change a metastore property (I know, this is not a perfect example...), for instance: {{set hive.txn.timeout=500}}
> 4. run another command such as {{desc formatted src}} again
> Notice that in step 4), since a metavar is changed the {{isCompatible}} will return false, and hence a new {{Hive}} object is created. As result, you'll observe in the HS2 log that an connection has been leaked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)