You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Vaibhav Gumashta (JIRA)" <ji...@apache.org> on 2014/08/06 18:23:12 UTC

[jira] [Commented] (HIVE-7353) HiveServer2 using embedded MetaStore leaks JDOPersistanceManager

    [ https://issues.apache.org/jira/browse/HIVE-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14087857#comment-14087857 ] 

Vaibhav Gumashta commented on HIVE-7353:
----------------------------------------

Thanks for the review comments. I've taken a different approach now and the problem seems more generic. Here's the core issue:
RawStore is kept as a threadlocal variable. A RawStore object has a reference to the JDOPersistanceManager object which JDOPersistanceManagerFactory caches. To remove the JDOPersistanceManager from the cache, an explicit JDOPersistanceManager#close call is required. 
The issue is, that in HiveServer2, we keep 2 threadpools (handler - binary mode/http mode & async) managed by an ExecutorService.  Based on the config, the threadpools keep a certain number of threads live and kill excess threads after a configurable keepAliveTime expires. However, ExecutorService does not provide a hook to plug in custom cleanup code when a thread is killed - ideally this is where we'd plug in code to close the JDOPersistanceManager stored in the threadlocal RawStore.

The current approach I've taken provides a custom ThreadFactory while creating the threadpool, which has a finalize method that does the cleanup. The ThreadFactory also maintains a map of RawStore object per Thread and in the finalize method of each thread, retrieves the RawStore object from the map, and performs the shutdown.

On another note, remote metastore also uses ExecutorService for maintaining its ThreadPool. I haven't tested there, but similar problem should exist in that case.

> HiveServer2 using embedded MetaStore leaks JDOPersistanceManager
> ----------------------------------------------------------------
>
>                 Key: HIVE-7353
>                 URL: https://issues.apache.org/jira/browse/HIVE-7353
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 0.13.0
>            Reporter: Vaibhav Gumashta
>            Assignee: Vaibhav Gumashta
>             Fix For: 0.14.0
>
>         Attachments: HIVE-7353.1.patch, HIVE-7353.2.patch
>
>
> While using embedded metastore, while creating background threads to run async operations, HiveServer2 ends up creating new instances of JDOPersistanceManager rather than using the one from the foreground (handler) thread. Since JDOPersistanceManagerFactory caches JDOPersistanceManager instances, they are never GCed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)