You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Bennie Schut (JIRA)" <ji...@apache.org> on 2010/08/13 17:38:19 UTC

[jira] Created: (HIVE-1539) Concurrent metastore threading problem

Concurrent metastore threading problem 
---------------------------------------

                 Key: HIVE-1539
                 URL: https://issues.apache.org/jira/browse/HIVE-1539
             Project: Hadoop Hive
          Issue Type: Bug
          Components: Metastore
    Affects Versions: 0.7.0
            Reporter: Bennie Schut
            Assignee: Bennie Schut


When running hive as a service and running a high number of queries concurrently I end up with multiple threads running at 100% cpu without any progress.

Looking at these threads I notice this thread(484e):
at org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:598)

But on a different thread(63a2):
at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoReplaceField(MStorageDescriptor.java)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1539) Concurrent metastore threading problem

Posted by "Bennie Schut (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898810#action_12898810 ] 

Bennie Schut commented on HIVE-1539:
------------------------------------

That would be HIVE-1482 not 28 :(

> Concurrent metastore threading problem 
> ---------------------------------------
>
>                 Key: HIVE-1539
>                 URL: https://issues.apache.org/jira/browse/HIVE-1539
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.7.0
>            Reporter: Bennie Schut
>            Assignee: Bennie Schut
>         Attachments: thread_dump_hanging.txt
>
>
> When running hive as a service and running a high number of queries concurrently I end up with multiple threads running at 100% cpu without any progress.
> Looking at these threads I notice this thread(484e):
> at org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:598)
> But on a different thread(63a2):
> at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoReplaceField(MStorageDescriptor.java)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1539) Concurrent metastore threading problem

Posted by "Bennie Schut (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HIVE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bennie Schut updated HIVE-1539:
-------------------------------

    Attachment: thread_dump_hanging.txt

Thread dump.

> Concurrent metastore threading problem 
> ---------------------------------------
>
>                 Key: HIVE-1539
>                 URL: https://issues.apache.org/jira/browse/HIVE-1539
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.7.0
>            Reporter: Bennie Schut
>            Assignee: Bennie Schut
>         Attachments: thread_dump_hanging.txt
>
>
> When running hive as a service and running a high number of queries concurrently I end up with multiple threads running at 100% cpu without any progress.
> Looking at these threads I notice this thread(484e):
> at org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:598)
> But on a different thread(63a2):
> at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoReplaceField(MStorageDescriptor.java)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1539) Concurrent metastore threading problem

Posted by "Bennie Schut (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898808#action_12898808 ] 

Bennie Schut commented on HIVE-1539:
------------------------------------

This is starting to look more and more like a similar problem we have on the jdbc driver (HIVE-1428). the HiveMetatStoreClient is doing calls on the thrift client but the thrift client isn't threadsafe (probably by design). So when the HiveMetaStoreClient is used it should be synchronized or a new HiveMetaStoreClient object should be created for each thread.

>From the "Hive" class:

  private IMetaStoreClient getMSC() throws MetaException {
    if (metaStoreClient == null) {
      metaStoreClient = createMetaStoreClient();
    }
    return metaStoreClient;
  }

I could make the sync factory we use on HIVE-1428 a common thing and reuse it here. Any objections?

> Concurrent metastore threading problem 
> ---------------------------------------
>
>                 Key: HIVE-1539
>                 URL: https://issues.apache.org/jira/browse/HIVE-1539
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.7.0
>            Reporter: Bennie Schut
>            Assignee: Bennie Schut
>         Attachments: thread_dump_hanging.txt
>
>
> When running hive as a service and running a high number of queries concurrently I end up with multiple threads running at 100% cpu without any progress.
> Looking at these threads I notice this thread(484e):
> at org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:598)
> But on a different thread(63a2):
> at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoReplaceField(MStorageDescriptor.java)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1539) Concurrent metastore threading problem

Posted by "Bennie Schut (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905455#action_12905455 ] 

Bennie Schut commented on HIVE-1539:
------------------------------------

JDOClassLoaderResolver doesn't seem thread safe. That's a bit of a surprise. I filed a bug with datanucleus: 
http://www.datanucleus.org/servlet/jira/browse/NUCCORE-559
I just made my own threadsafe version of the JDOClassLoaderResolver and am loading it to see if that fixes it. Will probably take a few days to be sure it got fixed.

> Concurrent metastore threading problem 
> ---------------------------------------
>
>                 Key: HIVE-1539
>                 URL: https://issues.apache.org/jira/browse/HIVE-1539
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.7.0
>            Reporter: Bennie Schut
>            Assignee: Bennie Schut
>         Attachments: thread_dump_hanging.txt
>
>
> When running hive as a service and running a high number of queries concurrently I end up with multiple threads running at 100% cpu without any progress.
> Looking at these threads I notice this thread(484e):
> at org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:598)
> But on a different thread(63a2):
> at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoReplaceField(MStorageDescriptor.java)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1539) Concurrent metastore threading problem

Posted by "Bennie Schut (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908472#action_12908472 ] 

Bennie Schut commented on HIVE-1539:
------------------------------------

Patch got comitted on datanucleus 2.2.0.m2
We could consider moving from 2.0 to 2.2
The temporary fix of loading your own classloader also wroked nicely.


> Concurrent metastore threading problem 
> ---------------------------------------
>
>                 Key: HIVE-1539
>                 URL: https://issues.apache.org/jira/browse/HIVE-1539
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.7.0
>            Reporter: Bennie Schut
>            Assignee: Bennie Schut
>         Attachments: ClassLoaderResolver.patch, thread_dump_hanging.txt
>
>
> When running hive as a service and running a high number of queries concurrently I end up with multiple threads running at 100% cpu without any progress.
> Looking at these threads I notice this thread(484e):
> at org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:598)
> But on a different thread(63a2):
> at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoReplaceField(MStorageDescriptor.java)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1539) Concurrent metastore threading problem

Posted by "Bennie Schut (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901356#action_12901356 ] 

Bennie Schut commented on HIVE-1539:
------------------------------------

thanks Paul, I didn't notice that yet so in theory it shouldn't be the cause. However after synchronizing the metastore client I never saw this problem again where I used to see this problem every day.
Strange issue. I'll see if I can create something reproducible but that might be tricky for this type of problem.

> Concurrent metastore threading problem 
> ---------------------------------------
>
>                 Key: HIVE-1539
>                 URL: https://issues.apache.org/jira/browse/HIVE-1539
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.7.0
>            Reporter: Bennie Schut
>            Assignee: Bennie Schut
>         Attachments: thread_dump_hanging.txt
>
>
> When running hive as a service and running a high number of queries concurrently I end up with multiple threads running at 100% cpu without any progress.
> Looking at these threads I notice this thread(484e):
> at org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:598)
> But on a different thread(63a2):
> at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoReplaceField(MStorageDescriptor.java)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-1539) Concurrent metastore threading problem

Posted by "Bennie Schut (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HIVE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bennie Schut updated HIVE-1539:
-------------------------------

    Attachment: ClassLoaderResolver.patch

Ok still testing it but this is a temporary fix we add our own sync. version of the classloader:

Just make sure you add these properties and it should work:
<property>
  <name>datanucleus.classLoaderResolverName</name>
  <value>syncloader</value>
</property>



> Concurrent metastore threading problem 
> ---------------------------------------
>
>                 Key: HIVE-1539
>                 URL: https://issues.apache.org/jira/browse/HIVE-1539
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.7.0
>            Reporter: Bennie Schut
>            Assignee: Bennie Schut
>         Attachments: ClassLoaderResolver.patch, thread_dump_hanging.txt
>
>
> When running hive as a service and running a high number of queries concurrently I end up with multiple threads running at 100% cpu without any progress.
> Looking at these threads I notice this thread(484e):
> at org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:598)
> But on a different thread(63a2):
> at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoReplaceField(MStorageDescriptor.java)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1539) Concurrent metastore threading problem

Posted by "Paul Yang (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899157#action_12899157 ] 

Paul Yang commented on HIVE-1539:
---------------------------------

>From looking at the Hive class, we are creating a ThreadLocal instance of Hive when we call get(). Since the HiveMetaStoreClient is a member of Hive, shouldn't the client be accessed by only 1 thread at a time?

> Concurrent metastore threading problem 
> ---------------------------------------
>
>                 Key: HIVE-1539
>                 URL: https://issues.apache.org/jira/browse/HIVE-1539
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.7.0
>            Reporter: Bennie Schut
>            Assignee: Bennie Schut
>         Attachments: thread_dump_hanging.txt
>
>
> When running hive as a service and running a high number of queries concurrently I end up with multiple threads running at 100% cpu without any progress.
> Looking at these threads I notice this thread(484e):
> at org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:598)
> But on a different thread(63a2):
> at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoReplaceField(MStorageDescriptor.java)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1539) Concurrent metastore threading problem

Posted by "Bennie Schut (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904093#action_12904093 ] 

Bennie Schut commented on HIVE-1539:
------------------------------------

Ok, I just had another few instances of hanging threads. Last time I used a SynchronizedFactory to eliminate any threading issues in the metastore client but that doesn't solve it like Paul correctly saw.

"pool-1-thread-224" prio=10 tid=0x00007f717d62c800 nid=0x7b4d runnable [0x000000004557c000]
   java.lang.Thread.State: RUNNABLE
        at java.util.HashMap.get(HashMap.java:303)
        at org.datanucleus.util.ReferenceValueMap.get(ReferenceValueMap.java:186)
        at org.datanucleus.JDOClassLoaderResolver.classForName(JDOClassLoaderResolver.java:185)
        at org.datanucleus.JDOClassLoaderResolver.classForName(JDOClassLoaderResolver.java:415)
        at org.datanucleus.store.mapped.MappedTypeManager.isSupportedMappedType(MappedTypeManager.java:84)
        at org.datanucleus.store.rdbms.query.legacy.AbstractIteratorStatement.<init>(AbstractIteratorStatement.java:83)
        at org.datanucleus.store.rdbms.query.legacy.UnionIteratorStatement.<init>(UnionIteratorStatement.java:147)
        at org.datanucleus.store.rdbms.query.legacy.ClassTableExtent.newQueryStatement(ClassTableExtent.java:204)
        at org.datanucleus.store.rdbms.query.legacy.QueryCompiler.executionCompile(QueryCompiler.java:323)
        at org.datanucleus.store.rdbms.query.legacy.JDOQLQueryCompiler.compile(JDOQLQueryCompiler.java:225)
        at org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.compileInternal(JDOQLQuery.java:175)
        at org.datanucleus.store.query.Query.executeQuery(Query.java:1628)
        at org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.executeQuery(JDOQLQuery.java:245)
        at org.datanucleus.store.query.Query.executeWithArray(Query.java:1499)
        at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:243)
        at org.apache.hadoop.hive.metastore.ObjectStore.getMDatabase(ObjectStore.java:323)
        at org.apache.hadoop.hive.metastore.ObjectStore.convertToMTable(ObjectStore.java:640)
        at org.apache.hadoop.hive.metastore.ObjectStore.alterTable(ObjectStore.java:946)
        at org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:177)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$23.run(HiveMetaStore.java:1329)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$23.run(HiveMetaStore.java:1326)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:229)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table(HiveMetaStore.java:1326)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:143)
        at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hive.common.SynchronizedFactory$Handler.invoke(SynchronizedFactory.java:46)
        - locked <0x00000007ffc6fd00> (a org.apache.hadoop.hive.metastore.HiveMetaStoreClient)
        at $Proxy8.alter_table(Unknown Source)
        at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:267)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:875)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:174)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:609)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:478)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:356)
        at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:114)
        at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:378)
        at org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:366)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)


> Concurrent metastore threading problem 
> ---------------------------------------
>
>                 Key: HIVE-1539
>                 URL: https://issues.apache.org/jira/browse/HIVE-1539
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.7.0
>            Reporter: Bennie Schut
>            Assignee: Bennie Schut
>         Attachments: thread_dump_hanging.txt
>
>
> When running hive as a service and running a high number of queries concurrently I end up with multiple threads running at 100% cpu without any progress.
> Looking at these threads I notice this thread(484e):
> at org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:598)
> But on a different thread(63a2):
> at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoReplaceField(MStorageDescriptor.java)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.