You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Hive QA (JIRA)" <ji...@apache.org> on 2013/10/18 01:24:43 UTC

[jira] [Commented] (HIVE-5575) ZooKeeper connection closed when unlock with retry

    [ https://issues.apache.org/jira/browse/HIVE-5575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13798591#comment-13798591 ] 

Hive QA commented on HIVE-5575:
-------------------------------



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12608914/HIVE-5575.patch

{color:green}SUCCESS:{color} +1 4415 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1159/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1159/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

> ZooKeeper connection closed when unlock with retry
> --------------------------------------------------
>
>                 Key: HIVE-5575
>                 URL: https://issues.apache.org/jira/browse/HIVE-5575
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.11.0
>            Reporter: Chun Chen
>            Assignee: Chun Chen
>             Fix For: 0.13.0
>
>         Attachments: D13515.1.patch, HIVE-5575.patch, zookeeper session closed.png
>
>
> See the attachment, I have encountered a scenario that hive retries to  unlock all locks, but zookeeper session is closed. If there are hundreds of locks, say dynamic partition, the process will hang up for several days.
> The stack is 
> {code}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.21-b01 mixed mode):
> "Attach Listener" daemon prio=10 tid=0x000000000683f000 nid=0x34d0 waiting on condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>    Locked ownable synchronizers:
> 	- None
> "LeaseChecker" daemon prio=10 tid=0x0000000006693800 nid=0x2713 waiting on condition [0x0000000042af7000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
> 	at java.lang.Thread.sleep(Native Method)
> 	at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.run(DFSClient.java:1376)
> 	at java.lang.Thread.run(Thread.java:722)
>    Locked ownable synchronizers:
> 	- None
> "Service Thread" daemon prio=10 tid=0x00002aaab8001000 nid=0x2651 runnable [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>    Locked ownable synchronizers:
> 	- None
> "C2 CompilerThread1" daemon prio=10 tid=0x0000000005c7c800 nid=0x2650 waiting on condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>    Locked ownable synchronizers:
> 	- None
> "C2 CompilerThread0" daemon prio=10 tid=0x0000000005c71000 nid=0x264f waiting on condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>    Locked ownable synchronizers:
> 	- None
> "Signal Dispatcher" daemon prio=10 tid=0x0000000005c6f000 nid=0x264e runnable [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>    Locked ownable synchronizers:
> 	- None
> "Finalizer" daemon prio=10 tid=0x0000000005c22000 nid=0x264d in Object.wait() [0x00000000427f4000]
>    java.lang.Thread.State: WAITING (on object monitor)
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
> 	- locked <0x000000078324b110> (a java.lang.ref.ReferenceQueue$Lock)
> 	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
> 	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)
>    Locked ownable synchronizers:
> 	- None
> "Reference Handler" daemon prio=10 tid=0x0000000005c1a000 nid=0x264c in Object.wait() [0x0000000041900000]
>    java.lang.Thread.State: WAITING (on object monitor)
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.Object.wait(Object.java:503)
> 	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
> 	- locked <0x000000078328fbc0> (a java.lang.ref.Reference$Lock)
>    Locked ownable synchronizers:
> 	- None
> "main" prio=10 tid=0x0000000005b76800 nid=0x263d waiting on condition [0x0000000040f46000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
> 	at java.lang.Thread.sleep(Native Method)
> 	at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlockWithRetry(ZooKeeperHiveLockManager.java:426)
> 	at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlock(ZooKeeperHiveLockManager.java:415)
> 	at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.releaseLocks(ZooKeeperHiveLockManager.java:257)
> 	at org.apache.hadoop.hive.ql.Driver.releaseLocks(Driver.java:864)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:953)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> 	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
> 	at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
> 	at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
> 	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
> 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:601)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>    Locked ownable synchronizers:
> 	- None
> "VM Thread" prio=10 tid=0x0000000005c12800 nid=0x264b runnable 
> "GC task thread#0 (ParallelGC)" prio=10 tid=0x0000000005b84800 nid=0x263e runnable 
> "GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000005b86000 nid=0x263f runnable 
> "GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000005b88000 nid=0x2640 runnable 
> "GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000005b8a000 nid=0x2641 runnable 
> "GC task thread#4 (ParallelGC)" prio=10 tid=0x0000000005b8b800 nid=0x2642 runnable 
> "GC task thread#5 (ParallelGC)" prio=10 tid=0x0000000005b8d800 nid=0x2643 runnable 
> "GC task thread#6 (ParallelGC)" prio=10 tid=0x0000000005b8f800 nid=0x2644 runnable 
> "GC task thread#7 (ParallelGC)" prio=10 tid=0x0000000005b91000 nid=0x2645 runnable 
> "GC task thread#8 (ParallelGC)" prio=10 tid=0x0000000005b93000 nid=0x2646 runnable 
> "GC task thread#9 (ParallelGC)" prio=10 tid=0x0000000005b95000 nid=0x2647 runnable 
> "GC task thread#10 (ParallelGC)" prio=10 tid=0x0000000005b96800 nid=0x2648 runnable 
> "GC task thread#11 (ParallelGC)" prio=10 tid=0x0000000005b98800 nid=0x2649 runnable 
> "GC task thread#12 (ParallelGC)" prio=10 tid=0x0000000005b9a800 nid=0x264a runnable 
> "VM Periodic Task Thread" prio=10 tid=0x00002aaab800c000 nid=0x2652 waiting on condition 
> JNI global references: 294
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)