You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Chun Chang (JIRA)" <ji...@apache.org> on 2015/04/28 03:15:06 UTC

[jira] [Closed] (DRILL-1804) random failures while running large number of queries

     [ https://issues.apache.org/jira/browse/DRILL-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chun Chang closed DRILL-1804.
-----------------------------
    Assignee: Chun Chang  (was: Chris Westin)

general automation run.

> random failures while running large number of queries
> -----------------------------------------------------
>
>                 Key: DRILL-1804
>                 URL: https://issues.apache.org/jira/browse/DRILL-1804
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 0.7.0
>            Reporter: Chun Chang
>            Assignee: Chun Chang
>            Priority: Blocker
>             Fix For: 0.8.0
>
>
> #Tue Dec 02 14:38:34 EST 2014
> git.commit.id.abbrev=757e9a2
> Running Mondrian regression tests, out of over 6000 queries, sometimes I get one or two random failures. Here is the stack when it happens:
> 2014-12-02 17:49:32,271 [2b8193d3-f0ca-aa7c-094a-d8234d76d068:foreman] ERROR o.a.drill.exec.work.foreman.Foreman - Error aeae057b-ed0a-43aa-902d-fe3a41531511: Query failed: Unexpected exception during fragment initialization.
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception during fragment initialization.
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:194) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45]
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45]
>   at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper. Failure while accessing Zookeeper
>   at org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:111) ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.QueryStatus.updateQueryStateInStore(QueryStatus.java:132) ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.recordNewState(Foreman.java:502) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:396) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:311) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:510) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:185) [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   ... 4 common frames omitted
> Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper
>   at org.apache.drill.exec.store.sys.zk.ZkEStore.createNodeInZK(ZkEStore.java:53) ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:106) ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   ... 10 common frames omitted
> Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /drill/running/2b8193d3-f0ca-aa7c-094a-d8234d76d068
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119) ~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) ~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
>   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) ~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
>   at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676) ~[curator-framework-2.5.0.jar:na]
>   at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660) ~[curator-framework-2.5.0.jar:na]
>   at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) ~[curator-client-2.5.0.jar:na]
>   at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656) ~[curator-framework-2.5.0.jar:na]
>   at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441) ~[curator-framework-2.5.0.jar:na]
>   at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:431) ~[curator-framework-2.5.0.jar:na]
>   at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:44) ~[curator-framework-2.5.0.jar:na]
>   at org.apache.drill.exec.store.sys.zk.ZkEStore.createNodeInZK(ZkEStore.java:51) ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   ... 11 common frames omitted
> 2014-12-02 17:49:32,287 [2b8193d3-f0ca-aa7c-094a-d8234d76d068:frag:0:0] WARN  o.a.d.e.p.impl.SendingAccountor - Failure while waiting for send complete.
> java.lang.InterruptedException: null
>   at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1301) ~[na:1.7.0_45]
>   at java.util.concurrent.Semaphore.acquire(Semaphore.java:472) ~[na:1.7.0_45]
>   at org.apache.drill.exec.physical.impl.SendingAccountor.waitForSendComplete(SendingAccountor.java:44) ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)