You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2019/06/13 03:09:00 UTC

[jira] [Commented] (KYLIN-4017) Build engine get zk(zookeeper) lock failed when building job, it causes the whole build engine doesn't work.

    [ https://issues.apache.org/jira/browse/KYLIN-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862675#comment-16862675 ] 

ASF subversion and git services commented on KYLIN-4017:
--------------------------------------------------------

Commit a74dc055a163e6adb0269e0924fbc78e8f997db2 in kylin's branch refs/heads/master from wangxiaojing
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=a74dc05 ]

KYLIN-4017 Build engine get zk(zookeeper) lock failed when building job, this causes the whole build engine doesn't work.


> Build engine get zk(zookeeper) lock failed when building job, it causes the whole build engine doesn't work.
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-4017
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4017
>             Project: Kylin
>          Issue Type: Bug
>          Components: Job Engine, Tools, Build and Test
>    Affects Versions: Future, v3.0.0, v3.0.0-alpha
>            Reporter: wangxiaojing
>            Priority: Critical
>              Labels: build
>             Fix For: Future, v3.0.0-alpha
>
>         Attachments: zkinstancestart.png
>
>
> Kylin has ZK acquisition lock exception when it is building job. Only restart can solve this problem. Otherwise, it can't build job ,the whole build engine doesn't work.This problem will continue to occur one day after restart. Log looks like below:
> {code:java}
> 2019-05-15 11:09:43,209 INFO [FetcherRunner 1910115020-57] threadpool.FetcherRunner:59 : CubingJob{id=878974c4-4c65-88a4-a912-b238fcc33bdc, name=BUILD CUBE - es_report_respnse_rate_cube - 20190513000000_20190514000000 - GMT+08:00 2019-05-15 11:03:15, state=READY} prepare to schedule and its priority is 20
> 2019-05-15 11:09:43,209 INFO [FetcherRunner 1910115020-57] threadpool.FetcherRunner:63 : CubingJob{id=878974c4-4c65-88a4-a912-b238fcc33bdc, name=BUILD CUBE - es_report_respnse_rate_cube - 20190513000000_20190514000000 - GMT+08:00 2019-05-15 11:03:15, state=READY} scheduled
> 2019-05-15 11:09:43,209 DEBUG [Scheduler 719764581 Job 878974c4-4c65-88a4-a912-b238fcc33bdc-132] zookeeper.ZookeeperDistributedLock:92 : 18786@bigdata-kylin-build01.gz01.diditaxi.com trying to lock /job_engine/lock/878974c4-4c65-88a4-a912-b238fcc33bdc
> 2019-05-15 11:09:43,212 ERROR [pool-12-thread-10] threadpool.DistributedScheduler:115 : unknown error execute job:878974c4-4c65-88a4-a912-b238fcc33bdc in server: 18786@bigdata-kylin-build01.gz01.diditaxi.com
> java.lang.IllegalStateException: Error while 18786@bigdata-kylin-build01.gz01.diditaxi.com trying to lock /job_engine/lock/878974c4-4c65-88a4-a912-b238fcc33bdc
>  at org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock.lock(ZookeeperDistributedLock.java:99)
>  at org.apache.kylin.job.lock.zookeeper.ZookeeperJobLock.lock(ZookeeperJobLock.java:41)
>  at org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:105)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: instance must be started before calling this method
>  at org.apache.curator.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:176)
>  at org.apache.curator.framework.imps.CuratorFrameworkImpl.create(CuratorFrameworkImpl.java:351)
>  at org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock.lock(ZookeeperDistributedLock.java:95)
>  ... 5 more{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)