You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Yaqian Zhang (Jira)" <ji...@apache.org> on 2020/12/25 07:10:00 UTC

[jira] [Comment Edited] (KYLIN-4847) Cuboid to HFile step failed on multiple job server env because of trying to read the metric jar file from the inactive job server's location.

    [ https://issues.apache.org/jira/browse/KYLIN-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17254712#comment-17254712 ] 

Yaqian Zhang edited comment on KYLIN-4847 at 12/25/20, 7:09 AM:
----------------------------------------------------------------

Hi:
I reproduced this error in my local environment.
I think you are right. HBaseSparkSteps.java#L69 added metrics-core.jar located in the $KYLIN_HOME directory of the Inactive job server submitting the job, but this directory does not exist in the leader job server. 
Actually, metrics-core.jar can also be loaded from hadoop directory, but this jar package also happens to exist under the classloading path of $KYLIN_HOME. 
As a workaround, you can move $KYLIN_HOME/tomcat/webapp/kylin/WEB-INF/lib/metrics-core.jar to a non-class loading path, and HBaseSparkSteps will add metrics-core.jar from the hadoop directory, which can be found at every node in the hadoop cluster.
I will try to fix this.


was (Author: zhangyaqian):
Hi:
I reproduced this error in my local environment.
I think you are right. HBaseSparkSteps.java#L69 added metrics-core.jar located in the $KYLIN_HOME directory of the Inactive job server submitting the job, but this directory does not exist in the leader job server. 
Actually, metrics-core.jar can also be loaded from hadoop directory, but this jar package also happens to exist under the classloading path of $KYLIN_HOME. As a workaround, you can move the jar to a non-class loading path, and HBaseSparkSteps will add metrics-core.jar from the hadoop directory, which can be found at every node in the hadoop cluster.
I will try to fix this.

> Cuboid to HFile step failed on multiple job server env because of trying to read the metric jar file from the inactive job server's location.
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-4847
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4847
>             Project: Kylin
>          Issue Type: Bug
>          Components: Job Engine
>    Affects Versions: v3.1.0
>            Reporter: yoonsung.lee
>            Priority: Major
>
> h1. My Cluster Setting
> 1. versIon: 3.1.0
>  2. 2 job servers(job & query mode), 2 query only servers. Each of them runs on each different host machine.
>  3. Use spark engine to build job.
> h1. Problem Circumstance
> h2. Root cause
> The active job server submits spark job to execute `Convert Cuboid Data to HFile`. But the active job server get an error because a resource for submitting spark job has the wrong path which the active job server cannot read.
>  * wrong resource: ${KYLIN_HOME}/tomcat/webapps/kylin/WEB-INF/lib/metrics-core-2.2.0.jar
>  * The ${KYLIN_HOME} is the inactive job server's location for only the above jar file.
> This situation occurs in the following two circumstances.
> h2. On build cube
> 1. Request the build API to the inactive job server. (exactly: /kylin/api/cubes/${cube_name}/rebuild )
>  2. Inactive job server stores the build task in meta store.
>  3. Active job server takes the build task and proceeds it.
>  4. Active job server failed on the `Convert Cuboid Data to HFile` step. 
> **This doesn't occur when I request build API to the active job server.**
> h2. On merge
> 1. Trigger merge cube job periodically
>  2. Active job server takes the merge task and proceeds it.
>  3. Active job server failed on the `Convert Cuboid Data to HFile` step.
> **This doesn't occur when there is only one job server in the cluster.**
> h1. Progress to solve this.
> I'm trying to find which code set the metrics-core-2.2.0.jar path wrong.
>  Until now, I guess this code would be the set the metrics-core-2.2.0.jar for the `Cuboid to HFile` spark job.
>  * [https://github.com/apache/kylin/blob/kylin-3.1.0/storage-hbase/src/main/java/org/apache/kylin/storage/hbase/steps/HBaseSparkSteps.java#L69]
> h1. Questions
> 1. I'm trying to remote debug with IDE to make sure my guess is right. But the breakpoint on that line is not captured on Runtime. It seems to be called on the booting phase. Is it right?
> 2. Is there any hint or guessing to solve this issue regardless of the above my progress?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)