You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Shaofeng SHI (JIRA)" <ji...@apache.org> on 2016/02/28 02:48:18 UTC

[jira] [Reopened] (KYLIN-1082) Hive dependencies should be add to tmpjars

     [ https://issues.apache.org/jira/browse/KYLIN-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shaofeng SHI reopened KYLIN-1082:
---------------------------------

I deployed a latest build from 1.x-staging, create a new cube and then build it; The job failed at the second step (fact distinct), all mapper has this error:

{code}
Error: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.metadata.HiveStorageHandler at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at org.apache.hive.hcatalog.mapreduce.HCatSplit.readFields(HCatSplit.java:139) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42) at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:371) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1650) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
{code} 

It seems some hive dependencies are missed. Yanghong, could you please double check this? For testing this, sandbox vm is not a good choice as it only has 1 node and all jars exists in that. You'd better selecting a clustered env where the hadoop nodes don't have hive installed.

> Hive dependencies should be add to tmpjars
> ------------------------------------------
>
>                 Key: KYLIN-1082
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1082
>             Project: Kylin
>          Issue Type: Bug
>          Components: Environment , Job Engine
>            Reporter: liyang
>            Assignee: Zhong Yanghong
>              Labels: newbie
>             Fix For: v2.1, v1.3
>
>         Attachments: auto_hive_tmpjars_1_x_staging.patch, auto_hive_tmpjars_2_x_staging.patch
>
>
> Currently kylin assume all data nodes have hive deployment at exact same FS location. However, a better position is to think hive as a client side app. Then we need to ship hive jar with MR job every time.
> This make deploy kylin a lot easier in cluster that does not have hive on all data nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)