You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Shaofeng SHI (JIRA)" <ji...@apache.org> on 2016/02/28 02:48:18 UTC
[jira] [Reopened] (KYLIN-1082) Hive dependencies should be add to
tmpjars
[ https://issues.apache.org/jira/browse/KYLIN-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shaofeng SHI reopened KYLIN-1082:
---------------------------------
I deployed a latest build from 1.x-staging, create a new cube and then build it; The job failed at the second step (fact distinct), all mapper has this error:
{code}
Error: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.metadata.HiveStorageHandler at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at org.apache.hive.hcatalog.mapreduce.HCatSplit.readFields(HCatSplit.java:139) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42) at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:371) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1650) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
{code}
It seems some hive dependencies are missed. Yanghong, could you please double check this? For testing this, sandbox vm is not a good choice as it only has 1 node and all jars exists in that. You'd better selecting a clustered env where the hadoop nodes don't have hive installed.
> Hive dependencies should be add to tmpjars
> ------------------------------------------
>
> Key: KYLIN-1082
> URL: https://issues.apache.org/jira/browse/KYLIN-1082
> Project: Kylin
> Issue Type: Bug
> Components: Environment , Job Engine
> Reporter: liyang
> Assignee: Zhong Yanghong
> Labels: newbie
> Fix For: v2.1, v1.3
>
> Attachments: auto_hive_tmpjars_1_x_staging.patch, auto_hive_tmpjars_2_x_staging.patch
>
>
> Currently kylin assume all data nodes have hive deployment at exact same FS location. However, a better position is to think hive as a client side app. Then we need to ship hive jar with MR job every time.
> This make deploy kylin a lot easier in cluster that does not have hive on all data nodes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)