You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kylin.apache.org by "LIU Ze (刘则)" <li...@wanda.cn> on 2015/10/21 04:14:16 UTC

回复: Re: Error in Step 2 "Extract Fact Table Distinct Columns"

 Thanks for all the help!   I used the https://issues.apache.org/jira/browse/KYLIN-1021, but it make another error:

kylin.log:
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): Unknown Job job_1444723293631_16444
        at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.verifyAndGetJob(HistoryClientService.java:218)
        at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getCounters(HistoryClientService.java:232)
        at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getCounters(MRClie


yarn logs -applicationId application_1444723293631_16444:

2015-10-21 09:08:08,593 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: Deserialization error: invalid stream header: FED007E8
        at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUtil.java:119)
        at org.apache.hive.hcatalog.mapreduce.HCatSplit.readFields(HCatSplit.java:139)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
        at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:372)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:754)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.StreamCorruptedException: invalid stream header: FED007E8
        at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:804)
        at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
        at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUtil.java:116)
        ... 11 more
________________________________

发件人： Shi, Shaofeng<ma...@ebay.com>
发送时间： 2015-10-19 10:42
收件人： dev@kylin.incubator.apache.org<ma...@kylin.incubator.apache.org>; LIU Ze (刘则)<ma...@wanda.cn>
主题： Re: Error in Step 2 "Extract Fact Table Distinct Columns"

good share Vadim; Kylin is trying to formally support this case (hive jars
are not in hadoop nodes), please look into:
https://issues.apache.org/jira/browse/KYLIN-1021

On 10/19/15, 10:31 AM, "Vadim Semenov" <_...@databuryat.com> wrote:

>You can make changes in job/pom.xml for hive-hcatalog package, so it will
>be included in the job jar, and rebuild kylin using scripts/package.sh.
>I.e. in job/pom.xml change this
>            <version>${hive-hcatalog.version}</version>
>            <scope>provided</scope>
>to:
>            <version>${hive-hcatalog.version}</version>
>            <!--<scope>provided</scope>-->
>
>
>On October 18, 2015 at 10:15:51 PM, yu feng (olaptestyu@gmail.com) wrote:
>
>From error log I think it was caused by being short of jars such as
>hive-hcatalog, I think it is because you did not deploy your hive env in
>every node of your hadoop cluster as we did, We solved this problem by
>modify the source code , upload or add tmpjars of those dependency jars
>and
>files before every mapreduce job was submitted.
>
>JIRA ticket is here : https://issues.apache.org/jira/browse/KYLIN-1021
>
>hope it can help you ~
>
>2015-10-16 15:52 GMT+08:00 LIU Ze (刘则) <li...@wanda.cn>:
>
>> hi all：
>> it make a error in step 2
>>
>> kylin.log:
>> ________________________________
>> [pool-5-thread-9]:[2015-10-16
>>
>>15:47:06,398][ERROR][org.apache.kylin.job.common.HadoopCmdOutput.updateJo
>>bCounter(HadoopCmdOutput.java:100)]
>> - java.io.IOException: Unknown Job job_1444723293631_9288
>> at
>>
>>org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHa
>>ndler.verifyAndGetJob(HistoryClientService.java:218)
>> at
>>
>>org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHa
>>ndler.getCounters(HistoryClientService.java:232)
>> at
>>
>>org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServ
>>iceImpl.getCounters(MRClientProtocolPBServiceImpl.java:159)
>> at
>>
>>org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.c
>>allBlockingMethod(MRClientProtocol.java:281)
>> at
>>
>>org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(Pr
>>otobufRpcEngine.java:616)
>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at
>>
>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
>>.java:1657)
>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>>
>>
>> yarn logs -applicationId application_1444723293631_9286 :
>>
>> 2015-10-16 15:43:07,543 INFO [main]
>> org.apache.hadoop.conf.Configuration.deprecation: session.id is
>> deprecated. Instead, use dfs.metrics.session-id
>> 2015-10-16 15:43:07,826 INFO [main]
>> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output
>>
>> Committer Algorithm version is 1
>> 2015-10-16 15:43:07,834 INFO [main] org.apache.hadoop.mapred.Task:
>>Using
>> ResourceCalculatorProcessTree : [ ]
>> 2015-10-16 15:43:07,901 WARN [main] org.apache.hadoop.mapred.YarnChild:
>>
>> Exception running child : java.lang.RuntimeException:
>> java.lang.ClassNotFoundException: Class
>> org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found
>> at
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
>> at
>>
>>org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobCo
>>ntextImpl.java:174)
>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:749)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:415)
>> at
>>
>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
>>.java:1657)
>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
>> Caused by: java.lang.ClassNotFoundException: Class
>> org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found
>> at
>>
>>org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:21
>>01)
>> at
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
>> ... 8 more
>>
>>

Re: Re: Error in Step 2 "Extract Fact Table Distinct Columns"

Posted by Li Yang <li...@apache.org>.

Discard old job and re-launch a new build?  The old step 1 may created
something out-dated after config change.

On Wed, Oct 21, 2015 at 10:14 AM, LIU Ze (刘则) <li...@wanda.cn> wrote:

>
>  Thanks for all the help!   I used the
> https://issues.apache.org/jira/browse/KYLIN-1021, but it make another
> error:
>
> kylin.log:
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
> Unknown Job job_1444723293631_16444
>         at
> org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.verifyAndGetJob(HistoryClientService.java:218)
>         at
> org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getCounters(HistoryClientService.java:232)
>         at
> org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getCounters(MRClie
>
>
> yarn logs -applicationId application_1444723293631_16444:
>
> 2015-10-21 09:08:08,593 WARN [main] org.apache.hadoop.mapred.YarnChild:
> Exception running child : java.io.IOException: Deserialization error:
> invalid stream header: FED007E8
>         at
> org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUtil.java:119)
>         at
> org.apache.hive.hcatalog.mapreduce.HCatSplit.readFields(HCatSplit.java:139)
>         at
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
>         at
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
>         at
> org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:372)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:754)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.io.StreamCorruptedException: invalid stream header:
> FED007E8
>         at
> java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:804)
>         at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
>         at
> org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUtil.java:116)
>         ... 11 more
> ________________________________
>
> 发件人： Shi, Shaofeng<ma...@ebay.com>
> 发送时间： 2015-10-19 10:42
> 收件人： dev@kylin.incubator.apache.org<ma...@kylin.incubator.apache.org>;
> LIU Ze (刘则)<ma...@wanda.cn>
> 主题： Re: Error in Step 2 "Extract Fact Table Distinct Columns"
>
> good share Vadim; Kylin is trying to formally support this case (hive jars
> are not in hadoop nodes), please look into:
> https://issues.apache.org/jira/browse/KYLIN-1021
>
> On 10/19/15, 10:31 AM, "Vadim Semenov" <_...@databuryat.com> wrote:
>
> >You can make changes in job/pom.xml for hive-hcatalog package, so it will
> >be included in the job jar, and rebuild kylin using scripts/package.sh.
> >I.e. in job/pom.xml change this
> >            <version>${hive-hcatalog.version}</version>
> >            <scope>provided</scope>
> >to:
> >            <version>${hive-hcatalog.version}</version>
> >            <!--<scope>provided</scope>-->
> >
> >
> >On October 18, 2015 at 10:15:51 PM, yu feng (olaptestyu@gmail.com) wrote:
> >
> >From error log I think it was caused by being short of jars such as
> >hive-hcatalog, I think it is because you did not deploy your hive env in
> >every node of your hadoop cluster as we did, We solved this problem by
> >modify the source code , upload or add tmpjars of those dependency jars
> >and
> >files before every mapreduce job was submitted.
> >
> >JIRA ticket is here : https://issues.apache.org/jira/browse/KYLIN-1021
> >
> >hope it can help you ~
> >
> >2015-10-16 15:52 GMT+08:00 LIU Ze (刘则) <li...@wanda.cn>:
> >
> >> hi all：
> >> it make a error in step 2
> >>
> >> kylin.log:
> >> ________________________________
> >> [pool-5-thread-9]:[2015-10-16
> >>
> >>15:47:06,398][ERROR][org.apache.kylin.job.common.HadoopCmdOutput.updateJo
> >>bCounter(HadoopCmdOutput.java:100)]
> >> - java.io.IOException: Unknown Job job_1444723293631_9288
> >> at
> >>
> >>org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHa
> >>ndler.verifyAndGetJob(HistoryClientService.java:218)
> >> at
> >>
> >>org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHa
> >>ndler.getCounters(HistoryClientService.java:232)
> >> at
> >>
> >>org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServ
> >>iceImpl.getCounters(MRClientProtocolPBServiceImpl.java:159)
> >> at
> >>
> >>org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.c
> >>allBlockingMethod(MRClientProtocol.java:281)
> >> at
> >>
> >>org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(Pr
> >>otobufRpcEngine.java:616)
> >> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> >> at java.security.AccessController.doPrivileged(Native Method)
> >> at javax.security.auth.Subject.doAs(Subject.java:415)
> >> at
> >>
> >>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
> >>.java:1657)
> >> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
> >>
> >>
> >> yarn logs -applicationId application_1444723293631_9286 :
> >>
> >> 2015-10-16 15:43:07,543 INFO [main]
> >> org.apache.hadoop.conf.Configuration.deprecation: session.id is
> >> deprecated. Instead, use dfs.metrics.session-id
> >> 2015-10-16 15:43:07,826 INFO [main]
> >> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output
> >>
> >> Committer Algorithm version is 1
> >> 2015-10-16 15:43:07,834 INFO [main] org.apache.hadoop.mapred.Task:
> >>Using
> >> ResourceCalculatorProcessTree : [ ]
> >> 2015-10-16 15:43:07,901 WARN [main] org.apache.hadoop.mapred.YarnChild:
> >>
> >> Exception running child : java.lang.RuntimeException:
> >> java.lang.ClassNotFoundException: Class
> >> org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found
> >> at
> >> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
> >> at
> >>
> >>org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobCo
> >>ntextImpl.java:174)
> >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:749)
> >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> >> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> >> at java.security.AccessController.doPrivileged(Native Method)
> >> at javax.security.auth.Subject.doAs(Subject.java:415)
> >> at
> >>
> >>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
> >>.java:1657)
> >> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> >> Caused by: java.lang.ClassNotFoundException: Class
> >> org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found
> >> at
> >>
> >>org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:21
> >>01)
> >> at
> >> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
> >> ... 8 more
> >>
> >>
>
>