You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by crossme <cr...@aliyun.com> on 2017/07/13 11:51:31 UTC

common.HadoopJobStatusChecker:58 : error check status

Hi All     The Cube build error on Step 3 Extract Fact Table Distinct Columns. Here is the error message. Any help please.
    Explain: This created 4 Cube test, only one of which all processes run, Cube successfully constructed, can query, the rest of the Cube in the third step error, the error log is below, do not achieve the status of the Job.
production environment:        CDH-5.9   Kylin-2.0

2017-07-13 14:16:39,835 INFO  [Job 3895f42c-8ee4-4eee-a0fc-9b511f9c0be4-437] mapred.ClientServiceDelegate:277 : Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2017-07-13 14:16:39,849 ERROR [Job 3895f42c-8ee4-4eee-a0fc-9b511f9c0be4-437] common.HadoopJobStatusChecker:58 : error check status
java.io.IOException: Job status not available 
 at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:334)
 at org.apache.hadoop.mapreduce.Job.getStatus(Job.java:341)
 at org.apache.kylin.engine.mr.common.HadoopJobStatusChecker.checkStatus(HadoopJobStatusChecker.java:38)
 at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:153)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:142)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:748)
2017-07-13 14:16:39,850 ERROR [Job 3895f42c-8ee4-4eee-a0fc-9b511f9c0be4-437] common.MapReduceExecutable:197 : error execute MapReduceExecutable{id=3895f42c-8ee4-4eee-a0fc-9b511f9c0be4-07, name=Convert Cuboid Data to HFile, state=RUNNING}
java.lang.NullPointerException
 at org.apache.hadoop.mapreduce.Job.getTrackingURL(Job.java:380)
 at org.apache.kylin.engine.mr.common.HadoopCmdOutput.getInfo(HadoopCmdOutput.java:61)
 at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:162)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:142)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:748)

yarn-site.xml

<property>
         <name>yarn.resourcemanager.webapp.address</name>
         <value>dn1:8088</value>
</property>

回复:common.HadoopJobStatusChecker:58 : error check status

Posted by crossme <cr...@aliyun.com>.
     Thank you very much for your reply.            I refer to this article  https://discuss.pivotal.io/hc/en-us/articles/201180246-IOException-Job-status-not-available-when-mapreduce-job-exits-successfully            I try to add the following two parameters in mapred-site.xml<property>
      <name>mapreduce.jobhistory.intermediate-done-dir</name>
      <value>${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate</value>
</property>
<property>
      <name>mapreduce.jobhistory.done-dir</name>
      <value>${yarn.app.mapreduce.am.staging-dir}/history/done</value>
</property>        And restart YARN service,  found an error in the HDFS log file, ${yarn.app.mapreduce.am.staging-dir}/history directory do not have permission to write and modify permissions, Kylin Build Cube will no longer appear before the abnormal state of Job can normally get, currently using normal Kylin.            
-----------------------------------------------------------------------研发中心    赵龙
深圳市神盾信息技术有限公司
邮箱:zhaolong@sundun.cn地址:深圳市南山高新技术产业园北区清华信息港科研楼8层804室 -----------------------------------------------------------------------

------------------------------------------------------------------发件人:Li Yang <li...@apache.org>发送时间:2017年7月19日(星期三) 14:19收件人:user <us...@kylin.apache.org>; crossme <cr...@aliyun.com>主 题:Re: common.HadoopJobStatusChecker:58 : error check status
Given the exception actually happens in Hadoop code:
> java.lang.NullPointerException at org.apache.hadoop.mapreduce.Job.getTrackingURL(Job.java:380)

And you had cube built successfully before. You might want to check recent changes to your Hadoop env. It seems broken somewhere.

On Fri, Jul 14, 2017 at 11:07 AM, crossme <cr...@aliyun.com> wrote:
         

>>>>>Perform third steps of logging information
         Counters: 53
 File System Counters
  FILE: Number of bytes read=326082918
  FILE: Number of bytes written=639475115
  FILE: Number of read operations=0
  FILE: Number of large read operations=0
  FILE: Number of write operations=0
  HDFS: Number of bytes read=375767996
  HDFS: Number of bytes written=154906
  HDFS: Number of read operations=48
  HDFS: Number of large read operations=0
  HDFS: Number of write operations=8
 Job Counters 
  Failed reduce tasks=7
  Killed reduce tasks=4
  Launched map tasks=9
  Launched reduce tasks=15
  Data-local map tasks=7
  Rack-local map tasks=2
  Total time spent by all maps in occupied slots (ms)=554536
  Total time spent by all reduces in occupied slots (ms)=1035019
  Total time spent by all map tasks (ms)=554536
  Total time spent by all reduce tasks (ms)=1035019
  Total vcore-seconds taken by all map tasks=554536
  Total vcore-seconds taken by all reduce tasks=1035019
  Total megabyte-seconds taken by all map tasks=567844864
  Total megabyte-seconds taken by all reduce tasks=1059859456
 Map-Reduce Framework
  Map input records=8758042
  Map output records=70064417
  Map output bytes=1547698142
  Map output materialized bytes=310833429
  Input split bytes=96597
  Combine input records=70064417
  Combine output records=42960200
  Reduce input groups=289020
  Reduce shuffle bytes=8450082
  Reduce input records=1652047
  Reduce output records=0
  Spilled Records=87572447
  Shuffled Maps =36
  Failed Shuffles=0
  Merged Map outputs=36
  GC time elapsed (ms)=6372
  CPU time spent (ms)=677610
  Physical memory (bytes) snapshot=8539713536
  Virtual memory (bytes) snapshot=36823269376
  Total committed heap usage (bytes)=8535932928
 Shuffle Errors
  BAD_ID=0
  CONNECTION=0
  IO_ERROR=0
  WRONG_LENGTH=0
  WRONG_MAP=0
  WRONG_REDUCE=0
 File Input Format Counters 
  Bytes Read=0
 File Output Format Counters 
  Bytes Written=0
 org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper$RawDataCounter
  BYTES=1721507003       

>>>>>The following is an error in the kylin.log file.   There is no error in the Hadoop log file    
2017-07-14 10:40:10,427 INFO  [pool-9-thread-1] threadpool.DefaultScheduler:124 : Job Fetcher: 1 should running, 1 actual running, 0 stopped, 0 ready, 10 already succeed, 8 error, 6 discarded, 0 others
2017-07-14 10:40:12,442 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:40:22,450 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:40:32,457 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:40:42,467 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:40:52,478 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:02,493 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:10,430 INFO  [pool-9-thread-1] threadpool.DefaultScheduler:124 : Job Fetcher: 1 should running, 1 actual running, 0 stopped, 0 ready, 10 already succeed, 8 error, 6 discarded, 0 others
2017-07-14 10:41:12,518 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:22,527 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:32,535 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:42,544 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:53,548 INFO  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] ipc.Client:867 : Retrying connect to server: dn1/10.50.229.209:51098. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2017-07-14 10:41:54,549 INFO  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] ipc.Client:867 : Retrying connect to server: dn1/10.50.229.209:51098. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2017-07-14 10:41:55,549 INFO  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] ipc.Client:867 : Retrying connect to server: dn1/10.50.229.209:51098. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2017-07-14 10:41:55,663 INFO  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] mapred.ClientServiceDelegate:277 : Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
2017-07-14 10:41:55,686 ERROR [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] common.HadoopJobStatusChecker:58 : error check status
java.io.IOException: Job status not available 
 at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:334)
 at org.apache.hadoop.mapreduce.Job.getStatus(Job.java:341)
 at org.apache.kylin.engine.mr.common.HadoopJobStatusChecker.checkStatus(HadoopJobStatusChecker.java:38)
 at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:153)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:142)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:748)
2017-07-14 10:41:55,687 ERROR [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] common.MapReduceExecutable:197 : error execute MapReduceExecutable{id=14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02, name=Extract Fact Table Distinct Columns, state=RUNNING}
java.lang.NullPointerException
 at org.apache.hadoop.mapreduce.Job.getTrackingURL(Job.java:380)
 at org.apache.kylin.engine.mr.common.HadoopCmdOutput.getInfo(HadoopCmdOutput.java:61)
 at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:162)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:142)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:748)
2017-07-14 10:41:55,687 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:55,697 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:55,703 INFO  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] execution.ExecutableManager:389 : job id:14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02 from RUNNING to ERROR
2017-07-14 10:41:55,713 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0
2017-07-14 10:41:55,728 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0
2017-07-14 10:41:55,731 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0
2017-07-14 10:41:55,734 INFO  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] execution.ExecutableManager:389 : job id:14691c4a-64d2-4b1d-ace5-d2d6ad9618d0 from RUNNING to ERROR
2017-07-14 10:41:55,734 WARN  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] execution.AbstractExecutable:258 : no need to send email, user list is empty
2017-07-14 10:41:55,745 INFO  [pool-9-thread-1] threadpool.DefaultScheduler:124 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 10 already succeed, 9 error, 6 discarded, 0 others
2017-07-14 10:42:10,431 INFO  [pool-9-thread-1] threadpool.DefaultScheduler:124 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 10 already succeed, 9 error, 6 discarded, 0 others
2017-07-14 10:43:10,432 INFO  [pool-9-thread-1] threadpool.DefaultScheduler:124 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 10 already succeed, 9 error, 6 discarded, 0 others
2017-07-14 10:44:10,431 INFO  [pool-9-thread-1] threadpool.DefaultScheduler:124 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 10 already succeed, 9 error, 6 discarded, 0 others


------------------------------------------------------------------发件人:crossme <cr...@aliyun.com>发送时间:2017年7月13日(星期四) 19:51收件人:user <us...@kylin.apache.org>主 题:common.HadoopJobStatusChecker:58 : error check status
Hi All     The Cube build error on Step 3 Extract Fact Table Distinct Columns. Here is the error message. Any help please.
    Explain: This created 4 Cube test, only one of which all processes run, Cube successfully constructed, can query, the rest of the Cube in the third step error, the error log is below, do not achieve the status of the Job.
production environment:        CDH-5.9   Kylin-2.0

2017-07-13 14:16:39,835 INFO  [Job 3895f42c-8ee4-4eee-a0fc-9b511f9c0be4-437] mapred.ClientServiceDelegate:277 : Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2017-07-13 14:16:39,849 ERROR [Job 3895f42c-8ee4-4eee-a0fc-9b511f9c0be4-437] common.HadoopJobStatusChecker:58 : error check status
java.io.IOException: Job status not available 
 at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:334)
 at org.apache.hadoop.mapreduce.Job.getStatus(Job.java:341)
 at org.apache.kylin.engine.mr.common.HadoopJobStatusChecker.checkStatus(HadoopJobStatusChecker.java:38)
 at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:153)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:142)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:748)
2017-07-13 14:16:39,850 ERROR [Job 3895f42c-8ee4-4eee-a0fc-9b511f9c0be4-437] common.MapReduceExecutable:197 : error execute MapReduceExecutable{id=3895f42c-8ee4-4eee-a0fc-9b511f9c0be4-07, name=Convert Cuboid Data to HFile, state=RUNNING}
java.lang.NullPointerException
 at org.apache.hadoop.mapreduce.Job.getTrackingURL(Job.java:380)
 at org.apache.kylin.engine.mr.common.HadoopCmdOutput.getInfo(HadoopCmdOutput.java:61)
 at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:162)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:142)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:748)

yarn-site.xml

<property>
         <name>yarn.resourcemanager.webapp.address</name>
         <value>dn1:8088</value>
</property>



Re: common.HadoopJobStatusChecker:58 : error check status

Posted by Li Yang <li...@apache.org>.
Given the exception actually happens in Hadoop code:
> java.lang.NullPointerException at
org.apache.hadoop.mapreduce.Job.getTrackingURL(Job.java:380)

And you had cube built successfully before. You might want to check recent
changes to your Hadoop env. It seems broken somewhere.

On Fri, Jul 14, 2017 at 11:07 AM, crossme <cr...@aliyun.com> wrote:

>
>
>
> >>>>>Perform third steps of logging information
>
>          Counters: 53
>  File System Counters
>   FILE: Number of bytes read=326082918
>   FILE: Number of bytes written=639475115
>   FILE: Number of read operations=0
>   FILE: Number of large read operations=0
>   FILE: Number of write operations=0
>   HDFS: Number of bytes read=375767996
>   HDFS: Number of bytes written=154906
>   HDFS: Number of read operations=48
>   HDFS: Number of large read operations=0
>   HDFS: Number of write operations=8
>  Job Counters
>   Failed reduce tasks=7
>   Killed reduce tasks=4
>   Launched map tasks=9
>   Launched reduce tasks=15
>   Data-local map tasks=7
>   Rack-local map tasks=2
>   Total time spent by all maps in occupied slots (ms)=554536
>   Total time spent by all reduces in occupied slots (ms)=1035019
>   Total time spent by all map tasks (ms)=554536
>   Total time spent by all reduce tasks (ms)=1035019
>   Total vcore-seconds taken by all map tasks=554536
>   Total vcore-seconds taken by all reduce tasks=1035019
>   Total megabyte-seconds taken by all map tasks=567844864
>   Total megabyte-seconds taken by all reduce tasks=1059859456
>  Map-Reduce Framework
>   Map input records=8758042
>   Map output records=70064417
>   Map output bytes=1547698142
>   Map output materialized bytes=310833429
>   Input split bytes=96597
>   Combine input records=70064417
>   Combine output records=42960200
>   Reduce input groups=289020
>   Reduce shuffle bytes=8450082
>   Reduce input records=1652047
>   Reduce output records=0
>   Spilled Records=87572447
>   Shuffled Maps =36
>   Failed Shuffles=0
>   Merged Map outputs=36
>   GC time elapsed (ms)=6372
>   CPU time spent (ms)=677610
>   Physical memory (bytes) snapshot=8539713536
>   Virtual memory (bytes) snapshot=36823269376
>   Total committed heap usage (bytes)=8535932928
>  Shuffle Errors
>   BAD_ID=0
>   CONNECTION=0
>   IO_ERROR=0
>   WRONG_LENGTH=0
>   WRONG_MAP=0
>   WRONG_REDUCE=0
>  File Input Format Counters
>   Bytes Read=0
>  File Output Format Counters
>   Bytes Written=0
>  org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper$RawDataCounter
>   BYTES=1721507003
>
>
> >>>>>The following is an error in the kylin.log file.
>   There is no error in the Hadoop log file
>
> 2017-07-14 10:40:10,427 INFO  [pool-9-thread-1] threadpool.
> DefaultScheduler:124 : Job Fetcher: 1 should running, 1
> actual running, 0 stopped, 0 ready, 10 already succeed, 8
> error, 6 discarded, 0 others
> 2017-07-14 10:40:12,442 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
> 2017-07-14 10:40:22,450 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
> 2017-07-14 10:40:32,457 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
> 2017-07-14 10:40:42,467 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
> 2017-07-14 10:40:52,478 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
> 2017-07-14 10:41:02,493 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
> 2017-07-14 10:41:10,430 INFO  [pool-9-thread-1] threadpool.
> DefaultScheduler:124 : Job Fetcher: 1 should running, 1
> actual running, 0 stopped, 0 ready, 10 already succeed, 8
> error, 6 discarded, 0 others
> 2017-07-14 10:41:12,518 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
> 2017-07-14 10:41:22,527 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
> 2017-07-14 10:41:32,535 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
> 2017-07-14 10:41:42,544 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
> 2017-07-14 10:41:53,548 INFO  [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] ipc.Client:867 : Retrying connect to server: dn1/
> 10.50.229.209:51098. Already tried 0 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=
> 1000 MILLISECONDS)
> 2017-07-14 10:41:54,549 INFO  [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] ipc.Client:867 : Retrying connect to server: dn1/
> 10.50.229.209:51098. Already tried 1 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=
> 1000 MILLISECONDS)
> 2017-07-14 10:41:55,549 INFO  [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] ipc.Client:867 : Retrying connect to server: dn1/
> 10.50.229.209:51098. Already tried 2 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=
> 1000 MILLISECONDS)
> 2017-07-14 10:41:55,663 INFO  [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] mapred.ClientServiceDelegate:277 : Application state is
> completed. FinalApplicationStatus=FAILED. Redirecting to job history
> server
> 2017-07-14 10:41:55,686 ERROR [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] common.HadoopJobStatusChecker:58 : error check status
> java.io.IOException: Job status not available
>  at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:334)
>  at org.apache.hadoop.mapreduce.Job.getStatus(Job.java:341)
>  at org.apache.kylin.engine.mr.common.HadoopJobStatusChecker.checkStatus(
> HadoopJobStatusChecker.java:38)
>  at org.apache.kylin.engine.mr.common.MapReduceExecutable.
> doWork(MapReduceExecutable.java:153)
>  at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
>  at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
> DefaultChainedExecutable.java:64)
>  at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
>  at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(
> DefaultScheduler.java:142)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:748)
> 2017-07-14 10:41:55,687 ERROR [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] common.MapReduceExecutable:197 : error execute
> MapReduceExecutable{id=14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-02, name=Extract Fact Table Distinct Columns, state=RUNNING}
> java.lang.NullPointerException
>  at org.apache.hadoop.mapreduce.Job.getTrackingURL(Job.java:380)
>  at org.apache.kylin.engine.mr.common.HadoopCmdOutput.
> getInfo(HadoopCmdOutput.java:61)
>  at org.apache.kylin.engine.mr.common.MapReduceExecutable.
> doWork(MapReduceExecutable.java:162)
>  at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
>  at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
> DefaultChainedExecutable.java:64)
>  at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
>  at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(
> DefaultScheduler.java:142)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:748)
> 2017-07-14 10:41:55,687 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
> 2017-07-14 10:41:55,697 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
> 2017-07-14 10:41:55,703 INFO  [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] execution.ExecutableManager:389 : job
> id:14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02 from RUNNING to ERROR
> 2017-07-14 10:41:55,713 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0
> 2017-07-14 10:41:55,728 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0
> 2017-07-14 10:41:55,731 DEBUG [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] dao.ExecutableDao:217 : updating
> job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0
> 2017-07-14 10:41:55,734 INFO  [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] execution.ExecutableManager:389 : job
> id:14691c4a-64d2-4b1d-ace5-d2d6ad9618d0 from RUNNING to ERROR
> 2017-07-14 10:41:55,734 WARN  [Job 14691c4a-64d2-4b1d-ace5-
> d2d6ad9618d0-297] execution.AbstractExecutable:258 : no
> need to send email, user list is empty
> 2017-07-14 10:41:55,745 INFO  [pool-9-thread-1] threadpool.
> DefaultScheduler:124 : Job Fetcher: 0 should running, 0
> actual running, 0 stopped, 0 ready, 10 already succeed, 9
> error, 6 discarded, 0 others
> 2017-07-14 10:42:10,431 INFO  [pool-9-thread-1] threadpool.
> DefaultScheduler:124 : Job Fetcher: 0 should running, 0
> actual running, 0 stopped, 0 ready, 10 already succeed, 9
> error, 6 discarded, 0 others
> 2017-07-14 10:43:10,432 INFO  [pool-9-thread-1] threadpool.
> DefaultScheduler:124 : Job Fetcher: 0 should running, 0
> actual running, 0 stopped, 0 ready, 10 already succeed, 9
> error, 6 discarded, 0 others
> 2017-07-14 10:44:10,431 INFO  [pool-9-thread-1] threadpool.
> DefaultScheduler:124 : Job Fetcher: 0 should running, 0
> actual running, 0 stopped, 0 ready, 10 already succeed, 9
> error, 6 discarded, 0 others
>
>
> ------------------------------------------------------------------
> 发件人:crossme <cr...@aliyun.com>
> 发送时间:2017年7月13日(星期四) 19:51
> 收件人:user <us...@kylin.apache.org>
> 主 题:common.HadoopJobStatusChecker:58 : error check status
>
> Hi All
>
>     The Cube build error on Step 3 Extract Fact Table Distinct Columns.
> Here is the error message. Any help please.
>
>     Explain: This created 4 Cube test, only one of which
> all processes run, Cube successfully constructed, can
> query, the rest of the Cube in the third step error, the
> error log is below, do not achieve the status of the Job.
>
> production environment:
>         CDH-5.9   Kylin-2.0
>
> 2017-07-13 14:16:39,835 INFO  [Job 3895f42c-8ee4-4eee-a0fc-
> 9b511f9c0be4-437] mapred.ClientServiceDelegate:277 : Application state is
> completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job
> history server
> 2017-07-13 14:16:39,849 ERROR [Job 3895f42c-8ee4-4eee-a0fc-
> 9b511f9c0be4-437] common.HadoopJobStatusChecker:58 : error check status
> java.io.IOException: Job status not available
>  at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:334)
>  at org.apache.hadoop.mapreduce.Job.getStatus(Job.java:341)
>  at org.apache.kylin.engine.mr.common.HadoopJobStatusChecker.checkStatus(
> HadoopJobStatusChecker.java:38)
>  at org.apache.kylin.engine.mr.common.MapReduceExecutable.
> doWork(MapReduceExecutable.java:153)
>  at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
>  at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
> DefaultChainedExecutable.java:64)
>  at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
>  at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(
> DefaultScheduler.java:142)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:748)
> 2017-07-13 14:16:39,850 ERROR [Job 3895f42c-8ee4-4eee-a0fc-
> 9b511f9c0be4-437] common.MapReduceExecutable:197 : error execute
> MapReduceExecutable{id=3895f42c-8ee4-4eee-a0fc-
> 9b511f9c0be4-07, name=Convert Cuboid Data to HFile, state=RUNNING}
> java.lang.NullPointerException
>  at org.apache.hadoop.mapreduce.Job.getTrackingURL(Job.java:380)
>  at org.apache.kylin.engine.mr.common.HadoopCmdOutput.
> getInfo(HadoopCmdOutput.java:61)
>  at org.apache.kylin.engine.mr.common.MapReduceExecutable.
> doWork(MapReduceExecutable.java:162)
>  at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
>  at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
> DefaultChainedExecutable.java:64)
>  at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
>  at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(
> DefaultScheduler.java:142)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:748)
>
>
> yarn-site.xml
>
> <property>
>          <name>yarn.resourcemanager.webapp.address</name>
>          <value>dn1:8088</value>
> </property>
>
>
>

回复:common.HadoopJobStatusChecker:58 : error check status

Posted by crossme <cr...@aliyun.com>.
         

>>>>>Perform third steps of logging information
         Counters: 53
 File System Counters
  FILE: Number of bytes read=326082918
  FILE: Number of bytes written=639475115
  FILE: Number of read operations=0
  FILE: Number of large read operations=0
  FILE: Number of write operations=0
  HDFS: Number of bytes read=375767996
  HDFS: Number of bytes written=154906
  HDFS: Number of read operations=48
  HDFS: Number of large read operations=0
  HDFS: Number of write operations=8
 Job Counters 
  Failed reduce tasks=7
  Killed reduce tasks=4
  Launched map tasks=9
  Launched reduce tasks=15
  Data-local map tasks=7
  Rack-local map tasks=2
  Total time spent by all maps in occupied slots (ms)=554536
  Total time spent by all reduces in occupied slots (ms)=1035019
  Total time spent by all map tasks (ms)=554536
  Total time spent by all reduce tasks (ms)=1035019
  Total vcore-seconds taken by all map tasks=554536
  Total vcore-seconds taken by all reduce tasks=1035019
  Total megabyte-seconds taken by all map tasks=567844864
  Total megabyte-seconds taken by all reduce tasks=1059859456
 Map-Reduce Framework
  Map input records=8758042
  Map output records=70064417
  Map output bytes=1547698142
  Map output materialized bytes=310833429
  Input split bytes=96597
  Combine input records=70064417
  Combine output records=42960200
  Reduce input groups=289020
  Reduce shuffle bytes=8450082
  Reduce input records=1652047
  Reduce output records=0
  Spilled Records=87572447
  Shuffled Maps =36
  Failed Shuffles=0
  Merged Map outputs=36
  GC time elapsed (ms)=6372
  CPU time spent (ms)=677610
  Physical memory (bytes) snapshot=8539713536
  Virtual memory (bytes) snapshot=36823269376
  Total committed heap usage (bytes)=8535932928
 Shuffle Errors
  BAD_ID=0
  CONNECTION=0
  IO_ERROR=0
  WRONG_LENGTH=0
  WRONG_MAP=0
  WRONG_REDUCE=0
 File Input Format Counters 
  Bytes Read=0
 File Output Format Counters 
  Bytes Written=0
 org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper$RawDataCounter
  BYTES=1721507003       

>>>>>The following is an error in the kylin.log file.   There is no error in the Hadoop log file    
2017-07-14 10:40:10,427 INFO  [pool-9-thread-1] threadpool.DefaultScheduler:124 : Job Fetcher: 1 should running, 1 actual running, 0 stopped, 0 ready, 10 already succeed, 8 error, 6 discarded, 0 others
2017-07-14 10:40:12,442 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:40:22,450 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:40:32,457 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:40:42,467 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:40:52,478 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:02,493 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:10,430 INFO  [pool-9-thread-1] threadpool.DefaultScheduler:124 : Job Fetcher: 1 should running, 1 actual running, 0 stopped, 0 ready, 10 already succeed, 8 error, 6 discarded, 0 others
2017-07-14 10:41:12,518 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:22,527 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:32,535 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:42,544 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:53,548 INFO  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] ipc.Client:867 : Retrying connect to server: dn1/10.50.229.209:51098. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2017-07-14 10:41:54,549 INFO  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] ipc.Client:867 : Retrying connect to server: dn1/10.50.229.209:51098. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2017-07-14 10:41:55,549 INFO  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] ipc.Client:867 : Retrying connect to server: dn1/10.50.229.209:51098. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2017-07-14 10:41:55,663 INFO  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] mapred.ClientServiceDelegate:277 : Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
2017-07-14 10:41:55,686 ERROR [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] common.HadoopJobStatusChecker:58 : error check status
java.io.IOException: Job status not available 
 at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:334)
 at org.apache.hadoop.mapreduce.Job.getStatus(Job.java:341)
 at org.apache.kylin.engine.mr.common.HadoopJobStatusChecker.checkStatus(HadoopJobStatusChecker.java:38)
 at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:153)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:142)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:748)
2017-07-14 10:41:55,687 ERROR [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] common.MapReduceExecutable:197 : error execute MapReduceExecutable{id=14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02, name=Extract Fact Table Distinct Columns, state=RUNNING}
java.lang.NullPointerException
 at org.apache.hadoop.mapreduce.Job.getTrackingURL(Job.java:380)
 at org.apache.kylin.engine.mr.common.HadoopCmdOutput.getInfo(HadoopCmdOutput.java:61)
 at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:162)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:142)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:748)
2017-07-14 10:41:55,687 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:55,697 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02
2017-07-14 10:41:55,703 INFO  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] execution.ExecutableManager:389 : job id:14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-02 from RUNNING to ERROR
2017-07-14 10:41:55,713 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0
2017-07-14 10:41:55,728 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0
2017-07-14 10:41:55,731 DEBUG [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] dao.ExecutableDao:217 : updating job output, id: 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0
2017-07-14 10:41:55,734 INFO  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] execution.ExecutableManager:389 : job id:14691c4a-64d2-4b1d-ace5-d2d6ad9618d0 from RUNNING to ERROR
2017-07-14 10:41:55,734 WARN  [Job 14691c4a-64d2-4b1d-ace5-d2d6ad9618d0-297] execution.AbstractExecutable:258 : no need to send email, user list is empty
2017-07-14 10:41:55,745 INFO  [pool-9-thread-1] threadpool.DefaultScheduler:124 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 10 already succeed, 9 error, 6 discarded, 0 others
2017-07-14 10:42:10,431 INFO  [pool-9-thread-1] threadpool.DefaultScheduler:124 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 10 already succeed, 9 error, 6 discarded, 0 others
2017-07-14 10:43:10,432 INFO  [pool-9-thread-1] threadpool.DefaultScheduler:124 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 10 already succeed, 9 error, 6 discarded, 0 others
2017-07-14 10:44:10,431 INFO  [pool-9-thread-1] threadpool.DefaultScheduler:124 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 10 already succeed, 9 error, 6 discarded, 0 others


------------------------------------------------------------------发件人:crossme <cr...@aliyun.com>发送时间:2017年7月13日(星期四) 19:51收件人:user <us...@kylin.apache.org>主 题:common.HadoopJobStatusChecker:58 : error check status
Hi All     The Cube build error on Step 3 Extract Fact Table Distinct Columns. Here is the error message. Any help please.
    Explain: This created 4 Cube test, only one of which all processes run, Cube successfully constructed, can query, the rest of the Cube in the third step error, the error log is below, do not achieve the status of the Job.
production environment:        CDH-5.9   Kylin-2.0

2017-07-13 14:16:39,835 INFO  [Job 3895f42c-8ee4-4eee-a0fc-9b511f9c0be4-437] mapred.ClientServiceDelegate:277 : Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2017-07-13 14:16:39,849 ERROR [Job 3895f42c-8ee4-4eee-a0fc-9b511f9c0be4-437] common.HadoopJobStatusChecker:58 : error check status
java.io.IOException: Job status not available 
 at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:334)
 at org.apache.hadoop.mapreduce.Job.getStatus(Job.java:341)
 at org.apache.kylin.engine.mr.common.HadoopJobStatusChecker.checkStatus(HadoopJobStatusChecker.java:38)
 at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:153)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:142)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:748)
2017-07-13 14:16:39,850 ERROR [Job 3895f42c-8ee4-4eee-a0fc-9b511f9c0be4-437] common.MapReduceExecutable:197 : error execute MapReduceExecutable{id=3895f42c-8ee4-4eee-a0fc-9b511f9c0be4-07, name=Convert Cuboid Data to HFile, state=RUNNING}
java.lang.NullPointerException
 at org.apache.hadoop.mapreduce.Job.getTrackingURL(Job.java:380)
 at org.apache.kylin.engine.mr.common.HadoopCmdOutput.getInfo(HadoopCmdOutput.java:61)
 at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:162)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
 at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
 at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:142)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:748)

yarn-site.xml

<property>
         <name>yarn.resourcemanager.webapp.address</name>
         <value>dn1:8088</value>
</property>