You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by Li Yang <li...@apache.org> on 2017/10/08 00:40:09 UTC

Re: why hardcode hbase.client.retries.number=1 ?

This configuration is for metadata retrieving. We want it fail fast when
retrieving the most basic information like model/table/cube. Without the
fast fail, HBase retry could go from 30 seconds to 30 minutes by default.
Leaving user complaining Kylin is dead. The wanted behavior is reporting
HBase failure as soon as possible, so user can focus on and fix the root
cause.

On Mon, Sep 25, 2017 at 1:53 PM, yuyong.zhai <yu...@ele.me> wrote:

> when i merge my cube,the error log
>
>
>
> 2017-09-25 13:27:11,116 ERROR [Job f86b4e43-2136-417b-b38f-7e8d37e80c02-4384]
> dao.ExecutableDao:201 : error get job output id:f86b4e43-2136-417b-b38f-
> 7e8d37e80c02-03
>
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> attempts=1, exceptions:
>
> Mon Sep 25 13:27:11 GMT+08:00 2017, RpcRetryingCaller{globalStartTime=1506317231114,
> pause=50, retries=1}, org.apache.hadoop.hbase.exceptions.RegionMovedException:
> Region moved to: hostname=xxx port=16020 startCode=1505724495412. As of
> locationSeqNum=19987.
>
>
>         at org.apache.hadoop.hbase.client.RpcRetryingCaller.
> callWithRetries(RpcRetryingCaller.java:157)
>
>         at org.apache.hadoop.hbase.client.HTable.get(HTable.java:865)
>
>         at org.apache.hadoop.hbase.client.HTable.get(HTable.java:831)
>
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.
> internalGetFromHTable(HBaseResourceStore.java:384)
>
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.
> getFromHTable(HBaseResourceStore.java:362)
>
>         at org.apache.kylin.storage.hbase.HBaseResourceStore.
> getResourceImpl(HBaseResourceStore.java:272)
>
>         at org.apache.kylin.common.persistence.ResourceStore.
> getResource(ResourceStore.java:154)
>
>         at org.apache.kylin.job.dao.ExecutableDao.readJobOutputResource(
> ExecutableDao.java:100)
>
>         at org.apache.kylin.job.dao.ExecutableDao.getJobOutput(
> ExecutableDao.java:193)
>
>         at org.apache.kylin.job.execution.ExecutableManager.
> getOutput(ExecutableManager.java:150)
>
>         at org.apache.kylin.job.execution.AbstractExecutable.
> getOutput(AbstractExecutable.java:312)
>
>         at org.apache.kylin.job.execution.AbstractExecutable.isDiscarded(
> AbstractExecutable.java:392)
>
>         at org.apache.kylin.engine.mr.common.MapReduceExecutable.
> doWork(MapReduceExecutable.java:150)
>
>         at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:125)
>
>         at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
> DefaultChainedExecutable.java:65)
>
>         at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:125)
>
>         at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> JobRunner.run(DefaultScheduler.java:141)
>
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>
>         at java.lang.Thread.run(Thread.java:745)
>
> Caused by: org.apache.hadoop.hbase.exceptions.RegionMovedException:
> Region moved to: hostname=xxx port=16020 startCode=1505724495412. As of
> locationSeqNum=19987.
>
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:62)
>
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
>
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>
>         at org.apache.hadoop.ipc.RemoteException.instantiateException(
> RemoteException.java:106)
>
>         at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(
> RemoteException.java:95)
>