You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by Na Zhai <na...@kyligence.io> on 2018/12/06 04:38:44 UTC

答复: Error (2.5.2) - fail to find the statistics file in base dir (Step 5: Save Cuboid Statistics)

Hi, Jon Shoberg.
Maybe you can try hive 1.x. Can you check the log of HDFS and hive? Maybe there will be the error message.

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用

________________________________
发件人: Jon Shoberg <jo...@gmail.com>
发送时间: Wednesday, December 5, 2018 2:23:44 PM
收件人: user@kylin.apache.org
主题: Error (2.5.2) - fail to find the statistics file in base dir (Step 5: Save Cuboid Statistics)

I'm getting the following error message below when reaching Step 5: Save Cuboid Statistics

Any ideas or suggestions? Below is the error and after are steps I've tried to fix.

java.io.IOException: fail to find the statistics file in base dir: hdfs://192.168.1.20:9000/kylin/kylin_metadata/kylin-46adf439-7f25-91fa-a3cf-a7c27732e77c/HoldingNodeCube/fact_distinct_columns/statistics<http://192.168.1.20:9000/kylin/kylin_metadata/kylin-46adf439-7f25-91fa-a3cf-a7c27732e77c/HoldingNodeCube/fact_distinct_columns/statistics>
at org.apache.kylin.engine.mr.steps.SaveStatisticsStep.doWork(SaveStatisticsStep.java:78)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:164)
at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:70)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:164)
at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

The system is Centos 7, Hadoop 2.8.5, Hive 2.3.4, HBase 1.4.8, Kylin 2.5.2 - all installed from tar

DFS/Yarn all seem to be working appropriately.  No DFS errors and is not in safe mode.

Hive is working correctly and source tables are populated with data and SQL tests seem OK.

HBase seems working OK. Its only installed/used for Kylin so even after re-installing Kylin had installed its meta-data just fine.

Kylin's web UI comes up OK. The data model, cube, and processing are all done via the Web UI and no errors other than the one above.

I looked at the GITHUB sources where the error message comes from and its not creating the statistics directory and cannot get the statistics files (they're not created)

In the DFS/Hadoop logs I see some messages around statistics and the job ID in the HDFS URL but nothing that shows me where to fix something.

Any thoughts, ideas, or experiences to share would be greatly appreciated!

J


Re: Error (2.5.2) - fail to find the statistics file in base dir (Step 5: Save Cuboid Statistics)

Posted by ShaoFeng Shi <sh...@apache.org>.
Interesting; feel free to report a JIRA if the problem occurs again.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng.shi@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscribe@kylin.apache.org
Join Kylin dev mail group: dev-subscribe@kylin.apache.org




Jon Shoberg <jo...@gmail.com> 于2018年12月7日周五 上午12:44写道:

> Thanks for the suggestion!  This cluster is dedicated to kylin so I tried
> a few other changes.
>
> After switching to Hadoop 2.7.7 and the stable versions of Hive, Tez,
> Hbase (1.4.3) for that version of Hadoop everything is working better.
>
> I'm able to build a cube with Kylin 2.5.2 with compression enabled and
> that works for now.
>
> The only issue is that Kylin 2.5.2 has a problem figuring out KYLIN_CONF
> (when KYLIN_HOME is properly set) and some steps and the WEB UI will fail
> by not being able to find various properties or conf files.
>
> Looking at the log its consistently mis-calculating KYLIN_CONF so a sym.
> link fixed this as a workaround and can move forward :)
>
> Thanks! J
>
>
>
>
>
>
> On Wed, Dec 5, 2018 at 9:39 PM Na Zhai <na...@kyligence.io> wrote:
>
>> Hi, Jon Shoberg.
>>
>> Maybe you can try hive 1.x. Can you check the log of HDFS and hive? Maybe
>> there will be the error message.
>>
>>
>>
>> 发送自 Windows 10 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>应用
>>
>>
>> ------------------------------
>> *发件人:* Jon Shoberg <jo...@gmail.com>
>> *发送时间:* Wednesday, December 5, 2018 2:23:44 PM
>> *收件人:* user@kylin.apache.org
>> *主题:* Error (2.5.2) - fail to find the statistics file in base dir (Step
>> 5: Save Cuboid Statistics)
>>
>> I'm getting the following error message below when reaching Step 5: Save
>> Cuboid Statistics
>>
>> Any ideas or suggestions? Below is the error and after are steps I've
>> tried to fix.
>>
>> java.io.IOException: fail to find the statistics file in base dir: hdfs://
>> 192.168.1.20:9000/kylin/kylin_metadata/kylin-46adf439-7f25-91fa-a3cf-a7c27732e77c/HoldingNodeCube/fact_distinct_columns/statistics
>> at
>> org.apache.kylin.engine.mr.steps.SaveStatisticsStep.doWork(SaveStatisticsStep.java:78)
>> at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:164)
>> at
>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:70)
>> at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:164)
>> at
>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> at java.lang.Thread.run(Thread.java:748)
>>
>> The system is Centos 7, Hadoop 2.8.5, Hive 2.3.4, HBase 1.4.8, Kylin
>> 2.5.2 - all installed from tar
>>
>> DFS/Yarn all seem to be working appropriately.  No DFS errors and is not
>> in safe mode.
>>
>> Hive is working correctly and source tables are populated with data and
>> SQL tests seem OK.
>>
>> HBase seems working OK. Its only installed/used for Kylin so even after
>> re-installing Kylin had installed its meta-data just fine.
>>
>> Kylin's web UI comes up OK. The data model, cube, and processing are all
>> done via the Web UI and no errors other than the one above.
>>
>> I looked at the GITHUB sources where the error message comes from and its
>> not creating the statistics directory and cannot get the statistics files
>> (they're not created)
>>
>> In the DFS/Hadoop logs I see some messages around statistics and the job
>> ID in the HDFS URL but nothing that shows me where to fix something.
>>
>> Any thoughts, ideas, or experiences to share would be greatly appreciated!
>>
>> J
>>
>>

Re: Error (2.5.2) - fail to find the statistics file in base dir (Step 5: Save Cuboid Statistics)

Posted by Jon Shoberg <jo...@gmail.com>.
Thanks for the suggestion!  This cluster is dedicated to kylin so I tried a
few other changes.

After switching to Hadoop 2.7.7 and the stable versions of Hive, Tez, Hbase
(1.4.3) for that version of Hadoop everything is working better.

I'm able to build a cube with Kylin 2.5.2 with compression enabled and that
works for now.

The only issue is that Kylin 2.5.2 has a problem figuring out KYLIN_CONF
(when KYLIN_HOME is properly set) and some steps and the WEB UI will fail
by not being able to find various properties or conf files.

Looking at the log its consistently mis-calculating KYLIN_CONF so a sym.
link fixed this as a workaround and can move forward :)

Thanks! J






On Wed, Dec 5, 2018 at 9:39 PM Na Zhai <na...@kyligence.io> wrote:

> Hi, Jon Shoberg.
>
> Maybe you can try hive 1.x. Can you check the log of HDFS and hive? Maybe
> there will be the error message.
>
>
>
> 发送自 Windows 10 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
> ------------------------------
> *发件人:* Jon Shoberg <jo...@gmail.com>
> *发送时间:* Wednesday, December 5, 2018 2:23:44 PM
> *收件人:* user@kylin.apache.org
> *主题:* Error (2.5.2) - fail to find the statistics file in base dir (Step
> 5: Save Cuboid Statistics)
>
> I'm getting the following error message below when reaching Step 5: Save
> Cuboid Statistics
>
> Any ideas or suggestions? Below is the error and after are steps I've
> tried to fix.
>
> java.io.IOException: fail to find the statistics file in base dir: hdfs://
> 192.168.1.20:9000/kylin/kylin_metadata/kylin-46adf439-7f25-91fa-a3cf-a7c27732e77c/HoldingNodeCube/fact_distinct_columns/statistics
> at
> org.apache.kylin.engine.mr.steps.SaveStatisticsStep.doWork(SaveStatisticsStep.java:78)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:164)
> at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:70)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:164)
> at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
> The system is Centos 7, Hadoop 2.8.5, Hive 2.3.4, HBase 1.4.8, Kylin 2.5.2
> - all installed from tar
>
> DFS/Yarn all seem to be working appropriately.  No DFS errors and is not
> in safe mode.
>
> Hive is working correctly and source tables are populated with data and
> SQL tests seem OK.
>
> HBase seems working OK. Its only installed/used for Kylin so even after
> re-installing Kylin had installed its meta-data just fine.
>
> Kylin's web UI comes up OK. The data model, cube, and processing are all
> done via the Web UI and no errors other than the one above.
>
> I looked at the GITHUB sources where the error message comes from and its
> not creating the statistics directory and cannot get the statistics files
> (they're not created)
>
> In the DFS/Hadoop logs I see some messages around statistics and the job
> ID in the HDFS URL but nothing that shows me where to fix something.
>
> Any thoughts, ideas, or experiences to share would be greatly appreciated!
>
> J
>
>