You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by 于奎星 <yu...@xiaomi.com> on 2016/10/26 01:46:01 UTC

which hdfs should the property "kylin.hdfs.working.dir" point to when deploying kylin with standalone hbase cluster

hi


now I'm trying this: http://kylin.apache.org/blog/2016/06/10/standalone-hbase-cluster/


and I'm not sure which hdfs the property "kylin.hdfs.working.dir" should point to, hdfs in main cluster or hdfs in hbase cluter?


Any help would be great.


thanks.



________________________________

With Best regards?

Yu Kuixing


Re: 答复: which hdfs should the property "kylin.hdfs.working.dir" point to when deploying kylin with standalone hbase cluster

Posted by ShaoFeng Shi <sh...@apache.org>.
"kylin.hdfs.working.dir" should points to the source (hive) cluster's hdfs.

2016-10-26 11:31 GMT+08:00 于奎星 <yu...@xiaomi.com>:

> I have two cluster A and B.
>
>
> A works as main cluster, with hive and mr running on it and no hbase;
>
> B works as hbase cluster, with hbase running on it and no hive.
>
>
> the following is my *local* config files:
>
>
> *hadoop core-site.xml (points to Cluster A)*
>
>
>   <property>
>     <name>fs.defaultFS</name>
>     <value>hdfs://A</value>
>   </property>
>
>   <property>
>     <name>ha.zookeeper.quorum</name>
>     <value>xxxx</value>
>   </property>
>
> *hadoop hdfs-site.xml (points to hdfs in Cluster A)*
>
>   <property>
>     <name>dfs.nameservices</name>
>     <value>c3prc-hadoop</value>
>   </property>
>
> *hbase client core-site.xml*​ *(points to Cluster B)​*
>
>
>   <property>
>     <name>fs.defaultFS</name>
>     <value>hdfs://c3prc-xiaomi98</value>
>   </property>
>
> *hbase hdfs-site.xml (points to hdfs in Cluster B)*
>
>   <property>
>     <name>dfs.nameservices</name>
>     <value>c3prc-xiaomi98</value>
>   </property>
>
> *hive client points to Cluster A*
>
>
> Here's my problem:
>
> when kylin runs my job, it throws the following exception In the step Create
> Intermediate Flat Hive Table:
>
> 2016-10-26 10:27:05,885 ERROR [pool-7-thread-4]
> hive.CreateFlatHiveTableStep:125 : job:1a2ae09d-5cb9-4c99-83a4-8e0d53b29966-02
> execute finished with exception
> java.lang.IllegalArgumentException: java.net.UnknownHostException: *A*
> at org.apache.hadoop.security.SecurityUtil.buildTokenService(
> SecurityUtil.java:377)
> at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(
> NameNodeProxies.java:249)
> at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(
> NameNodeProxies.java:151)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:599)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:544)
> at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(
> DistributedFileSystem.java:147)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2405)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2439)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2421)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
> at org.apache.kylin.source.hive.CreateFlatHiveTableStep.
> readRowCountFromFile(CreateFlatHiveTableStep.java:52)
> at org.apache.kylin.source.hive.CreateFlatHiveTableStep.doWork(
> CreateFlatHiveTableStep.java:112)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:116)
> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
> DefaultChainedExecutable.java:61)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:116)
> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(
> DefaultScheduler.java:137)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
>
> This exception happens when kylin tries to read the rowCountFile created
> in step Create Count File.
>
> In my case, this file is created in Cluster A. It seems that kylin has
> loaded local hbase config files. So it cannot parse Cluster A.
>
> is there any suggestion?
>
> thanks.
>
>
>
>
>
>
> ------------------------------
>
> With Best regards​
>
> Yu Kuixing​
>
> ------------------------------
> *发件人:* 于奎星
> *发送时间:* 2016年10月26日 10:36
> *收件人:* user
> *主题:* 答复: which hdfs should the property "kylin.hdfs.working.dir" point
> to when deploying kylin with standalone hbase cluster
>
>
> Yes, I configured kylin.hbase.cluster.fs to my hbase cluster.
>
>
> And I want to know, what does "kylin.hdfs.working.dir"​ exactly mean?
>
>
> Because it is confusing when kylin works with two different hdfs.
>
>
> ------------------------------
>
> With Best regards​
>
> Yu Kuixing
>
> ------------------------------
> *发件人:* Billy(Yiming) Liu <li...@gmail.com>
> *发送时间:* 2016年10月26日 10:08
> *收件人:* user
> *主题:* Re: which hdfs should the property "kylin.hdfs.working.dir" point
> to when deploying kylin with standalone hbase cluster
>
> use kylin.hbase.cluster.fs
>
> 2016-10-26 9:46 GMT+08:00 于奎星 <yu...@xiaomi.com>:
>
>> hi
>>
>>
>> now I'm trying this: http://kylin.apache.org/
>> blog/2016/06/10/standalone-hbase-cluster/
>>
>>
>> and I'm not sure which hdfs the property "kylin.hdfs.working.dir" should point
>> to, hdfs in main cluster or hdfs in hbase cluter?
>>
>>
>> Any help would be great.
>>
>>
>> thanks.
>>
>>
>>
>> ------------------------------
>>
>> With Best regards​
>>
>> Yu Kuixing
>>
>>
>
>
> --
> With Warm regards
>
> Yiming Liu (刘一鸣)
>



-- 
Best regards,

Shaofeng Shi 史少锋

答复: which hdfs should the property "kylin.hdfs.working.dir" point to when deploying kylin with standalone hbase cluster

Posted by 于奎星 <yu...@xiaomi.com>.
I have two cluster A and B.


A works as main cluster, with hive and mr running on it and no hbase;

B works as hbase cluster, with hbase running on it and no hive.


the following is my local config files:


hadoop core-site.xml (points to Cluster A)


  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://A</value>
  </property>

  <property>
    <name>ha.zookeeper.quorum</name>
    <value>xxxx</value>
  </property>

hadoop hdfs-site.xml (points to hdfs in Cluster A)

  <property>
    <name>dfs.nameservices</name>
    <value>c3prc-hadoop</value>
  </property>

hbase client core-site.xml​ (points to Cluster B)​


  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://c3prc-xiaomi98</value>
  </property>

hbase hdfs-site.xml (points to hdfs in Cluster B)

  <property>
    <name>dfs.nameservices</name>
    <value>c3prc-xiaomi98</value>
  </property>

hive client points to Cluster A


Here's my problem:

when kylin runs my job, it throws the following exception In the step Create Intermediate Flat Hive Table:

2016-10-26 10:27:05,885 ERROR [pool-7-thread-4] hive.CreateFlatHiveTableStep:125 : job:1a2ae09d-5cb9-4c99-83a4-8e0d53b29966-02 execute finished with exception
java.lang.IllegalArgumentException: java.net.UnknownHostException: A
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:249)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:151)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:599)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:544)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:147)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2405)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2439)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2421)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.kylin.source.hive.CreateFlatHiveTableStep.readRowCountFromFile(CreateFlatHiveTableStep.java:52)
at org.apache.kylin.source.hive.CreateFlatHiveTableStep.doWork(CreateFlatHiveTableStep.java:112)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:116)
at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:61)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:116)
at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:137)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


This exception happens when kylin tries to read the rowCountFile created in step Create Count File.

In my case, this file is created in Cluster A. It seems that kylin has loaded local hbase config files. So it cannot parse Cluster A.

is there any suggestion?

thanks.







________________________________

With Best regards​

Yu Kuixing​

________________________________
发件人: 于奎星
发送时间: 2016年10月26日 10:36
收件人: user
主题: 答复: which hdfs should the property "kylin.hdfs.working.dir" point to when deploying kylin with standalone hbase cluster


Yes, I configured kylin.hbase.cluster.fs to my hbase cluster.


And I want to know, what does "kylin.hdfs.working.dir"​ exactly mean?


Because it is confusing when kylin works with two different hdfs.


________________________________

With Best regards​

Yu Kuixing

________________________________
发件人: Billy(Yiming) Liu <li...@gmail.com>
发送时间: 2016年10月26日 10:08
收件人: user
主题: Re: which hdfs should the property "kylin.hdfs.working.dir" point to when deploying kylin with standalone hbase cluster

use kylin.hbase.cluster.fs

2016-10-26 9:46 GMT+08:00 于奎星 <yu...@xiaomi.com>>:

hi


now I'm trying this: http://kylin.apache.org/blog/2016/06/10/standalone-hbase-cluster/


and I'm not sure which hdfs the property "kylin.hdfs.working.dir" should point to, hdfs in main cluster or hdfs in hbase cluter?


Any help would be great.


thanks.



________________________________

With Best regards​

Yu Kuixing




--
With Warm regards

Yiming Liu (刘一鸣)

答复: which hdfs should the property "kylin.hdfs.working.dir" point to when deploying kylin with standalone hbase cluster

Posted by 于奎星 <yu...@xiaomi.com>.
Yes, I configured kylin.hbase.cluster.fs to my hbase cluster.


And I want to know, what does "kylin.hdfs.working.dir"? exactly mean?


Because it is confusing when kylin works with two different hdfs.


________________________________

With Best regards?

Yu Kuixing

________________________________
发件人: Billy(Yiming) Liu <li...@gmail.com>
发送时间: 2016年10月26日 10:08
收件人: user
主题: Re: which hdfs should the property "kylin.hdfs.working.dir" point to when deploying kylin with standalone hbase cluster

use kylin.hbase.cluster.fs

2016-10-26 9:46 GMT+08:00 于奎星 <yu...@xiaomi.com>>:

hi


now I'm trying this: http://kylin.apache.org/blog/2016/06/10/standalone-hbase-cluster/


and I'm not sure which hdfs the property "kylin.hdfs.working.dir" should point to, hdfs in main cluster or hdfs in hbase cluter?


Any help would be great.


thanks.



________________________________

With Best regards?

Yu Kuixing




--
With Warm regards

Yiming Liu (刘一鸣)

Re: which hdfs should the property "kylin.hdfs.working.dir" point to when deploying kylin with standalone hbase cluster

Posted by "Billy(Yiming) Liu" <li...@gmail.com>.
use kylin.hbase.cluster.fs

2016-10-26 9:46 GMT+08:00 于奎星 <yu...@xiaomi.com>:

> hi
>
>
> now I'm trying this: http://kylin.apache.org/blog/2016/06/10/standalone-
> hbase-cluster/
>
>
> and I'm not sure which hdfs the property "kylin.hdfs.working.dir" should point
> to, hdfs in main cluster or hdfs in hbase cluter?
>
>
> Any help would be great.
>
>
> thanks.
>
>
>
> ------------------------------
>
> With Best regards​
>
> Yu Kuixing
>
>


-- 
With Warm regards

Yiming Liu (刘一鸣)