You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by Diego Pinheiro <di...@gmail.com> on 2015/08/21 03:03:09 UTC

Kylin and Remote Server

Hi all,

until now I have been playing with Kylin in a sandbox. After some
tests I would like to run it in my production environment. However,
how Kylin works with remote server is not clear to me.

In Kylin properties, there is an option: kylin.job.run.as.remote.cmd,
if I set it to true, then Kylin will assume that the job engine will
run in another server. Only the jobs run in another machine? How about
the data?

On http://kylin.incubator.apache.org/docs/install/index.html , it says
that the most common case is to run Kylin in a Hadoop client machine.
So, I assume that Kylin could use Thrift and JDBC clients to
communicate to the server. However, looking into
${KYLIN_HOME}/bin/kylin.sh, I saw that Hive and HBase command lines
are used to start Kylin. It also takes their classpaths to load other
JAR files.

My question is, how are you running Kylin? Are you running it always
with your server? If not, how are you doing it? I would like to run it
in another machine, but all Kylin information, such as intermediate
Hive tables, HBase cube data, should be stored in the server one. Am I
missing something in the docs?

I believe that this is clear for most of you, but I am quite confused
in this case.


Regards,
Diego

Re: Kylin and Remote Server

Posted by hongbin ma <ma...@apache.org>.
Well organized!

It's definitely helpful to other community members,

thanks!

-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Re: Kylin and Remote Server

Posted by Diego Pinheiro <di...@gmail.com>.
Hi,

(I hope this can help someone)

after a long time, I could ran Kylin from a client machine and send
the jobs to my cluster (VM HDP 2.1).

Since I am a newbie in Hadoop, It took me a lot to configure my client
machine from scratch. I did not install any package from Hortonworks
or Cloudera. The step by step is quite long, but I would like to share
with you to help others that could be in the same situation that I
was. I describe it here:
https://gist.github.com/diegoValhalla/b54fa0d16a904f80b1f5 (as I do
not have a blog). All configuration files that I used are attached. I
checked that this properties are the minimum to set the environment.

When setting Kylin client machine I faced to major problems:
hive-hcatalog-core.jar and updateJobCounter() function. The first
problem was happening due to different class definitions[1] so I
replace hive-hcatalor-core.jar by HDP 2.1 one. Then, I got the problem
in updateJobCounter() method[2]. I guess this is related to some
misconfiguration in hadoop properties. However, this does not change
cube build. This was how I ran Kylin.

If anyone has ran Kylin in a different way, please, let me know.

[1] http://mail-archives.apache.org/mod_mbox/kylin-dev/201506.mbox/%3CBN1PR02MB0706A02D7C0EE200FC7D9E3CAB60@BN1PR02MB070.namprd02.prod.outlook.com%3E
[2] http://mail-archives.apache.org/mod_mbox/kylin-dev/201509.mbox/%3CCAO44d8EL7=XHc_ZgNpLttz7unur0P14daOu1nb=b4i0DhCiv1g@mail.gmail.com%3E


On Mon, Aug 31, 2015 at 11:19 PM, Shi, Shaofeng <sh...@ebay.com> wrote:
> or you can change “kylin.job.hive.database.for.intermediatetable” back to
> “default” to bypass this issue;
>
> On 9/1/15, 8:59 AM, "Shi, Shaofeng" <sh...@ebay.com> wrote:
>
>>should be related with: https://issues.apache.org/jira/browse/KYLIN-975
>>
>>The patch is available now; You can make a new build by pull 0.7-staging
>>branch and then run scripts/package.sh
>>
>>On 9/1/15, 7:46 AM, "Diego Pinheiro" <di...@gmail.com> wrote:
>>
>>>After some changes, I am getting the following error:
>>>
>>>[pool-5-thread-2]:[2015-08-31
>>>19:42:13,394][ERROR][org.apache.kylin.job.hadoop.cube.FactDistinctColumns
>>>J
>>>ob.run(FactDistinctColumnsJob.java:83)]
>>>- error in FactDistinctColumnsJob
>>>java.io.IOException:
>>>NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube
>>>_
>>>desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
>>>table not found)
>>>        at
>>>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForm
>>>a
>>>t.java:97)
>>>        at
>>>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForm
>>>a
>>>t.java:51)
>>>        at
>>>org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.setupMapper(FactD
>>>i
>>>stinctColumnsJob.java:101)
>>>        at
>>>org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctC
>>>o
>>>lumnsJob.java:74)
>>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>>        at
>>>org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutabl
>>>e
>>>.java:112)
>>>        at
>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecuta
>>>b
>>>le.java:107)
>>>        at
>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultCha
>>>i
>>>nedExecutable.java:50)
>>>        at
>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecuta
>>>b
>>>le.java:107)
>>>        at
>>>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(Defau
>>>l
>>>tScheduler.java:132)
>>>        at
>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java
>>>:
>>>1145)
>>>        at
>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav
>>>a
>>>:615)
>>>        at java.lang.Thread.run(Thread.java:745)
>>>Caused by:
>>>NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube
>>>_
>>>desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
>>>table not found)
>>>        at
>>>org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveM
>>>e
>>>taStore.java:1560)
>>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>        at
>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
>>>:
>>>57)
>>>        at
>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
>>>m
>>>pl.java:43)
>>>        at java.lang.reflect.Method.invoke(Method.java:606)
>>>        at
>>>org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHan
>>>d
>>>ler.java:105)
>>>        at com.sun.proxy.$Proxy45.get_table(Unknown Source)
>>>        at
>>>org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaSto
>>>r
>>>eClient.java:997)
>>>        at
>>>org.apache.hive.hcatalog.common.HCatUtil.getTable(HCatUtil.java:191)
>>>        at
>>>org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(Initia
>>>l
>>>izeInput.java:105)
>>>        at
>>>org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInp
>>>u
>>>t.java:86)
>>>        at
>>>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForm
>>>a
>>>t.java:95)
>>>        ... 13 more
>>>
>>>Did you already face it?
>>>
>>>
>>>On Wed, Aug 26, 2015 at 11:20 PM, Diego Pinheiro
>>><di...@gmail.com> wrote:
>>>> @DroopyHoo, it is good to know that since we are planning to change
>>>> authentication, but this is not the cause of my error. Since hongbin
>>>> ma comments, I changed my kylin properties to use the same as sandbox
>>>> (which is my VM HDP 2.1). So, I am using LDAP auth for now.
>>>>
>>>> @hongbin ma, kylin.job.run.as.remote.cmd was set to true indeed. I
>>>> changed it to false, but I am still getting the same errors:
>>>>
>>>> [pool-5-thread-2]:[2015-08-25
>>>>
>>>>19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.chec
>>>>k
>>>>Status(HadoopStatusChecker.java:91)]
>>>> - error check status
>>>> java.net.ConnectException: Connection refused
>>>>     at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>>     at
>>>>java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:
>>>>3
>>>>39)
>>>>     at
>>>>java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImp
>>>>l
>>>>.java:198)
>>>>     at
>>>>java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:18
>>>>2
>>>>)
>>>>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>>     at java.net.Socket.connect(Socket.java:579)
>>>>
>>>> org.apache.kylin.job.exception.ExecuteException:
>>>>java.lang.NullPointerException
>>>>     at
>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecut
>>>>a
>>>>ble.java:110)
>>>>     at
>>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultCh
>>>>a
>>>>inedExecutable.java:50)
>>>>     at
>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecut
>>>>a
>>>>ble.java:106)
>>>>
>>>> Just to resume, this is my current environment:
>>>>
>>>> - one machine with HDP 2.1 (which I already ran Kylin 0.7.2 and it
>>>>worked well);
>>>> - HDP 2.1 is my cluster, my machine is the client;
>>>> - I did not change any configuration in my cluster;
>>>> - Kylin is in my client machine with the default configuration files;
>>>> - I downloaded Hadoop 2.4.0, HBase 0.98.0 and Hive 0.13.1 to use in my
>>>> client as a Hadoop CLI;
>>>> - In the template of these files: hadoop/core-site.xml,
>>>> hbase/hbase-site.xml and hive/hive-site.xml, I just added one or two
>>>> properties to set my cluster IP. All these tools are apparently ok
>>>> since I can access my cluster from them.
>>>>
>>>> Whereas my cluster is ok, I was thinking if my problem is in Hadoop
>>>> CLI configuration files in the client...what changes have you guys
>>>> done in those three configuration files? Did you also change
>>>> yarn-site.xml or hdfs-site.xml?
>>>>
>>>> HDP 2.1 makes Kylin installation really easy. But, since I am new in
>>>> Hadoop, I am facing these problems when setting up the client machine.
>>>>
>>>>
>>>> On Wed, Aug 26, 2015 at 5:40 AM, DroopyHoo <ol...@gmail.com> wrote:
>>>>> Hi Diego
>>>>>
>>>>> We met this error stack when deploying in our Hadoop enviroment(not
>>>>> sandbox). The problem we met is  the function for checking MR job
>>>>>status do
>>>>> not support kerberos auth (our hadoop cluster use kerberos service).
>>>>>So we
>>>>> made some changes to this part of source code.
>>>>>
>>>>> I'm not sure whether this case could help you to analyse the problem.
>>>>>
>>>>> 在 15/8/26 上午10:46, Diego Pinheiro 写道:
>>>>>
>>>>>> Hi Bin Mahone,
>>>>>>
>>>>>> sorry for the late reply. Thank you for your support. I didn't know
>>>>>> about Kylin instances. It is really interesting.
>>>>>>
>>>>>> However, let me ask you, I was setting up my hadoop client machine
>>>>>> with Kylin to communicate to my sandbox. But things are not working
>>>>>> well.
>>>>>>
>>>>>> I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
>>>>>> are working and I can access my "remote server" from my client
>>>>>>machine
>>>>>> (actually, I set kylin as sandbox since all my hadoop cli is pointing
>>>>>> to my sandbox). Then, Kylin was built and everything was ok until I
>>>>>> tried to build the cube.
>>>>>>
>>>>>> I got the following errors always in the second step of cube build:
>>>>>>
>>>>>> [pool-5-thread-2]:[2015-08-25
>>>>>>
>>>>>>
>>>>>>19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.ch
>>>>>>e
>>>>>>ckStatus(HadoopStatusChecker.java:91)]
>>>>>> - error check status
>>>>>> java.net.ConnectException: Connection refused
>>>>>>      at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>>>>      at
>>>>>>
>>>>>>java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.jav
>>>>>>a
>>>>>>:339)
>>>>>>      at
>>>>>>
>>>>>>java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketI
>>>>>>m
>>>>>>pl.java:198)
>>>>>>      at
>>>>>>
>>>>>>java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:
>>>>>>1
>>>>>>82)
>>>>>>      at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>>>>      at java.net.Socket.connect(Socket.java:579)
>>>>>>      at java.net.Socket.connect(Socket.java:528)
>>>>>>      at java.net.Socket.<init>(Socket.java:425)
>>>>>>      at java.net.Socket.<init>(Socket.java:280)
>>>>>>      at
>>>>>>
>>>>>>org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.cr
>>>>>>e
>>>>>>ateSocket(DefaultProtocolSocketFactory.java:80)
>>>>>>      at
>>>>>>
>>>>>>org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.cr
>>>>>>e
>>>>>>ateSocket(DefaultProtocolSocketFactory.java:122)
>>>>>>      at
>>>>>>
>>>>>>org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:
>>>>>>7
>>>>>>07)
>>>>>>      at
>>>>>>
>>>>>>org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(Http
>>>>>>M
>>>>>>ethodDirector.java:387)
>>>>>>      at
>>>>>>
>>>>>>org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMet
>>>>>>h
>>>>>>odDirector.java:171)
>>>>>>      at
>>>>>>
>>>>>>org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java
>>>>>>:
>>>>>>397)
>>>>>>      at
>>>>>>
>>>>>>org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java
>>>>>>:
>>>>>>323)
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopSt
>>>>>>a
>>>>>>tusGetter.java:78)
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.j
>>>>>>a
>>>>>>va:55)
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatu
>>>>>>s
>>>>>>Checker.java:56)
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecut
>>>>>>a
>>>>>>ble.java:136)
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>>u
>>>>>>table.java:106)
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(Default
>>>>>>C
>>>>>>hainedExecutable.java:50)
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>>u
>>>>>>table.java:106)
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(De
>>>>>>f
>>>>>>aultScheduler.java:133)
>>>>>>      at
>>>>>>
>>>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>a
>>>>>>va:1145)
>>>>>>      at
>>>>>>
>>>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>j
>>>>>>ava:615)
>>>>>>      at java.lang.Thread.run(Thread.java:745)
>>>>>>
>>>>>> org.apache.kylin.job.exception.ExecuteException:
>>>>>> java.lang.NullPointerException
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>>u
>>>>>>table.java:110)
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(Default
>>>>>>C
>>>>>>hainedExecutable.java:50)
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>>u
>>>>>>table.java:106)
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(De
>>>>>>f
>>>>>>aultScheduler.java:133)
>>>>>>      at
>>>>>>
>>>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>>a
>>>>>>va:1145)
>>>>>>      at
>>>>>>
>>>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>>j
>>>>>>ava:615)
>>>>>>      at java.lang.Thread.run(Thread.java:745)
>>>>>> Caused by: java.lang.NullPointerException
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapRedu
>>>>>>c
>>>>>>eExecutable.java:73)
>>>>>>      at
>>>>>>
>>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>>u
>>>>>>table.java:105)
>>>>>>      ... 6 more
>>>>>>
>>>>>> Do you have any thoughts about these errors? (detailed log is
>>>>>>attached)
>>>>>>
>>>>>>
>>>>>> On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <ma...@apache.org>
>>>>>>wrote:
>>>>>>>
>>>>>>> the document is summarized at
>>>>>>> http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
>>>>>>>
>>>>>>> On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <ma...@apache.org>
>>>>>>>wrote:
>>>>>>>
>>>>>>>> hi Diego
>>>>>>>>
>>>>>>>> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it
>>>>>>>>is
>>>>>>>> enabled when you cannot run Kylin server on the same machine as
>>>>>>>>your
>>>>>>>> hadoop
>>>>>>>> CLI, for example, if you're starting Kylin from you local IDE, and
>>>>>>>>you
>>>>>>>> hadoop CLI is a sandbox in another machine, this is the "remote"
>>>>>>>>case.
>>>>>>>>
>>>>>>>> In most of the production deployments we suggest using
>>>>>>>>'"non-remote"
>>>>>>>> mode,
>>>>>>>> that is, kylin instance is started on the hadoop CLI. The picture
>>>>>>>> depicts
>>>>>>>> the scenario:
>>>>>>>>
>>>>>>>>
>>>>>>>>https://github.com/apache/incubator-kylin/blob/0.7/website/images/in
>>>>>>>>s
>>>>>>>>tall/on_cli_install_scene.png
>>>>>>>>
>>>>>>>> Kylin instances are stateless,  the runtime state is saved in its
>>>>>>>> "Metadata Store" in hbase (kylin.metadata.url config in
>>>>>>>> conf/kylin.properties). For load balance considerations it is
>>>>>>>>possible
>>>>>>>> to
>>>>>>>> start multiple Kylin instances sharing the same metadata store
>>>>>>>>(thus
>>>>>>>> sharing the same state on table schemas, job status, cube status,
>>>>>>>>etc.)
>>>>>>>>
>>>>>>>> Each of the kylin instances has a kylin.server.mode entry in
>>>>>>>> conf/kylin.properties specifying the runtime mode, it has three
>>>>>>>>options:
>>>>>>>> 1.
>>>>>>>> "job" for running job engine only 2. "query" for running query
>>>>>>>>engine
>>>>>>>> only
>>>>>>>> and 3 "all" for running both. Notice that only one server can run
>>>>>>>>the
>>>>>>>> job
>>>>>>>> engine("all" mode or "job" mode), the others must all be "query"
>>>>>>>>mode.
>>>>>>>>
>>>>>>>> A typical scenario is depicted in the attachment chart.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> *Bin Mahone | 马洪宾*
>>>>>>>> Apache Kylin: http://kylin.io
>>>>>>>> Github: https://github.com/binmahone
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>>
>>>>>>> *Bin Mahone | 马洪宾*
>>>>>>> Apache Kylin: http://kylin.io
>>>>>>> Github: https://github.com/binmahone
>>>>>
>>>>>
>>>>> --
>>>>> -------
>>>>> Wei Hu
>>
>

Re: Kylin and Remote Server

Posted by "Shi, Shaofeng" <sh...@ebay.com>.
or you can change “kylin.job.hive.database.for.intermediatetable” back to
“default” to bypass this issue;

On 9/1/15, 8:59 AM, "Shi, Shaofeng" <sh...@ebay.com> wrote:

>should be related with: https://issues.apache.org/jira/browse/KYLIN-975
>
>The patch is available now; You can make a new build by pull 0.7-staging
>branch and then run scripts/package.sh
>
>On 9/1/15, 7:46 AM, "Diego Pinheiro" <di...@gmail.com> wrote:
>
>>After some changes, I am getting the following error:
>>
>>[pool-5-thread-2]:[2015-08-31
>>19:42:13,394][ERROR][org.apache.kylin.job.hadoop.cube.FactDistinctColumns
>>J
>>ob.run(FactDistinctColumnsJob.java:83)]
>>- error in FactDistinctColumnsJob
>>java.io.IOException:
>>NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube
>>_
>>desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
>>table not found)
>>        at 
>>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForm
>>a
>>t.java:97)
>>        at 
>>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForm
>>a
>>t.java:51)
>>        at 
>>org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.setupMapper(FactD
>>i
>>stinctColumnsJob.java:101)
>>        at 
>>org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctC
>>o
>>lumnsJob.java:74)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>        at 
>>org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutabl
>>e
>>.java:112)
>>        at 
>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecuta
>>b
>>le.java:107)
>>        at 
>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultCha
>>i
>>nedExecutable.java:50)
>>        at 
>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecuta
>>b
>>le.java:107)
>>        at 
>>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(Defau
>>l
>>tScheduler.java:132)
>>        at 
>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java
>>:
>>1145)
>>        at 
>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav
>>a
>>:615)
>>        at java.lang.Thread.run(Thread.java:745)
>>Caused by: 
>>NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube
>>_
>>desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
>>table not found)
>>        at 
>>org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveM
>>e
>>taStore.java:1560)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at 
>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
>>:
>>57)
>>        at 
>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
>>m
>>pl.java:43)
>>        at java.lang.reflect.Method.invoke(Method.java:606)
>>        at 
>>org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHan
>>d
>>ler.java:105)
>>        at com.sun.proxy.$Proxy45.get_table(Unknown Source)
>>        at 
>>org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaSto
>>r
>>eClient.java:997)
>>        at 
>>org.apache.hive.hcatalog.common.HCatUtil.getTable(HCatUtil.java:191)
>>        at 
>>org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(Initia
>>l
>>izeInput.java:105)
>>        at 
>>org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInp
>>u
>>t.java:86)
>>        at 
>>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForm
>>a
>>t.java:95)
>>        ... 13 more
>>
>>Did you already face it?
>>
>>
>>On Wed, Aug 26, 2015 at 11:20 PM, Diego Pinheiro
>><di...@gmail.com> wrote:
>>> @DroopyHoo, it is good to know that since we are planning to change
>>> authentication, but this is not the cause of my error. Since hongbin
>>> ma comments, I changed my kylin properties to use the same as sandbox
>>> (which is my VM HDP 2.1). So, I am using LDAP auth for now.
>>>
>>> @hongbin ma, kylin.job.run.as.remote.cmd was set to true indeed. I
>>> changed it to false, but I am still getting the same errors:
>>>
>>> [pool-5-thread-2]:[2015-08-25
>>> 
>>>19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.chec
>>>k
>>>Status(HadoopStatusChecker.java:91)]
>>> - error check status
>>> java.net.ConnectException: Connection refused
>>>     at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>     at 
>>>java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:
>>>3
>>>39)
>>>     at 
>>>java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImp
>>>l
>>>.java:198)
>>>     at 
>>>java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:18
>>>2
>>>)
>>>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>     at java.net.Socket.connect(Socket.java:579)
>>>
>>> org.apache.kylin.job.exception.ExecuteException:
>>>java.lang.NullPointerException
>>>     at 
>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecut
>>>a
>>>ble.java:110)
>>>     at 
>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultCh
>>>a
>>>inedExecutable.java:50)
>>>     at 
>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecut
>>>a
>>>ble.java:106)
>>>
>>> Just to resume, this is my current environment:
>>>
>>> - one machine with HDP 2.1 (which I already ran Kylin 0.7.2 and it
>>>worked well);
>>> - HDP 2.1 is my cluster, my machine is the client;
>>> - I did not change any configuration in my cluster;
>>> - Kylin is in my client machine with the default configuration files;
>>> - I downloaded Hadoop 2.4.0, HBase 0.98.0 and Hive 0.13.1 to use in my
>>> client as a Hadoop CLI;
>>> - In the template of these files: hadoop/core-site.xml,
>>> hbase/hbase-site.xml and hive/hive-site.xml, I just added one or two
>>> properties to set my cluster IP. All these tools are apparently ok
>>> since I can access my cluster from them.
>>>
>>> Whereas my cluster is ok, I was thinking if my problem is in Hadoop
>>> CLI configuration files in the client...what changes have you guys
>>> done in those three configuration files? Did you also change
>>> yarn-site.xml or hdfs-site.xml?
>>>
>>> HDP 2.1 makes Kylin installation really easy. But, since I am new in
>>> Hadoop, I am facing these problems when setting up the client machine.
>>>
>>>
>>> On Wed, Aug 26, 2015 at 5:40 AM, DroopyHoo <ol...@gmail.com> wrote:
>>>> Hi Diego
>>>>
>>>> We met this error stack when deploying in our Hadoop enviroment(not
>>>> sandbox). The problem we met is  the function for checking MR job
>>>>status do
>>>> not support kerberos auth (our hadoop cluster use kerberos service).
>>>>So we
>>>> made some changes to this part of source code.
>>>>
>>>> I'm not sure whether this case could help you to analyse the problem.
>>>>
>>>> 在 15/8/26 上午10:46, Diego Pinheiro 写道:
>>>>
>>>>> Hi Bin Mahone,
>>>>>
>>>>> sorry for the late reply. Thank you for your support. I didn't know
>>>>> about Kylin instances. It is really interesting.
>>>>>
>>>>> However, let me ask you, I was setting up my hadoop client machine
>>>>> with Kylin to communicate to my sandbox. But things are not working
>>>>> well.
>>>>>
>>>>> I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
>>>>> are working and I can access my "remote server" from my client
>>>>>machine
>>>>> (actually, I set kylin as sandbox since all my hadoop cli is pointing
>>>>> to my sandbox). Then, Kylin was built and everything was ok until I
>>>>> tried to build the cube.
>>>>>
>>>>> I got the following errors always in the second step of cube build:
>>>>>
>>>>> [pool-5-thread-2]:[2015-08-25
>>>>>
>>>>> 
>>>>>19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.ch
>>>>>e
>>>>>ckStatus(HadoopStatusChecker.java:91)]
>>>>> - error check status
>>>>> java.net.ConnectException: Connection refused
>>>>>      at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>>>      at
>>>>> 
>>>>>java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.jav
>>>>>a
>>>>>:339)
>>>>>      at
>>>>> 
>>>>>java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketI
>>>>>m
>>>>>pl.java:198)
>>>>>      at
>>>>> 
>>>>>java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:
>>>>>1
>>>>>82)
>>>>>      at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>>>      at java.net.Socket.connect(Socket.java:579)
>>>>>      at java.net.Socket.connect(Socket.java:528)
>>>>>      at java.net.Socket.<init>(Socket.java:425)
>>>>>      at java.net.Socket.<init>(Socket.java:280)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.cr
>>>>>e
>>>>>ateSocket(DefaultProtocolSocketFactory.java:80)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.cr
>>>>>e
>>>>>ateSocket(DefaultProtocolSocketFactory.java:122)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:
>>>>>7
>>>>>07)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(Http
>>>>>M
>>>>>ethodDirector.java:387)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMet
>>>>>h
>>>>>odDirector.java:171)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java
>>>>>:
>>>>>397)
>>>>>      at
>>>>> 
>>>>>org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java
>>>>>:
>>>>>323)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopSt
>>>>>a
>>>>>tusGetter.java:78)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.j
>>>>>a
>>>>>va:55)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatu
>>>>>s
>>>>>Checker.java:56)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecut
>>>>>a
>>>>>ble.java:136)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>u
>>>>>table.java:106)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(Default
>>>>>C
>>>>>hainedExecutable.java:50)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>u
>>>>>table.java:106)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(De
>>>>>f
>>>>>aultScheduler.java:133)
>>>>>      at
>>>>> 
>>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>a
>>>>>va:1145)
>>>>>      at
>>>>> 
>>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>j
>>>>>ava:615)
>>>>>      at java.lang.Thread.run(Thread.java:745)
>>>>>
>>>>> org.apache.kylin.job.exception.ExecuteException:
>>>>> java.lang.NullPointerException
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>u
>>>>>table.java:110)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(Default
>>>>>C
>>>>>hainedExecutable.java:50)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>u
>>>>>table.java:106)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(De
>>>>>f
>>>>>aultScheduler.java:133)
>>>>>      at
>>>>> 
>>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>>>>>a
>>>>>va:1145)
>>>>>      at
>>>>> 
>>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>>>>>j
>>>>>ava:615)
>>>>>      at java.lang.Thread.run(Thread.java:745)
>>>>> Caused by: java.lang.NullPointerException
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapRedu
>>>>>c
>>>>>eExecutable.java:73)
>>>>>      at
>>>>> 
>>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExec
>>>>>u
>>>>>table.java:105)
>>>>>      ... 6 more
>>>>>
>>>>> Do you have any thoughts about these errors? (detailed log is
>>>>>attached)
>>>>>
>>>>>
>>>>> On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <ma...@apache.org>
>>>>>wrote:
>>>>>>
>>>>>> the document is summarized at
>>>>>> http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
>>>>>>
>>>>>> On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <ma...@apache.org>
>>>>>>wrote:
>>>>>>
>>>>>>> hi Diego
>>>>>>>
>>>>>>> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it
>>>>>>>is
>>>>>>> enabled when you cannot run Kylin server on the same machine as
>>>>>>>your
>>>>>>> hadoop
>>>>>>> CLI, for example, if you're starting Kylin from you local IDE, and
>>>>>>>you
>>>>>>> hadoop CLI is a sandbox in another machine, this is the "remote"
>>>>>>>case.
>>>>>>>
>>>>>>> In most of the production deployments we suggest using
>>>>>>>'"non-remote"
>>>>>>> mode,
>>>>>>> that is, kylin instance is started on the hadoop CLI. The picture
>>>>>>> depicts
>>>>>>> the scenario:
>>>>>>>
>>>>>>> 
>>>>>>>https://github.com/apache/incubator-kylin/blob/0.7/website/images/in
>>>>>>>s
>>>>>>>tall/on_cli_install_scene.png
>>>>>>>
>>>>>>> Kylin instances are stateless,  the runtime state is saved in its
>>>>>>> "Metadata Store" in hbase (kylin.metadata.url config in
>>>>>>> conf/kylin.properties). For load balance considerations it is
>>>>>>>possible
>>>>>>> to
>>>>>>> start multiple Kylin instances sharing the same metadata store
>>>>>>>(thus
>>>>>>> sharing the same state on table schemas, job status, cube status,
>>>>>>>etc.)
>>>>>>>
>>>>>>> Each of the kylin instances has a kylin.server.mode entry in
>>>>>>> conf/kylin.properties specifying the runtime mode, it has three
>>>>>>>options:
>>>>>>> 1.
>>>>>>> "job" for running job engine only 2. "query" for running query
>>>>>>>engine
>>>>>>> only
>>>>>>> and 3 "all" for running both. Notice that only one server can run
>>>>>>>the
>>>>>>> job
>>>>>>> engine("all" mode or "job" mode), the others must all be "query"
>>>>>>>mode.
>>>>>>>
>>>>>>> A typical scenario is depicted in the attachment chart.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>>
>>>>>>> *Bin Mahone | 马洪宾*
>>>>>>> Apache Kylin: http://kylin.io
>>>>>>> Github: https://github.com/binmahone
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>>
>>>>>> *Bin Mahone | 马洪宾*
>>>>>> Apache Kylin: http://kylin.io
>>>>>> Github: https://github.com/binmahone
>>>>
>>>>
>>>> --
>>>> -------
>>>> Wei Hu
>


Re: Kylin and Remote Server

Posted by "Shi, Shaofeng" <sh...@ebay.com>.
should be related with: https://issues.apache.org/jira/browse/KYLIN-975

The patch is available now; You can make a new build by pull 0.7-staging
branch and then run scripts/package.sh

On 9/1/15, 7:46 AM, "Diego Pinheiro" <di...@gmail.com> wrote:

>After some changes, I am getting the following error:
>
>[pool-5-thread-2]:[2015-08-31
>19:42:13,394][ERROR][org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJ
>ob.run(FactDistinctColumnsJob.java:83)]
>- error in FactDistinctColumnsJob
>java.io.IOException:
>NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube_
>desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
>table not found)
>        at 
>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForma
>t.java:97)
>        at 
>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForma
>t.java:51)
>        at 
>org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.setupMapper(FactDi
>stinctColumnsJob.java:101)
>        at 
>org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctCo
>lumnsJob.java:74)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>        at 
>org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable
>.java:112)
>        at 
>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutab
>le.java:107)
>        at 
>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChai
>nedExecutable.java:50)
>        at 
>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutab
>le.java:107)
>        at 
>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(Defaul
>tScheduler.java:132)
>        at 
>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
>1145)
>        at 
>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>:615)
>        at java.lang.Thread.run(Thread.java:745)
>Caused by: 
>NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube_
>desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
>table not found)
>        at 
>org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMe
>taStore.java:1560)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at 
>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
>57)
>        at 
>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
>pl.java:43)
>        at java.lang.reflect.Method.invoke(Method.java:606)
>        at 
>org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHand
>ler.java:105)
>        at com.sun.proxy.$Proxy45.get_table(Unknown Source)
>        at 
>org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStor
>eClient.java:997)
>        at 
>org.apache.hive.hcatalog.common.HCatUtil.getTable(HCatUtil.java:191)
>        at 
>org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(Initial
>izeInput.java:105)
>        at 
>org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInpu
>t.java:86)
>        at 
>org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputForma
>t.java:95)
>        ... 13 more
>
>Did you already face it?
>
>
>On Wed, Aug 26, 2015 at 11:20 PM, Diego Pinheiro
><di...@gmail.com> wrote:
>> @DroopyHoo, it is good to know that since we are planning to change
>> authentication, but this is not the cause of my error. Since hongbin
>> ma comments, I changed my kylin properties to use the same as sandbox
>> (which is my VM HDP 2.1). So, I am using LDAP auth for now.
>>
>> @hongbin ma, kylin.job.run.as.remote.cmd was set to true indeed. I
>> changed it to false, but I am still getting the same errors:
>>
>> [pool-5-thread-2]:[2015-08-25
>> 
>>19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.check
>>Status(HadoopStatusChecker.java:91)]
>> - error check status
>> java.net.ConnectException: Connection refused
>>     at java.net.PlainSocketImpl.socketConnect(Native Method)
>>     at 
>>java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:3
>>39)
>>     at 
>>java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl
>>.java:198)
>>     at 
>>java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182
>>)
>>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>     at java.net.Socket.connect(Socket.java:579)
>>
>> org.apache.kylin.job.exception.ExecuteException:
>>java.lang.NullPointerException
>>     at 
>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecuta
>>ble.java:110)
>>     at 
>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultCha
>>inedExecutable.java:50)
>>     at 
>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecuta
>>ble.java:106)
>>
>> Just to resume, this is my current environment:
>>
>> - one machine with HDP 2.1 (which I already ran Kylin 0.7.2 and it
>>worked well);
>> - HDP 2.1 is my cluster, my machine is the client;
>> - I did not change any configuration in my cluster;
>> - Kylin is in my client machine with the default configuration files;
>> - I downloaded Hadoop 2.4.0, HBase 0.98.0 and Hive 0.13.1 to use in my
>> client as a Hadoop CLI;
>> - In the template of these files: hadoop/core-site.xml,
>> hbase/hbase-site.xml and hive/hive-site.xml, I just added one or two
>> properties to set my cluster IP. All these tools are apparently ok
>> since I can access my cluster from them.
>>
>> Whereas my cluster is ok, I was thinking if my problem is in Hadoop
>> CLI configuration files in the client...what changes have you guys
>> done in those three configuration files? Did you also change
>> yarn-site.xml or hdfs-site.xml?
>>
>> HDP 2.1 makes Kylin installation really easy. But, since I am new in
>> Hadoop, I am facing these problems when setting up the client machine.
>>
>>
>> On Wed, Aug 26, 2015 at 5:40 AM, DroopyHoo <ol...@gmail.com> wrote:
>>> Hi Diego
>>>
>>> We met this error stack when deploying in our Hadoop enviroment(not
>>> sandbox). The problem we met is  the function for checking MR job
>>>status do
>>> not support kerberos auth (our hadoop cluster use kerberos service).
>>>So we
>>> made some changes to this part of source code.
>>>
>>> I'm not sure whether this case could help you to analyse the problem.
>>>
>>> 在 15/8/26 上午10:46, Diego Pinheiro 写道:
>>>
>>>> Hi Bin Mahone,
>>>>
>>>> sorry for the late reply. Thank you for your support. I didn't know
>>>> about Kylin instances. It is really interesting.
>>>>
>>>> However, let me ask you, I was setting up my hadoop client machine
>>>> with Kylin to communicate to my sandbox. But things are not working
>>>> well.
>>>>
>>>> I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
>>>> are working and I can access my "remote server" from my client machine
>>>> (actually, I set kylin as sandbox since all my hadoop cli is pointing
>>>> to my sandbox). Then, Kylin was built and everything was ok until I
>>>> tried to build the cube.
>>>>
>>>> I got the following errors always in the second step of cube build:
>>>>
>>>> [pool-5-thread-2]:[2015-08-25
>>>>
>>>> 
>>>>19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.che
>>>>ckStatus(HadoopStatusChecker.java:91)]
>>>> - error check status
>>>> java.net.ConnectException: Connection refused
>>>>      at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>>      at
>>>> 
>>>>java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java
>>>>:339)
>>>>      at
>>>> 
>>>>java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketIm
>>>>pl.java:198)
>>>>      at
>>>> 
>>>>java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:1
>>>>82)
>>>>      at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>>      at java.net.Socket.connect(Socket.java:579)
>>>>      at java.net.Socket.connect(Socket.java:528)
>>>>      at java.net.Socket.<init>(Socket.java:425)
>>>>      at java.net.Socket.<init>(Socket.java:280)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.cre
>>>>ateSocket(DefaultProtocolSocketFactory.java:80)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.cre
>>>>ateSocket(DefaultProtocolSocketFactory.java:122)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:7
>>>>07)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpM
>>>>ethodDirector.java:387)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMeth
>>>>odDirector.java:171)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:
>>>>397)
>>>>      at
>>>> 
>>>>org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:
>>>>323)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopSta
>>>>tusGetter.java:78)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.ja
>>>>va:55)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatus
>>>>Checker.java:56)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecuta
>>>>ble.java:136)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecu
>>>>table.java:106)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultC
>>>>hainedExecutable.java:50)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecu
>>>>table.java:106)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(Def
>>>>aultScheduler.java:133)
>>>>      at
>>>> 
>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.ja
>>>>va:1145)
>>>>      at
>>>> 
>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.j
>>>>ava:615)
>>>>      at java.lang.Thread.run(Thread.java:745)
>>>>
>>>> org.apache.kylin.job.exception.ExecuteException:
>>>> java.lang.NullPointerException
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecu
>>>>table.java:110)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultC
>>>>hainedExecutable.java:50)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecu
>>>>table.java:106)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(Def
>>>>aultScheduler.java:133)
>>>>      at
>>>> 
>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.ja
>>>>va:1145)
>>>>      at
>>>> 
>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.j
>>>>ava:615)
>>>>      at java.lang.Thread.run(Thread.java:745)
>>>> Caused by: java.lang.NullPointerException
>>>>      at
>>>> 
>>>>org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduc
>>>>eExecutable.java:73)
>>>>      at
>>>> 
>>>>org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecu
>>>>table.java:105)
>>>>      ... 6 more
>>>>
>>>> Do you have any thoughts about these errors? (detailed log is
>>>>attached)
>>>>
>>>>
>>>> On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <ma...@apache.org>
>>>>wrote:
>>>>>
>>>>> the document is summarized at
>>>>> http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
>>>>>
>>>>> On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <ma...@apache.org>
>>>>>wrote:
>>>>>
>>>>>> hi Diego
>>>>>>
>>>>>> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it
>>>>>>is
>>>>>> enabled when you cannot run Kylin server on the same machine as your
>>>>>> hadoop
>>>>>> CLI, for example, if you're starting Kylin from you local IDE, and
>>>>>>you
>>>>>> hadoop CLI is a sandbox in another machine, this is the "remote"
>>>>>>case.
>>>>>>
>>>>>> In most of the production deployments we suggest using '"non-remote"
>>>>>> mode,
>>>>>> that is, kylin instance is started on the hadoop CLI. The picture
>>>>>> depicts
>>>>>> the scenario:
>>>>>>
>>>>>> 
>>>>>>https://github.com/apache/incubator-kylin/blob/0.7/website/images/ins
>>>>>>tall/on_cli_install_scene.png
>>>>>>
>>>>>> Kylin instances are stateless,  the runtime state is saved in its
>>>>>> "Metadata Store" in hbase (kylin.metadata.url config in
>>>>>> conf/kylin.properties). For load balance considerations it is
>>>>>>possible
>>>>>> to
>>>>>> start multiple Kylin instances sharing the same metadata store (thus
>>>>>> sharing the same state on table schemas, job status, cube status,
>>>>>>etc.)
>>>>>>
>>>>>> Each of the kylin instances has a kylin.server.mode entry in
>>>>>> conf/kylin.properties specifying the runtime mode, it has three
>>>>>>options:
>>>>>> 1.
>>>>>> "job" for running job engine only 2. "query" for running query
>>>>>>engine
>>>>>> only
>>>>>> and 3 "all" for running both. Notice that only one server can run
>>>>>>the
>>>>>> job
>>>>>> engine("all" mode or "job" mode), the others must all be "query"
>>>>>>mode.
>>>>>>
>>>>>> A typical scenario is depicted in the attachment chart.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>>
>>>>>> *Bin Mahone | 马洪宾*
>>>>>> Apache Kylin: http://kylin.io
>>>>>> Github: https://github.com/binmahone
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>>
>>>>> *Bin Mahone | 马洪宾*
>>>>> Apache Kylin: http://kylin.io
>>>>> Github: https://github.com/binmahone
>>>
>>>
>>> --
>>> -------
>>> Wei Hu


Re: Kylin and Remote Server

Posted by Diego Pinheiro <di...@gmail.com>.
After some changes, I am getting the following error:

[pool-5-thread-2]:[2015-08-31
19:42:13,394][ERROR][org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:83)]
- error in FactDistinctColumnsJob
java.io.IOException:
NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube_desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
table not found)
        at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:97)
        at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:51)
        at org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:101)
        at org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:74)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:112)
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
        at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
        at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
        at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:132)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: NoSuchObjectException(message:DEFAULT.kylin_intermediate_kylin_sales_cube_desc_20120101000000_20150727000000_a7572e8b_2c6a_4904_a19b_d74933e6d008
table not found)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
        at com.sun.proxy.$Proxy45.get_table(Unknown Source)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997)
        at org.apache.hive.hcatalog.common.HCatUtil.getTable(HCatUtil.java:191)
        at org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:105)
        at org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:86)
        at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:95)
        ... 13 more

Did you already face it?


On Wed, Aug 26, 2015 at 11:20 PM, Diego Pinheiro
<di...@gmail.com> wrote:
> @DroopyHoo, it is good to know that since we are planning to change
> authentication, but this is not the cause of my error. Since hongbin
> ma comments, I changed my kylin properties to use the same as sandbox
> (which is my VM HDP 2.1). So, I am using LDAP auth for now.
>
> @hongbin ma, kylin.job.run.as.remote.cmd was set to true indeed. I
> changed it to false, but I am still getting the same errors:
>
> [pool-5-thread-2]:[2015-08-25
> 19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:91)]
> - error check status
> java.net.ConnectException: Connection refused
>     at java.net.PlainSocketImpl.socketConnect(Native Method)
>     at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>     at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>     at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>     at java.net.Socket.connect(Socket.java:579)
>
> org.apache.kylin.job.exception.ExecuteException: java.lang.NullPointerException
>     at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:110)
>     at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>     at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>
> Just to resume, this is my current environment:
>
> - one machine with HDP 2.1 (which I already ran Kylin 0.7.2 and it worked well);
> - HDP 2.1 is my cluster, my machine is the client;
> - I did not change any configuration in my cluster;
> - Kylin is in my client machine with the default configuration files;
> - I downloaded Hadoop 2.4.0, HBase 0.98.0 and Hive 0.13.1 to use in my
> client as a Hadoop CLI;
> - In the template of these files: hadoop/core-site.xml,
> hbase/hbase-site.xml and hive/hive-site.xml, I just added one or two
> properties to set my cluster IP. All these tools are apparently ok
> since I can access my cluster from them.
>
> Whereas my cluster is ok, I was thinking if my problem is in Hadoop
> CLI configuration files in the client...what changes have you guys
> done in those three configuration files? Did you also change
> yarn-site.xml or hdfs-site.xml?
>
> HDP 2.1 makes Kylin installation really easy. But, since I am new in
> Hadoop, I am facing these problems when setting up the client machine.
>
>
> On Wed, Aug 26, 2015 at 5:40 AM, DroopyHoo <ol...@gmail.com> wrote:
>> Hi Diego
>>
>> We met this error stack when deploying in our Hadoop enviroment(not
>> sandbox). The problem we met is  the function for checking MR job status do
>> not support kerberos auth (our hadoop cluster use kerberos service). So we
>> made some changes to this part of source code.
>>
>> I'm not sure whether this case could help you to analyse the problem.
>>
>> 在 15/8/26 上午10:46, Diego Pinheiro 写道:
>>
>>> Hi Bin Mahone,
>>>
>>> sorry for the late reply. Thank you for your support. I didn't know
>>> about Kylin instances. It is really interesting.
>>>
>>> However, let me ask you, I was setting up my hadoop client machine
>>> with Kylin to communicate to my sandbox. But things are not working
>>> well.
>>>
>>> I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
>>> are working and I can access my "remote server" from my client machine
>>> (actually, I set kylin as sandbox since all my hadoop cli is pointing
>>> to my sandbox). Then, Kylin was built and everything was ok until I
>>> tried to build the cube.
>>>
>>> I got the following errors always in the second step of cube build:
>>>
>>> [pool-5-thread-2]:[2015-08-25
>>>
>>> 19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:91)]
>>> - error check status
>>> java.net.ConnectException: Connection refused
>>>      at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>      at
>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>>>      at
>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>>>      at
>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>>>      at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>      at java.net.Socket.connect(Socket.java:579)
>>>      at java.net.Socket.connect(Socket.java:528)
>>>      at java.net.Socket.<init>(Socket.java:425)
>>>      at java.net.Socket.<init>(Socket.java:280)
>>>      at
>>> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
>>>      at
>>> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
>>>      at
>>> org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
>>>      at
>>> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
>>>      at
>>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
>>>      at
>>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>>>      at
>>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>>>      at
>>> org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopStatusGetter.java:78)
>>>      at
>>> org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.java:55)
>>>      at
>>> org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:56)
>>>      at
>>> org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:136)
>>>      at
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>>      at
>>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>>>      at
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>>      at
>>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>>>      at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>      at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>      at java.lang.Thread.run(Thread.java:745)
>>>
>>> org.apache.kylin.job.exception.ExecuteException:
>>> java.lang.NullPointerException
>>>      at
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:110)
>>>      at
>>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>>>      at
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>>      at
>>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>>>      at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>      at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>      at java.lang.Thread.run(Thread.java:745)
>>> Caused by: java.lang.NullPointerException
>>>      at
>>> org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduceExecutable.java:73)
>>>      at
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:105)
>>>      ... 6 more
>>>
>>> Do you have any thoughts about these errors? (detailed log is attached)
>>>
>>>
>>> On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <ma...@apache.org> wrote:
>>>>
>>>> the document is summarized at
>>>> http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
>>>>
>>>> On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <ma...@apache.org> wrote:
>>>>
>>>>> hi Diego
>>>>>
>>>>> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it is
>>>>> enabled when you cannot run Kylin server on the same machine as your
>>>>> hadoop
>>>>> CLI, for example, if you're starting Kylin from you local IDE, and you
>>>>> hadoop CLI is a sandbox in another machine, this is the "remote" case.
>>>>>
>>>>> In most of the production deployments we suggest using '"non-remote"
>>>>> mode,
>>>>> that is, kylin instance is started on the hadoop CLI. The picture
>>>>> depicts
>>>>> the scenario:
>>>>>
>>>>> https://github.com/apache/incubator-kylin/blob/0.7/website/images/install/on_cli_install_scene.png
>>>>>
>>>>> Kylin instances are stateless,  the runtime state is saved in its
>>>>> "Metadata Store" in hbase (kylin.metadata.url config in
>>>>> conf/kylin.properties). For load balance considerations it is possible
>>>>> to
>>>>> start multiple Kylin instances sharing the same metadata store (thus
>>>>> sharing the same state on table schemas, job status, cube status, etc.)
>>>>>
>>>>> Each of the kylin instances has a kylin.server.mode entry in
>>>>> conf/kylin.properties specifying the runtime mode, it has three options:
>>>>> 1.
>>>>> "job" for running job engine only 2. "query" for running query engine
>>>>> only
>>>>> and 3 "all" for running both. Notice that only one server can run the
>>>>> job
>>>>> engine("all" mode or "job" mode), the others must all be "query" mode.
>>>>>
>>>>> A typical scenario is depicted in the attachment chart.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>>
>>>>> *Bin Mahone | 马洪宾*
>>>>> Apache Kylin: http://kylin.io
>>>>> Github: https://github.com/binmahone
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>>
>>>> *Bin Mahone | 马洪宾*
>>>> Apache Kylin: http://kylin.io
>>>> Github: https://github.com/binmahone
>>
>>
>> --
>> -------
>> Wei Hu
>>

Re: Kylin and Remote Server

Posted by Diego Pinheiro <di...@gmail.com>.
@DroopyHoo, it is good to know that since we are planning to change
authentication, but this is not the cause of my error. Since hongbin
ma comments, I changed my kylin properties to use the same as sandbox
(which is my VM HDP 2.1). So, I am using LDAP auth for now.

@hongbin ma, kylin.job.run.as.remote.cmd was set to true indeed. I
changed it to false, but I am still getting the same errors:

[pool-5-thread-2]:[2015-08-25
19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:91)]
- error check status
java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:579)

org.apache.kylin.job.exception.ExecuteException: java.lang.NullPointerException
    at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:110)
    at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
    at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)

Just to resume, this is my current environment:

- one machine with HDP 2.1 (which I already ran Kylin 0.7.2 and it worked well);
- HDP 2.1 is my cluster, my machine is the client;
- I did not change any configuration in my cluster;
- Kylin is in my client machine with the default configuration files;
- I downloaded Hadoop 2.4.0, HBase 0.98.0 and Hive 0.13.1 to use in my
client as a Hadoop CLI;
- In the template of these files: hadoop/core-site.xml,
hbase/hbase-site.xml and hive/hive-site.xml, I just added one or two
properties to set my cluster IP. All these tools are apparently ok
since I can access my cluster from them.

Whereas my cluster is ok, I was thinking if my problem is in Hadoop
CLI configuration files in the client...what changes have you guys
done in those three configuration files? Did you also change
yarn-site.xml or hdfs-site.xml?

HDP 2.1 makes Kylin installation really easy. But, since I am new in
Hadoop, I am facing these problems when setting up the client machine.


On Wed, Aug 26, 2015 at 5:40 AM, DroopyHoo <ol...@gmail.com> wrote:
> Hi Diego
>
> We met this error stack when deploying in our Hadoop enviroment(not
> sandbox). The problem we met is  the function for checking MR job status do
> not support kerberos auth (our hadoop cluster use kerberos service). So we
> made some changes to this part of source code.
>
> I'm not sure whether this case could help you to analyse the problem.
>
> 在 15/8/26 上午10:46, Diego Pinheiro 写道:
>
>> Hi Bin Mahone,
>>
>> sorry for the late reply. Thank you for your support. I didn't know
>> about Kylin instances. It is really interesting.
>>
>> However, let me ask you, I was setting up my hadoop client machine
>> with Kylin to communicate to my sandbox. But things are not working
>> well.
>>
>> I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
>> are working and I can access my "remote server" from my client machine
>> (actually, I set kylin as sandbox since all my hadoop cli is pointing
>> to my sandbox). Then, Kylin was built and everything was ok until I
>> tried to build the cube.
>>
>> I got the following errors always in the second step of cube build:
>>
>> [pool-5-thread-2]:[2015-08-25
>>
>> 19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:91)]
>> - error check status
>> java.net.ConnectException: Connection refused
>>      at java.net.PlainSocketImpl.socketConnect(Native Method)
>>      at
>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>>      at
>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>>      at
>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>>      at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>      at java.net.Socket.connect(Socket.java:579)
>>      at java.net.Socket.connect(Socket.java:528)
>>      at java.net.Socket.<init>(Socket.java:425)
>>      at java.net.Socket.<init>(Socket.java:280)
>>      at
>> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
>>      at
>> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
>>      at
>> org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
>>      at
>> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
>>      at
>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
>>      at
>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>>      at
>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>>      at
>> org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopStatusGetter.java:78)
>>      at
>> org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.java:55)
>>      at
>> org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:56)
>>      at
>> org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:136)
>>      at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>      at
>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>>      at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>      at
>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>>      at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>      at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>      at java.lang.Thread.run(Thread.java:745)
>>
>> org.apache.kylin.job.exception.ExecuteException:
>> java.lang.NullPointerException
>>      at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:110)
>>      at
>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>>      at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>      at
>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>>      at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>      at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>      at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.lang.NullPointerException
>>      at
>> org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduceExecutable.java:73)
>>      at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:105)
>>      ... 6 more
>>
>> Do you have any thoughts about these errors? (detailed log is attached)
>>
>>
>> On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <ma...@apache.org> wrote:
>>>
>>> the document is summarized at
>>> http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
>>>
>>> On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <ma...@apache.org> wrote:
>>>
>>>> hi Diego
>>>>
>>>> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it is
>>>> enabled when you cannot run Kylin server on the same machine as your
>>>> hadoop
>>>> CLI, for example, if you're starting Kylin from you local IDE, and you
>>>> hadoop CLI is a sandbox in another machine, this is the "remote" case.
>>>>
>>>> In most of the production deployments we suggest using '"non-remote"
>>>> mode,
>>>> that is, kylin instance is started on the hadoop CLI. The picture
>>>> depicts
>>>> the scenario:
>>>>
>>>> https://github.com/apache/incubator-kylin/blob/0.7/website/images/install/on_cli_install_scene.png
>>>>
>>>> Kylin instances are stateless,  the runtime state is saved in its
>>>> "Metadata Store" in hbase (kylin.metadata.url config in
>>>> conf/kylin.properties). For load balance considerations it is possible
>>>> to
>>>> start multiple Kylin instances sharing the same metadata store (thus
>>>> sharing the same state on table schemas, job status, cube status, etc.)
>>>>
>>>> Each of the kylin instances has a kylin.server.mode entry in
>>>> conf/kylin.properties specifying the runtime mode, it has three options:
>>>> 1.
>>>> "job" for running job engine only 2. "query" for running query engine
>>>> only
>>>> and 3 "all" for running both. Notice that only one server can run the
>>>> job
>>>> engine("all" mode or "job" mode), the others must all be "query" mode.
>>>>
>>>> A typical scenario is depicted in the attachment chart.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>>
>>>> *Bin Mahone | 马洪宾*
>>>> Apache Kylin: http://kylin.io
>>>> Github: https://github.com/binmahone
>>>>
>>>
>>>
>>> --
>>> Regards,
>>>
>>> *Bin Mahone | 马洪宾*
>>> Apache Kylin: http://kylin.io
>>> Github: https://github.com/binmahone
>
>
> --
> -------
> Wei Hu
>

Re: Kylin and Remote Server

Posted by DroopyHoo <ol...@gmail.com>.
Hi Diego

We met this error stack when deploying in our Hadoop enviroment(not 
sandbox). The problem we met is  the function for checking MR job status 
do not support kerberos auth (our hadoop cluster use kerberos service). 
So we made some changes to this part of source code.

I'm not sure whether this case could help you to analyse the problem.

在 15/8/26 上午10:46, Diego Pinheiro 写道:
> Hi Bin Mahone,
>
> sorry for the late reply. Thank you for your support. I didn't know
> about Kylin instances. It is really interesting.
>
> However, let me ask you, I was setting up my hadoop client machine
> with Kylin to communicate to my sandbox. But things are not working
> well.
>
> I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
> are working and I can access my "remote server" from my client machine
> (actually, I set kylin as sandbox since all my hadoop cli is pointing
> to my sandbox). Then, Kylin was built and everything was ok until I
> tried to build the cube.
>
> I got the following errors always in the second step of cube build:
>
> [pool-5-thread-2]:[2015-08-25
> 19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:91)]
> - error check status
> java.net.ConnectException: Connection refused
>      at java.net.PlainSocketImpl.socketConnect(Native Method)
>      at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>      at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>      at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>      at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>      at java.net.Socket.connect(Socket.java:579)
>      at java.net.Socket.connect(Socket.java:528)
>      at java.net.Socket.<init>(Socket.java:425)
>      at java.net.Socket.<init>(Socket.java:280)
>      at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
>      at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
>      at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
>      at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
>      at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
>      at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>      at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>      at org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopStatusGetter.java:78)
>      at org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.java:55)
>      at org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:56)
>      at org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:136)
>      at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>      at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>      at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>      at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>      at java.lang.Thread.run(Thread.java:745)
>
> org.apache.kylin.job.exception.ExecuteException: java.lang.NullPointerException
>      at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:110)
>      at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>      at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>      at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>      at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>      at org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduceExecutable.java:73)
>      at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:105)
>      ... 6 more
>
> Do you have any thoughts about these errors? (detailed log is attached)
>
>
> On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <ma...@apache.org> wrote:
>> the document is summarized at
>> http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
>>
>> On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <ma...@apache.org> wrote:
>>
>>> hi Diego
>>>
>>> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it is
>>> enabled when you cannot run Kylin server on the same machine as your hadoop
>>> CLI, for example, if you're starting Kylin from you local IDE, and you
>>> hadoop CLI is a sandbox in another machine, this is the "remote" case.
>>>
>>> In most of the production deployments we suggest using '"non-remote" mode,
>>> that is, kylin instance is started on the hadoop CLI. The picture depicts
>>> the scenario:
>>> https://github.com/apache/incubator-kylin/blob/0.7/website/images/install/on_cli_install_scene.png
>>>
>>> Kylin instances are stateless,  the runtime state is saved in its
>>> "Metadata Store" in hbase (kylin.metadata.url config in
>>> conf/kylin.properties). For load balance considerations it is possible to
>>> start multiple Kylin instances sharing the same metadata store (thus
>>> sharing the same state on table schemas, job status, cube status, etc.)
>>>
>>> Each of the kylin instances has a kylin.server.mode entry in
>>> conf/kylin.properties specifying the runtime mode, it has three options: 1.
>>> "job" for running job engine only 2. "query" for running query engine only
>>> and 3 "all" for running both. Notice that only one server can run the job
>>> engine("all" mode or "job" mode), the others must all be "query" mode.
>>>
>>> A typical scenario is depicted in the attachment chart.
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>>
>>> *Bin Mahone | 马洪宾*
>>> Apache Kylin: http://kylin.io
>>> Github: https://github.com/binmahone
>>>
>>
>>
>> --
>> Regards,
>>
>> *Bin Mahone | 马洪宾*
>> Apache Kylin: http://kylin.io
>> Github: https://github.com/binmahone

-- 
-------
Wei Hu


Re: Kylin and Remote Server

Posted by hongbin ma <ma...@apache.org>.
in this case, you should leave kylin.job.run.as.remote.cmd alone.

On Wed, Aug 26, 2015 at 11:52 AM, Diego Pinheiro <di...@gmail.com>
wrote:

> I see...my settings are the two first machines you pointed out:
>
> Machine 1:  a sandbox which works acts as the "hadoop cluster"
> Machine 2:  a hadoop client machine which installed the client libraries
> and is running Kylin.sh
>
> I will take a look at my kylin.properties to check
> 'kylin.job.run.as.remote.cmd". Unfortunately, I can't do it right now,
> but, as soon as I checked it. I will let you know.
>
>
> On Tue, Aug 25, 2015 at 11:11 PM, hongbin ma <ma...@apache.org> wrote:
> > Hi, I'm not quite sure about your settings because we may have messed up
> > the terms.
> > How many machines do you have in your settings? Correct me if I'm wrong:
> > Machine 1:  a sandbox which works acts as the "hadoop cluster"
> > Machine 2:  a hadoop client machine which installed the client libraries
> > and is running Kylin.sh
> > Machine 3:  you working laptop/PC ?
> >
> > The config 'kylin.job.run.as.remote.cmd" might be confusing, it should
> not
> > be set to "true" unless you're NOT running Kylin.sh on a hadoop client
> > machine (Thus kylin instance has to ssh to another real hadoop client
> > machine to execute hbase,hive,hadoop commands). So normally, if you're
> > running Kylin.sh on "Machine 2", you should leave
> > 'kylin.job.run.as.remote.cmd"  to false
> >
> >
> >
> > On Wed, Aug 26, 2015 at 10:46 AM, Diego Pinheiro <
> diegoquintanap@gmail.com>
> > wrote:
> >
> >> Hi Bin Mahone,
> >>
> >> sorry for the late reply. Thank you for your support. I didn't know
> >> about Kylin instances. It is really interesting.
> >>
> >> However, let me ask you, I was setting up my hadoop client machine
> >> with Kylin to communicate to my sandbox. But things are not working
> >> well.
> >>
> >> I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
> >> are working and I can access my "remote server" from my client machine
> >> (actually, I set kylin as sandbox since all my hadoop cli is pointing
> >> to my sandbox). Then, Kylin was built and everything was ok until I
> >> tried to build the cube.
> >>
> >> I got the following errors always in the second step of cube build:
> >>
> >> [pool-5-thread-2]:[2015-08-25
> >>
> >>
> 19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:91)]
> >> - error check status
> >> java.net.ConnectException: Connection refused
> >>     at java.net.PlainSocketImpl.socketConnect(Native Method)
> >>     at
> >>
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> >>     at
> >>
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
> >>     at
> >>
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> >>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> >>     at java.net.Socket.connect(Socket.java:579)
> >>     at java.net.Socket.connect(Socket.java:528)
> >>     at java.net.Socket.<init>(Socket.java:425)
> >>     at java.net.Socket.<init>(Socket.java:280)
> >>     at
> >>
> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
> >>     at
> >>
> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
> >>     at
> >>
> org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
> >>     at
> >>
> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
> >>     at
> >>
> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
> >>     at
> >>
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
> >>     at
> >>
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
> >>     at
> >>
> org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopStatusGetter.java:78)
> >>     at
> >>
> org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.java:55)
> >>     at
> >>
> org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:56)
> >>     at
> >>
> org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:136)
> >>     at
> >>
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
> >>     at
> >>
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
> >>     at
> >>
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
> >>     at
> >>
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
> >>     at
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >>     at
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>     at java.lang.Thread.run(Thread.java:745)
> >>
> >> org.apache.kylin.job.exception.ExecuteException:
> >> java.lang.NullPointerException
> >>     at
> >>
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:110)
> >>     at
> >>
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
> >>     at
> >>
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
> >>     at
> >>
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
> >>     at
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >>     at
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>     at java.lang.Thread.run(Thread.java:745)
> >> Caused by: java.lang.NullPointerException
> >>     at
> >>
> org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduceExecutable.java:73)
> >>     at
> >>
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:105)
> >>     ... 6 more
> >>
> >> Do you have any thoughts about these errors? (detailed log is attached)
> >>
> >>
> >> On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <ma...@apache.org>
> wrote:
> >> > the document is summarized at
> >> > http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
> >> >
> >> > On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <ma...@apache.org>
> >> wrote:
> >> >
> >> >> hi Diego
> >> >>
> >> >> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it is
> >> >> enabled when you cannot run Kylin server on the same machine as your
> >> hadoop
> >> >> CLI, for example, if you're starting Kylin from you local IDE, and
> you
> >> >> hadoop CLI is a sandbox in another machine, this is the "remote"
> case.
> >> >>
> >> >> In most of the production deployments we suggest using '"non-remote"
> >> mode,
> >> >> that is, kylin instance is started on the hadoop CLI. The picture
> >> depicts
> >> >> the scenario:
> >> >>
> >>
> https://github.com/apache/incubator-kylin/blob/0.7/website/images/install/on_cli_install_scene.png
> >> >>
> >> >> Kylin instances are stateless,  the runtime state is saved in its
> >> >> "Metadata Store" in hbase (kylin.metadata.url config in
> >> >> conf/kylin.properties). For load balance considerations it is
> possible
> >> to
> >> >> start multiple Kylin instances sharing the same metadata store (thus
> >> >> sharing the same state on table schemas, job status, cube status,
> etc.)
> >> >>
> >> >> Each of the kylin instances has a kylin.server.mode entry in
> >> >> conf/kylin.properties specifying the runtime mode, it has three
> >> options: 1.
> >> >> "job" for running job engine only 2. "query" for running query engine
> >> only
> >> >> and 3 "all" for running both. Notice that only one server can run the
> >> job
> >> >> engine("all" mode or "job" mode), the others must all be "query"
> mode.
> >> >>
> >> >> A typical scenario is depicted in the attachment chart.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Regards,
> >> >>
> >> >> *Bin Mahone | 马洪宾*
> >> >> Apache Kylin: http://kylin.io
> >> >> Github: https://github.com/binmahone
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Regards,
> >> >
> >> > *Bin Mahone | 马洪宾*
> >> > Apache Kylin: http://kylin.io
> >> > Github: https://github.com/binmahone
> >>
> >
> >
> >
> > --
> > Regards,
> >
> > *Bin Mahone | 马洪宾*
> > Apache Kylin: http://kylin.io
> > Github: https://github.com/binmahone
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Re: Kylin and Remote Server

Posted by Diego Pinheiro <di...@gmail.com>.
I see...my settings are the two first machines you pointed out:

Machine 1:  a sandbox which works acts as the "hadoop cluster"
Machine 2:  a hadoop client machine which installed the client libraries
and is running Kylin.sh

I will take a look at my kylin.properties to check
'kylin.job.run.as.remote.cmd". Unfortunately, I can't do it right now,
but, as soon as I checked it. I will let you know.


On Tue, Aug 25, 2015 at 11:11 PM, hongbin ma <ma...@apache.org> wrote:
> Hi, I'm not quite sure about your settings because we may have messed up
> the terms.
> How many machines do you have in your settings? Correct me if I'm wrong:
> Machine 1:  a sandbox which works acts as the "hadoop cluster"
> Machine 2:  a hadoop client machine which installed the client libraries
> and is running Kylin.sh
> Machine 3:  you working laptop/PC ?
>
> The config 'kylin.job.run.as.remote.cmd" might be confusing, it should not
> be set to "true" unless you're NOT running Kylin.sh on a hadoop client
> machine (Thus kylin instance has to ssh to another real hadoop client
> machine to execute hbase,hive,hadoop commands). So normally, if you're
> running Kylin.sh on "Machine 2", you should leave
> 'kylin.job.run.as.remote.cmd"  to false
>
>
>
> On Wed, Aug 26, 2015 at 10:46 AM, Diego Pinheiro <di...@gmail.com>
> wrote:
>
>> Hi Bin Mahone,
>>
>> sorry for the late reply. Thank you for your support. I didn't know
>> about Kylin instances. It is really interesting.
>>
>> However, let me ask you, I was setting up my hadoop client machine
>> with Kylin to communicate to my sandbox. But things are not working
>> well.
>>
>> I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
>> are working and I can access my "remote server" from my client machine
>> (actually, I set kylin as sandbox since all my hadoop cli is pointing
>> to my sandbox). Then, Kylin was built and everything was ok until I
>> tried to build the cube.
>>
>> I got the following errors always in the second step of cube build:
>>
>> [pool-5-thread-2]:[2015-08-25
>>
>> 19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:91)]
>> - error check status
>> java.net.ConnectException: Connection refused
>>     at java.net.PlainSocketImpl.socketConnect(Native Method)
>>     at
>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>>     at
>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>>     at
>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>     at java.net.Socket.connect(Socket.java:579)
>>     at java.net.Socket.connect(Socket.java:528)
>>     at java.net.Socket.<init>(Socket.java:425)
>>     at java.net.Socket.<init>(Socket.java:280)
>>     at
>> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
>>     at
>> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
>>     at
>> org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
>>     at
>> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
>>     at
>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
>>     at
>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>>     at
>> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>>     at
>> org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopStatusGetter.java:78)
>>     at
>> org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.java:55)
>>     at
>> org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:56)
>>     at
>> org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:136)
>>     at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>     at
>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>>     at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>     at
>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>>     at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>     at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>     at java.lang.Thread.run(Thread.java:745)
>>
>> org.apache.kylin.job.exception.ExecuteException:
>> java.lang.NullPointerException
>>     at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:110)
>>     at
>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>>     at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>>     at
>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>>     at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>     at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>     at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.lang.NullPointerException
>>     at
>> org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduceExecutable.java:73)
>>     at
>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:105)
>>     ... 6 more
>>
>> Do you have any thoughts about these errors? (detailed log is attached)
>>
>>
>> On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <ma...@apache.org> wrote:
>> > the document is summarized at
>> > http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
>> >
>> > On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <ma...@apache.org>
>> wrote:
>> >
>> >> hi Diego
>> >>
>> >> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it is
>> >> enabled when you cannot run Kylin server on the same machine as your
>> hadoop
>> >> CLI, for example, if you're starting Kylin from you local IDE, and you
>> >> hadoop CLI is a sandbox in another machine, this is the "remote" case.
>> >>
>> >> In most of the production deployments we suggest using '"non-remote"
>> mode,
>> >> that is, kylin instance is started on the hadoop CLI. The picture
>> depicts
>> >> the scenario:
>> >>
>> https://github.com/apache/incubator-kylin/blob/0.7/website/images/install/on_cli_install_scene.png
>> >>
>> >> Kylin instances are stateless,  the runtime state is saved in its
>> >> "Metadata Store" in hbase (kylin.metadata.url config in
>> >> conf/kylin.properties). For load balance considerations it is possible
>> to
>> >> start multiple Kylin instances sharing the same metadata store (thus
>> >> sharing the same state on table schemas, job status, cube status, etc.)
>> >>
>> >> Each of the kylin instances has a kylin.server.mode entry in
>> >> conf/kylin.properties specifying the runtime mode, it has three
>> options: 1.
>> >> "job" for running job engine only 2. "query" for running query engine
>> only
>> >> and 3 "all" for running both. Notice that only one server can run the
>> job
>> >> engine("all" mode or "job" mode), the others must all be "query" mode.
>> >>
>> >> A typical scenario is depicted in the attachment chart.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Regards,
>> >>
>> >> *Bin Mahone | 马洪宾*
>> >> Apache Kylin: http://kylin.io
>> >> Github: https://github.com/binmahone
>> >>
>> >
>> >
>> >
>> > --
>> > Regards,
>> >
>> > *Bin Mahone | 马洪宾*
>> > Apache Kylin: http://kylin.io
>> > Github: https://github.com/binmahone
>>
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone

Re: Kylin and Remote Server

Posted by hongbin ma <ma...@apache.org>.
​Hi, I'm not quite sure about your settings because we may have messed up
the terms.
How many machines do you have in your settings? Correct me if I'm wrong:
Machine 1:  a sandbox which works acts as the "hadoop cluster"
Machine 2:  a hadoop client machine which installed the client libraries
and is running Kylin.sh
Machine 3:  you working laptop/PC ?

The config 'kylin.job.run.as.remote.cmd" might be confusing, it should not
be set to "true" unless you're NOT running Kylin.sh on a hadoop client
machine (Thus kylin instance has to ssh to another real hadoop client
machine to execute hbase,hive,hadoop commands). So normally, if you're
running Kylin.sh on "Machine 2", you should leave
'kylin.job.run.as.remote.cmd"  to false



On Wed, Aug 26, 2015 at 10:46 AM, Diego Pinheiro <di...@gmail.com>
wrote:

> Hi Bin Mahone,
>
> sorry for the late reply. Thank you for your support. I didn't know
> about Kylin instances. It is really interesting.
>
> However, let me ask you, I was setting up my hadoop client machine
> with Kylin to communicate to my sandbox. But things are not working
> well.
>
> I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
> are working and I can access my "remote server" from my client machine
> (actually, I set kylin as sandbox since all my hadoop cli is pointing
> to my sandbox). Then, Kylin was built and everything was ok until I
> tried to build the cube.
>
> I got the following errors always in the second step of cube build:
>
> [pool-5-thread-2]:[2015-08-25
>
> 19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:91)]
> - error check status
> java.net.ConnectException: Connection refused
>     at java.net.PlainSocketImpl.socketConnect(Native Method)
>     at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
>     at
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
>     at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>     at java.net.Socket.connect(Socket.java:579)
>     at java.net.Socket.connect(Socket.java:528)
>     at java.net.Socket.<init>(Socket.java:425)
>     at java.net.Socket.<init>(Socket.java:280)
>     at
> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
>     at
> org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
>     at
> org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
>     at
> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
>     at
> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
>     at
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>     at
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>     at
> org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopStatusGetter.java:78)
>     at
> org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.java:55)
>     at
> org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:56)
>     at
> org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:136)
>     at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>     at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>     at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>     at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
>
> org.apache.kylin.job.exception.ExecuteException:
> java.lang.NullPointerException
>     at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:110)
>     at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>     at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
>     at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>     at
> org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduceExecutable.java:73)
>     at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:105)
>     ... 6 more
>
> Do you have any thoughts about these errors? (detailed log is attached)
>
>
> On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <ma...@apache.org> wrote:
> > the document is summarized at
> > http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
> >
> > On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <ma...@apache.org>
> wrote:
> >
> >> hi Diego
> >>
> >> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it is
> >> enabled when you cannot run Kylin server on the same machine as your
> hadoop
> >> CLI, for example, if you're starting Kylin from you local IDE, and you
> >> hadoop CLI is a sandbox in another machine, this is the "remote" case.
> >>
> >> In most of the production deployments we suggest using '"non-remote"
> mode,
> >> that is, kylin instance is started on the hadoop CLI. The picture
> depicts
> >> the scenario:
> >>
> https://github.com/apache/incubator-kylin/blob/0.7/website/images/install/on_cli_install_scene.png
> >>
> >> Kylin instances are stateless,  the runtime state is saved in its
> >> "Metadata Store" in hbase (kylin.metadata.url config in
> >> conf/kylin.properties). For load balance considerations it is possible
> to
> >> start multiple Kylin instances sharing the same metadata store (thus
> >> sharing the same state on table schemas, job status, cube status, etc.)
> >>
> >> Each of the kylin instances has a kylin.server.mode entry in
> >> conf/kylin.properties specifying the runtime mode, it has three
> options: 1.
> >> "job" for running job engine only 2. "query" for running query engine
> only
> >> and 3 "all" for running both. Notice that only one server can run the
> job
> >> engine("all" mode or "job" mode), the others must all be "query" mode.
> >>
> >> A typical scenario is depicted in the attachment chart.
> >>
> >>
> >>
> >>
> >>
> >> --
> >> Regards,
> >>
> >> *Bin Mahone | 马洪宾*
> >> Apache Kylin: http://kylin.io
> >> Github: https://github.com/binmahone
> >>
> >
> >
> >
> > --
> > Regards,
> >
> > *Bin Mahone | 马洪宾*
> > Apache Kylin: http://kylin.io
> > Github: https://github.com/binmahone
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Re: Kylin and Remote Server

Posted by Diego Pinheiro <di...@gmail.com>.
Hi Bin Mahone,

sorry for the late reply. Thank you for your support. I didn't know
about Kylin instances. It is really interesting.

However, let me ask you, I was setting up my hadoop client machine
with Kylin to communicate to my sandbox. But things are not working
well.

I have installed hadoop 2.4.0, hbase 0.98.0 and hive 0.13.1. All them
are working and I can access my "remote server" from my client machine
(actually, I set kylin as sandbox since all my hadoop cli is pointing
to my sandbox). Then, Kylin was built and everything was ok until I
tried to build the cube.

I got the following errors always in the second step of cube build:

[pool-5-thread-2]:[2015-08-25
19:16:03,679][ERROR][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:91)]
- error check status
java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:579)
    at java.net.Socket.connect(Socket.java:528)
    at java.net.Socket.<init>(Socket.java:425)
    at java.net.Socket.<init>(Socket.java:280)
    at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
    at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
    at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
    at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
    at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
    at org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopStatusGetter.java:78)
    at org.apache.kylin.job.tools.HadoopStatusGetter.get(HadoopStatusGetter.java:55)
    at org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:56)
    at org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:136)
    at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
    at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
    at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
    at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

org.apache.kylin.job.exception.ExecuteException: java.lang.NullPointerException
    at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:110)
    at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
    at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
    at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:133)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
    at org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduceExecutable.java:73)
    at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:105)
    ... 6 more

Do you have any thoughts about these errors? (detailed log is attached)


On Fri, Aug 21, 2015 at 3:21 AM, hongbin ma <ma...@apache.org> wrote:
> the document is summarized at
> http://kylin.incubator.apache.org/docs/install/kylin_cluster.html
>
> On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <ma...@apache.org> wrote:
>
>> hi Diego
>>
>> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it is
>> enabled when you cannot run Kylin server on the same machine as your hadoop
>> CLI, for example, if you're starting Kylin from you local IDE, and you
>> hadoop CLI is a sandbox in another machine, this is the "remote" case.
>>
>> In most of the production deployments we suggest using '"non-remote" mode,
>> that is, kylin instance is started on the hadoop CLI. The picture depicts
>> the scenario:
>> https://github.com/apache/incubator-kylin/blob/0.7/website/images/install/on_cli_install_scene.png
>>
>> Kylin instances are stateless,  the runtime state is saved in its
>> "Metadata Store" in hbase (kylin.metadata.url config in
>> conf/kylin.properties). For load balance considerations it is possible to
>> start multiple Kylin instances sharing the same metadata store (thus
>> sharing the same state on table schemas, job status, cube status, etc.)
>>
>> Each of the kylin instances has a kylin.server.mode entry in
>> conf/kylin.properties specifying the runtime mode, it has three options: 1.
>> "job" for running job engine only 2. "query" for running query engine only
>> and 3 "all" for running both. Notice that only one server can run the job
>> engine("all" mode or "job" mode), the others must all be "query" mode.
>>
>> A typical scenario is depicted in the attachment chart.
>>
>>
>>
>>
>>
>> --
>> Regards,
>>
>> *Bin Mahone | 马洪宾*
>> Apache Kylin: http://kylin.io
>> Github: https://github.com/binmahone
>>
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone

Re: Kylin and Remote Server

Posted by hongbin ma <ma...@apache.org>.
the document is summarized at
http://kylin.incubator.apache.org/docs/install/kylin_cluster.html

On Fri, Aug 21, 2015 at 1:51 PM, hongbin ma <ma...@apache.org> wrote:

> ​hi Diego
>
> the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it is
> enabled when you cannot run Kylin server on the same machine as your hadoop
> CLI, for example, if you're starting Kylin from you local IDE, and you
> hadoop CLI is a sandbox in another machine, this is the "remote" case.
>
> In most of the production deployments we suggest using '"non-remote" mode,
> that is, kylin instance is started on the hadoop CLI. The picture depicts
> the scenario:
> https://github.com/apache/incubator-kylin/blob/0.7/website/images/install/on_cli_install_scene.png
>
> Kylin instances are stateless,  the runtime state is saved in its
> "Metadata Store" in hbase (kylin.metadata.url config in
> conf/kylin.properties). For load balance considerations it is possible to
> start multiple Kylin instances sharing the same metadata store (thus
> sharing the same state on table schemas, job status, cube status, etc.)
>
> Each of the kylin instances has a kylin.server.mode entry in
> conf/kylin.properties specifying the runtime mode, it has three options: 1.
> "job" for running job engine only 2. "query" for running query engine only
> and 3 "all" for running both. Notice that only one server can run the job
> engine("all" mode or "job" mode), the others must all be "query" mode.
>
> A typical scenario is depicted in the attachment chart.
>
>
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Re: Kylin and Remote Server

Posted by hongbin ma <ma...@apache.org>.
​hi Diego

the config "kylin.job.run.as.remote.cmd" is somehow ambiguous , it is
enabled when you cannot run Kylin server on the same machine as your hadoop
CLI, for example, if you're starting Kylin from you local IDE, and you
hadoop CLI is a sandbox in another machine, this is the "remote" case.

In most of the production deployments we suggest using '"non-remote" mode,
that is, kylin instance is started on the hadoop CLI. The picture depicts
the scenario:
https://github.com/apache/incubator-kylin/blob/0.7/website/images/install/on_cli_install_scene.png

Kylin instances are stateless,  the runtime state is saved in its "Metadata
Store" in hbase (kylin.metadata.url config in conf/kylin.properties). For
load balance considerations it is possible to start multiple Kylin
instances sharing the same metadata store (thus sharing the same state on
table schemas, job status, cube status, etc.)

Each of the kylin instances has a kylin.server.mode entry in
conf/kylin.properties specifying the runtime mode, it has three options: 1.
"job" for running job engine only 2. "query" for running query engine only
and 3 "all" for running both. Notice that only one server can run the job
engine("all" mode or "job" mode), the others must all be "query" mode.

A typical scenario is depicted in the attachment chart.





-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone