You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by Sonny Heer <so...@gmail.com> on 2018/02/28 14:53:14 UTC

running spark on kylin 2.2

Anyone know what I need to set in order for spark-submit to use the HDP
version of spark and not the internal one?

currently i see:

export HADOOP_CONF_DIR=/ebs/kylin/hadoop-conf &&
/ebs/kylin/apache-kylin-2.2.0-bin/spark/bin/spark-submit


I see in the kylin.properties files:
## Spark conf (default is in spark/conf/spark-defaults.conf)

Although it doesn't how how I can change this to use the HDP spark-submit.

Also HDP is on 1.6.1 version of spark and kylin internally uses 2.x.  Not
sure if that matters during submit.  I can't seem to get more than 2
executors to run without it failing with other errors.  We have about 44
slots on our cluster.

Also uncommented:
## uncomment for HDP

kylin.engine.spark-conf.spark.driver.extraJavaOptions=-Dhdp.version=current

kylin.engine.spark-conf.spark.yarn.am.extraJavaOptions=-Dhdp.version=current

kylin.engine.spark-conf.spark.executor.extraJavaOptions=-Dhdp.version=current

see attached for other properties set.

Re: running spark on kylin 2.2

Posted by Sonny Heer <so...@gmail.com>.
2.0 shows spark as beta release.  Is anyone using it in prod successfully?

On Wed, Feb 28, 2018 at 4:26 PM, ShaoFeng Shi <sh...@apache.org>
wrote:

> Kylin 2.1/2.2/2.3  compiles (and ships) with Spark 2.1; If you want to run
> with Spark 1.6, please use Kylin 2.0;
>
> 2018-03-01 6:45 GMT+08:00 Ted Yu <yu...@gmail.com>:
>
>> Please use vendor's forum.
>>
>> Thanks
>>
>> -------- Original message --------
>> From: Sonny Heer <so...@gmail.com>
>> Date: 2/28/18 2:35 PM (GMT-08:00)
>> To: user@kylin.apache.org
>> Subject: Re: running spark on kylin 2.2
>>
>> So when I run it with just spark-submit it gets further.  but now there
>> is a API difference.  Does Kylin 2.2 work with Spark 1.6.1 ?  This is the
>> version that comes with HDP 2.4.2.0-258
>>
>> ERROR:
>> Exception in thread "main" java.lang.NoSuchMethodError:
>> org.apache.spark.sql.hive.HiveContext.table(Ljava/lang/Strin
>> g;)Lorg/apache/spark/sql/Dataset;
>> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(Spa
>> rkCubingByLayer.java:167
>>
>>
>> It appears it only supports spark 2.x?  Please advise what we can do to
>> make this work on HDP 2.4...
>>
>> Thanks
>>
>> On Wed, Feb 28, 2018 at 2:07 PM, Sonny Heer <so...@gmail.com> wrote:
>>
>>> I don't see spark-libs.jar under $KYLIN_HOME/spark/jars
>>>
>>> per this doc: http://kylin.apache.org/docs21/tutorial/cube_spark.html
>>>
>>> On Wed, Feb 28, 2018 at 10:30 AM, Sonny Heer <so...@gmail.com>
>>> wrote:
>>>
>>>> Hi Billy
>>>> Looks like the current error is this:
>>>>
>>>> Error: Could not find or load main class org.apache.spark.deploy.yarn.A
>>>> pplicationMaster
>>>>
>>>> End of LogType:stderr
>>>>
>>>> Thanks
>>>>
>>>> On Wed, Feb 28, 2018 at 8:04 AM, Billy Liu <bi...@apache.org> wrote:
>>>>
>>>>> Any exception in logs?
>>>>>
>>>>> With Warm regards
>>>>>
>>>>> Billy Liu
>>>>>
>>>>>
>>>>> 2018-02-28 22:53 GMT+08:00 Sonny Heer <so...@gmail.com>:
>>>>> > Anyone know what I need to set in order for spark-submit to use the
>>>>> HDP
>>>>> > version of spark and not the internal one?
>>>>> >
>>>>> > currently i see:
>>>>> >
>>>>> > export HADOOP_CONF_DIR=/ebs/kylin/hadoop-conf &&
>>>>> > /ebs/kylin/apache-kylin-2.2.0-bin/spark/bin/spark-submit
>>>>> >
>>>>> >
>>>>> > I see in the kylin.properties files:
>>>>> > ## Spark conf (default is in spark/conf/spark-defaults.conf)
>>>>> >
>>>>> > Although it doesn't how how I can change this to use the HDP
>>>>> spark-submit.
>>>>> >
>>>>> > Also HDP is on 1.6.1 version of spark and kylin internally uses
>>>>> 2.x.  Not
>>>>> > sure if that matters during submit.  I can't seem to get more than 2
>>>>> > executors to run without it failing with other errors.  We have
>>>>> about 44
>>>>> > slots on our cluster.
>>>>> >
>>>>> > Also uncommented:
>>>>> > ## uncomment for HDP
>>>>> >
>>>>> > kylin.engine.spark-conf.spark.driver.extraJavaOptions=-Dhdp.
>>>>> version=current
>>>>> >
>>>>> > kylin.engine.spark-conf.spark.yarn.am.extraJavaOptions=-Dhdp
>>>>> .version=current
>>>>> >
>>>>> > kylin.engine.spark-conf.spark.executor.extraJavaOptions=-Dhd
>>>>> p.version=current
>>>>> >
>>>>> > see attached for other properties set.
>>>>>
>>>>
>>>>
>>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>

Re: running spark on kylin 2.2

Posted by ShaoFeng Shi <sh...@apache.org>.
Kylin 2.1/2.2/2.3  compiles (and ships) with Spark 2.1; If you want to run
with Spark 1.6, please use Kylin 2.0;

2018-03-01 6:45 GMT+08:00 Ted Yu <yu...@gmail.com>:

> Please use vendor's forum.
>
> Thanks
>
> -------- Original message --------
> From: Sonny Heer <so...@gmail.com>
> Date: 2/28/18 2:35 PM (GMT-08:00)
> To: user@kylin.apache.org
> Subject: Re: running spark on kylin 2.2
>
> So when I run it with just spark-submit it gets further.  but now there is
> a API difference.  Does Kylin 2.2 work with Spark 1.6.1 ?  This is the
> version that comes with HDP 2.4.2.0-258
>
> ERROR:
> Exception in thread "main" java.lang.NoSuchMethodError:
> org.apache.spark.sql.hive.HiveContext.table(Ljava/lang/
> String;)Lorg/apache/spark/sql/Dataset;
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:167
>
>
> It appears it only supports spark 2.x?  Please advise what we can do to
> make this work on HDP 2.4...
>
> Thanks
>
> On Wed, Feb 28, 2018 at 2:07 PM, Sonny Heer <so...@gmail.com> wrote:
>
>> I don't see spark-libs.jar under $KYLIN_HOME/spark/jars
>>
>> per this doc: http://kylin.apache.org/docs21/tutorial/cube_spark.html
>>
>> On Wed, Feb 28, 2018 at 10:30 AM, Sonny Heer <so...@gmail.com> wrote:
>>
>>> Hi Billy
>>> Looks like the current error is this:
>>>
>>> Error: Could not find or load main class org.apache.spark.deploy.yarn.A
>>> pplicationMaster
>>>
>>> End of LogType:stderr
>>>
>>> Thanks
>>>
>>> On Wed, Feb 28, 2018 at 8:04 AM, Billy Liu <bi...@apache.org> wrote:
>>>
>>>> Any exception in logs?
>>>>
>>>> With Warm regards
>>>>
>>>> Billy Liu
>>>>
>>>>
>>>> 2018-02-28 22:53 GMT+08:00 Sonny Heer <so...@gmail.com>:
>>>> > Anyone know what I need to set in order for spark-submit to use the
>>>> HDP
>>>> > version of spark and not the internal one?
>>>> >
>>>> > currently i see:
>>>> >
>>>> > export HADOOP_CONF_DIR=/ebs/kylin/hadoop-conf &&
>>>> > /ebs/kylin/apache-kylin-2.2.0-bin/spark/bin/spark-submit
>>>> >
>>>> >
>>>> > I see in the kylin.properties files:
>>>> > ## Spark conf (default is in spark/conf/spark-defaults.conf)
>>>> >
>>>> > Although it doesn't how how I can change this to use the HDP
>>>> spark-submit.
>>>> >
>>>> > Also HDP is on 1.6.1 version of spark and kylin internally uses 2.x.
>>>> Not
>>>> > sure if that matters during submit.  I can't seem to get more than 2
>>>> > executors to run without it failing with other errors.  We have about
>>>> 44
>>>> > slots on our cluster.
>>>> >
>>>> > Also uncommented:
>>>> > ## uncomment for HDP
>>>> >
>>>> > kylin.engine.spark-conf.spark.driver.extraJavaOptions=-Dhdp.
>>>> version=current
>>>> >
>>>> > kylin.engine.spark-conf.spark.yarn.am.extraJavaOptions=-Dhdp
>>>> .version=current
>>>> >
>>>> > kylin.engine.spark-conf.spark.executor.extraJavaOptions=-Dhd
>>>> p.version=current
>>>> >
>>>> > see attached for other properties set.
>>>>
>>>
>>>
>>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Re: running spark on kylin 2.2

Posted by Ted Yu <yu...@gmail.com>.
Please use vendor's forum. 
Thanks
-------- Original message --------From: Sonny Heer <so...@gmail.com> Date: 2/28/18  2:35 PM  (GMT-08:00) To: user@kylin.apache.org Subject: Re: running spark on kylin 2.2 
So when I run it with just spark-submit it gets further.  but now there is a API difference.  Does Kylin 2.2 work with Spark 1.6.1 ?  This is the version that comes with HDP 










2.4.2.0-258



ERROR:











Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.hive.HiveContext.table(Ljava/lang/String;)Lorg/apache/spark/sql/Dataset;
at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:167

It appears it only supports spark 2.x?  Please advise what we can do to make this work on HDP 2.4...
Thanks
On Wed, Feb 28, 2018 at 2:07 PM, Sonny Heer <so...@gmail.com> wrote:
I don't see spark-libs.jar under $KYLIN_HOME/spark/jars
per this doc: http://kylin.apache.org/docs21/tutorial/cube_spark.html
On Wed, Feb 28, 2018 at 10:30 AM, Sonny Heer <so...@gmail.com> wrote:
Hi BillyLooks like the current error is this:











Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster
End of LogType:stderr



Thanks
On Wed, Feb 28, 2018 at 8:04 AM, Billy Liu <bi...@apache.org> wrote:
Any exception in logs?



With Warm regards



Billy Liu





2018-02-28 22:53 GMT+08:00 Sonny Heer <so...@gmail.com>:

> Anyone know what I need to set in order for spark-submit to use the HDP

> version of spark and not the internal one?

>

> currently i see:

>

> export HADOOP_CONF_DIR=/ebs/kylin/hadoop-conf &&

> /ebs/kylin/apache-kylin-2.2.0-bin/spark/bin/spark-submit

>

>

> I see in the kylin.properties files:

> ## Spark conf (default is in spark/conf/spark-defaults.conf)

>

> Although it doesn't how how I can change this to use the HDP spark-submit.

>

> Also HDP is on 1.6.1 version of spark and kylin internally uses 2.x.  Not

> sure if that matters during submit.  I can't seem to get more than 2

> executors to run without it failing with other errors.  We have about 44

> slots on our cluster.

>

> Also uncommented:

> ## uncomment for HDP

>

> kylin.engine.spark-conf.spark.driver.extraJavaOptions=-Dhdp.version=current

>

> kylin.engine.spark-conf.spark.yarn.am.extraJavaOptions=-Dhdp.version=current

>

> kylin.engine.spark-conf.spark.executor.extraJavaOptions=-Dhdp.version=current

>

> see attached for other properties set.








Re: running spark on kylin 2.2

Posted by Sonny Heer <so...@gmail.com>.
So when I run it with just spark-submit it gets further.  but now there is
a API difference.  Does Kylin 2.2 work with Spark 1.6.1 ?  This is the
version that comes with HDP 2.4.2.0-258

ERROR:
Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.spark.sql.hive.HiveContext.table(Ljava/lang/String;)Lorg/apache/spark/sql/Dataset;
at
org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:167


It appears it only supports spark 2.x?  Please advise what we can do to
make this work on HDP 2.4...

Thanks

On Wed, Feb 28, 2018 at 2:07 PM, Sonny Heer <so...@gmail.com> wrote:

> I don't see spark-libs.jar under $KYLIN_HOME/spark/jars
>
> per this doc: http://kylin.apache.org/docs21/tutorial/cube_spark.html
>
> On Wed, Feb 28, 2018 at 10:30 AM, Sonny Heer <so...@gmail.com> wrote:
>
>> Hi Billy
>> Looks like the current error is this:
>>
>> Error: Could not find or load main class org.apache.spark.deploy.yarn.A
>> pplicationMaster
>>
>> End of LogType:stderr
>>
>> Thanks
>>
>> On Wed, Feb 28, 2018 at 8:04 AM, Billy Liu <bi...@apache.org> wrote:
>>
>>> Any exception in logs?
>>>
>>> With Warm regards
>>>
>>> Billy Liu
>>>
>>>
>>> 2018-02-28 22:53 GMT+08:00 Sonny Heer <so...@gmail.com>:
>>> > Anyone know what I need to set in order for spark-submit to use the HDP
>>> > version of spark and not the internal one?
>>> >
>>> > currently i see:
>>> >
>>> > export HADOOP_CONF_DIR=/ebs/kylin/hadoop-conf &&
>>> > /ebs/kylin/apache-kylin-2.2.0-bin/spark/bin/spark-submit
>>> >
>>> >
>>> > I see in the kylin.properties files:
>>> > ## Spark conf (default is in spark/conf/spark-defaults.conf)
>>> >
>>> > Although it doesn't how how I can change this to use the HDP
>>> spark-submit.
>>> >
>>> > Also HDP is on 1.6.1 version of spark and kylin internally uses 2.x.
>>> Not
>>> > sure if that matters during submit.  I can't seem to get more than 2
>>> > executors to run without it failing with other errors.  We have about
>>> 44
>>> > slots on our cluster.
>>> >
>>> > Also uncommented:
>>> > ## uncomment for HDP
>>> >
>>> > kylin.engine.spark-conf.spark.driver.extraJavaOptions=-Dhdp.
>>> version=current
>>> >
>>> > kylin.engine.spark-conf.spark.yarn.am.extraJavaOptions=-Dhdp
>>> .version=current
>>> >
>>> > kylin.engine.spark-conf.spark.executor.extraJavaOptions=-Dhd
>>> p.version=current
>>> >
>>> > see attached for other properties set.
>>>
>>
>>
>

Re: running spark on kylin 2.2

Posted by Sonny Heer <so...@gmail.com>.
I don't see spark-libs.jar under $KYLIN_HOME/spark/jars

per this doc: http://kylin.apache.org/docs21/tutorial/cube_spark.html

On Wed, Feb 28, 2018 at 10:30 AM, Sonny Heer <so...@gmail.com> wrote:

> Hi Billy
> Looks like the current error is this:
>
> Error: Could not find or load main class org.apache.spark.deploy.yarn.
> ApplicationMaster
>
> End of LogType:stderr
>
> Thanks
>
> On Wed, Feb 28, 2018 at 8:04 AM, Billy Liu <bi...@apache.org> wrote:
>
>> Any exception in logs?
>>
>> With Warm regards
>>
>> Billy Liu
>>
>>
>> 2018-02-28 22:53 GMT+08:00 Sonny Heer <so...@gmail.com>:
>> > Anyone know what I need to set in order for spark-submit to use the HDP
>> > version of spark and not the internal one?
>> >
>> > currently i see:
>> >
>> > export HADOOP_CONF_DIR=/ebs/kylin/hadoop-conf &&
>> > /ebs/kylin/apache-kylin-2.2.0-bin/spark/bin/spark-submit
>> >
>> >
>> > I see in the kylin.properties files:
>> > ## Spark conf (default is in spark/conf/spark-defaults.conf)
>> >
>> > Although it doesn't how how I can change this to use the HDP
>> spark-submit.
>> >
>> > Also HDP is on 1.6.1 version of spark and kylin internally uses 2.x.
>> Not
>> > sure if that matters during submit.  I can't seem to get more than 2
>> > executors to run without it failing with other errors.  We have about 44
>> > slots on our cluster.
>> >
>> > Also uncommented:
>> > ## uncomment for HDP
>> >
>> > kylin.engine.spark-conf.spark.driver.extraJavaOptions=-Dhdp.
>> version=current
>> >
>> > kylin.engine.spark-conf.spark.yarn.am.extraJavaOptions=-Dhdp
>> .version=current
>> >
>> > kylin.engine.spark-conf.spark.executor.extraJavaOptions=-Dhd
>> p.version=current
>> >
>> > see attached for other properties set.
>>
>
>

Re: running spark on kylin 2.2

Posted by Sonny Heer <so...@gmail.com>.
Hi Billy
Looks like the current error is this:

Error: Could not find or load main class
org.apache.spark.deploy.yarn.ApplicationMaster

End of LogType:stderr

Thanks

On Wed, Feb 28, 2018 at 8:04 AM, Billy Liu <bi...@apache.org> wrote:

> Any exception in logs?
>
> With Warm regards
>
> Billy Liu
>
>
> 2018-02-28 22:53 GMT+08:00 Sonny Heer <so...@gmail.com>:
> > Anyone know what I need to set in order for spark-submit to use the HDP
> > version of spark and not the internal one?
> >
> > currently i see:
> >
> > export HADOOP_CONF_DIR=/ebs/kylin/hadoop-conf &&
> > /ebs/kylin/apache-kylin-2.2.0-bin/spark/bin/spark-submit
> >
> >
> > I see in the kylin.properties files:
> > ## Spark conf (default is in spark/conf/spark-defaults.conf)
> >
> > Although it doesn't how how I can change this to use the HDP
> spark-submit.
> >
> > Also HDP is on 1.6.1 version of spark and kylin internally uses 2.x.  Not
> > sure if that matters during submit.  I can't seem to get more than 2
> > executors to run without it failing with other errors.  We have about 44
> > slots on our cluster.
> >
> > Also uncommented:
> > ## uncomment for HDP
> >
> > kylin.engine.spark-conf.spark.driver.extraJavaOptions=-Dhdp.
> version=current
> >
> > kylin.engine.spark-conf.spark.yarn.am.extraJavaOptions=-
> Dhdp.version=current
> >
> > kylin.engine.spark-conf.spark.executor.extraJavaOptions=-
> Dhdp.version=current
> >
> > see attached for other properties set.
>

Re: running spark on kylin 2.2

Posted by Billy Liu <bi...@apache.org>.
Any exception in logs?

With Warm regards

Billy Liu


2018-02-28 22:53 GMT+08:00 Sonny Heer <so...@gmail.com>:
> Anyone know what I need to set in order for spark-submit to use the HDP
> version of spark and not the internal one?
>
> currently i see:
>
> export HADOOP_CONF_DIR=/ebs/kylin/hadoop-conf &&
> /ebs/kylin/apache-kylin-2.2.0-bin/spark/bin/spark-submit
>
>
> I see in the kylin.properties files:
> ## Spark conf (default is in spark/conf/spark-defaults.conf)
>
> Although it doesn't how how I can change this to use the HDP spark-submit.
>
> Also HDP is on 1.6.1 version of spark and kylin internally uses 2.x.  Not
> sure if that matters during submit.  I can't seem to get more than 2
> executors to run without it failing with other errors.  We have about 44
> slots on our cluster.
>
> Also uncommented:
> ## uncomment for HDP
>
> kylin.engine.spark-conf.spark.driver.extraJavaOptions=-Dhdp.version=current
>
> kylin.engine.spark-conf.spark.yarn.am.extraJavaOptions=-Dhdp.version=current
>
> kylin.engine.spark-conf.spark.executor.extraJavaOptions=-Dhdp.version=current
>
> see attached for other properties set.