You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by "M. Dale" <me...@yahoo.com> on 2015/02/19 16:40:18 UTC

Using Spark with Kryo - how to set Spark configuration?

I would like to use Spark with Kryo serialization.

 From the command line spark-shell I would add:
--conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
--conf 
spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator \

Or I can add those to conf/spark-defaults.conf:
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator

How can I set these properties for the Zeppelin-provided spark context?

Thanks for your help,
Markus

Re: Using Spark with Kryo - how to set Spark configuration?

Posted by Jongyoul Lee <jo...@gmail.com>.
Hi Moon,

Your patch LGTM.

On Mon, Feb 23, 2015 at 1:39 PM, moon soo Lee <le...@gmail.com> wrote:

> Hi JL, Markus.
>
> Any configuration properties (even though they're not listed in GUI by
> default)  start with 'spark.' can be added inside interpreter menu and
> they're consumed by SparkContext.
>
> About reading exsiting spark-defaults.conf, i think quickest way to do is
> making little helper function in zeppelin-env.sh to read the configuration.
> for example, if you place following code in your zeppelin-env.sh, it'll
> read spark-defaults.conf and set properties.
>
> function readSparkConf() {
>>     SPARK_CONF_PATH="${1}"
>>     echo "Reading ${SPARK_CONF_PATH}"
>>     while read line; do
>>         echo "${line}" | grep -e "^spark[.]" > /dev/null
>>         if [ $? -ne 0 ]; then
>>             # skip the line not started with 'spark.'
>>             continue;
>>         fi
>>         SPARK_CONF_KEY=`echo "${line}" | sed -e 's/\(^spark[^ ]*\)[
>> \t]*\(.*\)/\1/g'`
>>         SPARK_CONF_VALUE=`echo "${line}" | sed -e 's/\(^spark[^ ]*\)[
>> \t]*\(.*\)/\2/g'`
>>         export ZEPPELIN_JAVA_OPTS+="
>> -D${SPARK_CONF_KEY}=\"${SPARK_CONF_VALUE}\""
>>     done < ${SPARK_CONF_PATH}
>> }
>> readSparkConf [YOUR_SPARK_HOME]/conf/spark-defaults.conf
>
>
>
> Thanks,
> moon
>
>
> On Mon, Feb 23, 2015 at 12:56 PM, M. Dale <me...@yahoo.com> wrote:
>
>>  JL and Moon,
>>    Thank you for your responses below! I would second JL's observation.
>> Creating a new interpreter via the GUI allows this customization but it
>> would be nice to be able to just point at a configuration file (the same
>> configuration file used to run spark-shell) and maybe add that to the
>> default interpreter.
>>
>> Thank you,
>> Markus
>>
>>
>> On 02/22/2015 10:09 PM, Jongyoul Lee wrote:
>>
>> Hi Markus,
>>
>>  I also cannot find to pass spark configurations to zeppelin easily, but
>> you can do it with JAVA_OPTS. In your case,
>> JAVA_OPTS="-Dspark.serializer=org.apache.spark.serializer.KryoSerializer
>> -Dspark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator"
>>
>>  Hi Moon,
>> Zeppelin needs to read spark-default.conf, adds a configuration option to
>> set spark-default.conf, or allow set spark.* to zeppelin-site.xml. Please
>> give us your opinion.
>>
>>  Regards,
>> JL
>>
>> On Mon, Feb 23, 2015 at 10:24 AM, moon soo Lee <le...@gmail.com>
>> wrote:
>>
>>> Hi Markus,
>>>
>>>  I think following directories
>>>
>>>  conf/            (basic configurations, interpreter configurations)
>>> notebook/    (notebooks)
>>> interpreter/  (if you have any 3rd party interpreter installed)
>>>
>>>  are things to take care when move to next Zeppelin version.
>>>
>>>  Thanks
>>> moon
>>>
>>>
>>> On Fri, Feb 20, 2015 at 12:57 AM, M. Dale <me...@yahoo.com> wrote:
>>>
>>>>  Anthony,
>>>>   That did the trick! Thank you so much for your quick reply.
>>>>
>>>> It seems that conf/interpreter.json got changed and now saves my new
>>>> interpreter. When I move to the next version of Zeppelin, is that what I
>>>> need to copy to keep that interpreter as an option?
>>>>
>>>> Thanks again for your help,
>>>> Markus
>>>>
>>>>
>>>> On 02/19/2015 10:46 AM, Anthony Corbacho wrote:
>>>>
>>>> hello Markus,
>>>>
>>>> in Zeppelin you can configure those option through the interpreters.
>>>>
>>>> basically, you will have to create a new interpreter and set the
>>>> serialization options (you will see that you can add custom options to
>>>> spark interpreter) and then activate this interpreter in your notebook.
>>>>
>>>> hope it help.
>>>> On Feb 20, 2015 12:41 AM, "M. Dale" <me...@yahoo.com> wrote:
>>>>
>>>>> I would like to use Spark with Kryo serialization.
>>>>>
>>>>> From the command line spark-shell I would add:
>>>>> --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
>>>>> --conf
>>>>> spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator \
>>>>>
>>>>> Or I can add those to conf/spark-defaults.conf:
>>>>> spark.serializer=org.apache.spark.serializer.KryoSerializer
>>>>>
>>>>> spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator
>>>>>
>>>>> How can I set these properties for the Zeppelin-provided spark context?
>>>>>
>>>>> Thanks for your help,
>>>>> Markus
>>>>>
>>>>
>>>>
>>>
>>
>>
>>  --
>>  이종열, Jongyoul Lee, 李宗烈
>>  http://madeng.net
>>
>>
>>
>


-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net

Re: Using Spark with Kryo - how to set Spark configuration?

Posted by moon soo Lee <le...@gmail.com>.
Hi JL, Markus.

Any configuration properties (even though they're not listed in GUI by
default)  start with 'spark.' can be added inside interpreter menu and
they're consumed by SparkContext.

About reading exsiting spark-defaults.conf, i think quickest way to do is
making little helper function in zeppelin-env.sh to read the configuration.
for example, if you place following code in your zeppelin-env.sh, it'll
read spark-defaults.conf and set properties.

function readSparkConf() {
>     SPARK_CONF_PATH="${1}"
>     echo "Reading ${SPARK_CONF_PATH}"
>     while read line; do
>         echo "${line}" | grep -e "^spark[.]" > /dev/null
>         if [ $? -ne 0 ]; then
>             # skip the line not started with 'spark.'
>             continue;
>         fi
>         SPARK_CONF_KEY=`echo "${line}" | sed -e 's/\(^spark[^ ]*\)[
> \t]*\(.*\)/\1/g'`
>         SPARK_CONF_VALUE=`echo "${line}" | sed -e 's/\(^spark[^ ]*\)[
> \t]*\(.*\)/\2/g'`
>         export ZEPPELIN_JAVA_OPTS+="
> -D${SPARK_CONF_KEY}=\"${SPARK_CONF_VALUE}\""
>     done < ${SPARK_CONF_PATH}
> }
> readSparkConf [YOUR_SPARK_HOME]/conf/spark-defaults.conf



Thanks,
moon


On Mon, Feb 23, 2015 at 12:56 PM, M. Dale <me...@yahoo.com> wrote:

>  JL and Moon,
>    Thank you for your responses below! I would second JL's observation.
> Creating a new interpreter via the GUI allows this customization but it
> would be nice to be able to just point at a configuration file (the same
> configuration file used to run spark-shell) and maybe add that to the
> default interpreter.
>
> Thank you,
> Markus
>
>
> On 02/22/2015 10:09 PM, Jongyoul Lee wrote:
>
> Hi Markus,
>
>  I also cannot find to pass spark configurations to zeppelin easily, but
> you can do it with JAVA_OPTS. In your case,
> JAVA_OPTS="-Dspark.serializer=org.apache.spark.serializer.KryoSerializer
> -Dspark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator"
>
>  Hi Moon,
> Zeppelin needs to read spark-default.conf, adds a configuration option to
> set spark-default.conf, or allow set spark.* to zeppelin-site.xml. Please
> give us your opinion.
>
>  Regards,
> JL
>
> On Mon, Feb 23, 2015 at 10:24 AM, moon soo Lee <le...@gmail.com>
> wrote:
>
>> Hi Markus,
>>
>>  I think following directories
>>
>>  conf/            (basic configurations, interpreter configurations)
>> notebook/    (notebooks)
>> interpreter/  (if you have any 3rd party interpreter installed)
>>
>>  are things to take care when move to next Zeppelin version.
>>
>>  Thanks
>> moon
>>
>>
>> On Fri, Feb 20, 2015 at 12:57 AM, M. Dale <me...@yahoo.com> wrote:
>>
>>>  Anthony,
>>>   That did the trick! Thank you so much for your quick reply.
>>>
>>> It seems that conf/interpreter.json got changed and now saves my new
>>> interpreter. When I move to the next version of Zeppelin, is that what I
>>> need to copy to keep that interpreter as an option?
>>>
>>> Thanks again for your help,
>>> Markus
>>>
>>>
>>> On 02/19/2015 10:46 AM, Anthony Corbacho wrote:
>>>
>>> hello Markus,
>>>
>>> in Zeppelin you can configure those option through the interpreters.
>>>
>>> basically, you will have to create a new interpreter and set the
>>> serialization options (you will see that you can add custom options to
>>> spark interpreter) and then activate this interpreter in your notebook.
>>>
>>> hope it help.
>>> On Feb 20, 2015 12:41 AM, "M. Dale" <me...@yahoo.com> wrote:
>>>
>>>> I would like to use Spark with Kryo serialization.
>>>>
>>>> From the command line spark-shell I would add:
>>>> --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
>>>> --conf
>>>> spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator \
>>>>
>>>> Or I can add those to conf/spark-defaults.conf:
>>>> spark.serializer=org.apache.spark.serializer.KryoSerializer
>>>>
>>>> spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator
>>>>
>>>> How can I set these properties for the Zeppelin-provided spark context?
>>>>
>>>> Thanks for your help,
>>>> Markus
>>>>
>>>
>>>
>>
>
>
>  --
>  이종열, Jongyoul Lee, 李宗烈
>  http://madeng.net
>
>
>

Re: Using Spark with Kryo - how to set Spark configuration?

Posted by "M. Dale" <me...@yahoo.com>.
JL and Moon,
    Thank you for your responses below! I would second JL's observation. 
Creating a new interpreter via the GUI allows this customization but it 
would be nice to be able to just point at a configuration file (the same 
configuration file used to run spark-shell) and maybe add that to the 
default interpreter.

Thank you,
Markus

On 02/22/2015 10:09 PM, Jongyoul Lee wrote:
> Hi Markus,
>
> I also cannot find to pass spark configurations to zeppelin easily, 
> but you can do it with JAVA_OPTS. In your case, 
> JAVA_OPTS="-Dspark.serializer=org.apache.spark.serializer.KryoSerializer 
> -Dspark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator"
>
> Hi Moon,
> Zeppelin needs to read spark-default.conf, adds a configuration option 
> to set spark-default.conf, or allow set spark.* to zeppelin-site.xml. 
> Please give us your opinion.
>
> Regards,
> JL
>
> On Mon, Feb 23, 2015 at 10:24 AM, moon soo Lee <leemoonsoo@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Hi Markus,
>
>     I think following directories
>
>     conf/            (basic configurations, interpreter configurations)
>     notebook/    (notebooks)
>     interpreter/  (if you have any 3rd party interpreter installed)
>
>     are things to take care when move to next Zeppelin version.
>
>     Thanks
>     moon
>
>
>     On Fri, Feb 20, 2015 at 12:57 AM, M. Dale <medale94@yahoo.com
>     <ma...@yahoo.com>> wrote:
>
>         Anthony,
>           That did the trick! Thank you so much for your quick reply.
>
>         It seems that conf/interpreter.json got changed and now saves
>         my new interpreter. When I move to the next version of
>         Zeppelin, is that what I need to copy to keep that interpreter
>         as an option?
>
>         Thanks again for your help,
>         Markus
>
>
>         On 02/19/2015 10:46 AM, Anthony Corbacho wrote:
>>
>>         hello Markus,
>>
>>         in Zeppelin you can configure those option through the
>>         interpreters.
>>
>>         basically, you will have to create a new interpreter and set
>>         the serialization options (you will see that you can add
>>         custom options to spark interpreter) and then activate this
>>         interpreter in your notebook.
>>
>>         hope it help.
>>
>>         On Feb 20, 2015 12:41 AM, "M. Dale" <medale94@yahoo.com
>>         <ma...@yahoo.com>> wrote:
>>
>>             I would like to use Spark with Kryo serialization.
>>
>>             From the command line spark-shell I would add:
>>             --conf
>>             spark.serializer=org.apache.spark.serializer.KryoSerializer \
>>             --conf
>>             spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator
>>             \
>>
>>             Or I can add those to conf/spark-defaults.conf:
>>             spark.serializer=org.apache.spark.serializer.KryoSerializer
>>             spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator
>>
>>             How can I set these properties for the Zeppelin-provided
>>             spark context?
>>
>>             Thanks for your help,
>>             Markus
>>
>
>
>
>
>
> -- 
> 이종열, Jongyoul Lee, 李宗烈
> http://madeng.net


Re: Using Spark with Kryo - how to set Spark configuration?

Posted by Jongyoul Lee <jo...@gmail.com>.
Hi Markus,

I also cannot find to pass spark configurations to zeppelin easily, but you
can do it with JAVA_OPTS. In your case,
JAVA_OPTS="-Dspark.serializer=org.apache.spark.serializer.KryoSerializer
-Dspark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator"

Hi Moon,
Zeppelin needs to read spark-default.conf, adds a configuration option to
set spark-default.conf, or allow set spark.* to zeppelin-site.xml. Please
give us your opinion.

Regards,
JL

On Mon, Feb 23, 2015 at 10:24 AM, moon soo Lee <le...@gmail.com> wrote:

> Hi Markus,
>
> I think following directories
>
> conf/            (basic configurations, interpreter configurations)
> notebook/    (notebooks)
> interpreter/  (if you have any 3rd party interpreter installed)
>
> are things to take care when move to next Zeppelin version.
>
> Thanks
> moon
>
>
> On Fri, Feb 20, 2015 at 12:57 AM, M. Dale <me...@yahoo.com> wrote:
>
>>  Anthony,
>>   That did the trick! Thank you so much for your quick reply.
>>
>> It seems that conf/interpreter.json got changed and now saves my new
>> interpreter. When I move to the next version of Zeppelin, is that what I
>> need to copy to keep that interpreter as an option?
>>
>> Thanks again for your help,
>> Markus
>>
>>
>> On 02/19/2015 10:46 AM, Anthony Corbacho wrote:
>>
>> hello Markus,
>>
>> in Zeppelin you can configure those option through the interpreters.
>>
>> basically, you will have to create a new interpreter and set the
>> serialization options (you will see that you can add custom options to
>> spark interpreter) and then activate this interpreter in your notebook.
>>
>> hope it help.
>> On Feb 20, 2015 12:41 AM, "M. Dale" <me...@yahoo.com> wrote:
>>
>>> I would like to use Spark with Kryo serialization.
>>>
>>> From the command line spark-shell I would add:
>>> --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
>>> --conf
>>> spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator \
>>>
>>> Or I can add those to conf/spark-defaults.conf:
>>> spark.serializer=org.apache.spark.serializer.KryoSerializer
>>>
>>> spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator
>>>
>>> How can I set these properties for the Zeppelin-provided spark context?
>>>
>>> Thanks for your help,
>>> Markus
>>>
>>
>>
>


-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net

Re: Using Spark with Kryo - how to set Spark configuration?

Posted by moon soo Lee <le...@gmail.com>.
Hi Markus,

I think following directories

conf/            (basic configurations, interpreter configurations)
notebook/    (notebooks)
interpreter/  (if you have any 3rd party interpreter installed)

are things to take care when move to next Zeppelin version.

Thanks
moon


On Fri, Feb 20, 2015 at 12:57 AM, M. Dale <me...@yahoo.com> wrote:

>  Anthony,
>   That did the trick! Thank you so much for your quick reply.
>
> It seems that conf/interpreter.json got changed and now saves my new
> interpreter. When I move to the next version of Zeppelin, is that what I
> need to copy to keep that interpreter as an option?
>
> Thanks again for your help,
> Markus
>
>
> On 02/19/2015 10:46 AM, Anthony Corbacho wrote:
>
> hello Markus,
>
> in Zeppelin you can configure those option through the interpreters.
>
> basically, you will have to create a new interpreter and set the
> serialization options (you will see that you can add custom options to
> spark interpreter) and then activate this interpreter in your notebook.
>
> hope it help.
> On Feb 20, 2015 12:41 AM, "M. Dale" <me...@yahoo.com> wrote:
>
>> I would like to use Spark with Kryo serialization.
>>
>> From the command line spark-shell I would add:
>> --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
>> --conf
>> spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator \
>>
>> Or I can add those to conf/spark-defaults.conf:
>> spark.serializer=org.apache.spark.serializer.KryoSerializer
>> spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator
>>
>> How can I set these properties for the Zeppelin-provided spark context?
>>
>> Thanks for your help,
>> Markus
>>
>
>

Re: Using Spark with Kryo - how to set Spark configuration?

Posted by "M. Dale" <me...@yahoo.com>.
Anthony,
   That did the trick! Thank you so much for your quick reply.

It seems that conf/interpreter.json got changed and now saves my new 
interpreter. When I move to the next version of Zeppelin, is that what I 
need to copy to keep that interpreter as an option?

Thanks again for your help,
Markus

On 02/19/2015 10:46 AM, Anthony Corbacho wrote:
>
> hello Markus,
>
> in Zeppelin you can configure those option through the interpreters.
>
> basically, you will have to create a new interpreter and set the 
> serialization options (you will see that you can add custom options to 
> spark interpreter) and then activate this interpreter in your notebook.
>
> hope it help.
>
> On Feb 20, 2015 12:41 AM, "M. Dale" <medale94@yahoo.com 
> <ma...@yahoo.com>> wrote:
>
>     I would like to use Spark with Kryo serialization.
>
>     From the command line spark-shell I would add:
>     --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
>     --conf
>     spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator
>     \
>
>     Or I can add those to conf/spark-defaults.conf:
>     spark.serializer=org.apache.spark.serializer.KryoSerializer
>     spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator
>
>     How can I set these properties for the Zeppelin-provided spark
>     context?
>
>     Thanks for your help,
>     Markus
>


Re: Using Spark with Kryo - how to set Spark configuration?

Posted by Anthony Corbacho <an...@apache.org>.
hello Markus,

in Zeppelin you can configure those option through the interpreters.

basically, you will have to create a new interpreter and set the
serialization options (you will see that you can add custom options to
spark interpreter) and then activate this interpreter in your notebook.

hope it help.
On Feb 20, 2015 12:41 AM, "M. Dale" <me...@yahoo.com> wrote:

> I would like to use Spark with Kryo serialization.
>
> From the command line spark-shell I would add:
> --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
> --conf spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator
> \
>
> Or I can add those to conf/spark-defaults.conf:
> spark.serializer=org.apache.spark.serializer.KryoSerializer
> spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator
>
> How can I set these properties for the Zeppelin-provided spark context?
>
> Thanks for your help,
> Markus
>