You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by amit tewari <am...@gmail.com> on 2016/01/27 12:13:55 UTC
spark.kryo.classesToRegister
This is what I have added in my code:
rdd.persist(StorageLevel.MEMORY_ONLY_SER())
conf.set("spark.serializer","org.apache.spark.serializer.KryoSerializer");
Do I compulsorily need to do anything via : spark.kryo.classesToRegister?
Or the above code sufficient to achieve performance gain using Kryo
serialization?
Thanks
Amit
Re: spark.kryo.classesToRegister
Posted by Jagrut Sharma <ja...@gmail.com>.
I have run into this issue (
https://issues.apache.org/jira/browse/SPARK-10251) with kryo on Spark
version 1.4.1.
Just something to be aware of when setting config to 'true'.
Thanks.
--
Jagrut
On Thu, Jan 28, 2016 at 6:32 AM, Jim Lohse <sp...@megalearningllc.com>
wrote:
> You are only required to add classes to Kryo (compulsorily) if you use a
> specific setting:
>
> //require registration of all classes with Kyro.set("spark.kryo.registrationRequired", "true")
>
> Here's an example of my setup, I think this is the best approach because
> it forces me to really think about what I am serializing:
>
> // for kyro serializer it wants to register all classes that need to be serializedClass[] kryoClassArray = new Class[]{DropResult.class, DropEvaluation.class, PrintHetSharing.class};
> SparkConf sparkConf = new SparkConf()
> .setAppName("MyAppName")
> .setMaster(spark://ipaddress:7077)
> // now for the Kryo stuff
> .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")//require registration of all classes with Kyro.set("spark.kryo.registrationRequired", "true")// don't forget to register ALL classes or will get error.registerKryoClasses(kryoClassArray);
>
> On 01/27/2016 12:58 PM, Shixiong(Ryan) Zhu wrote:
>
> It depends. The default Kryo serializer cannot handle all cases. If you
> encounter any issue, you can follow the Kryo doc to set up custom
> serializer: https://github.com/EsotericSoftware/kryo/blob/master/README.md
>
> On Wed, Jan 27, 2016 at 3:13 AM, amit tewari <am...@gmail.com>
> wrote:
>>
>> This is what I have added in my code:
>>
>>
>>
>> rdd.persist(StorageLevel.MEMORY_ONLY_SER())
>>
>> conf.set("spark.serializer","org.apache.spark.serializer.KryoSerializer");
>>
>>
>>
>> Do I compulsorily need to do anything via : spark.kryo.classesToRegister?
>>
>> Or the above code sufficient to achieve performance gain using Kryo
>> serialization?
>>
>>
>>
>> Thanks
>>
>> Amit
>>
>
Re: spark.kryo.classesToRegister
Posted by Jim Lohse <sp...@megalearningllc.com>.
You are only required to add classes to Kryo (compulsorily) if you use a
specific setting:
//require registration of all classes with Kyro .set("spark.kryo.registrationRequired","true")
Here's an example of my setup, I think this is the best approach because
it forces me to really think about what I am serializing:
// for kyro serializer it wants to register all classes that need to be
serialized Class[] kryoClassArray = new Class[]{DropResult.class,
DropEvaluation.class, PrintHetSharing.class}; SparkConf sparkConf = new
SparkConf() .setAppName("MyAppName") .setMaster(spark://ipaddress:7077)
// now for the Kryo stuff .set("spark.serializer",
"org.apache.spark.serializer.KryoSerializer") //require registration of
all classes with Kyro .set("spark.kryo.registrationRequired", "true") //
don't forget to register ALL classes or will get error
.registerKryoClasses(kryoClassArray);
On 01/27/2016 12:58 PM, Shixiong(Ryan) Zhu wrote:
> It depends. The default Kryo serializer cannot handle all cases. If
> you encounter any issue, you can follow the Kryo doc to set up custom
> serializer:
> https://github.com/EsotericSoftware/kryo/blob/master/README.md
> On Wed, Jan 27, 2016 at 3:13 AM, amit tewari <amittewari.5@gmail.com
> <ma...@gmail.com>> wrote:
>
> This is what I have added in my code:
>
> rdd.persist(StorageLevel.MEMORY_ONLY_SER())
>
> conf.set("spark.serializer","org.apache.spark.serializer.KryoSerializer");
>
> Do I compulsorily need to do anything via
> : spark.kryo.classesToRegister?
>
> Or the above code sufficient to achieve performance gain using
> Kryo serialization?
>
> Thanks
>
> Amit
>
Re: spark.kryo.classesToRegister
Posted by "Shixiong(Ryan) Zhu" <sh...@databricks.com>.
It depends. The default Kryo serializer cannot handle all cases. If you
encounter any issue, you can follow the Kryo doc to set up custom
serializer: https://github.com/EsotericSoftware/kryo/blob/master/README.md
On Wed, Jan 27, 2016 at 3:13 AM, amit tewari <am...@gmail.com> wrote:
> This is what I have added in my code:
>
>
>
> rdd.persist(StorageLevel.MEMORY_ONLY_SER())
>
> conf.set("spark.serializer","org.apache.spark.serializer.KryoSerializer");
>
>
>
> Do I compulsorily need to do anything via : spark.kryo.classesToRegister?
>
> Or the above code sufficient to achieve performance gain using Kryo
> serialization?
>
>
>
> Thanks
>
> Amit
>