You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Soren Macbeth <so...@yieldbot.com> on 2014/06/02 00:40:39 UTC

ClassTag in Serializer in 1.0.0 makes non-scala callers sad panda

https://github.com/apache/spark/blob/v1.0.0/core/src/main/scala/org/apache/spark/serializer/Serializer.scala#L64-L66

These changes to the SerializerInstance make it really gross to call
serialize and deserialize from non-scala languages. I'm not sure what the
purpose of a ClassTag is, but if we could get some other arities that don't
require classtags that would help a ton.

Re: ClassTag in Serializer in 1.0.0 makes non-scala callers sad panda

Posted by Matei Zaharia <ma...@gmail.com>.
Very cool, looking forward to it!

Matei

On Jun 1, 2014, at 5:42 PM, Soren Macbeth <so...@yieldbot.com> wrote:

> Yep, that's what I'm doing.
> 
> (def OBJECT-CLASS-TAG (.apply ClassTag$/MODULE$ java.lang.Object))
> 
> ps - I'm planning to open source this Clojure DSL soon as well
> 
> 
> On Sun, Jun 1, 2014 at 5:10 PM, Matei Zaharia <ma...@gmail.com>
> wrote:
> 
>> Ah, got it. In general it will always be safe to pass the ClassTag for
>> java.lang.Object here — this is what our Java API does to say that type
>> info is not known. So you can always pass that. Look at the Java code for
>> how to get this ClassTag.
>> 
>> Matei
>> 
>> On Jun 1, 2014, at 4:33 PM, Soren Macbeth <so...@yieldbot.com> wrote:
>> 
>>> I'm writing a Clojure DSL for Spark. I use kryo to serialize my clojure
>>> functions and for efficiency I hook into Spark's kryo serializer. In
>> order
>>> to do that I get a SerializerInstance from SparkEnv and call the
>> serialize
>>> and deserialize methods. I was able to workaround it by making ClassTag
>>> object in clojure, but it's less than ideal.
>>> 
>>> 
>>> On Sun, Jun 1, 2014 at 4:25 PM, Matei Zaharia <ma...@gmail.com>
>>> wrote:
>>> 
>>>> BTW passing a ClassTag tells the Serializer what the type of object
>> being
>>>> serialized is when you compile your program, which will allow for more
>>>> efficient serializers (especially on streams).
>>>> 
>>>> Matei
>>>> 
>>>> On Jun 1, 2014, at 4:24 PM, Matei Zaharia <ma...@gmail.com>
>> wrote:
>>>> 
>>>>> Why do you need to call Serializer from your own program? It’s an
>>>> internal developer API so ideally it would only be called to extend
>> Spark.
>>>> Are you looking to implement a custom Serializer?
>>>>> 
>>>>> Matei
>>>>> 
>>>>> On Jun 1, 2014, at 3:40 PM, Soren Macbeth <so...@yieldbot.com> wrote:
>>>>> 
>>>>>> 
>>>> 
>> https://github.com/apache/spark/blob/v1.0.0/core/src/main/scala/org/apache/spark/serializer/Serializer.scala#L64-L66
>>>>>> 
>>>>>> These changes to the SerializerInstance make it really gross to call
>>>>>> serialize and deserialize from non-scala languages. I'm not sure what
>>>> the
>>>>>> purpose of a ClassTag is, but if we could get some other arities that
>>>> don't
>>>>>> require classtags that would help a ton.
>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: ClassTag in Serializer in 1.0.0 makes non-scala callers sad panda

Posted by Soren Macbeth <so...@yieldbot.com>.
Yep, that's what I'm doing.

(def OBJECT-CLASS-TAG (.apply ClassTag$/MODULE$ java.lang.Object))

ps - I'm planning to open source this Clojure DSL soon as well


On Sun, Jun 1, 2014 at 5:10 PM, Matei Zaharia <ma...@gmail.com>
wrote:

> Ah, got it. In general it will always be safe to pass the ClassTag for
> java.lang.Object here — this is what our Java API does to say that type
> info is not known. So you can always pass that. Look at the Java code for
> how to get this ClassTag.
>
> Matei
>
> On Jun 1, 2014, at 4:33 PM, Soren Macbeth <so...@yieldbot.com> wrote:
>
> > I'm writing a Clojure DSL for Spark. I use kryo to serialize my clojure
> > functions and for efficiency I hook into Spark's kryo serializer. In
> order
> > to do that I get a SerializerInstance from SparkEnv and call the
> serialize
> > and deserialize methods. I was able to workaround it by making ClassTag
> > object in clojure, but it's less than ideal.
> >
> >
> > On Sun, Jun 1, 2014 at 4:25 PM, Matei Zaharia <ma...@gmail.com>
> > wrote:
> >
> >> BTW passing a ClassTag tells the Serializer what the type of object
> being
> >> serialized is when you compile your program, which will allow for more
> >> efficient serializers (especially on streams).
> >>
> >> Matei
> >>
> >> On Jun 1, 2014, at 4:24 PM, Matei Zaharia <ma...@gmail.com>
> wrote:
> >>
> >>> Why do you need to call Serializer from your own program? It’s an
> >> internal developer API so ideally it would only be called to extend
> Spark.
> >> Are you looking to implement a custom Serializer?
> >>>
> >>> Matei
> >>>
> >>> On Jun 1, 2014, at 3:40 PM, Soren Macbeth <so...@yieldbot.com> wrote:
> >>>
> >>>>
> >>
> https://github.com/apache/spark/blob/v1.0.0/core/src/main/scala/org/apache/spark/serializer/Serializer.scala#L64-L66
> >>>>
> >>>> These changes to the SerializerInstance make it really gross to call
> >>>> serialize and deserialize from non-scala languages. I'm not sure what
> >> the
> >>>> purpose of a ClassTag is, but if we could get some other arities that
> >> don't
> >>>> require classtags that would help a ton.
> >>>
> >>
> >>
>
>

Re: ClassTag in Serializer in 1.0.0 makes non-scala callers sad panda

Posted by Matei Zaharia <ma...@gmail.com>.
Ah, got it. In general it will always be safe to pass the ClassTag for java.lang.Object here — this is what our Java API does to say that type info is not known. So you can always pass that. Look at the Java code for how to get this ClassTag.

Matei

On Jun 1, 2014, at 4:33 PM, Soren Macbeth <so...@yieldbot.com> wrote:

> I'm writing a Clojure DSL for Spark. I use kryo to serialize my clojure
> functions and for efficiency I hook into Spark's kryo serializer. In order
> to do that I get a SerializerInstance from SparkEnv and call the serialize
> and deserialize methods. I was able to workaround it by making ClassTag
> object in clojure, but it's less than ideal.
> 
> 
> On Sun, Jun 1, 2014 at 4:25 PM, Matei Zaharia <ma...@gmail.com>
> wrote:
> 
>> BTW passing a ClassTag tells the Serializer what the type of object being
>> serialized is when you compile your program, which will allow for more
>> efficient serializers (especially on streams).
>> 
>> Matei
>> 
>> On Jun 1, 2014, at 4:24 PM, Matei Zaharia <ma...@gmail.com> wrote:
>> 
>>> Why do you need to call Serializer from your own program? It’s an
>> internal developer API so ideally it would only be called to extend Spark.
>> Are you looking to implement a custom Serializer?
>>> 
>>> Matei
>>> 
>>> On Jun 1, 2014, at 3:40 PM, Soren Macbeth <so...@yieldbot.com> wrote:
>>> 
>>>> 
>> https://github.com/apache/spark/blob/v1.0.0/core/src/main/scala/org/apache/spark/serializer/Serializer.scala#L64-L66
>>>> 
>>>> These changes to the SerializerInstance make it really gross to call
>>>> serialize and deserialize from non-scala languages. I'm not sure what
>> the
>>>> purpose of a ClassTag is, but if we could get some other arities that
>> don't
>>>> require classtags that would help a ton.
>>> 
>> 
>> 


Re: ClassTag in Serializer in 1.0.0 makes non-scala callers sad panda

Posted by Soren Macbeth <so...@yieldbot.com>.
I'm writing a Clojure DSL for Spark. I use kryo to serialize my clojure
functions and for efficiency I hook into Spark's kryo serializer. In order
to do that I get a SerializerInstance from SparkEnv and call the serialize
and deserialize methods. I was able to workaround it by making ClassTag
object in clojure, but it's less than ideal.


On Sun, Jun 1, 2014 at 4:25 PM, Matei Zaharia <ma...@gmail.com>
wrote:

> BTW passing a ClassTag tells the Serializer what the type of object being
> serialized is when you compile your program, which will allow for more
> efficient serializers (especially on streams).
>
> Matei
>
> On Jun 1, 2014, at 4:24 PM, Matei Zaharia <ma...@gmail.com> wrote:
>
> > Why do you need to call Serializer from your own program? It’s an
> internal developer API so ideally it would only be called to extend Spark.
> Are you looking to implement a custom Serializer?
> >
> > Matei
> >
> > On Jun 1, 2014, at 3:40 PM, Soren Macbeth <so...@yieldbot.com> wrote:
> >
> >>
> https://github.com/apache/spark/blob/v1.0.0/core/src/main/scala/org/apache/spark/serializer/Serializer.scala#L64-L66
> >>
> >> These changes to the SerializerInstance make it really gross to call
> >> serialize and deserialize from non-scala languages. I'm not sure what
> the
> >> purpose of a ClassTag is, but if we could get some other arities that
> don't
> >> require classtags that would help a ton.
> >
>
>

Re: ClassTag in Serializer in 1.0.0 makes non-scala callers sad panda

Posted by Matei Zaharia <ma...@gmail.com>.
BTW passing a ClassTag tells the Serializer what the type of object being serialized is when you compile your program, which will allow for more efficient serializers (especially on streams).

Matei

On Jun 1, 2014, at 4:24 PM, Matei Zaharia <ma...@gmail.com> wrote:

> Why do you need to call Serializer from your own program? It’s an internal developer API so ideally it would only be called to extend Spark. Are you looking to implement a custom Serializer?
> 
> Matei
> 
> On Jun 1, 2014, at 3:40 PM, Soren Macbeth <so...@yieldbot.com> wrote:
> 
>> https://github.com/apache/spark/blob/v1.0.0/core/src/main/scala/org/apache/spark/serializer/Serializer.scala#L64-L66
>> 
>> These changes to the SerializerInstance make it really gross to call
>> serialize and deserialize from non-scala languages. I'm not sure what the
>> purpose of a ClassTag is, but if we could get some other arities that don't
>> require classtags that would help a ton.
> 


Re: ClassTag in Serializer in 1.0.0 makes non-scala callers sad panda

Posted by Matei Zaharia <ma...@gmail.com>.
Why do you need to call Serializer from your own program? It’s an internal developer API so ideally it would only be called to extend Spark. Are you looking to implement a custom Serializer?

Matei

On Jun 1, 2014, at 3:40 PM, Soren Macbeth <so...@yieldbot.com> wrote:

> https://github.com/apache/spark/blob/v1.0.0/core/src/main/scala/org/apache/spark/serializer/Serializer.scala#L64-L66
> 
> These changes to the SerializerInstance make it really gross to call
> serialize and deserialize from non-scala languages. I'm not sure what the
> purpose of a ClassTag is, but if we could get some other arities that don't
> require classtags that would help a ton.