You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Leo Romanoff <ro...@yahoo.com> on 2013/03/21 08:39:30 UTC

Speed improvements for reflection-based serialization

Hi,

I've played a bit with Avro serialization based on reflection, i.e. with the classes from org.apache.avro.generic package.
It works fine in this mode, but is rather slow compared to such frameworks like protostuff or kryo.
Quick look at the source code has shown that a lot of reflection-based operations and class lookups are not cached. And such operations are usually pretty expensive when executed by a JVM. 

So I changed some of org.apache.avro.generic classes, introduced caching and a few other optimizations. Now it seems to perform much better.
I could submit my patch for a review, if anyone is interested in such improvements. I'm new to Avro, but I got the impression that Voldermont and may be a few other BigData projects are using it? May be they are interested, though I don't know if they use this reflection-based serialization. 

Best Regards,
  Leo

Re: Speed improvements for reflection-based serialization

Posted by Doug Cutting <cu...@apache.org>.
More generally, Avro contribution guidelines are at:

https://cwiki.apache.org/AVRO/how-to-contribute.html

Doug

On Thu, Mar 21, 2013 at 9:30 AM, Doug Cutting <cu...@apache.org> wrote:
> Leo,
>
> Please submit your patches, I'd love to see them.  Create an issue in
> Jira and attach your changes there.
>
> https://issues.apache.org/jira/browse/AVRO
>
> Thanks,
>
> Doug
>
> On Thu, Mar 21, 2013 at 12:39 AM, Leo Romanoff <ro...@yahoo.com> wrote:
>> Hi,
>>
>> I've played a bit with Avro serialization based on reflection, i.e. with the classes from org.apache.avro.generic package.
>> It works fine in this mode, but is rather slow compared to such frameworks like protostuff or kryo.
>> Quick look at the source code has shown that a lot of reflection-based operations and class lookups are not cached. And such operations are usually pretty expensive when executed by a JVM.
>>
>> So I changed some of org.apache.avro.generic classes, introduced caching and a few other optimizations. Now it seems to perform much better.
>> I could submit my patch for a review, if anyone is interested in such improvements. I'm new to Avro, but I got the impression that Voldermont and may be a few other BigData projects are using it? May be they are interested, though I don't know if they use this reflection-based serialization.
>>
>> Best Regards,
>>   Leo

Re: Speed improvements for reflection-based serialization

Posted by Doug Cutting <cu...@apache.org>.
Leo,

Please submit your patches, I'd love to see them.  Create an issue in
Jira and attach your changes there.

https://issues.apache.org/jira/browse/AVRO

Thanks,

Doug

On Thu, Mar 21, 2013 at 12:39 AM, Leo Romanoff <ro...@yahoo.com> wrote:
> Hi,
>
> I've played a bit with Avro serialization based on reflection, i.e. with the classes from org.apache.avro.generic package.
> It works fine in this mode, but is rather slow compared to such frameworks like protostuff or kryo.
> Quick look at the source code has shown that a lot of reflection-based operations and class lookups are not cached. And such operations are usually pretty expensive when executed by a JVM.
>
> So I changed some of org.apache.avro.generic classes, introduced caching and a few other optimizations. Now it seems to perform much better.
> I could submit my patch for a review, if anyone is interested in such improvements. I'm new to Avro, but I got the impression that Voldermont and may be a few other BigData projects are using it? May be they are interested, though I don't know if they use this reflection-based serialization.
>
> Best Regards,
>   Leo