You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Justin Cunningham (JIRA)" <ji...@apache.org> on 2015/01/29 23:13:35 UTC

[jira] [Commented] (AVRO-1504) Improve python implementation performance

    [ https://issues.apache.org/jira/browse/AVRO-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14297800#comment-14297800 ] 

Justin Cunningham commented on AVRO-1504:
-----------------------------------------

I started looking into making some performance improvements for writing in the python clientlib before I came across this ticket, and I found that the lookup table Steven implemented at https://github.com/smoy/avro/commit/71220bb4a84c7aa4d42b593a2c0f7cefa8cda82d#diff-438b29138d73e88e1a515a63c8250e25R124 and replacing the property at https://github.com/smoy/avro/commit/71220bb4a84c7aa4d42b593a2c0f7cefa8cda82d#diff-438b29138d73e88e1a515a63c8250e25R268 alone resulted in a 15% performance improvement.  To encode 100,000 records, runtime dropped from 6.587 seconds to 5.616 seconds in my benchmark.

Performance of the python client isn't great write now, these changes will result in a substantial improvement.  

Any chance a committer could do a code review?

> Improve python implementation performance
> -----------------------------------------
>
>                 Key: AVRO-1504
>                 URL: https://issues.apache.org/jira/browse/AVRO-1504
>             Project: Avro
>          Issue Type: Improvement
>          Components: python
>    Affects Versions: 1.7.6
>            Reporter: Steven Moy
>              Labels: patch, performance
>         Attachments: AVRO-1504.patch
>
>
> Inspired by https://www.python.org/doc/essays/list2str/, there are some low hanging fruit to increase the performance for python implementation.
> Patch soon follow:
> https://github.com/smoy/avro/commits/smoy_reader_performance
> relevant commits
> * 71220bb4a84c7aa4d42b593a2c0f7cefa8cda82d
> * 542139ce1a40492c9234ee5f84a4410515877af4
> * 2f7a0ef8d02148cf69269f5b59f89481e7c86d34



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)