You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2013/09/10 18:45:32 UTC
[jira] [Commented] (AVRO-1332) Improve C# DatumReader performance
[ https://issues.apache.org/jira/browse/AVRO-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763191#comment-13763191 ]
Doug Cutting commented on AVRO-1332:
------------------------------------
The batch size=1000 seems like the case we care most about. This is faster in all cases except when serializing a complex record. Any idea why this is now slower? Also, I wonder if some of the resolution can be cached, in order to keep the smaller batch sizes more competitive too?
> Improve C# DatumReader performance
> ----------------------------------
>
> Key: AVRO-1332
> URL: https://issues.apache.org/jira/browse/AVRO-1332
> Project: Avro
> Issue Type: Improvement
> Components: csharp
> Affects Versions: 1.7.5
> Reporter: David McIntosh
> Priority: Minor
> Labels: performance
> Attachments: AVRO-1332-2.patch, AVRO-1332.patch
>
>
> The current implementations of the C# datum readers perform resolution of the reader and writer schema on every call to Read. In my tests this was causing it to perform poorly when reading a large number of records (slower than parsing the same data from delimited text files). It would be more efficient if the reader only needed to resolve the schemas once.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira