You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Douglas Creager (JIRA)" <ji...@apache.org> on 2012/12/01 15:57:58 UTC

[jira] [Updated] (AVRO-1089) Avro-C - Penalty 30x to 50x for using resolved writer on arrays

     [ https://issues.apache.org/jira/browse/AVRO-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Douglas Creager updated AVRO-1089:
----------------------------------

    Attachment: 0001-AVRO-1089.-Fix-performance-penalty-for-array-resolve.patch

Here's a one-liner patch that fixes this.  The problem was that an internal array wasn't being cleared, and was growing not just with the size of each test case, but with the number of test cases.  Iterating through that array was causing the slowdown.

All tests still pass; running time for the resolved array tests are now comparable with the non-resolved array tests.
                
> Avro-C - Penalty 30x to 50x for using resolved writer on arrays
> ---------------------------------------------------------------
>
>                 Key: AVRO-1089
>                 URL: https://issues.apache.org/jira/browse/AVRO-1089
>             Project: Avro
>          Issue Type: Bug
>          Components: c
>    Affects Versions: 1.6.3, 1.7.0
>         Environment: Ubuntu Linux
>            Reporter: Vivek Nadkarni
>         Attachments: 0001-AVRO-1089.-Fix-performance-penalty-for-array-resolve.patch, AVRO-1089-performance.png
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The new performance tests created in AVRO-1088 show that using the
> resolved writer takes 30 to 50 times longer than using no schema
> resolution or using the resolved reader for simple and nested arrays.
> For a simple array, using the resolved writer took ~30x longer than
> using the memory reader that assumed a matching schema. For the nested
> array, using the resolved writer took ~50x longer.
> These results suggest that there is a bug in resolved writer. I do not
> have a proposed fix at this time.
> **** Running simple array matched schemas ****
>   250000 tests per run
>   Run 1
>   Run 2
>   Run 3
>   Average time: 2.123s
>   Tests/sec:    117739
> **** Running simple array resolved writer ****
>   10000 tests per run
>   Run 1
>   Run 2
>   Run 3
>   Average time: 2.747s
>   Tests/sec:    3641
> **** Running nested array matched schemas ****
>   250000 tests per run
>   Run 1
>   Run 2
>   Run 3
>   Average time: 3.030s
>   Tests/sec:    82508
> **** Running nested array resolved writer ****
>   10000 tests per run
>   Run 1
>   Run 2
>   Run 3
>   Average time: 6.650s
>   Tests/sec:    1504

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira