You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Vivek Nadkarni (JIRA)" <ji...@apache.org> on 2012/05/14 10:13:49 UTC

[jira] [Created] (AVRO-1088) Avro-C - Add performance tests for schema resolution and arrays.

Vivek Nadkarni created AVRO-1088:
------------------------------------

             Summary: Avro-C - Add performance tests for schema resolution and arrays.
                 Key: AVRO-1088
                 URL: https://issues.apache.org/jira/browse/AVRO-1088
             Project: Avro
          Issue Type: Improvement
          Components: c
    Affects Versions: 1.7.0
         Environment: Ubuntu Linux 11.10
            Reporter: Vivek Nadkarni
             Fix For: 1.7.0


The current performance test in Avro-C measures the performance while
reading and writing of Avro values using a complex record schema,
which does not contain any arrays.

We add tests to measure the performance for simple and nested
arrays. We also replicate all tests to measure the performance of the
schema resolution using a resolved reader and a resolved writer.

Specifically we add the following performance tests:

Nested Record
1. Replicating the test "nested record value by index", using a helper
   function. Using helper functions adds a little overhead, but it
   allows us to test various schemas, as well as different modes of
   schema resolution much more easily.
2. Using a resolved writer to resolve between (identical) reader and
   writer schemas, while reading a complex record.
3. Using a resolved reader to resolve between (identical) reader and
   writer schemas, while writing a complex record.

Simple Array
4. Test the performance for reading and writing a simple array.
5. Using a resolved writer to resolve between (identical) reader and
   writer schemas, while reading a simple array.
6. Using a resolved reader to resolve between (identical) reader and
   writer schemas, while writing a simple array.

Nested Array
7. Test the performance for reading and writing a nested array.
8. Using a resolved writer to resolve between (identical) reader and
   writer schemas, while reading a nested array.
9. Using a resolved reader to resolve between (identical) reader and
   writer schemas, while writing a nested array.

Additionally we fix a minor bug:
1. The return value of avro_value_equal_fast() was not being
   tested. Test this return value, and fail if it is FALSE.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (AVRO-1088) Avro-C - Add performance tests for schema resolution and arrays.

Posted by "Vivek Nadkarni (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vivek Nadkarni updated AVRO-1088:
---------------------------------

    Status: Patch Available  (was: Open)

I ran the performance tests and got the results appended below.

The results show that, as expected, there is a slight performance hit
for using a resolved writer or resolved reader for the complex record,
compared to using the matched schemas.

However, the results also show that for the simple array and for the
nested array, the penalty for using the resolved writer is
substantial. Using the resolved writer takes 30 to 50 times longer
than using no schema resolution or using the resolved reader for
simple and nested arrays.

The performance results indicate that there is a likely bug in the
resolved writer, when it is trying to resolve simple or nested
arrays. This bug will be reported in a separate AVRO-JIRA issue.


**** Running refcount ****
  100000000 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.423s
  Tests/sec:    41265475
**** Running nested record (legacy) ****
  100000 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.270s
  Tests/sec:    44053
**** Running nested record (value by index) ****
  1000000 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.077s
  Tests/sec:    481541
**** Running nested record (value by name) ****
  1000000 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.333s
  Tests/sec:    428571
**** Running nested record (value by index) matched schemas ****
  1000000 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.147s
  Tests/sec:    465839
**** Running nested record (value by index) resolved writer ****
  1000000 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.480s
  Tests/sec:    403226
**** Running nested record (value by index) resolved reader ****
  1000000 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.230s
  Tests/sec:    448430
**** Running simple array matched schemas ****
  250000 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.123s
  Tests/sec:    117739
**** Running simple array resolved writer ****
  10000 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.747s
  Tests/sec:    3641
**** Running simple array resolved reader ****
  250000 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 2.270s
  Tests/sec:    110132
**** Running nested array matched schemas ****
  250000 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 3.030s
  Tests/sec:    82508
**** Running nested array resolved writer ****
  10000 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 6.650s
  Tests/sec:    1504
**** Running simple array resolved reader ****
  250000 tests per run
  Run 1
  Run 2
  Run 3
  Average time: 3.313s
  Tests/sec:    75453


                
> Avro-C - Add performance tests for schema resolution and arrays.
> ----------------------------------------------------------------
>
>                 Key: AVRO-1088
>                 URL: https://issues.apache.org/jira/browse/AVRO-1088
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>    Affects Versions: 1.7.0
>         Environment: Ubuntu Linux 11.10
>            Reporter: Vivek Nadkarni
>             Fix For: 1.7.0
>
>         Attachments: AVRO-1088.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The current performance test in Avro-C measures the performance while
> reading and writing of Avro values using a complex record schema,
> which does not contain any arrays.
> We add tests to measure the performance for simple and nested
> arrays. We also replicate all tests to measure the performance of the
> schema resolution using a resolved reader and a resolved writer.
> Specifically we add the following performance tests:
> Nested Record
> 1. Replicating the test "nested record value by index", using a helper
>    function. Using helper functions adds a little overhead, but it
>    allows us to test various schemas, as well as different modes of
>    schema resolution much more easily.
> 2. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a complex record.
> 3. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a complex record.
> Simple Array
> 4. Test the performance for reading and writing a simple array.
> 5. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a simple array.
> 6. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a simple array.
> Nested Array
> 7. Test the performance for reading and writing a nested array.
> 8. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a nested array.
> 9. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a nested array.
> Additionally we fix a minor bug:
> 1. The return value of avro_value_equal_fast() was not being
>    tested. Test this return value, and fail if it is FALSE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (AVRO-1088) Avro-C - Add performance tests for schema resolution and arrays.

Posted by "Pugachev Maxim (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pugachev Maxim updated AVRO-1088:
---------------------------------

    Attachment: AVRO-1088.patch.2

Compilation warnings fixed (Ubuntu 11.04, gcc 4.5.2)
                
> Avro-C - Add performance tests for schema resolution and arrays.
> ----------------------------------------------------------------
>
>                 Key: AVRO-1088
>                 URL: https://issues.apache.org/jira/browse/AVRO-1088
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>    Affects Versions: 1.7.0
>         Environment: Ubuntu Linux 11.10
>            Reporter: Vivek Nadkarni
>             Fix For: 1.7.0
>
>         Attachments: AVRO-1088.patch, AVRO-1088.patch.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The current performance test in Avro-C measures the performance while
> reading and writing of Avro values using a complex record schema,
> which does not contain any arrays.
> We add tests to measure the performance for simple and nested
> arrays. We also replicate all tests to measure the performance of the
> schema resolution using a resolved reader and a resolved writer.
> Specifically we add the following performance tests:
> Nested Record
> 1. Replicating the test "nested record value by index", using a helper
>    function. Using helper functions adds a little overhead, but it
>    allows us to test various schemas, as well as different modes of
>    schema resolution much more easily.
> 2. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a complex record.
> 3. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a complex record.
> Simple Array
> 4. Test the performance for reading and writing a simple array.
> 5. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a simple array.
> 6. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a simple array.
> Nested Array
> 7. Test the performance for reading and writing a nested array.
> 8. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a nested array.
> 9. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a nested array.
> Additionally we fix a minor bug:
> 1. The return value of avro_value_equal_fast() was not being
>    tested. Test this return value, and fail if it is FALSE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (AVRO-1088) Avro-C - Add performance tests for schema resolution and arrays.

Posted by "Pugachev Maxim (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280065#comment-13280065 ] 

Pugachev Maxim commented on AVRO-1088:
--------------------------------------

This test case gives me a compilation warnings. I`ve fixed it in AVRO-1088.patch.2
                
> Avro-C - Add performance tests for schema resolution and arrays.
> ----------------------------------------------------------------
>
>                 Key: AVRO-1088
>                 URL: https://issues.apache.org/jira/browse/AVRO-1088
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>    Affects Versions: 1.7.0
>         Environment: Ubuntu Linux 11.10
>            Reporter: Vivek Nadkarni
>             Fix For: 1.7.0
>
>         Attachments: AVRO-1088.patch, AVRO-1088.patch.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The current performance test in Avro-C measures the performance while
> reading and writing of Avro values using a complex record schema,
> which does not contain any arrays.
> We add tests to measure the performance for simple and nested
> arrays. We also replicate all tests to measure the performance of the
> schema resolution using a resolved reader and a resolved writer.
> Specifically we add the following performance tests:
> Nested Record
> 1. Replicating the test "nested record value by index", using a helper
>    function. Using helper functions adds a little overhead, but it
>    allows us to test various schemas, as well as different modes of
>    schema resolution much more easily.
> 2. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a complex record.
> 3. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a complex record.
> Simple Array
> 4. Test the performance for reading and writing a simple array.
> 5. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a simple array.
> 6. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a simple array.
> Nested Array
> 7. Test the performance for reading and writing a nested array.
> 8. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a nested array.
> 9. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a nested array.
> Additionally we fix a minor bug:
> 1. The return value of avro_value_equal_fast() was not being
>    tested. Test this return value, and fail if it is FALSE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (AVRO-1088) Avro-C - Add performance tests for schema resolution and arrays.

Posted by "Douglas Creager (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280106#comment-13280106 ] 

Douglas Creager commented on AVRO-1088:
---------------------------------------

Thanks for the updated patch; just committed it to SVN.
                
> Avro-C - Add performance tests for schema resolution and arrays.
> ----------------------------------------------------------------
>
>                 Key: AVRO-1088
>                 URL: https://issues.apache.org/jira/browse/AVRO-1088
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>    Affects Versions: 1.7.0
>         Environment: Ubuntu Linux 11.10
>            Reporter: Vivek Nadkarni
>             Fix For: 1.7.0
>
>         Attachments: AVRO-1088.patch, AVRO-1088.patch.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The current performance test in Avro-C measures the performance while
> reading and writing of Avro values using a complex record schema,
> which does not contain any arrays.
> We add tests to measure the performance for simple and nested
> arrays. We also replicate all tests to measure the performance of the
> schema resolution using a resolved reader and a resolved writer.
> Specifically we add the following performance tests:
> Nested Record
> 1. Replicating the test "nested record value by index", using a helper
>    function. Using helper functions adds a little overhead, but it
>    allows us to test various schemas, as well as different modes of
>    schema resolution much more easily.
> 2. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a complex record.
> 3. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a complex record.
> Simple Array
> 4. Test the performance for reading and writing a simple array.
> 5. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a simple array.
> 6. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a simple array.
> Nested Array
> 7. Test the performance for reading and writing a nested array.
> 8. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a nested array.
> 9. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a nested array.
> Additionally we fix a minor bug:
> 1. The return value of avro_value_equal_fast() was not being
>    tested. Test this return value, and fail if it is FALSE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (AVRO-1088) Avro-C - Add performance tests for schema resolution and arrays.

Posted by "Douglas Creager (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Douglas Creager updated AVRO-1088:
----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed to SVN
                
> Avro-C - Add performance tests for schema resolution and arrays.
> ----------------------------------------------------------------
>
>                 Key: AVRO-1088
>                 URL: https://issues.apache.org/jira/browse/AVRO-1088
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>    Affects Versions: 1.7.0
>         Environment: Ubuntu Linux 11.10
>            Reporter: Vivek Nadkarni
>             Fix For: 1.7.0
>
>         Attachments: AVRO-1088.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The current performance test in Avro-C measures the performance while
> reading and writing of Avro values using a complex record schema,
> which does not contain any arrays.
> We add tests to measure the performance for simple and nested
> arrays. We also replicate all tests to measure the performance of the
> schema resolution using a resolved reader and a resolved writer.
> Specifically we add the following performance tests:
> Nested Record
> 1. Replicating the test "nested record value by index", using a helper
>    function. Using helper functions adds a little overhead, but it
>    allows us to test various schemas, as well as different modes of
>    schema resolution much more easily.
> 2. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a complex record.
> 3. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a complex record.
> Simple Array
> 4. Test the performance for reading and writing a simple array.
> 5. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a simple array.
> 6. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a simple array.
> Nested Array
> 7. Test the performance for reading and writing a nested array.
> 8. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a nested array.
> 9. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a nested array.
> Additionally we fix a minor bug:
> 1. The return value of avro_value_equal_fast() was not being
>    tested. Test this return value, and fail if it is FALSE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (AVRO-1088) Avro-C - Add performance tests for schema resolution and arrays.

Posted by "Vivek Nadkarni (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vivek Nadkarni updated AVRO-1088:
---------------------------------

    Attachment: AVRO-1088.patch

Uploading patch file implementing the new performance tests. 

                
> Avro-C - Add performance tests for schema resolution and arrays.
> ----------------------------------------------------------------
>
>                 Key: AVRO-1088
>                 URL: https://issues.apache.org/jira/browse/AVRO-1088
>             Project: Avro
>          Issue Type: Improvement
>          Components: c
>    Affects Versions: 1.7.0
>         Environment: Ubuntu Linux 11.10
>            Reporter: Vivek Nadkarni
>             Fix For: 1.7.0
>
>         Attachments: AVRO-1088.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The current performance test in Avro-C measures the performance while
> reading and writing of Avro values using a complex record schema,
> which does not contain any arrays.
> We add tests to measure the performance for simple and nested
> arrays. We also replicate all tests to measure the performance of the
> schema resolution using a resolved reader and a resolved writer.
> Specifically we add the following performance tests:
> Nested Record
> 1. Replicating the test "nested record value by index", using a helper
>    function. Using helper functions adds a little overhead, but it
>    allows us to test various schemas, as well as different modes of
>    schema resolution much more easily.
> 2. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a complex record.
> 3. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a complex record.
> Simple Array
> 4. Test the performance for reading and writing a simple array.
> 5. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a simple array.
> 6. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a simple array.
> Nested Array
> 7. Test the performance for reading and writing a nested array.
> 8. Using a resolved writer to resolve between (identical) reader and
>    writer schemas, while reading a nested array.
> 9. Using a resolved reader to resolve between (identical) reader and
>    writer schemas, while writing a nested array.
> Additionally we fix a minor bug:
> 1. The return value of avro_value_equal_fast() was not being
>    tested. Test this return value, and fail if it is FALSE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira