You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Vivek Nadkarni (JIRA)" <ji...@apache.org> on 2012/05/14 10:13:49 UTC
[jira] [Created] (AVRO-1088) Avro-C - Add performance tests for
schema resolution and arrays.
Vivek Nadkarni created AVRO-1088:
------------------------------------
Summary: Avro-C - Add performance tests for schema resolution and arrays.
Key: AVRO-1088
URL: https://issues.apache.org/jira/browse/AVRO-1088
Project: Avro
Issue Type: Improvement
Components: c
Affects Versions: 1.7.0
Environment: Ubuntu Linux 11.10
Reporter: Vivek Nadkarni
Fix For: 1.7.0
The current performance test in Avro-C measures the performance while
reading and writing of Avro values using a complex record schema,
which does not contain any arrays.
We add tests to measure the performance for simple and nested
arrays. We also replicate all tests to measure the performance of the
schema resolution using a resolved reader and a resolved writer.
Specifically we add the following performance tests:
Nested Record
1. Replicating the test "nested record value by index", using a helper
function. Using helper functions adds a little overhead, but it
allows us to test various schemas, as well as different modes of
schema resolution much more easily.
2. Using a resolved writer to resolve between (identical) reader and
writer schemas, while reading a complex record.
3. Using a resolved reader to resolve between (identical) reader and
writer schemas, while writing a complex record.
Simple Array
4. Test the performance for reading and writing a simple array.
5. Using a resolved writer to resolve between (identical) reader and
writer schemas, while reading a simple array.
6. Using a resolved reader to resolve between (identical) reader and
writer schemas, while writing a simple array.
Nested Array
7. Test the performance for reading and writing a nested array.
8. Using a resolved writer to resolve between (identical) reader and
writer schemas, while reading a nested array.
9. Using a resolved reader to resolve between (identical) reader and
writer schemas, while writing a nested array.
Additionally we fix a minor bug:
1. The return value of avro_value_equal_fast() was not being
tested. Test this return value, and fail if it is FALSE.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1088) Avro-C - Add performance tests for
schema resolution and arrays.
Posted by "Vivek Nadkarni (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vivek Nadkarni updated AVRO-1088:
---------------------------------
Status: Patch Available (was: Open)
I ran the performance tests and got the results appended below.
The results show that, as expected, there is a slight performance hit
for using a resolved writer or resolved reader for the complex record,
compared to using the matched schemas.
However, the results also show that for the simple array and for the
nested array, the penalty for using the resolved writer is
substantial. Using the resolved writer takes 30 to 50 times longer
than using no schema resolution or using the resolved reader for
simple and nested arrays.
The performance results indicate that there is a likely bug in the
resolved writer, when it is trying to resolve simple or nested
arrays. This bug will be reported in a separate AVRO-JIRA issue.
**** Running refcount ****
100000000 tests per run
Run 1
Run 2
Run 3
Average time: 2.423s
Tests/sec: 41265475
**** Running nested record (legacy) ****
100000 tests per run
Run 1
Run 2
Run 3
Average time: 2.270s
Tests/sec: 44053
**** Running nested record (value by index) ****
1000000 tests per run
Run 1
Run 2
Run 3
Average time: 2.077s
Tests/sec: 481541
**** Running nested record (value by name) ****
1000000 tests per run
Run 1
Run 2
Run 3
Average time: 2.333s
Tests/sec: 428571
**** Running nested record (value by index) matched schemas ****
1000000 tests per run
Run 1
Run 2
Run 3
Average time: 2.147s
Tests/sec: 465839
**** Running nested record (value by index) resolved writer ****
1000000 tests per run
Run 1
Run 2
Run 3
Average time: 2.480s
Tests/sec: 403226
**** Running nested record (value by index) resolved reader ****
1000000 tests per run
Run 1
Run 2
Run 3
Average time: 2.230s
Tests/sec: 448430
**** Running simple array matched schemas ****
250000 tests per run
Run 1
Run 2
Run 3
Average time: 2.123s
Tests/sec: 117739
**** Running simple array resolved writer ****
10000 tests per run
Run 1
Run 2
Run 3
Average time: 2.747s
Tests/sec: 3641
**** Running simple array resolved reader ****
250000 tests per run
Run 1
Run 2
Run 3
Average time: 2.270s
Tests/sec: 110132
**** Running nested array matched schemas ****
250000 tests per run
Run 1
Run 2
Run 3
Average time: 3.030s
Tests/sec: 82508
**** Running nested array resolved writer ****
10000 tests per run
Run 1
Run 2
Run 3
Average time: 6.650s
Tests/sec: 1504
**** Running simple array resolved reader ****
250000 tests per run
Run 1
Run 2
Run 3
Average time: 3.313s
Tests/sec: 75453
> Avro-C - Add performance tests for schema resolution and arrays.
> ----------------------------------------------------------------
>
> Key: AVRO-1088
> URL: https://issues.apache.org/jira/browse/AVRO-1088
> Project: Avro
> Issue Type: Improvement
> Components: c
> Affects Versions: 1.7.0
> Environment: Ubuntu Linux 11.10
> Reporter: Vivek Nadkarni
> Fix For: 1.7.0
>
> Attachments: AVRO-1088.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> The current performance test in Avro-C measures the performance while
> reading and writing of Avro values using a complex record schema,
> which does not contain any arrays.
> We add tests to measure the performance for simple and nested
> arrays. We also replicate all tests to measure the performance of the
> schema resolution using a resolved reader and a resolved writer.
> Specifically we add the following performance tests:
> Nested Record
> 1. Replicating the test "nested record value by index", using a helper
> function. Using helper functions adds a little overhead, but it
> allows us to test various schemas, as well as different modes of
> schema resolution much more easily.
> 2. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a complex record.
> 3. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a complex record.
> Simple Array
> 4. Test the performance for reading and writing a simple array.
> 5. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a simple array.
> 6. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a simple array.
> Nested Array
> 7. Test the performance for reading and writing a nested array.
> 8. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a nested array.
> 9. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a nested array.
> Additionally we fix a minor bug:
> 1. The return value of avro_value_equal_fast() was not being
> tested. Test this return value, and fail if it is FALSE.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1088) Avro-C - Add performance tests for
schema resolution and arrays.
Posted by "Pugachev Maxim (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pugachev Maxim updated AVRO-1088:
---------------------------------
Attachment: AVRO-1088.patch.2
Compilation warnings fixed (Ubuntu 11.04, gcc 4.5.2)
> Avro-C - Add performance tests for schema resolution and arrays.
> ----------------------------------------------------------------
>
> Key: AVRO-1088
> URL: https://issues.apache.org/jira/browse/AVRO-1088
> Project: Avro
> Issue Type: Improvement
> Components: c
> Affects Versions: 1.7.0
> Environment: Ubuntu Linux 11.10
> Reporter: Vivek Nadkarni
> Fix For: 1.7.0
>
> Attachments: AVRO-1088.patch, AVRO-1088.patch.2
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> The current performance test in Avro-C measures the performance while
> reading and writing of Avro values using a complex record schema,
> which does not contain any arrays.
> We add tests to measure the performance for simple and nested
> arrays. We also replicate all tests to measure the performance of the
> schema resolution using a resolved reader and a resolved writer.
> Specifically we add the following performance tests:
> Nested Record
> 1. Replicating the test "nested record value by index", using a helper
> function. Using helper functions adds a little overhead, but it
> allows us to test various schemas, as well as different modes of
> schema resolution much more easily.
> 2. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a complex record.
> 3. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a complex record.
> Simple Array
> 4. Test the performance for reading and writing a simple array.
> 5. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a simple array.
> 6. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a simple array.
> Nested Array
> 7. Test the performance for reading and writing a nested array.
> 8. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a nested array.
> 9. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a nested array.
> Additionally we fix a minor bug:
> 1. The return value of avro_value_equal_fast() was not being
> tested. Test this return value, and fail if it is FALSE.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1088) Avro-C - Add performance tests for
schema resolution and arrays.
Posted by "Pugachev Maxim (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280065#comment-13280065 ]
Pugachev Maxim commented on AVRO-1088:
--------------------------------------
This test case gives me a compilation warnings. I`ve fixed it in AVRO-1088.patch.2
> Avro-C - Add performance tests for schema resolution and arrays.
> ----------------------------------------------------------------
>
> Key: AVRO-1088
> URL: https://issues.apache.org/jira/browse/AVRO-1088
> Project: Avro
> Issue Type: Improvement
> Components: c
> Affects Versions: 1.7.0
> Environment: Ubuntu Linux 11.10
> Reporter: Vivek Nadkarni
> Fix For: 1.7.0
>
> Attachments: AVRO-1088.patch, AVRO-1088.patch.2
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> The current performance test in Avro-C measures the performance while
> reading and writing of Avro values using a complex record schema,
> which does not contain any arrays.
> We add tests to measure the performance for simple and nested
> arrays. We also replicate all tests to measure the performance of the
> schema resolution using a resolved reader and a resolved writer.
> Specifically we add the following performance tests:
> Nested Record
> 1. Replicating the test "nested record value by index", using a helper
> function. Using helper functions adds a little overhead, but it
> allows us to test various schemas, as well as different modes of
> schema resolution much more easily.
> 2. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a complex record.
> 3. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a complex record.
> Simple Array
> 4. Test the performance for reading and writing a simple array.
> 5. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a simple array.
> 6. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a simple array.
> Nested Array
> 7. Test the performance for reading and writing a nested array.
> 8. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a nested array.
> 9. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a nested array.
> Additionally we fix a minor bug:
> 1. The return value of avro_value_equal_fast() was not being
> tested. Test this return value, and fail if it is FALSE.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (AVRO-1088) Avro-C - Add performance tests for
schema resolution and arrays.
Posted by "Douglas Creager (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280106#comment-13280106 ]
Douglas Creager commented on AVRO-1088:
---------------------------------------
Thanks for the updated patch; just committed it to SVN.
> Avro-C - Add performance tests for schema resolution and arrays.
> ----------------------------------------------------------------
>
> Key: AVRO-1088
> URL: https://issues.apache.org/jira/browse/AVRO-1088
> Project: Avro
> Issue Type: Improvement
> Components: c
> Affects Versions: 1.7.0
> Environment: Ubuntu Linux 11.10
> Reporter: Vivek Nadkarni
> Fix For: 1.7.0
>
> Attachments: AVRO-1088.patch, AVRO-1088.patch.2
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> The current performance test in Avro-C measures the performance while
> reading and writing of Avro values using a complex record schema,
> which does not contain any arrays.
> We add tests to measure the performance for simple and nested
> arrays. We also replicate all tests to measure the performance of the
> schema resolution using a resolved reader and a resolved writer.
> Specifically we add the following performance tests:
> Nested Record
> 1. Replicating the test "nested record value by index", using a helper
> function. Using helper functions adds a little overhead, but it
> allows us to test various schemas, as well as different modes of
> schema resolution much more easily.
> 2. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a complex record.
> 3. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a complex record.
> Simple Array
> 4. Test the performance for reading and writing a simple array.
> 5. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a simple array.
> 6. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a simple array.
> Nested Array
> 7. Test the performance for reading and writing a nested array.
> 8. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a nested array.
> 9. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a nested array.
> Additionally we fix a minor bug:
> 1. The return value of avro_value_equal_fast() was not being
> tested. Test this return value, and fail if it is FALSE.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1088) Avro-C - Add performance tests for
schema resolution and arrays.
Posted by "Douglas Creager (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Douglas Creager updated AVRO-1088:
----------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
Committed to SVN
> Avro-C - Add performance tests for schema resolution and arrays.
> ----------------------------------------------------------------
>
> Key: AVRO-1088
> URL: https://issues.apache.org/jira/browse/AVRO-1088
> Project: Avro
> Issue Type: Improvement
> Components: c
> Affects Versions: 1.7.0
> Environment: Ubuntu Linux 11.10
> Reporter: Vivek Nadkarni
> Fix For: 1.7.0
>
> Attachments: AVRO-1088.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> The current performance test in Avro-C measures the performance while
> reading and writing of Avro values using a complex record schema,
> which does not contain any arrays.
> We add tests to measure the performance for simple and nested
> arrays. We also replicate all tests to measure the performance of the
> schema resolution using a resolved reader and a resolved writer.
> Specifically we add the following performance tests:
> Nested Record
> 1. Replicating the test "nested record value by index", using a helper
> function. Using helper functions adds a little overhead, but it
> allows us to test various schemas, as well as different modes of
> schema resolution much more easily.
> 2. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a complex record.
> 3. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a complex record.
> Simple Array
> 4. Test the performance for reading and writing a simple array.
> 5. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a simple array.
> 6. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a simple array.
> Nested Array
> 7. Test the performance for reading and writing a nested array.
> 8. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a nested array.
> 9. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a nested array.
> Additionally we fix a minor bug:
> 1. The return value of avro_value_equal_fast() was not being
> tested. Test this return value, and fail if it is FALSE.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (AVRO-1088) Avro-C - Add performance tests for
schema resolution and arrays.
Posted by "Vivek Nadkarni (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vivek Nadkarni updated AVRO-1088:
---------------------------------
Attachment: AVRO-1088.patch
Uploading patch file implementing the new performance tests.
> Avro-C - Add performance tests for schema resolution and arrays.
> ----------------------------------------------------------------
>
> Key: AVRO-1088
> URL: https://issues.apache.org/jira/browse/AVRO-1088
> Project: Avro
> Issue Type: Improvement
> Components: c
> Affects Versions: 1.7.0
> Environment: Ubuntu Linux 11.10
> Reporter: Vivek Nadkarni
> Fix For: 1.7.0
>
> Attachments: AVRO-1088.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> The current performance test in Avro-C measures the performance while
> reading and writing of Avro values using a complex record schema,
> which does not contain any arrays.
> We add tests to measure the performance for simple and nested
> arrays. We also replicate all tests to measure the performance of the
> schema resolution using a resolved reader and a resolved writer.
> Specifically we add the following performance tests:
> Nested Record
> 1. Replicating the test "nested record value by index", using a helper
> function. Using helper functions adds a little overhead, but it
> allows us to test various schemas, as well as different modes of
> schema resolution much more easily.
> 2. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a complex record.
> 3. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a complex record.
> Simple Array
> 4. Test the performance for reading and writing a simple array.
> 5. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a simple array.
> 6. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a simple array.
> Nested Array
> 7. Test the performance for reading and writing a nested array.
> 8. Using a resolved writer to resolve between (identical) reader and
> writer schemas, while reading a nested array.
> 9. Using a resolved reader to resolve between (identical) reader and
> writer schemas, while writing a nested array.
> Additionally we fix a minor bug:
> 1. The return value of avro_value_equal_fast() was not being
> tested. Test this return value, and fail if it is FALSE.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira