You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Dave Wright (JIRA)" <ji...@apache.org> on 2010/05/31 22:15:41 UTC
[jira] Updated: (AVRO-556) Poor performance for Reader::readBytes
can be easily improved
[ https://issues.apache.org/jira/browse/AVRO-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dave Wright updated AVRO-556:
-----------------------------
Description:
The default implementation of Reader::readBytes on 1.3.2 reads bytes into the result vector one-byte-at-a-time. For large byte arrays (~500k or so), this is horrendously slow.
The code can easily be changed to simply do:
{{void readBytes(std::vector<uint8_t> &val) {
int64_t size = readSize();
val.resize(size);
in_.readBytes(&val[0], size);
}}}
..which will copy all the bytes in a single call.
(note: it appears this function has been changed in the trunk, but it still copies byte-by-byte, so the optimization would still apply).
In my testing of serializing/deserializing a message with a 500k byte field in it 1000 times, execution time dropped from from 30+sec to 0.2sec with this optimization.
The same optimization can easily be applied to readFixed(uint8_t *val...) as well.
was:
The default implementation of Reader::readBytes on 1.3.2 reads bytes into the result vector one-byte-at-a-time. For large byte arrays (~500k or so), this is horrendously slow.
The code can easily be changed to simply do:
void readBytes(std::vector<uint8_t> &val) {
int64_t size = readSize();
val.resize(size);
in_.readBytes(&val[0], size);
}
..which will copy all the bytes in a single call.
(note: it appears this function has been changed in the trunk, but it still copies byte-by-byte, so the optimization would still apply).
In my testing of serializing/deserializing a message with a 500k byte field in it 1000 times, execution time dropped from from 30+sec to 0.2sec with this optimization.
> Poor performance for Reader::readBytes can be easily improved
> -------------------------------------------------------------
>
> Key: AVRO-556
> URL: https://issues.apache.org/jira/browse/AVRO-556
> Project: Avro
> Issue Type: Improvement
> Components: c++
> Affects Versions: 1.3.2
> Environment: Linux
> Reporter: Dave Wright
>
> The default implementation of Reader::readBytes on 1.3.2 reads bytes into the result vector one-byte-at-a-time. For large byte arrays (~500k or so), this is horrendously slow.
> The code can easily be changed to simply do:
> {{void readBytes(std::vector<uint8_t> &val) {
> int64_t size = readSize();
> val.resize(size);
> in_.readBytes(&val[0], size);
> }}}
> ..which will copy all the bytes in a single call.
> (note: it appears this function has been changed in the trunk, but it still copies byte-by-byte, so the optimization would still apply).
> In my testing of serializing/deserializing a message with a 500k byte field in it 1000 times, execution time dropped from from 30+sec to 0.2sec with this optimization.
> The same optimization can easily be applied to readFixed(uint8_t *val...) as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.