You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Liya Fan (JIRA)" <ji...@apache.org> on 2019/08/08 10:56:00 UTC
[jira] [Created] (ARROW-6172) [Java] Avoid creating value holders
repeatedly when reading data from JDBC
Liya Fan created ARROW-6172:
-------------------------------
Summary: [Java] Avoid creating value holders repeatedly when reading data from JDBC
Key: ARROW-6172
URL: https://issues.apache.org/jira/browse/ARROW-6172
Project: Apache Arrow
Issue Type: Improvement
Components: Java
Reporter: Liya Fan
Assignee: Liya Fan
When converting JDBC data to Arrow data. A value holder is created for each single value. The following code snippet gives an example:
NullableSmallIntHolder holder = new NullableSmallIntHolder();
holder.isSet = isNonNull ? 1 : 0;
if (isNonNull) {
holder.value = (short) value;
}
smallIntVector.setSafe(rowCount, holder);
smallIntVector.setValueCount(rowCount + 1);
This is inefficient, both in terms of memory usage, and computational efficiency.
For most types, we can improve the performance by directly setting the value.
For example, the benchmarks on IntVector show that a 20% performance improvement can be achieved by directly setting the int value:
Benchmark Mode Cnt Score Error Units
IntBenchmarks.setIntDirectly avgt 5 15.397 ± 0.018 us/op
IntBenchmarks.setWithValueHolder avgt 5 19.198 ± 0.789 us/op
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)