You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Liya Fan (JIRA)" <ji...@apache.org> on 2019/08/08 10:56:00 UTC

[jira] [Created] (ARROW-6172) [Java] Avoid creating value holders repeatedly when reading data from JDBC

Liya Fan created ARROW-6172:
-------------------------------

             Summary: [Java] Avoid creating value holders repeatedly when reading data from JDBC
                 Key: ARROW-6172
                 URL: https://issues.apache.org/jira/browse/ARROW-6172
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Java
            Reporter: Liya Fan
            Assignee: Liya Fan


When converting JDBC data to Arrow data. A value holder is created for each single value. The following code snippet gives an example:

NullableSmallIntHolder holder = new NullableSmallIntHolder();
 holder.isSet = isNonNull ? 1 : 0;
 if (isNonNull) {
 holder.value = (short) value;
 }
 smallIntVector.setSafe(rowCount, holder);
 smallIntVector.setValueCount(rowCount + 1);

 

This is inefficient, both in terms of memory usage, and computational efficiency. 

For most types, we can improve the performance by directly setting the value.

For example, the benchmarks on IntVector show that a 20% performance improvement can be achieved by directly setting the int value:

 

Benchmark Mode Cnt Score Error Units
IntBenchmarks.setIntDirectly avgt 5 15.397 ± 0.018 us/op
IntBenchmarks.setWithValueHolder avgt 5 19.198 ± 0.789 us/op

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)