You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/08/08 11:00:00 UTC
[jira] [Updated] (ARROW-6172) [Java] Avoid creating value holders
repeatedly when reading data from JDBC
[ https://issues.apache.org/jira/browse/ARROW-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated ARROW-6172:
----------------------------------
Labels: pull-request-available (was: )
> [Java] Avoid creating value holders repeatedly when reading data from JDBC
> --------------------------------------------------------------------------
>
> Key: ARROW-6172
> URL: https://issues.apache.org/jira/browse/ARROW-6172
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Java
> Reporter: Liya Fan
> Assignee: Liya Fan
> Priority: Minor
> Labels: pull-request-available
>
> When converting JDBC data to Arrow data. A value holder is created for each single value. The following code snippet gives an example:
> NullableSmallIntHolder holder = new NullableSmallIntHolder();
> holder.isSet = isNonNull ? 1 : 0;
> if (isNonNull) {
> holder.value = (short) value;
> }
> smallIntVector.setSafe(rowCount, holder);
> smallIntVector.setValueCount(rowCount + 1);
>
> This is inefficient, both in terms of memory usage, and computational efficiency.
> For most types, we can improve the performance by directly setting the value.
> For example, the benchmarks on IntVector show that a 20% performance improvement can be achieved by directly setting the int value:
>
> Benchmark Mode Cnt Score Error Units
> IntBenchmarks.setIntDirectly avgt 5 15.397 ± 0.018 us/op
> IntBenchmarks.setWithValueHolder avgt 5 19.198 ± 0.789 us/op
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)