You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/08/08 11:00:00 UTC

[jira] [Updated] (ARROW-6172) [Java] Avoid creating value holders repeatedly when reading data from JDBC

     [ https://issues.apache.org/jira/browse/ARROW-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated ARROW-6172:
----------------------------------
    Labels: pull-request-available  (was: )

> [Java] Avoid creating value holders repeatedly when reading data from JDBC
> --------------------------------------------------------------------------
>
>                 Key: ARROW-6172
>                 URL: https://issues.apache.org/jira/browse/ARROW-6172
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Java
>            Reporter: Liya Fan
>            Assignee: Liya Fan
>            Priority: Minor
>              Labels: pull-request-available
>
> When converting JDBC data to Arrow data. A value holder is created for each single value. The following code snippet gives an example:
> NullableSmallIntHolder holder = new NullableSmallIntHolder();
>  holder.isSet = isNonNull ? 1 : 0;
>  if (isNonNull) {
>  holder.value = (short) value;
>  }
>  smallIntVector.setSafe(rowCount, holder);
>  smallIntVector.setValueCount(rowCount + 1);
>  
> This is inefficient, both in terms of memory usage, and computational efficiency. 
> For most types, we can improve the performance by directly setting the value.
> For example, the benchmarks on IntVector show that a 20% performance improvement can be achieved by directly setting the int value:
>  
> Benchmark Mode Cnt Score Error Units
> IntBenchmarks.setIntDirectly avgt 5 15.397 ± 0.018 us/op
> IntBenchmarks.setWithValueHolder avgt 5 19.198 ± 0.789 us/op
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)