You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Animesh Trivedi (JIRA)" <ji...@apache.org> on 2018/10/12 08:35:00 UTC

[jira] [Commented] (ARROW-3496) [Java] Add microbenchmark code to Java

    [ https://issues.apache.org/jira/browse/ARROW-3496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16647653#comment-16647653 ] 

Animesh Trivedi commented on ARROW-3496:
----------------------------------------

To give you an idea what I have so far ([https://github.com/animeshtrivedi/benchmarking-arrow)|https://github.com/animeshtrivedi/benchmarking-arrow).] (its README is outdated). A standalone java program to : 

i) basic data generation template to generate data for integers, longs, binary column types (we can extend to include any arbitrary types and schema) 

ii) In-memory data buffers to hold the generated data in the memory (either on on or off heap buffers).  

iii) readers to consume the generated data using various APIs (calling get*(), or the holder API variant, or just writing your own readers from the direct byte buffers). 

The whole benchmark is multi-threaded and all 3 steps can be done in parallel. It is the last step usually what is benchmarked. Obviously the current code base has a whole lot more code for my own testing and understanding, but we can clean it up gradually. 

Where do we want to have this code? and how should a user run this? May be part of the default build process where benchmark is compiled as a separate jar (arrow-java-benchmarks-0.12.jar, something like this) 

> [Java] Add microbenchmark code to Java
> --------------------------------------
>
>                 Key: ARROW-3496
>                 URL: https://issues.apache.org/jira/browse/ARROW-3496
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Java
>    Affects Versions: 0.11.0
>            Reporter: Li Jin
>            Priority: Major
>
> [~atrivedi] has done some microbenchmarking with the Java API. Let's consider adding them to the codebase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)