You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Liya Fan (Jira)" <ji...@apache.org> on 2019/09/03 04:27:00 UTC

[jira] [Created] (ARROW-6420) [Java] Improve the performance of UnionVector when getting underlying vectors

Liya Fan created ARROW-6420:
-------------------------------

             Summary: [Java] Improve the performance of UnionVector when getting underlying vectors
                 Key: ARROW-6420
                 URL: https://issues.apache.org/jira/browse/ARROW-6420
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Java
            Reporter: Liya Fan
            Assignee: Liya Fan


Getting the underlying vector is a frequent opertation for UnionVector. It relies on this operation to get/set data at each index.

The current implementation is inefficient. In particular, it first gets the minor type at the given index, and then compares it against all possible minor types in a switch statment, until a match is found.

We improve the performance by storing the internal vectors in an array, whose index is the ordinal of the minor type. So given a minor type, its corresponding underlying vector can be obtained in O(1) time.

It should be noted that this technique is also applicable to UnionReader and UnionWriter, and support for UnionReader is already implemented.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)