You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Ted Xu (Jira)" <ji...@apache.org> on 2020/04/27 15:51:00 UTC

[jira] [Commented] (ORC-625) Improve dictionary lookup by marking common prefixes

    [ https://issues.apache.org/jira/browse/ORC-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093657#comment-17093657 ] 

Ted Xu commented on ORC-625:
----------------------------

Hi folks, I've created a PRĀ [https://github.com/apache/orc/pull/504]

Someone please have a review.

In my test case from production environment, it is tested to have a ~10% improvement e2e.

> Improve dictionary lookup by marking common prefixes
> ----------------------------------------------------
>
>                 Key: ORC-625
>                 URL: https://issues.apache.org/jira/browse/ORC-625
>             Project: ORC
>          Issue Type: Improvement
>            Reporter: Ted Xu
>            Priority: Major
>
> Dictionary lookup is slow if there are common prefixes, e.g.,
> {code:java}
> http://foo.bar/a1
> http://foo.bar/a2
> http://foo.bar/a3
> {code}
> It is because dictionary lookup will require comparing values from head every time the RedBlack tree finds a new node.
> If the RedBlack tree is able to mark common prefix to parent, we can skip some redundant compare.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)