You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Hao Fu (Jira)" <ji...@apache.org> on 2022/06/29 19:23:00 UTC

[jira] [Commented] (ORC-625) Improve dictionary lookup by marking common prefixes

    [ https://issues.apache.org/jira/browse/ORC-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17560632#comment-17560632 ] 

Hao Fu commented on ORC-625:
----------------------------

the benefits provided by this change is limited, because it is cheap to compare DynamicByteArray, and its not likely that we have large amounts of nodes with a long common prefix who also have a parent-child relationship.

 

On the other hand, the cost of this change is huge – an additional field that needs to be maintained especially during rotation, it is important but complicated to have thorough testing against it as well. Does not seem like a desirable change at all.

> Improve dictionary lookup by marking common prefixes
> ----------------------------------------------------
>
>                 Key: ORC-625
>                 URL: https://issues.apache.org/jira/browse/ORC-625
>             Project: ORC
>          Issue Type: Improvement
>            Reporter: Ted Xu
>            Priority: Major
>
> Dictionary lookup is slow if there are common prefixes, e.g.,
> {code:java}
> http://foo.bar/a1
> http://foo.bar/a2
> http://foo.bar/a3
> {code}
> It is because dictionary lookup will require comparing values from head every time the RedBlack tree finds a new node.
> If the RedBlack tree is able to mark common prefix to parent, we can skip some redundant compare.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)