You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "qingbo jiao (Jira)" <ji...@apache.org> on 2021/10/13 01:04:00 UTC

[jira] [Comment Edited] (ORC-1026) when write string type column,need to traversing the dictionary when flushDictionary method is called,Is there anyway to remove this travesing

    [ https://issues.apache.org/jira/browse/ORC-1026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17427972#comment-17427972 ] 

qingbo jiao edited comment on ORC-1026 at 10/13/21, 1:03 AM:
-------------------------------------------------------------

[~dongjoon]

When we build the dictionary, we already have three streams of information. When the flushDictionary method is called, we traverse the dictionary again, which will reduce the efficiency of orc file writing. Are there any improvements here to remove this traversal?


was (Author: jiaoqb):
When we build the dictionary, we already have three streams of information. When the flushDictionary method is called, we traverse the dictionary again, which will reduce the efficiency of orc file writing. Are there any improvements here to remove this traversal?

> when write string type column,need to traversing the dictionary when flushDictionary method is called,Is there anyway to remove this travesing
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: ORC-1026
>                 URL: https://issues.apache.org/jira/browse/ORC-1026
>             Project: ORC
>          Issue Type: Improvement
>          Components: Java
>    Affects Versions: 1.8.0
>            Reporter: qingbo jiao
>            Priority: Major
>
> In the StringBaseTreeWriter.class when called flushDictionary() method,where traversing the dictionary as show below
> {code:java}
> dictionary.visit(new Dictionary.Visitor() {
>   private int currentId = 0;
>   @Override
>   public void visit(Dictionary.VisitorContext context
>   ) throws IOException {
>     context.writeBytes(stringOutput);
>     lengthOutput.write(context.getLength());
>     dumpOrder[context.getOriginalPosition()] = currentId++;
>   }
> });
> {code}
> In the Impl class,we had some array to hold the information needed here



--
This message was sent by Atlassian Jira
(v8.3.4#803005)