You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2016/03/11 04:10:40 UTC

[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

     [ https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Shelukhin updated HIVE-9660:
-----------------------------------
    Attachment: HIVE-9660.WIP0.patch

WIP patch that takes care of the reading; the writing is only done for compressed path and not done for string writer yet cause its logic is different... whether it works at all is an open question.
Also, my head hurts now... I feel like after researching how Kerberos works.


> store end offset of compressed data for RG in RowIndex in ORC
> -------------------------------------------------------------
>
>                 Key: HIVE-9660
>                 URL: https://issues.apache.org/jira/browse/HIVE-9660
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-9660.WIP0.patch
>
>
> Right now the end offset is estimated, which in some cases results in tons of extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of compressed buffers for each RG, or end offset, or something, to remove this estimation magic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)