You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Prasanth Jayachandran (JIRA)" <ji...@apache.org> on 2015/04/02 00:37:53 UTC

[jira] [Updated] (HIVE-10114) Split strategies for ORC

     [ https://issues.apache.org/jira/browse/HIVE-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Prasanth Jayachandran updated HIVE-10114:
-----------------------------------------
    Attachment: HIVE-10114.4.patch

Earlier patches handle BI, ETL and ACID cases separately. But failed for BI + ACID and ETL + ACID case. Fixed them in v4 patch.

> Split strategies for ORC
> ------------------------
>
>                 Key: HIVE-10114
>                 URL: https://issues.apache.org/jira/browse/HIVE-10114
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>         Attachments: HIVE-10114.1.patch, HIVE-10114.2.patch, HIVE-10114.3.patch, HIVE-10114.4.patch
>
>
> ORC split generation does not have clearly defined strategies for different scenarios (many small orc files, few small orc files, many large files etc.). Few strategies like storing the file footer in orc split, making entire file as a orc split already exists. This JIRA to make the split generation simpler, support different strategies for various use cases (BI, ETL, ACID etc.) and to lay the foundation for HIVE-7428.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)