You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2016/11/10 02:09:58 UTC

[jira] [Updated] (HIVE-14453) refactor physical writing of ORC data and metadata to FS from the logical writers

     [ https://issues.apache.org/jira/browse/HIVE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Shelukhin updated HIVE-14453:
------------------------------------
    Attachment: HIVE-14453.02.patch

I'd like to revive this patch for HIVE-15147 (where we want to reencode parts of a text file to ORC for caching, and cache columns separately from each other).

[~prasanth_j] can you please review? This is a refactoring, so no real logic changes as far as I see.

> refactor physical writing of ORC data and metadata to FS from the logical writers
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-14453
>                 URL: https://issues.apache.org/jira/browse/HIVE-14453
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-14453.01.patch, HIVE-14453.02.patch, HIVE-14453.patch
>
>
> ORC data doesn't have to go directly into an HDFS stream via buffers, it can go somewhere else (e.g. a write-thru cache, or an addressable system that doesn't require the stream blocks to be held in memory before writing them all together).
> To that effect, it would be nice to abstract the data block/metadata structure creating from the physical file concerns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)