You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Teddy Choi (JIRA)" <ji...@apache.org> on 2017/02/15 05:50:41 UTC

[jira] [Commented] (HIVE-12631) LLAP: support ORC ACID tables

    [ https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15867303#comment-15867303 ] 

Teddy Choi commented on HIVE-12631:
-----------------------------------

This draft patch implements the basic idea. OrcAcidEncodedDataConsumer merges base from LLAP and delta from files before consuming. The methods and classes that are shared between OrcAcidEncodedDataConsumer and VectorizedOrcAcidRowBatchReader are now in AcidMergeUtils. AcidMergeUtils handles not only VectorizedRowBatch but also ColumnVectorBatch.

However, this patch doesn't cache delta data on LLAP. I will try to cache it in the next patch.

> LLAP: support ORC ACID tables
> -----------------------------
>
>                 Key: HIVE-12631
>                 URL: https://issues.apache.org/jira/browse/HIVE-12631
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Teddy Choi
>         Attachments: HIVE-12631.1.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and parallelization of reads and processing. This path does not support ACID. As far as I remember ACID logic is embedded inside ORC format; we need to refactor it to be on top of some interface, if practical; or just port it to LLAP read path.
> Another consideration is how the logic will work with cache. The cache is currently low-level (CB-level in ORC), so we could just use it to read bases and deltas (deltas should be cached with higher priority) and merge as usual. We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)