You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@impala.apache.org by "Zoltán Borók-Nagy (Jira)" <ji...@apache.org> on 2020/04/02 13:43:00 UTC

[jira] [Resolved] (IMPALA-9484) Milestone 1: properly scan files that has full ACID schema

     [ https://issues.apache.org/jira/browse/IMPALA-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zoltán Borók-Nagy resolved IMPALA-9484.
---------------------------------------
    Resolution: Fixed

> Milestone 1: properly scan files that has full ACID schema
> ----------------------------------------------------------
>
>                 Key: IMPALA-9484
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9484
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-acid
>
>  
> Full ACID row format looks like this:
> {
>  "operation": 0,
>  "originalTransaction": 1,
>  "bucket": 536870912,
>  "rowId": 0,
>  "currentTransaction": 1,
>  "row": \{"i": 1}
> }
> User columns are nested under "row". The frontend should create proper tuples and slot descriptors for the scan nodes to read the files correctly.
> We should be able to query the ACID columns, at least for debugging/testing. Hive uses the special “row__id” identifier for that.
> Impala should raise an error if there are delete deltas. Directory filtering should filter out minor compacted directories since the records from those need validation.
> Non-goals in this sub-task:
>  * row validation against validWriteIdList
>  * reading "original files" (files in non-ACID format)
>  * reading delete deltas



--
This message was sent by Atlassian Jira
(v8.3.4#803005)