You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2018/11/06 19:49:00 UTC

[jira] [Commented] (HIVE-20730) Do delete event filtering even if hive.acid.index is not there

    [ https://issues.apache.org/jira/browse/HIVE-20730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16677240#comment-16677240 ] 

Eugene Koifman commented on HIVE-20730:
---------------------------------------

I think {{OrcRecordUpdater.parseKeyIndex(reader);}} will probably NPE if {{hive.acid.key.index}} is missing.  It should probably 1st check if there is something under {{hive.acid.key.index}} key.

You could add a property like {{HiveConf.HIVETESTMODEROLLBACKTXN}} that forces {{OrcRecordUpdater}} to skip generating the index.  Aside form this, I can't think of a good way to test this until query based compactor is ready.  Query based compactor won't be using OrcRecordUpdater - that is what is causing this issue.

 

otherwise LGTM

> Do delete event filtering even if hive.acid.index is not there
> --------------------------------------------------------------
>
>                 Key: HIVE-20730
>                 URL: https://issues.apache.org/jira/browse/HIVE-20730
>             Project: Hive
>          Issue Type: Improvement
>          Components: Transactions
>    Affects Versions: 4.0.0
>            Reporter: Eugene Koifman
>            Assignee: Saurabh Seth
>            Priority: Major
>         Attachments: HIVE-20730.patch
>
>
> since HIVE-16812 {{VectorizedOrcAcidRowBatchReader}} filters delete events based on min/max ROW__ID in the split which relies on {{hive.acid.index}} to be in the ORC footer.  
> There is no way to generate {{hive.acid.index}} from a plain query as in HIVE-20699 and so we need to make sure that we generate a SARG into delete_delta/bucket_x based on stripe stats even the index is missing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)