You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2018/02/09 20:32:00 UTC
[jira] [Commented] (HIVE-17284) remove
OrcRecordUpdater.deleteEventIndexBuilder
[ https://issues.apache.org/jira/browse/HIVE-17284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358915#comment-16358915 ]
Eugene Koifman commented on HIVE-17284:
---------------------------------------
this may not be the right thing to do. ORC flattens structs (ROW__ID) and will maintain min/max for individual columns. To filter events we really need min/max ROW__ID
> remove OrcRecordUpdater.deleteEventIndexBuilder
> -----------------------------------------------
>
> Key: HIVE-17284
> URL: https://issues.apache.org/jira/browse/HIVE-17284
> Project: Hive
> Issue Type: Improvement
> Components: Transactions
> Affects Versions: 3.0.0
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
> Priority: Minor
>
> There is no point in it. We know how many rows a delete_delta file has from ORC and they are all the same type - so no need for AcidStats.
> hive.acid.key.index has no value since delete_delta files are never split and are not likely to have more than 1 stripe since they are very small.
> Also can remove KeyIndexBuilder.acidStats - we only have 1 type of event per file
>
> if doing this, make sure to fix {{OrcInputFormat.isOriginal(Reader)}} and {{OrcInputFormat.isOriginal(Footer)}} etc
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)