You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2015/11/19 02:42:11 UTC
[jira] [Updated] (HIVE-12352) CompactionTxnHandler.markCleaned()
may delete too much
[ https://issues.apache.org/jira/browse/HIVE-12352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eugene Koifman updated HIVE-12352:
----------------------------------
Description:
Worker will start with DB in state X (wrt this partition).
while it's working more txns will happen, against partition it's compacting.
then this will delete state up to X and since then. There may be new delta files created
between compaction starting and cleaning. These will not be compacted until more
transactions happen. So this ideally should only delete
up to TXN_ID that was compacted (i.e. HWM in Worker?) Then this can also run
at READ_COMMITTED. So this means we'd want to store HWM in COMPACTION_QUEUE when
Worker picks up the job.
was:
Worker will start with DB in state X (wrt this partition).
while it's working more txns will happen, against partition it's compacting.
then this will delete state up to X and since then. There may be new delta files created
between compaction starting and cleaning. These will not be compacted until more
transactions happen. So this ideally should only delete
up to TXN_ID that was compacted (i.e. HWM in Worker?) Then this can also run
at READ_COMMITTED. So this means we'd want to store HWM in COMPACTION_QUEUE when
Worker picks up the job.
> CompactionTxnHandler.markCleaned() may delete too much
> ------------------------------------------------------
>
> Key: HIVE-12352
> URL: https://issues.apache.org/jira/browse/HIVE-12352
> Project: Hive
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 1.0.0
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
>
> Worker will start with DB in state X (wrt this partition).
> while it's working more txns will happen, against partition it's compacting.
> then this will delete state up to X and since then. There may be new delta files created
> between compaction starting and cleaning. These will not be compacted until more
> transactions happen. So this ideally should only delete
> up to TXN_ID that was compacted (i.e. HWM in Worker?) Then this can also run
> at READ_COMMITTED. So this means we'd want to store HWM in COMPACTION_QUEUE when
> Worker picks up the job.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)