You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/11/30 09:10:00 UTC

[jira] [Work logged] (HIVE-24444) compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS

     [ https://issues.apache.org/jira/browse/HIVE-24444?focusedWorklogId=517809&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517809 ]

ASF GitHub Bot logged work on HIVE-24444:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Nov/20 09:09
            Start Date: 30/Nov/20 09:09
    Worklog Time Spent: 10m 
      Work Description: klcopp opened a new pull request #1716:
URL: https://github.com/apache/hive/pull/1716


   ### What changes were proposed in this pull request?
   
   ### Why are the changes needed?
   
   ### Does this PR introduce _any_ user-facing change?
   
   See HIVE-24444
   
   ### How was this patch tested?
   Unit test
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 517809)
    Remaining Estimate: 0h
            Time Spent: 10m

> compactor.Cleaner should not set state "mark cleaned" if there are obsolete files in the FS
> -------------------------------------------------------------------------------------------
>
>                 Key: HIVE-24444
>                 URL: https://issues.apache.org/jira/browse/HIVE-24444
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Karen Coppage
>            Assignee: Karen Coppage
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is an improvement on HIVE-24314, in which markCleaned() is called only if +any+ files are deleted by the cleaner. This could cause a problem in the following case:
> Say for table_1 compaction1 cleaning was blocked by an open txn, and compaction is run again on the same table (compaction2). Both compaction1 and compaction2 could be in "ready for cleaning" at the same time. By this time the blocking open txn could be committed. When the cleaner runs, one of compaction1 and compaction2 will remain in the "ready for cleaning" state:
> Say compaction2 is picked up by the cleaner first. The Cleaner deletes all obsolete files.  Then compaction1 is picked up by the cleaner; the cleaner doesn't remove any files and compaction1 will stay in the queue in a "ready for cleaning" state.
> HIVE-24291 already solves this issue but if it isn't usable (for example if HMS schema changes are out the question) then HIVE-24314 + this change will fix the issue of the Cleaner not removing all obsolete files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)