You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/04/24 18:06:00 UTC

[jira] [Work logged] (GOBBLIN-1825) Hive retention job should fail if deleting underlying files fail

     [ https://issues.apache.org/jira/browse/GOBBLIN-1825?focusedWorklogId=858755&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-858755 ]

ASF GitHub Bot logged work on GOBBLIN-1825:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 24/Apr/23 18:05
            Start Date: 24/Apr/23 18:05
    Worklog Time Spent: 10m 
      Work Description: meethngala opened a new pull request, #3687:
URL: https://github.com/apache/gobblin/pull/3687

   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
   
   
   ### JIRA
   - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
       - https://issues.apache.org/jira/browse/GOBBLIN-1825
   
   
   ### Description
   Hive retention would perform two tasks: drop the partition first and then delete the underlying files. Now, if for some reason the partition was dropped, but we couldn't delete the underlying files, the job would still succeed. Thus, if we try to re-run the job, it wouldn't fine any work and the hdfs files would never be discovered.
   
   In order to avoid running into this situation, I have made the below changes:
   - Delete the underlying files first and then try dropping the hive partition
   - throw the exceptions and fail the job rather than marking the job status as successful
   
   
   ### Tests
   - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
   
   
   ### Commits
   - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
       1. Subject is separated from body by a blank line
       2. Subject is limited to 50 characters
       3. Subject does not end with a period
       4. Subject uses the imperative mood ("add", not "adding")
       5. Body wraps at 72 characters
       6. Body explains "what" and "why", not "how"
   
   




Issue Time Tracking
-------------------

            Worklog Id:     (was: 858755)
    Remaining Estimate: 0h
            Time Spent: 10m

> Hive retention job should fail if deleting underlying files fail
> ----------------------------------------------------------------
>
>                 Key: GOBBLIN-1825
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1825
>             Project: Apache Gobblin
>          Issue Type: New Feature
>            Reporter: Meeth Gala
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)