You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/11/25 15:44:00 UTC

[jira] [Work logged] (HIVE-25561) Killed task should not commit file.

     [ https://issues.apache.org/jira/browse/HIVE-25561?focusedWorklogId=686546&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-686546 ]

ASF GitHub Bot logged work on HIVE-25561:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 25/Nov/21 15:43
            Start Date: 25/Nov/21 15:43
    Worklog Time Spent: 10m 
      Work Description: kgyrtkirk commented on pull request #2674:
URL: https://github.com/apache/hive/pull/2674#issuecomment-979313436


   @zhengchenyu : I was about to commit this - but I've stopped because of some minor issues.
   Would you mind associating the email address(zheng...@ke.com) used to commit the patch to your github account? without that github adds some "Co-authored-by:" stuff..
   May I also ask for your full name ? (or you could fill out "Full Name" in your jira or github profile) - however if you preffer "zhengchenyu001" that's fine as well - just let me know!
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 686546)
    Time Spent: 50m  (was: 40m)

> Killed task should not commit file.
> -----------------------------------
>
>                 Key: HIVE-25561
>                 URL: https://issues.apache.org/jira/browse/HIVE-25561
>             Project: Hive
>          Issue Type: Bug
>          Components: Tez
>    Affects Versions: 1.2.1, 2.3.8, 2.4.0
>            Reporter: zhengchenyu
>            Assignee: zhengchenyu
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> For tez engine in our cluster, I found some duplicate line, especially tez speculation is enabled. In partition dir, I found both 000002_0 and 000002_1 exist.
> It's a very low probability event. HIVE-10429 has fix some bug about interrupt, but some exception was not caught.
> In our cluster, Task receive SIGTERM, then ClientFinalizer(Hadoop Class) was called, hdfs client will close. Then will raise exception, but abort may not set to true.
> Then removeTempOrDuplicateFiles may fail because of inconsistency, duplicate file will retain. 
> (Notes: Driver first list dir, then Task commit file, then Driver remove duplicate file. It is a inconsistency case)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)