You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/01/15 10:22:00 UTC
[jira] [Work logged] (HIVE-24629) Invoke optional output committer in TezProcessor

     [ https://issues.apache.org/jira/browse/HIVE-24629?focusedWorklogId=536405&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-536405 ]

ASF GitHub Bot logged work on HIVE-24629:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 15/Jan/21 10:21
            Start Date: 15/Jan/21 10:21
    Worklog Time Spent: 10m 
      Work Description: abstractdog commented on a change in pull request #1857:
URL: https://github.com/apache/hive/pull/1857#discussion_r558200073



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java
##########
@@ -1600,9 +1601,14 @@ public Vertex createVertex(JobConf conf, BaseWork workUnit, Path scratchDir,
     // final vertices need to have at least one output
     boolean endVertex = tezWork.getLeaves().contains(workUnit);
     if (endVertex) {
+      OutputCommitterDescriptor ocd = null;
+      if (conf.get("hive.tez.mapreduce.output.committer.class") != null
+          && conf.get("mapred.output.committer.class") != null) {

Review comment:
       why is the presence of "mapred.output.committer.class" is needed?
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 536405)
    Remaining Estimate: 0h
            Time Spent: 10m

> Invoke optional output committer in TezProcessor
> ------------------------------------------------
>
>                 Key: HIVE-24629
>                 URL: https://issues.apache.org/jira/browse/HIVE-24629
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Marton Bod
>            Assignee: Marton Bod
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In order to enable Hive to write to Iceberg tables, we need to use an output committer which will fire at the end of each Tez task execution (commitTask) and the after the execution of each vertex (commitOutput/commitJob). This output committer will issue a commit containing the written-out data files to the Iceberg table, replacing its previous snapshot pointer with a new one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)