You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Janaki Lahorani (JIRA)" <ji...@apache.org> on 2018/03/01 19:58:00 UTC

[jira] [Assigned] (HIVE-17941) Don't Re-Create RunningJob Client During Status Checks

     [ https://issues.apache.org/jira/browse/HIVE-17941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Janaki Lahorani reassigned HIVE-17941:
--------------------------------------

    Assignee: Janaki Lahorani

> Don't Re-Create RunningJob Client During Status Checks
> ------------------------------------------------------
>
>                 Key: HIVE-17941
>                 URL: https://issues.apache.org/jira/browse/HIVE-17941
>             Project: Hive
>          Issue Type: Improvement
>          Components: HiveServer2
>    Affects Versions: 3.0.0, 2.3.1
>            Reporter: BELUGA BEHR
>            Assignee: Janaki Lahorani
>            Priority: Major
>
> {code:java|title=org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper}
> while (!rj.isComplete()) {
>   ...
>         RunningJob newRj = jc.getJob(rj.getID());
>         if (newRj == null) {
>           // under exceptional load, hadoop may not be able to look up status
>           // of finished jobs (because it has purged them from memory). From
>           // hive's perspective - it's equivalent to the job having failed.
>           // So raise a meaningful exception
>           throw new IOException("Could not find status of job:" + rj.getID());
>         } else {
>           th.setRunningJob(newRj);
>           rj = newRj;
>         }
>       }
>   ...
> }
> {code}
> https://github.com/apache/hive/blob/a9f25c0e7ad3f81a9f00f601947a161516e33f1b/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java#L295-L306
> Every time we loop here for a status update, we are rebuilding the RunningJob object to test if the Job information is still loaded in YARN.  Rebuilding this RunningJob object is not trivial because it requires that we re-load and parse the Job Configuration XML file every time.
> {code:java|title=Outdated Stacktrace But Same Idea Holds}
> at java.io.FileInputStream.open(Native Method)
>         at java.io.FileInputStream.<init>(FileInputStream.java:120)
>         at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1924)
>         at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1877)
>         at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1785)
>         at org.apache.hadoop.conf.Configuration.get(Configuration.java:712)
>         at org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:1951)
>         at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:398)
>         at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:388)
>         at org.apache.hadoop.mapred.JobClient$NetworkedJob.<init>(JobClient.java:174)
>         at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:655)
>         at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:668)
>         at org.apache.hadoop.hive.ql.exec.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:282)
>         at org.apache.hadoop.hive.ql.exec.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:532)
> {code}
> Maybe we can be use {{isRetired()}} instead for this particular check.  We also probably need to be better about checking the return value from any of the {{RunningJob}} methods if it's the case that they can fail/go-away at any time if YARN purges the information.  It seems that perhaps this was an attempt to detect a purged job before exercising the {{RunningJob}} object... even though it can go bad at any point.
> https://hadoop.apache.org/docs/r2.7.1/api/org/apache/hadoop/mapred/RunningJob.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)