You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Robert Kanter (JIRA)" <ji...@apache.org> on 2013/12/21 03:22:09 UTC

[jira] [Updated] (OOZIE-1643) Oozie doesn't parse Hadoop Job Id from the Hive action

     [ https://issues.apache.org/jira/browse/OOZIE-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Kanter updated OOZIE-1643:
---------------------------------

    Attachment: OOZIE-1643.patch

The patch simply adds the {{--hiveconf}} properties for specifying the log4j files to the {{HiveCLI}} args.  I checked on a CDH cluster, which has a newer version of Hive and it worked; against trunk, I still ran into the same problem where it picked up the file from the {{hive-common}} jar.  

I propose we add this patch and when we update to a newer Hive, it will start working.  There was no impact from adding these properties against Hive 0.10 so no reason to wait on upgrading Hive.

> Oozie doesn't parse Hadoop Job Id from the Hive action
> ------------------------------------------------------
>
>                 Key: OOZIE-1643
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1643
>             Project: Oozie
>          Issue Type: Bug
>          Components: action
>    Affects Versions: trunk
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: OOZIE-1643.patch
>
>
> I'm not sure how long this has been going on (possibly for quite a while), but the Hive action isn't able to parse the Hadoop Job Ids launched by Hive.  
> The way its supposed to work is that the {{HiveMain}} creates a {{hive-log4j.properties}} file which redirects the output from {{HiveCLI}} to the console (for easy viewing in the launcher, and creates a {{hive-exec-log4j.properties}} to redirect the output from one of the {{hive-exec}} classes to a log file; Oozie would then parse that log file for the Hadoop Job Ids.  
> What's instead happening is that the {{HiveCLI}} is picking up a {{hive-log4j.properties}} file from {{hive-common.jar}} instead.  This is making it log everything to {{stderr}}.  Oozie then can't parse the Hadoop Job Id.
> {noformat:title=stdout}
> ...
> <<< Invocation of Hive command completed <<<
>  Hadoop Job IDs executed by Hive: 
> <<< Invocation of Main class completed <<<
> Oozie Launcher, capturing output data:
> =======================
> #
> #Mon Dec 16 16:01:34 PST 2013
> hadoopJobs=
> =======================
> {noformat}
> {noformat:title=stderr}
> Picked up _JAVA_OPTIONS: -Djava.awt.headless=true
> 2013-12-16 16:01:20.884 java[59363:1703] Unable to load realm info from SCDynamicStore
> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
> Logging initialized using configuration in jar:file:/Users/rkanter/dev/hadoop-1.2.0/dirs/mapred/taskTracker/distcache/-4202506229388278450_-1489127056_2111515407/localhost/user/rkanter/share/lib/lib_20131216160106/hive/hive-common-0.10.0.jar!/hive-log4j.properties
> Hive history file=/tmp/rkanter/hive_job_log_rkanter_201312161601_851054619.txt
> OK
> Time taken: 5.444 seconds
> Total MapReduce jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_201312161418_0008, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201312161418_0008
> Kill Command = /Users/rkanter/dev/hadoop-1.2.0/libexec/../bin/hadoop job  -kill job_201312161418_0008
> Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
> 2013-12-16 16:01:33,409 Stage-1 map = 0%,  reduce = 0%
> 2013-12-16 16:01:34,415 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_201312161418_0008
> Ended Job = 1084818925, job is filtered out (removed at runtime).
> Ended Job = -956386500, job is filtered out (removed at runtime).
> Moving data to: hdfs://localhost:8020/tmp/hive-rkanter/hive_2013-12-16_16-01-28_168_4802779111653057155/-ext-10000
> Moving data to: /user/rkanter/examples/output-data/hive
> MapReduce Jobs Launched: 
> Job 0:  HDFS Read: 0 HDFS Write: 0 SUCCESS
> Total MapReduce CPU Time Spent: 0 msec
> OK
> Time taken: 6.284 seconds
> Log file: /Users/rkanter/dev/hadoop-1.2.0/dirs/mapred/taskTracker/rkanter/jobcache/job_201312161418_0007/attempt_201312161418_0007_m_000000_0/work/hive-oozie-job_201312161418_0007.log  not present. Therefore no Hadoop jobids found
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)