You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Phillip Rhodes <pr...@osintegrators.com> on 2014/04/21 21:01:56 UTC

Hive task hangs forever in RUNNING state

Hello Oozie gang:

I am running Oozie 4.0.1 on Elastic Mapreduce using the 3.0.4 AMI (Hadoop
2.2.0). I've built Oozie from source, and everything installs and seems to
work correctly, up to the point of scheduling a Hive job. That is, I can
connect to the Web Console, submit and kill jobs using the 'oozie' command,
etc. BUT... when I setup a cron task to automatically run a Hive script on
a scheduled basis, I find that the Hive task that is spawned hangs and
never completes.

My coordinator.xml looks like this:

<?xml version="1.0" ?>
<coordinator-app name="cron-coord" frequency="${coord:minutes(1)}" start="${sta\
rt}" end="${end}" timezone="UTC"
                 xmlns="uri:oozie:coordinator:0.2">
        <action>
        <workflow>
            <app-path>${workflowAppUri}</app-path>
            <configuration>
                <property>
                    <name>jobTracker</name>
                    <value>${jobTracker}</value>
                </property>
                <property>
                    <name>nameNode</name>
                    <value>${nameNode}</value>
                </property>
                <property>
                    <name>queueName</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
        </workflow>
    </action></coordinator-app>

and the workflow.xml looks like this:

<?xml version="1.0" ?>
<workflow-app xmlns="uri:oozie:workflow:0.2" name="hive-wf">
    <start to="hive-node"/>

    <action name="hive-node">
        <hive xmlns="uri:oozie:hive-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <script>script.q</script>
        </hive>
        <ok to="end"/>
        <error to="fail"/>
    </action>

    <kill name="fail">
        <message>Hive failed, error
message[${wf:errorMessage(wf:lastErrorNode(\))}]</message>
    </kill>
    <end name="end"/></workflow-app>

job.properties has entries like this:

nameNode=hdfs://ip-redacted.internal:9000                                 \

jobTracker=ip-redacted.internal:9026

queueName=default
oozieRoot=oozie

oozie.coord.application.path=${nameNode}/user/${user.name}/${oozieRoot}/apps/hi\
ve
workflowAppUri=${nameNode}/user/${user.name}/${oozieRoot}/apps/hive

start=2014-04-18T22:11Z
end=2014-04-18T23:59Z

and finally, the script.q file looks like this:

add jar hdfs:///jars/jsonserde.jar;

LOAD DATA INPATH '/inputdata/foo/*.json' INTO TABLE foo;

Where the table has already been created. Also, this script works fine if I
run it in the Hive shell by hand.

I don't see any messages in the oozie.log file that look particularly
incriminating.

Any thoughts or advice are much appreciated.



Phil