You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Cheng Su <sc...@gmail.com> on 2012/11/16 10:00:12 UTC

Why hadoop job pending for so long?

Hi, all.

I am running some hive queries. I use hive 0.9.0 + hadoop 0.20.205.0

But some of them are pending for quite a long time, say 30min or even longer.
I have enough free map slots.

In the jobtracker logs:

    2012-11-16 15:00:26,686 WARN org.apache.hadoop.conf.Configuration:
/data/local/mapred/local/jobTracker/job_201211151751_0015.xml:a
attempt to override final parameter: mapred.local.dir;  Ignoring.
    2012-11-16 15:00:26,687 WARN org.apache.hadoop.conf.Configuration:
/data/local/mapred/local/jobTracker/job_201211151751_0015.xml:a
attempt to override final parameter: mapred.job.tracker;  Ignoring.
    2012-11-16 15:00:26,687 INFO
org.apache.hadoop.mapred.JobInProgress: job_201211151751_0015: nMaps=1
nReduces=1 max=-1
    2012-11-16 15:00:26,687 INFO org.apache.hadoop.mapred.JobTracker:
Job job_201211151751_0015 added successfully for user 'hadoop' to
queue 'default'
    2012-11-16 15:00:26,687 INFO org.apache.hadoop.mapred.AuditLogger:
USER=hadoop  IP=192.168.1.80 OPERATION=SUBMIT_JOB
TARGET=job_201211151751_0015    RESULT=SUCCESS
    2012-11-16 15:00:26,687 INFO org.apache.hadoop.mapred.JobTracker:
Initializing job_201211151751_0015
    2012-11-16 15:00:26,687 INFO
org.apache.hadoop.mapred.JobInProgress: Initializing
job_201211151751_0015
    2012-11-16 15:00:26,779 INFO
org.apache.hadoop.mapred.JobInProgress: jobToken generated and stored
with users keys in
/data/local/mapred/system/job_201211151751_0015/jobToken
    2012-11-16 15:00:26,781 INFO
org.apache.hadoop.mapred.JobInProgress: Input size for job
job_201211151751_0015 = 0. Number of splits = 1
    2012-11-16 15:00:26,781 INFO
org.apache.hadoop.mapred.JobInProgress:
tip:task_201211151751_0015_m_000000 has split on
node:/default-rack/hadoop02
    2012-11-16 15:00:26,782 INFO
org.apache.hadoop.mapred.JobInProgress: job_201211151751_0015
LOCALITY_WAIT_FACTOR=0.33333334
    2012-11-16 15:00:26,782 INFO
org.apache.hadoop.mapred.JobInProgress: Job job_201211151751_0015
initialized successfully with 1 map tasks and 1 reduce tasks.
    2012-11-16 15:00:27,840 INFO org.apache.hadoop.mapred.JobTracker:
Adding task (JOB_SETUP) 'attempt_201211151751_0015_m_000002_0' to tip
task_201211151751_0015_m_000002, for tracker
'tracker_hadoop02:localhost/127.0.0.1:29308'
    2012-11-16 15:00:42,852 INFO
org.apache.hadoop.mapred.JobInProgress: Task
'attempt_201211151751_0015_m_000002_0' has completed
task_201211151751_0015_m_000002 successfully.

In the tasktracker logs:

    2012-11-16 15:18:59,740 INFO org.apache.hadoop.mapred.TaskTracker:
LaunchTaskAction (registerTask): attempt_201211151751_0016_m_000002_0
task's state:UNASSIGNED
    2012-11-16 15:18:59,740 INFO org.apache.hadoop.mapred.TaskTracker:
Trying to launch : attempt_201211151751_0016_m_000002_0 which needs 1
slots
    2012-11-16 15:18:59,740 INFO org.apache.hadoop.mapred.TaskTracker:
In TaskLauncher, current free slots : 9 and trying to launch
attempt_201211151751_0016_m_000002_0 which needs 1 slots
    2012-11-16 15:19:09,185 INFO org.apache.hadoop.mapred.JvmManager:
In JvmRunner constructed JVM ID: jvm_201211151751_0016_m_-1007622750
    2012-11-16 15:19:09,186 INFO org.apache.hadoop.mapred.JvmManager:
JVM Runner jvm_201211151751_0016_m_-1007622750 spawned.
    2012-11-16 15:19:09,187 INFO
org.apache.hadoop.mapred.TaskController: Writing commands to
/data/local/mapred/local/ttprivate/taskTracker/hadoop/jobcache/job_201211151751_0016/attempt_201211151751_0016_m_000002_0/taskjvm.sh
    2012-11-16 15:19:10,545 INFO org.apache.hadoop.mapred.TaskTracker:
JVM with ID: jvm_201211151751_0016_m_-1007622750 given task:
attempt_201211151751_0016_m_000002_0
    2012-11-16 15:19:14,746 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201211151751_0016_m_000002_0 0.0% setup
    2012-11-16 15:19:14,746 INFO org.apache.hadoop.mapred.TaskTracker:
Task attempt_201211151751_0016_m_000002_0 is done.
    2012-11-16 15:19:14,746 INFO org.apache.hadoop.mapred.TaskTracker:
reported output size for attempt_201211151751_0016_m_000002_0  was -1
    2012-11-16 15:19:15,069 INFO org.apache.hadoop.mapred.JvmManager:
JVM : jvm_201211151751_0016_m_-1007622750 exited with exit code 0.
Number of tasks it ran: 1
    2012-11-16 15:19:17,747 INFO org.apache.hadoop.mapred.TaskTracker:
Received KillTaskAction for task: attempt_201211151751_0016_m_000002_0
    2012-11-16 15:19:17,747 INFO org.apache.hadoop.mapred.TaskTracker:
About to purge task: attempt_201211151751_0016_m_000002_0
    2012-11-16 15:19:17,747 INFO org.apache.hadoop.mapred.IndexCache:
Map ID attempt_201211151751_0016_m_000002_0 not found in cache


What's wrong with the job?


-- 

Regards,
Cheng Su

Re: Why hadoop job pending for so long?

Posted by Bejoy KS <be...@yahoo.com>.
Hi Chen

Hope you ensured that you have enough free slots in your queue/pool if you are using fair/capacity scheduler in your cluster.

Some times the job initialization would take some time if there are larger number of partitions and lots of small input files in them.

Regards
Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: Cheng Su <sc...@gmail.com>
Date: Fri, 16 Nov 2012 17:00:12 
To: <us...@hadoop.apache.org>; <us...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Why hadoop job pending for so long?

Hi, all.

I am running some hive queries. I use hive 0.9.0 + hadoop 0.20.205.0

But some of them are pending for quite a long time, say 30min or even longer.
I have enough free map slots.

In the jobtracker logs:

    2012-11-16 15:00:26,686 WARN org.apache.hadoop.conf.Configuration:
/data/local/mapred/local/jobTracker/job_201211151751_0015.xml:a
attempt to override final parameter: mapred.local.dir;  Ignoring.
    2012-11-16 15:00:26,687 WARN org.apache.hadoop.conf.Configuration:
/data/local/mapred/local/jobTracker/job_201211151751_0015.xml:a
attempt to override final parameter: mapred.job.tracker;  Ignoring.
    2012-11-16 15:00:26,687 INFO
org.apache.hadoop.mapred.JobInProgress: job_201211151751_0015: nMaps=1
nReduces=1 max=-1
    2012-11-16 15:00:26,687 INFO org.apache.hadoop.mapred.JobTracker:
Job job_201211151751_0015 added successfully for user 'hadoop' to
queue 'default'
    2012-11-16 15:00:26,687 INFO org.apache.hadoop.mapred.AuditLogger:
USER=hadoop  IP=192.168.1.80 OPERATION=SUBMIT_JOB
TARGET=job_201211151751_0015    RESULT=SUCCESS
    2012-11-16 15:00:26,687 INFO org.apache.hadoop.mapred.JobTracker:
Initializing job_201211151751_0015
    2012-11-16 15:00:26,687 INFO
org.apache.hadoop.mapred.JobInProgress: Initializing
job_201211151751_0015
    2012-11-16 15:00:26,779 INFO
org.apache.hadoop.mapred.JobInProgress: jobToken generated and stored
with users keys in
/data/local/mapred/system/job_201211151751_0015/jobToken
    2012-11-16 15:00:26,781 INFO
org.apache.hadoop.mapred.JobInProgress: Input size for job
job_201211151751_0015 = 0. Number of splits = 1
    2012-11-16 15:00:26,781 INFO
org.apache.hadoop.mapred.JobInProgress:
tip:task_201211151751_0015_m_000000 has split on
node:/default-rack/hadoop02
    2012-11-16 15:00:26,782 INFO
org.apache.hadoop.mapred.JobInProgress: job_201211151751_0015
LOCALITY_WAIT_FACTOR=0.33333334
    2012-11-16 15:00:26,782 INFO
org.apache.hadoop.mapred.JobInProgress: Job job_201211151751_0015
initialized successfully with 1 map tasks and 1 reduce tasks.
    2012-11-16 15:00:27,840 INFO org.apache.hadoop.mapred.JobTracker:
Adding task (JOB_SETUP) 'attempt_201211151751_0015_m_000002_0' to tip
task_201211151751_0015_m_000002, for tracker
'tracker_hadoop02:localhost/127.0.0.1:29308'
    2012-11-16 15:00:42,852 INFO
org.apache.hadoop.mapred.JobInProgress: Task
'attempt_201211151751_0015_m_000002_0' has completed
task_201211151751_0015_m_000002 successfully.

In the tasktracker logs:

    2012-11-16 15:18:59,740 INFO org.apache.hadoop.mapred.TaskTracker:
LaunchTaskAction (registerTask): attempt_201211151751_0016_m_000002_0
task's state:UNASSIGNED
    2012-11-16 15:18:59,740 INFO org.apache.hadoop.mapred.TaskTracker:
Trying to launch : attempt_201211151751_0016_m_000002_0 which needs 1
slots
    2012-11-16 15:18:59,740 INFO org.apache.hadoop.mapred.TaskTracker:
In TaskLauncher, current free slots : 9 and trying to launch
attempt_201211151751_0016_m_000002_0 which needs 1 slots
    2012-11-16 15:19:09,185 INFO org.apache.hadoop.mapred.JvmManager:
In JvmRunner constructed JVM ID: jvm_201211151751_0016_m_-1007622750
    2012-11-16 15:19:09,186 INFO org.apache.hadoop.mapred.JvmManager:
JVM Runner jvm_201211151751_0016_m_-1007622750 spawned.
    2012-11-16 15:19:09,187 INFO
org.apache.hadoop.mapred.TaskController: Writing commands to
/data/local/mapred/local/ttprivate/taskTracker/hadoop/jobcache/job_201211151751_0016/attempt_201211151751_0016_m_000002_0/taskjvm.sh
    2012-11-16 15:19:10,545 INFO org.apache.hadoop.mapred.TaskTracker:
JVM with ID: jvm_201211151751_0016_m_-1007622750 given task:
attempt_201211151751_0016_m_000002_0
    2012-11-16 15:19:14,746 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201211151751_0016_m_000002_0 0.0% setup
    2012-11-16 15:19:14,746 INFO org.apache.hadoop.mapred.TaskTracker:
Task attempt_201211151751_0016_m_000002_0 is done.
    2012-11-16 15:19:14,746 INFO org.apache.hadoop.mapred.TaskTracker:
reported output size for attempt_201211151751_0016_m_000002_0  was -1
    2012-11-16 15:19:15,069 INFO org.apache.hadoop.mapred.JvmManager:
JVM : jvm_201211151751_0016_m_-1007622750 exited with exit code 0.
Number of tasks it ran: 1
    2012-11-16 15:19:17,747 INFO org.apache.hadoop.mapred.TaskTracker:
Received KillTaskAction for task: attempt_201211151751_0016_m_000002_0
    2012-11-16 15:19:17,747 INFO org.apache.hadoop.mapred.TaskTracker:
About to purge task: attempt_201211151751_0016_m_000002_0
    2012-11-16 15:19:17,747 INFO org.apache.hadoop.mapred.IndexCache:
Map ID attempt_201211151751_0016_m_000002_0 not found in cache


What's wrong with the job?


-- 

Regards,
Cheng Su