You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Павел Мезенцев <pa...@mezentsev.org> on 2013/02/21 12:44:09 UTC

Hive 0.7.1 Query hands

Hello!

I use Hive 0.7.1 over Hadoop 0.20.2 (CHD3u3) on 70 nodes cluster.
I have a trouble with query like this:

*FROM* (  *SELECT* *id*, {expressions} *FROM* table1   *WHERE*
day='2013-02-16' *AND* ({conditions1})*UNION* *ALL*  *SELECT* *id*,
{expressions} *FROM* table2   *WHERE* day='2013-02-16' *AND*
(conditions)*UNION* *ALL*  *SELECT* *id*, {expressions} *FROM* table3
*WHERE* day='2013-02-16' *AND* (conditions)*UNION* *ALL*  *SELECT*
*id*, {expressions} *FROM* table4  *WHERE* day='2013-02-16' *AND*
(conditions)) union_tmp*INSERT* OVERWRITE *table* result_table
*PARTITION* (day='2013-02-16')*SELECT* *id*, transformations
(expressions)*GROUP* *BY* *id*;

it had 4865 map tasks and 100 reduce tasks.
first 4780 map taks completed succefull and last
85 tasks hangs.

All this tasks hands with no progress,


and no tasks attempts for each:

One hour later after hang situation starting, job failed with exeption:

2013-02-20 20:02:02,000 Stage-1 map = 0%,  reduce = 0%
2013-02-20 20:02:40,679 Stage-1 map = 1%,  reduce = 0%
2013-02-20 20:02:54,022 Stage-1 map = 2%,  reduce = 0%
2013-02-20 20:03:14,129 Stage-1 map = 3%,  reduce = 0%
......................................................
2013-02-20 21:18:00,361 Stage-1 map = 98%,  reduce = 22%
2013-02-20 21:18:05,691 Stage-1 map = 98%,  reduce = 23%
java.io.IOException: Call to statlabjt/10.6.0.55:8021 failed on local
exception: java.io.EOFException
	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1142)
	at org.apache.hadoop.ipc.Client.call(Client.java:1110)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
	at org.apache.hadoop.mapred.$Proxy8.getJobStatus(Unknown Source)
	at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:1053)
	at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:1065)
	at org.apache.hadoop.hive.ql.exec.ExecDriver.progress(ExecDriver.java:351)
	at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:672)
	at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748)
	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:425)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.io.EOFException
	at java.io.DataInputStream.readInt(DataInputStream.java:375)
	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:815)
	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:724)
Ended Job = job_201302152355_4764 with exception
'java.io.IOException(Call to statlabjt/10.6.0.55:8021 failed on local
exception: java.io.EOFException)'
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.MapRedTask

How to find reasons of such situation?
How to prevent such situations in future?

Best regards
Mezentsev Pavel

Re: Hive 0.7.1 Query hands

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi sir,
the root cause of your issues seems to be java.io.EOFException, that based on the java doc description means the following:

  "Signals that an end of file or end of stream has been reached unexpectedly during input."

What is the health status of the box with ip 10.6.0.55? Isn't it by any chance having some issues? Port 8021 is used for TaskTracker, so I would start by connecting to that box and checking the TT logs.

Jarcec

On Thu, Feb 21, 2013 at 02:44:09PM +0300, Павел Мезенцев wrote:
> Hello!
> 
> I use Hive 0.7.1 over Hadoop 0.20.2 (CHD3u3) on 70 nodes cluster.
> I have a trouble with query like this:
> 
> *FROM* (  *SELECT* *id*, {expressions} *FROM* table1   *WHERE*
> day='2013-02-16' *AND* ({conditions1})*UNION* *ALL*  *SELECT* *id*,
> {expressions} *FROM* table2   *WHERE* day='2013-02-16' *AND*
> (conditions)*UNION* *ALL*  *SELECT* *id*, {expressions} *FROM* table3
> *WHERE* day='2013-02-16' *AND* (conditions)*UNION* *ALL*  *SELECT*
> *id*, {expressions} *FROM* table4  *WHERE* day='2013-02-16' *AND*
> (conditions)) union_tmp*INSERT* OVERWRITE *table* result_table
> *PARTITION* (day='2013-02-16')*SELECT* *id*, transformations
> (expressions)*GROUP* *BY* *id*;
> 
> it had 4865 map tasks and 100 reduce tasks.
> first 4780 map taks completed succefull and last
> 85 tasks hangs.
> 
> All this tasks hands with no progress,
> 
> 
> and no tasks attempts for each:
> 
> One hour later after hang situation starting, job failed with exeption:
> 
> 2013-02-20 20:02:02,000 Stage-1 map = 0%,  reduce = 0%
> 2013-02-20 20:02:40,679 Stage-1 map = 1%,  reduce = 0%
> 2013-02-20 20:02:54,022 Stage-1 map = 2%,  reduce = 0%
> 2013-02-20 20:03:14,129 Stage-1 map = 3%,  reduce = 0%
> ......................................................
> 2013-02-20 21:18:00,361 Stage-1 map = 98%,  reduce = 22%
> 2013-02-20 21:18:05,691 Stage-1 map = 98%,  reduce = 23%
> java.io.IOException: Call to statlabjt/10.6.0.55:8021 failed on local
> exception: java.io.EOFException
> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1142)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1110)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
> 	at org.apache.hadoop.mapred.$Proxy8.getJobStatus(Unknown Source)
> 	at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:1053)
> 	at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:1065)
> 	at org.apache.hadoop.hive.ql.exec.ExecDriver.progress(ExecDriver.java:351)
> 	at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:672)
> 	at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123)
> 	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> 	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063)
> 	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748)
> 	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
> 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:425)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> Caused by: java.io.EOFException
> 	at java.io.DataInputStream.readInt(DataInputStream.java:375)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:815)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:724)
> Ended Job = job_201302152355_4764 with exception
> 'java.io.IOException(Call to statlabjt/10.6.0.55:8021 failed on local
> exception: java.io.EOFException)'
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.MapRedTask
> 
> How to find reasons of such situation?
> How to prevent such situations in future?
> 
> Best regards
> Mezentsev Pavel