You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by David Morel <da...@amakuru.net> on 2012/11/30 11:10:47 UTC

Skew join failure

Hi,

I am trying to solve the "last reducer hangs because of GC because of 
truckloads of data" issue that I have on some queries, by using SET 
hive.optimize.skewjoin=true; Unfortunately, every time I try this, I 
encounter an error of the form:
...
2012-11-30 10:42:39,181 Stage-10 map = 100%,  reduce = 100%, Cumulative 
CPU 406984.1 sec
MapReduce Total cumulative CPU time: 4 days 17 hours 3 minutes 4 seconds 
100 msec
Ended Job = job_201211281801_0463
java.io.FileNotFoundException: File 
hdfs://nameservice1/tmp/hive-dmorel/hive_2012-11-30_09-23-00_375_8178040921995939301/-mr-10014/hive_skew_join_bigkeys_0 
does not exist.
         at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:365)
         at 
org.apache.hadoop.hive.ql.plan.ConditionalResolverSkewJoin.getTasks(ConditionalResolverSkewJoin.java:96)
         at 
org.apache.hadoop.hive.ql.exec.ConditionalTask.execute(ConditionalTask.java:81)
         at 
org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
         at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
         at 
org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
...

Googling didn't give me any indication on how to debug/solve this, so 
I'd be glad if I could get any indication where to start looking.

I'm using CMF4.0 currently, so Hive 0.8.1.

Thanks a lot!

David Morel

Re: Skew join failure

Posted by Mark Grover <gr...@gmail.com>.

Hey David,
Sure thing. Play around with that property's value, see if that makes any
difference.

Also, if you could search to see if a file with a name like *hive_skew_join_
**bigkeys* exists on HDFS. Perhaps, it's looking at a different path. If
so, we can figure out how to fix that.

Mark

On Mon, Dec 3, 2012 at 12:25 PM, David Morel <da...@amakuru.net>wrote:

> On 30 Nov 2012, at 16:46, Mark Grover wrote:
>
>  Hi David, It seems like Hive is unable to find the skewed keys on
>> HDFS. Did you set *hive.skewjoin.key property? If so, to what value?*
>>
>
> Hey Mark,
>
> thanks for answering!
>
> I didn't set it to anything, but left it at its default value (100,000
> IIRC). I should probably have set it to a much lower value (I guess?)
> but I fail to understand why not meeting the threshold would break the
> whole thing. I guess I have too inspect the logs more closely? Do you
> have real-life examples of skewjoin params settings? the docs are really
> scarce about it...
>
> thanks!
>
> David
>
>
>> Mark
>>
>> On Fri, Nov 30, 2012 at 2:10 AM, David Morel
>> <da...@amakuru.net>**wrote:
>>
>>  Hi,
>>>
>>> I am trying to solve the "last reducer hangs because of GC because of
>>> truckloads of data" issue that I have on some queries, by using SET
>>> hive.optimize.skewjoin=true; Unfortunately, every time I try this, I
>>> encounter an error of the form: ... 2012-11-30 10:42:39,181 Stage-10
>>> map = 100%, reduce = 100%, Cumulative CPU 406984.1 sec MapReduce
>>> Total cumulative CPU time: 4 days 17 hours 3 minutes 4 seconds 100
>>> msec Ended Job = job_201211281801_0463 java.io.FileNotFoundException:
>>> File hdfs://nameservice1/tmp/hive-**** dmorel/hive_2012-11-30_09-23-***
>>> *00_375_8178040921995939301/-
>>> ** mr-10014/hive_skew_join_****bigkeys_0 does not exist. at
>>> org.apache.hadoop.hdfs.****DistributedFileSystem.****listStatus(**
>>> DistributedFileSystem.java:****365) at
>>> org.apache.hadoop.hive.ql.****plan.****ConditionalResolverSkewJoin.
>>> **getTasks(****ConditionalResolverSkewJoin.****java:96) at
>>> org.apache.hadoop.hive.ql.****exec.ConditionalTask.execute(****
>>> ConditionalTask.java:81) at
>>> org.apache.hadoop.hive.ql.****exec.Task.executeTask(Task.** java:133)
>>> at org.apache.hadoop.hive.ql.****exec.TaskRunner.runSequential(****
>>> TaskRunner.java:57) at
>>> org.apache.hadoop.hive.ql.****Driver.launchTask(Driver.java:**** 1332)
>>> at
>>> org.apache.hadoop.hive.ql.****Driver.execute(Driver.java:****1123) at
>>> org.apache.hadoop.hive.ql.****Driver.run(Driver.java:931) ...
>>>
>>>
>>> Googling didn't give me any indication on how to debug/solve this, so
>>> I'd be glad if I could get any indication where to start looking.
>>>
>>> I'm using CMF4.0 currently, so Hive 0.8.1.
>>>
>>

Re: Skew join failure

Posted by David Morel <da...@amakuru.net>.

On 30 Nov 2012, at 16:46, Mark Grover wrote:

> Hi David, It seems like Hive is unable to find the skewed keys on
> HDFS. Did you set *hive.skewjoin.key property? If so, to what value?*

Hey Mark,

thanks for answering!

I didn't set it to anything, but left it at its default value (100,000
IIRC). I should probably have set it to a much lower value (I guess?)
but I fail to understand why not meeting the threshold would break the
whole thing. I guess I have too inspect the logs more closely? Do you
have real-life examples of skewjoin params settings? the docs are really
scarce about it...

thanks!

David

>
> Mark
>
> On Fri, Nov 30, 2012 at 2:10 AM, David Morel
> <da...@amakuru.net>wrote:
>
>> Hi,
>>
>> I am trying to solve the "last reducer hangs because of GC because of
>> truckloads of data" issue that I have on some queries, by using SET
>> hive.optimize.skewjoin=true; Unfortunately, every time I try this, I
>> encounter an error of the form: ... 2012-11-30 10:42:39,181 Stage-10
>> map = 100%, reduce = 100%, Cumulative CPU 406984.1 sec MapReduce
>> Total cumulative CPU time: 4 days 17 hours 3 minutes 4 seconds 100
>> msec Ended Job = job_201211281801_0463 java.io.FileNotFoundException:
>> File hdfs://nameservice1/tmp/hive-** 
>> dmorel/hive_2012-11-30_09-23-**00_375_8178040921995939301/-
>> ** mr-10014/hive_skew_join_**bigkeys_0 does not exist. at
>> org.apache.hadoop.hdfs.**DistributedFileSystem.**listStatus(**
>> DistributedFileSystem.java:**365) at
>> org.apache.hadoop.hive.ql.**plan.**ConditionalResolverSkewJoin.
>> **getTasks(**ConditionalResolverSkewJoin.**java:96) at
>> org.apache.hadoop.hive.ql.**exec.ConditionalTask.execute(**
>> ConditionalTask.java:81) at
>> org.apache.hadoop.hive.ql.**exec.Task.executeTask(Task.** java:133)
>> at org.apache.hadoop.hive.ql.**exec.TaskRunner.runSequential(**
>> TaskRunner.java:57) at
>> org.apache.hadoop.hive.ql.**Driver.launchTask(Driver.java:** 1332) at
>> org.apache.hadoop.hive.ql.**Driver.execute(Driver.java:**1123) at
>> org.apache.hadoop.hive.ql.**Driver.run(Driver.java:931) ...
>>
>> Googling didn't give me any indication on how to debug/solve this, so
>> I'd be glad if I could get any indication where to start looking.
>>
>> I'm using CMF4.0 currently, so Hive 0.8.1.

Re: Skew join failure

Posted by Mark Grover <gr...@gmail.com>.

Hi David,
It seems like Hive is unable to find the skewed keys on HDFS.
Did you set *hive.skewjoin.key property? If so, to what value?*

Mark

On Fri, Nov 30, 2012 at 2:10 AM, David Morel <da...@amakuru.net>wrote:

> Hi,
>
> I am trying to solve the "last reducer hangs because of GC because of
> truckloads of data" issue that I have on some queries, by using SET
> hive.optimize.skewjoin=true; Unfortunately, every time I try this, I
> encounter an error of the form:
> ...
> 2012-11-30 10:42:39,181 Stage-10 map = 100%,  reduce = 100%, Cumulative
> CPU 406984.1 sec
> MapReduce Total cumulative CPU time: 4 days 17 hours 3 minutes 4 seconds
> 100 msec
> Ended Job = job_201211281801_0463
> java.io.FileNotFoundException: File hdfs://nameservice1/tmp/hive-**
> dmorel/hive_2012-11-30_09-23-**00_375_8178040921995939301/-**
> mr-10014/hive_skew_join_**bigkeys_0 does not exist.
>         at org.apache.hadoop.hdfs.**DistributedFileSystem.**listStatus(**
> DistributedFileSystem.java:**365)
>         at org.apache.hadoop.hive.ql.**plan.**ConditionalResolverSkewJoin.
> **getTasks(**ConditionalResolverSkewJoin.**java:96)
>         at org.apache.hadoop.hive.ql.**exec.ConditionalTask.execute(**
> ConditionalTask.java:81)
>         at org.apache.hadoop.hive.ql.**exec.Task.executeTask(Task.**
> java:133)
>         at org.apache.hadoop.hive.ql.**exec.TaskRunner.runSequential(**
> TaskRunner.java:57)
>         at org.apache.hadoop.hive.ql.**Driver.launchTask(Driver.java:**
> 1332)
>         at org.apache.hadoop.hive.ql.**Driver.execute(Driver.java:**1123)
>         at org.apache.hadoop.hive.ql.**Driver.run(Driver.java:931)
> ...
>
> Googling didn't give me any indication on how to debug/solve this, so I'd
> be glad if I could get any indication where to start looking.
>
> I'm using CMF4.0 currently, so Hive 0.8.1.
>
> Thanks a lot!
>
> David Morel
>