You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ayon Sinha <ay...@yahoo.com> on 2011/11/17 03:03:23 UTC

Severely hit by "curse of last reducer"

Hi,
Where do I find the log of what reducer key is causing the last reducer to go on for hours? The reducer logs don't say much about the key its processing. Is there a way to enable a debug mode where it would log the key it's processing?

My query looks like:

select partner_name, dates, sum(coins_granted) from table1 u join table2 p on u.partner_id=p.id group by partner_name, dates


My uncompressed size of table1 is about 30GB.
 
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.

Re: Severely hit by "curse of last reducer"

Posted by Ayon Sinha <ay...@yahoo.com>.
Skew join did seem to work but I'm thinking other strategies would work better like partitioning the table and/or changing the query. I got distracted with other things though.
 
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.



________________________________
 From: Mohit Gupta <su...@gmail.com>
To: user@hive.apache.org 
Cc: Ayon Sinha <ay...@yahoo.com> 
Sent: Monday, November 21, 2011 12:02 PM
Subject: Re: Severely hit by "curse of last reducer"
 

Hi Ayon,
Were you able to solve this issue? I am facing the same problem. The last reducer of my query has been running for more than 2 hours now.

Thanks
Mohit 


On Fri, Nov 18, 2011 at 9:33 AM, Mark Grover <mg...@oanda.com> wrote:

Rohan,
>I took a look at the source code and wanted to share a couple of things:
>1) Make sure the following 2 properties are being set to true (they are false by default):
>hive.optimize.skewjoin
>hive.auto.convert.join
>
>2) The Hive source code that is causing the exception is:
>       String path = entry.getKey();
>       Path dirPath = new Path(path);
>       FileSystem inpFs = dirPath.getFileSystem(conf);
>       FileStatus[] fstatus = inpFs.listStatus(dirPath);
>       if (fstatus.length > 0) {
>
>It's the last line that throws the exception. Looking at the above code and Hadoop source code (in particular, FileSystem.java, Path.java and PathFilter.java all under org.apache.hadoop.fs.*), it seems like the filestatus for the directory path being provided to the job is not well-liked. That could happen because the expected directory path is a file(?), the directory does not exist or is empty. So, when the job fails see if the path under consideration exists or not.
>
>I don't know if it's a bug in the code. If so, perhaps, we should be checking for both fstats != null and fstatus.length > 0. If you are zealous, you can try making that change and recompiling your Hive or alternatively, implementing your own ConditionalResolverSkewJoin1 (which has the same implementation as ConditionalResolverSkewJoin but with this extra check). Then plug this new class in.
>
>Sorry for the long-winded answer,
>
>Mark
>
>----- Original Message -----
>From: "rohan monga" <mo...@gmail.com>
>To: user@hive.apache.org
>Cc: "Ayon Sinha" <ay...@yahoo.com>
>
>Sent: Thursday, November 17, 2011 5:23:24 PM
>Subject: Re: Severely hit by "curse of last reducer"
>
>Hi Mark,
>Apologies for the thin details on the query :)
>Here is the error log http://pastebin.com/pqxh4d1u the job tracker
>doesn't show any errors.
>I am using hive-0.7, I did set a threshold for the query and sadly i
>couldn't find any more documentation on skewjoins other than the wiki.
>
>Thanks,
>--
>Rohan Monga
>
>
>
>On Thu, Nov 17, 2011 at 2:02 PM, Mark Grover <mg...@oanda.com> wrote:
>> Rohan,
>> The short answer is: I don't know:-) If you could paste the log, I or someone else of the mailing list could be able to help.
>>
>> BTW, What version of Hive were you using? Did you set the threshold before running the query? Try to find some documentation online if can tell what all properties need to be set before Skew Join. My understanding was that the 2 properties I mentioned below should suffice.
>>
>> Mark
>>
>> ----- Original Message -----
>> From: "rohan monga" <mo...@gmail.com>
>> To: user@hive.apache.org
>> Cc: "Ayon Sinha" <ay...@yahoo.com>
>> Sent: Thursday, November 17, 2011 4:44:17 PM
>> Subject: Re: Severely hit by "curse of last reducer"
>>
>> Hi Mark,
>> I have tried setting hive.optimize.skewjoin=true, but it get a
>> NullPointerException after the first stage of the query completes.
>> Why does that happen?
>>
>> Thanks,
>> --
>> Rohan Monga
>>
>>
>>
>> On Thu, Nov 17, 2011 at 1:37 PM, Mark Grover <mg...@oanda.com> wrote:
>>> Ayon,
>>> I see. From what you explained, skew join seems like what you want. Have you tried that already?
>>>
>>> Details on how skew join works are in this presentation. Jump to 15 minute mark if you want to just listen about skew joins.
>>> http://www.youtube.com/watch?v=OB4H3Yt5VWM
>>>
>>> I bet you could also find something in the mail list archives related to Skew Join.
>>>
>>> In a nutshell (from the video),
>>> set hive.optimize.skewjoin=true
>>> set hive.skewjoin.key=<Threshold>
>>>
>>> should do the trick for you. Threshold, I believe, is the number of records you consider a large number to defer till later.
>>>
>>> Good luck!
>>> Mark
>>>
>>> ----- Original Message -----
>>> From: "Ayon Sinha" <ay...@yahoo.com>
>>> To: "Mark Grover" <mg...@oanda.com>, user@hive.apache.org
>>> Sent: Wednesday, November 16, 2011 10:53:19 PM
>>> Subject: Re: Severely hit by "curse of last reducer"
>>>
>>>
>>>
>>> Only one reducer is always stuck. My table2 is small but using a Mapjoin makes my mappers run out of memory. My max reducers is 32 (also max reduce capacity). I tried setting num reducers to higher number (even 6000, which is appx. combination of dates & names I have) only to have lots of reducers with no data.
>>> So I am quite sure its is some key in stage-1 thats is doing this.
>>>
>>> -Ayon
>>> See My Photos on Flickr
>>> Also check out my Blog for answers to commonly asked questions.
>>>
>>>
>>>
>>>
>>> From: Mark Grover <mg...@oanda.com>
>>> To: user@hive.apache.org; Ayon Sinha <ay...@yahoo.com>
>>> Sent: Wednesday, November 16, 2011 6:54 PM
>>> Subject: Re: Severely hit by "curse of last reducer"
>>>
>>> Hi Ayon,
>>> Is it one particular reduce task that is slow or the entire reduce phase? How many reduce tasks did you have, anyways?
>>>
>>> Looking into what the reducer key was might only make sense if a particular reduce task was slow.
>>>
>>> If your table2 is small enough to fit in memory, you might want to try a map join.
>>> More details at:
>>> http://www.facebook.com/note.php?note_id=470667928919
>>>
>>> Let me know what you find.
>>>
>>> Mark
>>>
>>> ----- Original Message -----
>>> From: "Ayon Sinha" < ayonsinha@yahoo.com >
>>> To: "Hive Mailinglist" < user@hive.apache.org >
>>> Sent: Wednesday, November 16, 2011 9:03:23 PM
>>> Subject: Severely hit by "curse of last reducer"
>>>
>>>
>>>
>>> Hi,
>>> Where do I find the log of what reducer key is causing the last reducer to go on for hours? The reducer logs don't say much about the key its processing. Is there a way to enable a debug mode where it would log the key it's processing?
>>>
>>>
>>> My query looks like:
>>>
>>>
>>> select partner_name, dates, sum(coins_granted) from table1 u join table2 p on u.partner_id=p.id group by partner_name, dates
>>>
>>>
>>>
>>> My uncompressed size of table1 is about 30GB.
>>>
>>> -Ayon
>>> See My Photos on Flickr
>>> Also check out my Blog for answers to commonly asked questions.
>>>
>>>
>>>
>>
>

Re: Severely hit by "curse of last reducer"

Posted by Mohit Gupta <su...@gmail.com>.
Hi Ayon,
Were you able to solve this issue? I am facing the same problem. The last
reducer of my query has been running for more than 2 hours now.

Thanks
Mohit

On Fri, Nov 18, 2011 at 9:33 AM, Mark Grover <mg...@oanda.com> wrote:

> Rohan,
> I took a look at the source code and wanted to share a couple of things:
> 1) Make sure the following 2 properties are being set to true (they are
> false by default):
> hive.optimize.skewjoin
> hive.auto.convert.join
>
> 2) The Hive source code that is causing the exception is:
>        String path = entry.getKey();
>        Path dirPath = new Path(path);
>        FileSystem inpFs = dirPath.getFileSystem(conf);
>        FileStatus[] fstatus = inpFs.listStatus(dirPath);
>        if (fstatus.length > 0) {
>
> It's the last line that throws the exception. Looking at the above code
> and Hadoop source code (in particular, FileSystem.java, Path.java and
> PathFilter.java all under org.apache.hadoop.fs.*), it seems like the
> filestatus for the directory path being provided to the job is not
> well-liked. That could happen because the expected directory path is a
> file(?), the directory does not exist or is empty. So, when the job fails
> see if the path under consideration exists or not.
>
> I don't know if it's a bug in the code. If so, perhaps, we should be
> checking for both fstats != null and fstatus.length > 0. If you are
> zealous, you can try making that change and recompiling your Hive or
> alternatively, implementing your own ConditionalResolverSkewJoin1 (which
> has the same implementation as ConditionalResolverSkewJoin but with this
> extra check). Then plug this new class in.
>
> Sorry for the long-winded answer,
> Mark
>
> ----- Original Message -----
> From: "rohan monga" <mo...@gmail.com>
> To: user@hive.apache.org
> Cc: "Ayon Sinha" <ay...@yahoo.com>
> Sent: Thursday, November 17, 2011 5:23:24 PM
> Subject: Re: Severely hit by "curse of last reducer"
>
> Hi Mark,
> Apologies for the thin details on the query :)
> Here is the error log http://pastebin.com/pqxh4d1u the job tracker
> doesn't show any errors.
> I am using hive-0.7, I did set a threshold for the query and sadly i
> couldn't find any more documentation on skewjoins other than the wiki.
>
> Thanks,
> --
> Rohan Monga
>
>
>
> On Thu, Nov 17, 2011 at 2:02 PM, Mark Grover <mg...@oanda.com> wrote:
> > Rohan,
> > The short answer is: I don't know:-) If you could paste the log, I or
> someone else of the mailing list could be able to help.
> >
> > BTW, What version of Hive were you using? Did you set the threshold
> before running the query? Try to find some documentation online if can tell
> what all properties need to be set before Skew Join. My understanding was
> that the 2 properties I mentioned below should suffice.
> >
> > Mark
> >
> > ----- Original Message -----
> > From: "rohan monga" <mo...@gmail.com>
> > To: user@hive.apache.org
> > Cc: "Ayon Sinha" <ay...@yahoo.com>
> > Sent: Thursday, November 17, 2011 4:44:17 PM
> > Subject: Re: Severely hit by "curse of last reducer"
> >
> > Hi Mark,
> > I have tried setting hive.optimize.skewjoin=true, but it get a
> > NullPointerException after the first stage of the query completes.
> > Why does that happen?
> >
> > Thanks,
> > --
> > Rohan Monga
> >
> >
> >
> > On Thu, Nov 17, 2011 at 1:37 PM, Mark Grover <mg...@oanda.com> wrote:
> >> Ayon,
> >> I see. From what you explained, skew join seems like what you want.
> Have you tried that already?
> >>
> >> Details on how skew join works are in this presentation. Jump to 15
> minute mark if you want to just listen about skew joins.
> >> http://www.youtube.com/watch?v=OB4H3Yt5VWM
> >>
> >> I bet you could also find something in the mail list archives related
> to Skew Join.
> >>
> >> In a nutshell (from the video),
> >> set hive.optimize.skewjoin=true
> >> set hive.skewjoin.key=<Threshold>
> >>
> >> should do the trick for you. Threshold, I believe, is the number of
> records you consider a large number to defer till later.
> >>
> >> Good luck!
> >> Mark
> >>
> >> ----- Original Message -----
> >> From: "Ayon Sinha" <ay...@yahoo.com>
> >> To: "Mark Grover" <mg...@oanda.com>, user@hive.apache.org
> >> Sent: Wednesday, November 16, 2011 10:53:19 PM
> >> Subject: Re: Severely hit by "curse of last reducer"
> >>
> >>
> >>
> >> Only one reducer is always stuck. My table2 is small but using a
> Mapjoin makes my mappers run out of memory. My max reducers is 32 (also max
> reduce capacity). I tried setting num reducers to higher number (even 6000,
> which is appx. combination of dates & names I have) only to have lots of
> reducers with no data.
> >> So I am quite sure its is some key in stage-1 thats is doing this.
> >>
> >> -Ayon
> >> See My Photos on Flickr
> >> Also check out my Blog for answers to commonly asked questions.
> >>
> >>
> >>
> >>
> >> From: Mark Grover <mg...@oanda.com>
> >> To: user@hive.apache.org; Ayon Sinha <ay...@yahoo.com>
> >> Sent: Wednesday, November 16, 2011 6:54 PM
> >> Subject: Re: Severely hit by "curse of last reducer"
> >>
> >> Hi Ayon,
> >> Is it one particular reduce task that is slow or the entire reduce
> phase? How many reduce tasks did you have, anyways?
> >>
> >> Looking into what the reducer key was might only make sense if a
> particular reduce task was slow.
> >>
> >> If your table2 is small enough to fit in memory, you might want to try
> a map join.
> >> More details at:
> >> http://www.facebook.com/note.php?note_id=470667928919
> >>
> >> Let me know what you find.
> >>
> >> Mark
> >>
> >> ----- Original Message -----
> >> From: "Ayon Sinha" < ayonsinha@yahoo.com >
> >> To: "Hive Mailinglist" < user@hive.apache.org >
> >> Sent: Wednesday, November 16, 2011 9:03:23 PM
> >> Subject: Severely hit by "curse of last reducer"
> >>
> >>
> >>
> >> Hi,
> >> Where do I find the log of what reducer key is causing the last reducer
> to go on for hours? The reducer logs don't say much about the key its
> processing. Is there a way to enable a debug mode where it would log the
> key it's processing?
> >>
> >>
> >> My query looks like:
> >>
> >>
> >> select partner_name, dates, sum(coins_granted) from table1 u join
> table2 p on u.partner_id=p.id group by partner_name, dates
> >>
> >>
> >>
> >> My uncompressed size of table1 is about 30GB.
> >>
> >> -Ayon
> >> See My Photos on Flickr
> >> Also check out my Blog for answers to commonly asked questions.
> >>
> >>
> >>
> >
>

Re: Severely hit by "curse of last reducer"

Posted by Mark Grover <mg...@oanda.com>.
Rohan,
I took a look at the source code and wanted to share a couple of things:
1) Make sure the following 2 properties are being set to true (they are false by default):
hive.optimize.skewjoin
hive.auto.convert.join

2) The Hive source code that is causing the exception is:
        String path = entry.getKey();
        Path dirPath = new Path(path);
        FileSystem inpFs = dirPath.getFileSystem(conf);
        FileStatus[] fstatus = inpFs.listStatus(dirPath);
        if (fstatus.length > 0) {

It's the last line that throws the exception. Looking at the above code and Hadoop source code (in particular, FileSystem.java, Path.java and PathFilter.java all under org.apache.hadoop.fs.*), it seems like the filestatus for the directory path being provided to the job is not well-liked. That could happen because the expected directory path is a file(?), the directory does not exist or is empty. So, when the job fails see if the path under consideration exists or not.

I don't know if it's a bug in the code. If so, perhaps, we should be checking for both fstats != null and fstatus.length > 0. If you are zealous, you can try making that change and recompiling your Hive or alternatively, implementing your own ConditionalResolverSkewJoin1 (which has the same implementation as ConditionalResolverSkewJoin but with this extra check). Then plug this new class in.

Sorry for the long-winded answer,
Mark

----- Original Message -----
From: "rohan monga" <mo...@gmail.com>
To: user@hive.apache.org
Cc: "Ayon Sinha" <ay...@yahoo.com>
Sent: Thursday, November 17, 2011 5:23:24 PM
Subject: Re: Severely hit by "curse of last reducer"

Hi Mark,
Apologies for the thin details on the query :)
Here is the error log http://pastebin.com/pqxh4d1u the job tracker
doesn't show any errors.
I am using hive-0.7, I did set a threshold for the query and sadly i
couldn't find any more documentation on skewjoins other than the wiki.

Thanks,
--
Rohan Monga



On Thu, Nov 17, 2011 at 2:02 PM, Mark Grover <mg...@oanda.com> wrote:
> Rohan,
> The short answer is: I don't know:-) If you could paste the log, I or someone else of the mailing list could be able to help.
>
> BTW, What version of Hive were you using? Did you set the threshold before running the query? Try to find some documentation online if can tell what all properties need to be set before Skew Join. My understanding was that the 2 properties I mentioned below should suffice.
>
> Mark
>
> ----- Original Message -----
> From: "rohan monga" <mo...@gmail.com>
> To: user@hive.apache.org
> Cc: "Ayon Sinha" <ay...@yahoo.com>
> Sent: Thursday, November 17, 2011 4:44:17 PM
> Subject: Re: Severely hit by "curse of last reducer"
>
> Hi Mark,
> I have tried setting hive.optimize.skewjoin=true, but it get a
> NullPointerException after the first stage of the query completes.
> Why does that happen?
>
> Thanks,
> --
> Rohan Monga
>
>
>
> On Thu, Nov 17, 2011 at 1:37 PM, Mark Grover <mg...@oanda.com> wrote:
>> Ayon,
>> I see. From what you explained, skew join seems like what you want. Have you tried that already?
>>
>> Details on how skew join works are in this presentation. Jump to 15 minute mark if you want to just listen about skew joins.
>> http://www.youtube.com/watch?v=OB4H3Yt5VWM
>>
>> I bet you could also find something in the mail list archives related to Skew Join.
>>
>> In a nutshell (from the video),
>> set hive.optimize.skewjoin=true
>> set hive.skewjoin.key=<Threshold>
>>
>> should do the trick for you. Threshold, I believe, is the number of records you consider a large number to defer till later.
>>
>> Good luck!
>> Mark
>>
>> ----- Original Message -----
>> From: "Ayon Sinha" <ay...@yahoo.com>
>> To: "Mark Grover" <mg...@oanda.com>, user@hive.apache.org
>> Sent: Wednesday, November 16, 2011 10:53:19 PM
>> Subject: Re: Severely hit by "curse of last reducer"
>>
>>
>>
>> Only one reducer is always stuck. My table2 is small but using a Mapjoin makes my mappers run out of memory. My max reducers is 32 (also max reduce capacity). I tried setting num reducers to higher number (even 6000, which is appx. combination of dates & names I have) only to have lots of reducers with no data.
>> So I am quite sure its is some key in stage-1 thats is doing this.
>>
>> -Ayon
>> See My Photos on Flickr
>> Also check out my Blog for answers to commonly asked questions.
>>
>>
>>
>>
>> From: Mark Grover <mg...@oanda.com>
>> To: user@hive.apache.org; Ayon Sinha <ay...@yahoo.com>
>> Sent: Wednesday, November 16, 2011 6:54 PM
>> Subject: Re: Severely hit by "curse of last reducer"
>>
>> Hi Ayon,
>> Is it one particular reduce task that is slow or the entire reduce phase? How many reduce tasks did you have, anyways?
>>
>> Looking into what the reducer key was might only make sense if a particular reduce task was slow.
>>
>> If your table2 is small enough to fit in memory, you might want to try a map join.
>> More details at:
>> http://www.facebook.com/note.php?note_id=470667928919
>>
>> Let me know what you find.
>>
>> Mark
>>
>> ----- Original Message -----
>> From: "Ayon Sinha" < ayonsinha@yahoo.com >
>> To: "Hive Mailinglist" < user@hive.apache.org >
>> Sent: Wednesday, November 16, 2011 9:03:23 PM
>> Subject: Severely hit by "curse of last reducer"
>>
>>
>>
>> Hi,
>> Where do I find the log of what reducer key is causing the last reducer to go on for hours? The reducer logs don't say much about the key its processing. Is there a way to enable a debug mode where it would log the key it's processing?
>>
>>
>> My query looks like:
>>
>>
>> select partner_name, dates, sum(coins_granted) from table1 u join table2 p on u.partner_id=p.id group by partner_name, dates
>>
>>
>>
>> My uncompressed size of table1 is about 30GB.
>>
>> -Ayon
>> See My Photos on Flickr
>> Also check out my Blog for answers to commonly asked questions.
>>
>>
>>
>

Re: Severely hit by "curse of last reducer"

Posted by rohan monga <mo...@gmail.com>.
Hi Mark,
Apologies for the thin details on the query :)
Here is the error log http://pastebin.com/pqxh4d1u the job tracker
doesn't show any errors.
I am using hive-0.7, I did set a threshold for the query and sadly i
couldn't find any more documentation on skewjoins other than the wiki.

Thanks,
--
Rohan Monga



On Thu, Nov 17, 2011 at 2:02 PM, Mark Grover <mg...@oanda.com> wrote:
> Rohan,
> The short answer is: I don't know:-) If you could paste the log, I or someone else of the mailing list could be able to help.
>
> BTW, What version of Hive were you using? Did you set the threshold before running the query? Try to find some documentation online if can tell what all properties need to be set before Skew Join. My understanding was that the 2 properties I mentioned below should suffice.
>
> Mark
>
> ----- Original Message -----
> From: "rohan monga" <mo...@gmail.com>
> To: user@hive.apache.org
> Cc: "Ayon Sinha" <ay...@yahoo.com>
> Sent: Thursday, November 17, 2011 4:44:17 PM
> Subject: Re: Severely hit by "curse of last reducer"
>
> Hi Mark,
> I have tried setting hive.optimize.skewjoin=true, but it get a
> NullPointerException after the first stage of the query completes.
> Why does that happen?
>
> Thanks,
> --
> Rohan Monga
>
>
>
> On Thu, Nov 17, 2011 at 1:37 PM, Mark Grover <mg...@oanda.com> wrote:
>> Ayon,
>> I see. From what you explained, skew join seems like what you want. Have you tried that already?
>>
>> Details on how skew join works are in this presentation. Jump to 15 minute mark if you want to just listen about skew joins.
>> http://www.youtube.com/watch?v=OB4H3Yt5VWM
>>
>> I bet you could also find something in the mail list archives related to Skew Join.
>>
>> In a nutshell (from the video),
>> set hive.optimize.skewjoin=true
>> set hive.skewjoin.key=<Threshold>
>>
>> should do the trick for you. Threshold, I believe, is the number of records you consider a large number to defer till later.
>>
>> Good luck!
>> Mark
>>
>> ----- Original Message -----
>> From: "Ayon Sinha" <ay...@yahoo.com>
>> To: "Mark Grover" <mg...@oanda.com>, user@hive.apache.org
>> Sent: Wednesday, November 16, 2011 10:53:19 PM
>> Subject: Re: Severely hit by "curse of last reducer"
>>
>>
>>
>> Only one reducer is always stuck. My table2 is small but using a Mapjoin makes my mappers run out of memory. My max reducers is 32 (also max reduce capacity). I tried setting num reducers to higher number (even 6000, which is appx. combination of dates & names I have) only to have lots of reducers with no data.
>> So I am quite sure its is some key in stage-1 thats is doing this.
>>
>> -Ayon
>> See My Photos on Flickr
>> Also check out my Blog for answers to commonly asked questions.
>>
>>
>>
>>
>> From: Mark Grover <mg...@oanda.com>
>> To: user@hive.apache.org; Ayon Sinha <ay...@yahoo.com>
>> Sent: Wednesday, November 16, 2011 6:54 PM
>> Subject: Re: Severely hit by "curse of last reducer"
>>
>> Hi Ayon,
>> Is it one particular reduce task that is slow or the entire reduce phase? How many reduce tasks did you have, anyways?
>>
>> Looking into what the reducer key was might only make sense if a particular reduce task was slow.
>>
>> If your table2 is small enough to fit in memory, you might want to try a map join.
>> More details at:
>> http://www.facebook.com/note.php?note_id=470667928919
>>
>> Let me know what you find.
>>
>> Mark
>>
>> ----- Original Message -----
>> From: "Ayon Sinha" < ayonsinha@yahoo.com >
>> To: "Hive Mailinglist" < user@hive.apache.org >
>> Sent: Wednesday, November 16, 2011 9:03:23 PM
>> Subject: Severely hit by "curse of last reducer"
>>
>>
>>
>> Hi,
>> Where do I find the log of what reducer key is causing the last reducer to go on for hours? The reducer logs don't say much about the key its processing. Is there a way to enable a debug mode where it would log the key it's processing?
>>
>>
>> My query looks like:
>>
>>
>> select partner_name, dates, sum(coins_granted) from table1 u join table2 p on u.partner_id=p.id group by partner_name, dates
>>
>>
>>
>> My uncompressed size of table1 is about 30GB.
>>
>> -Ayon
>> See My Photos on Flickr
>> Also check out my Blog for answers to commonly asked questions.
>>
>>
>>
>

Re: Severely hit by "curse of last reducer"

Posted by Mark Grover <mg...@oanda.com>.
Rohan,
The short answer is: I don't know:-) If you could paste the log, I or someone else of the mailing list could be able to help.

BTW, What version of Hive were you using? Did you set the threshold before running the query? Try to find some documentation online if can tell what all properties need to be set before Skew Join. My understanding was that the 2 properties I mentioned below should suffice.

Mark

----- Original Message -----
From: "rohan monga" <mo...@gmail.com>
To: user@hive.apache.org
Cc: "Ayon Sinha" <ay...@yahoo.com>
Sent: Thursday, November 17, 2011 4:44:17 PM
Subject: Re: Severely hit by "curse of last reducer"

Hi Mark,
I have tried setting hive.optimize.skewjoin=true, but it get a
NullPointerException after the first stage of the query completes.
Why does that happen?

Thanks,
--
Rohan Monga



On Thu, Nov 17, 2011 at 1:37 PM, Mark Grover <mg...@oanda.com> wrote:
> Ayon,
> I see. From what you explained, skew join seems like what you want. Have you tried that already?
>
> Details on how skew join works are in this presentation. Jump to 15 minute mark if you want to just listen about skew joins.
> http://www.youtube.com/watch?v=OB4H3Yt5VWM
>
> I bet you could also find something in the mail list archives related to Skew Join.
>
> In a nutshell (from the video),
> set hive.optimize.skewjoin=true
> set hive.skewjoin.key=<Threshold>
>
> should do the trick for you. Threshold, I believe, is the number of records you consider a large number to defer till later.
>
> Good luck!
> Mark
>
> ----- Original Message -----
> From: "Ayon Sinha" <ay...@yahoo.com>
> To: "Mark Grover" <mg...@oanda.com>, user@hive.apache.org
> Sent: Wednesday, November 16, 2011 10:53:19 PM
> Subject: Re: Severely hit by "curse of last reducer"
>
>
>
> Only one reducer is always stuck. My table2 is small but using a Mapjoin makes my mappers run out of memory. My max reducers is 32 (also max reduce capacity). I tried setting num reducers to higher number (even 6000, which is appx. combination of dates & names I have) only to have lots of reducers with no data.
> So I am quite sure its is some key in stage-1 thats is doing this.
>
> -Ayon
> See My Photos on Flickr
> Also check out my Blog for answers to commonly asked questions.
>
>
>
>
> From: Mark Grover <mg...@oanda.com>
> To: user@hive.apache.org; Ayon Sinha <ay...@yahoo.com>
> Sent: Wednesday, November 16, 2011 6:54 PM
> Subject: Re: Severely hit by "curse of last reducer"
>
> Hi Ayon,
> Is it one particular reduce task that is slow or the entire reduce phase? How many reduce tasks did you have, anyways?
>
> Looking into what the reducer key was might only make sense if a particular reduce task was slow.
>
> If your table2 is small enough to fit in memory, you might want to try a map join.
> More details at:
> http://www.facebook.com/note.php?note_id=470667928919
>
> Let me know what you find.
>
> Mark
>
> ----- Original Message -----
> From: "Ayon Sinha" < ayonsinha@yahoo.com >
> To: "Hive Mailinglist" < user@hive.apache.org >
> Sent: Wednesday, November 16, 2011 9:03:23 PM
> Subject: Severely hit by "curse of last reducer"
>
>
>
> Hi,
> Where do I find the log of what reducer key is causing the last reducer to go on for hours? The reducer logs don't say much about the key its processing. Is there a way to enable a debug mode where it would log the key it's processing?
>
>
> My query looks like:
>
>
> select partner_name, dates, sum(coins_granted) from table1 u join table2 p on u.partner_id=p.id group by partner_name, dates
>
>
>
> My uncompressed size of table1 is about 30GB.
>
> -Ayon
> See My Photos on Flickr
> Also check out my Blog for answers to commonly asked questions.
>
>
>

Re: Severely hit by "curse of last reducer"

Posted by rohan monga <mo...@gmail.com>.
Hi Mark,
I have tried setting hive.optimize.skewjoin=true, but it get a
NullPointerException after the first stage of the query completes.
Why does that happen?

Thanks,
--
Rohan Monga



On Thu, Nov 17, 2011 at 1:37 PM, Mark Grover <mg...@oanda.com> wrote:
> Ayon,
> I see. From what you explained, skew join seems like what you want. Have you tried that already?
>
> Details on how skew join works are in this presentation. Jump to 15 minute mark if you want to just listen about skew joins.
> http://www.youtube.com/watch?v=OB4H3Yt5VWM
>
> I bet you could also find something in the mail list archives related to Skew Join.
>
> In a nutshell (from the video),
> set hive.optimize.skewjoin=true
> set hive.skewjoin.key=<Threshold>
>
> should do the trick for you. Threshold, I believe, is the number of records you consider a large number to defer till later.
>
> Good luck!
> Mark
>
> ----- Original Message -----
> From: "Ayon Sinha" <ay...@yahoo.com>
> To: "Mark Grover" <mg...@oanda.com>, user@hive.apache.org
> Sent: Wednesday, November 16, 2011 10:53:19 PM
> Subject: Re: Severely hit by "curse of last reducer"
>
>
>
> Only one reducer is always stuck. My table2 is small but using a Mapjoin makes my mappers run out of memory. My max reducers is 32 (also max reduce capacity). I tried setting num reducers to higher number (even 6000, which is appx. combination of dates & names I have) only to have lots of reducers with no data.
> So I am quite sure its is some key in stage-1 thats is doing this.
>
> -Ayon
> See My Photos on Flickr
> Also check out my Blog for answers to commonly asked questions.
>
>
>
>
> From: Mark Grover <mg...@oanda.com>
> To: user@hive.apache.org; Ayon Sinha <ay...@yahoo.com>
> Sent: Wednesday, November 16, 2011 6:54 PM
> Subject: Re: Severely hit by "curse of last reducer"
>
> Hi Ayon,
> Is it one particular reduce task that is slow or the entire reduce phase? How many reduce tasks did you have, anyways?
>
> Looking into what the reducer key was might only make sense if a particular reduce task was slow.
>
> If your table2 is small enough to fit in memory, you might want to try a map join.
> More details at:
> http://www.facebook.com/note.php?note_id=470667928919
>
> Let me know what you find.
>
> Mark
>
> ----- Original Message -----
> From: "Ayon Sinha" < ayonsinha@yahoo.com >
> To: "Hive Mailinglist" < user@hive.apache.org >
> Sent: Wednesday, November 16, 2011 9:03:23 PM
> Subject: Severely hit by "curse of last reducer"
>
>
>
> Hi,
> Where do I find the log of what reducer key is causing the last reducer to go on for hours? The reducer logs don't say much about the key its processing. Is there a way to enable a debug mode where it would log the key it's processing?
>
>
> My query looks like:
>
>
> select partner_name, dates, sum(coins_granted) from table1 u join table2 p on u.partner_id=p.id group by partner_name, dates
>
>
>
> My uncompressed size of table1 is about 30GB.
>
> -Ayon
> See My Photos on Flickr
> Also check out my Blog for answers to commonly asked questions.
>
>
>

Re: Severely hit by "curse of last reducer"

Posted by Mark Grover <mg...@oanda.com>.
Ayon,
I see. From what you explained, skew join seems like what you want. Have you tried that already?

Details on how skew join works are in this presentation. Jump to 15 minute mark if you want to just listen about skew joins.
http://www.youtube.com/watch?v=OB4H3Yt5VWM

I bet you could also find something in the mail list archives related to Skew Join.

In a nutshell (from the video),
set hive.optimize.skewjoin=true
set hive.skewjoin.key=<Threshold>

should do the trick for you. Threshold, I believe, is the number of records you consider a large number to defer till later.

Good luck!
Mark

----- Original Message -----
From: "Ayon Sinha" <ay...@yahoo.com>
To: "Mark Grover" <mg...@oanda.com>, user@hive.apache.org
Sent: Wednesday, November 16, 2011 10:53:19 PM
Subject: Re: Severely hit by "curse of last reducer"



Only one reducer is always stuck. My table2 is small but using a Mapjoin makes my mappers run out of memory. My max reducers is 32 (also max reduce capacity). I tried setting num reducers to higher number (even 6000, which is appx. combination of dates & names I have) only to have lots of reducers with no data. 
So I am quite sure its is some key in stage-1 thats is doing this. 

-Ayon 
See My Photos on Flickr 
Also check out my Blog for answers to commonly asked questions. 




From: Mark Grover <mg...@oanda.com> 
To: user@hive.apache.org; Ayon Sinha <ay...@yahoo.com> 
Sent: Wednesday, November 16, 2011 6:54 PM 
Subject: Re: Severely hit by "curse of last reducer" 

Hi Ayon, 
Is it one particular reduce task that is slow or the entire reduce phase? How many reduce tasks did you have, anyways? 

Looking into what the reducer key was might only make sense if a particular reduce task was slow. 

If your table2 is small enough to fit in memory, you might want to try a map join. 
More details at: 
http://www.facebook.com/note.php?note_id=470667928919 

Let me know what you find. 

Mark 

----- Original Message ----- 
From: "Ayon Sinha" < ayonsinha@yahoo.com > 
To: "Hive Mailinglist" < user@hive.apache.org > 
Sent: Wednesday, November 16, 2011 9:03:23 PM 
Subject: Severely hit by "curse of last reducer" 



Hi, 
Where do I find the log of what reducer key is causing the last reducer to go on for hours? The reducer logs don't say much about the key its processing. Is there a way to enable a debug mode where it would log the key it's processing? 


My query looks like: 


select partner_name, dates, sum(coins_granted) from table1 u join table2 p on u.partner_id=p.id group by partner_name, dates 



My uncompressed size of table1 is about 30GB. 

-Ayon 
See My Photos on Flickr 
Also check out my Blog for answers to commonly asked questions. 



Re: Severely hit by "curse of last reducer"

Posted by Ayon Sinha <ay...@yahoo.com>.
Only one reducer is always stuck. My table2 is small but using a Mapjoin makes my mappers run out of memory. My max reducers is 32 (also max reduce capacity). I tried setting num reducers to higher number (even 6000, which is appx. combination of dates & names I have) only to have lots of reducers with no data.
So I am quite sure its is some key in stage-1 thats is doing this.
 
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.



________________________________
From: Mark Grover <mg...@oanda.com>
To: user@hive.apache.org; Ayon Sinha <ay...@yahoo.com>
Sent: Wednesday, November 16, 2011 6:54 PM
Subject: Re: Severely hit by "curse of last reducer"

Hi Ayon,
Is it one particular reduce task that is slow or the entire reduce phase? How many reduce tasks did you have, anyways?

Looking into what the reducer key was might only make sense if a particular reduce task was slow.

If your table2 is small enough to fit in memory, you might want to try a map join.
More details at:
http://www.facebook.com/note.php?note_id=470667928919

Let me know what you find.

Mark

----- Original Message -----
From: "Ayon Sinha" <ay...@yahoo.com>
To: "Hive Mailinglist" <us...@hive.apache.org>
Sent: Wednesday, November 16, 2011 9:03:23 PM
Subject: Severely hit by "curse of last reducer"



Hi, 
Where do I find the log of what reducer key is causing the last reducer to go on for hours? The reducer logs don't say much about the key its processing. Is there a way to enable a debug mode where it would log the key it's processing? 


My query looks like: 


select partner_name, dates, sum(coins_granted) from table1 u join table2 p on u.partner_id=p.id group by partner_name, dates 



My uncompressed size of table1 is about 30GB. 

-Ayon 
See My Photos on Flickr 
Also check out my Blog for answers to commonly asked questions. 

Re: Severely hit by "curse of last reducer"

Posted by Mark Grover <mg...@oanda.com>.
Hi Ayon,
Is it one particular reduce task that is slow or the entire reduce phase? How many reduce tasks did you have, anyways?

Looking into what the reducer key was might only make sense if a particular reduce task was slow.

If your table2 is small enough to fit in memory, you might want to try a map join.
More details at:
http://www.facebook.com/note.php?note_id=470667928919

Let me know what you find.

Mark

----- Original Message -----
From: "Ayon Sinha" <ay...@yahoo.com>
To: "Hive Mailinglist" <us...@hive.apache.org>
Sent: Wednesday, November 16, 2011 9:03:23 PM
Subject: Severely hit by "curse of last reducer"



Hi, 
Where do I find the log of what reducer key is causing the last reducer to go on for hours? The reducer logs don't say much about the key its processing. Is there a way to enable a debug mode where it would log the key it's processing? 


My query looks like: 


select partner_name, dates, sum(coins_granted) from table1 u join table2 p on u.partner_id=p.id group by partner_name, dates 



My uncompressed size of table1 is about 30GB. 

-Ayon 
See My Photos on Flickr 
Also check out my Blog for answers to commonly asked questions.