You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Gang Luo <lg...@yahoo.com.cn> on 2010/06/16 17:36:48 UTC

skew join in pig

Hi,
there is something confusing me in the skew join (http://wiki.apache.org/pig/PigSkewedJoinSpec)
1. does the sampling job sample and build histogram on both tables, or just one table (in this case, which one) ?
2. the join job still take the two table as inputs, and shuffle tuples from partitioned table to particular reducer (one tuple to one reducer), and shuffle tuples from streamed table to all reducers associative to one partition (one tuple to multiple reducers). Is that correct?
3. Hot keys need more than one reducers. Are these reducers dedicated to this key only? Could they also take other keys at the same time?
4. for non-hot keys, my understanding is that they are shuffled to reducers based on default hash partitioner. However, it could happen all the keys shuffled to one reducers incurs skew even none of them is skewed individually.  

Can someone give me some ideas on these? Thanks.

-Gang


      

Re: skew join in pig

Posted by Gang Luo <lg...@yahoo.com.cn>.
Thanks Dmitriy. I didn't see any performance issue yet. I ask this for a comparison of skew joins strategies. I would like to share something after I get more ideas.

Thanks,
-Gang




----- 原始邮件 ----
发件人: Dmitriy Ryaboy <dv...@gmail.com>
收件人: pig-dev@hadoop.apache.org
发送日期: 2010/6/21 (周一) 12:36:36 下午
主   题: Re: skew join in pig

It's just whatever the hash function happens to do. By the time the "hot"
keys are slotted to be spread among multiple reducers, they are no longer
hot, so it doesn't matter if you put a few of the partitions in the same
reducer. Remember, we mostly care about things we have to keep in memory.
Since values for different keys don't have to be held in memory at the same
time, dropping them on the same reducer is not too big a deal.
Are you seeing slowness in the skew join that you are trying to fix? Can you
share the details of your specific use case and cluster setup?

-Dmitriy

2010/6/21 Gang Luo <lg...@yahoo.com.cn>

> OK. I konw how many reducers to allocate for one hot key. But how will the
> reducers allocated for key1 overlap the reducers for key2? Say key1 needs 3
> reducers and is allocated reducer 0, 1, 2. Key2 needs 4 reducers, can we
> allocate reducers 0, 1, 2, 3 for key2? Or 3, 4, 5, 6? What is the ideas
> behind the allocation for all of the hot keys?
>
> Thanks,
> -Gang
>
>
>
>
> ----- 原始邮件 ----
> 发件人: Alan Gates <ga...@yahoo-inc.com>
> 收件人: pig-dev@hadoop.apache.org
> 发送日期: 2010/6/18 (周五) 2:46:09 下午
> 主   题: Re: skew join in pig
>
> Are you asking how many reducers are used to split a hot key?  If so, the
> answer is as many as we estimate it will take to make the the records for
> the key fit into memory.  For example, if we have a key which we estimate
> has 10 million records, each record being about 100 bytes and for each
> reduce task we have 400M available, then we will allocate 3 reducers for
> that hot key.  We do not need to take into account any other keys sent to
> this reducer because reducers process rows one key at a time.
>
> Alan.
>
> On Jun 16, 2010, at 11:51 AM, Gang Luo wrote:
>
> > Thanks for replying. It is much clear now. One more thing to ask about
> the third question is, how to allocate reducers to several hot keys?
> Hashing? Further, Pig doesn't divide the reducers into hot-key reducers and
> non-hot-key reducers, is it right?
> >
> > Thanks,
> > -Gang
> >
> >
> > ----- 原始邮件 ----
> > 发件人: Alan Gates <ga...@yahoo-inc.com>
> > 收件人: pig-dev@hadoop.apache.org
> > 发送日期: 2010/6/16 (周三) 12:16:13 下午
> > 主   题: Re: skew join in pig
> >
> >
> > On Jun 16, 2010, at 8:36 AM, Gang Luo wrote:
> >
> >> Hi,
> >> there is something confusing me in the skew join (
> http://wiki.apache.org/pig/PigSkewedJoinSpec)
> >> 1. does the sampling job sample and build histogram on both tables, or
> just one table (in this case, which one) ?
> > Just the left one.
> >
> >> 2. the join job still take the two table as inputs, and shuffle tuples
> from partitioned table to particular reducer (one tuple to one reducer), and
> shuffle tuples from streamed table to all reducers associative to one
> partition (one tuple to multiple reducers). Is that correct?
> > Keys with small enough values to fit in memory are shuffled to reducers
> as normal.  Keys that are too large are split between reducers on the left
> side, and replicated to all of those reducers that have the splits (not all
> reducers) on the right side.  Does that answer your question?
> >
> >> 3. Hot keys need more than one reducers. Are these reducers dedicated to
> this key only? Could they also take other keys at the same time?
> > They take other keys at the same time.
> >
> >> 4. for non-hot keys, my understanding is that they are shuffled to
> reducers based on default hash partitioner. However, it could happen all the
> keys shuffled to one reducers incurs skew even none of them is skewed
> individually.
> > This is always the case in map reduce, though a good hash function should
> minimize the occurrences of this.
> >
> >>
> >> Can someone give me some ideas on these? Thanks.
> >>
> >> -Gang
> >>
> >>
> >>
> > Alan.
> >
> >
> >
>
>
>
>



      

Re: skew join in pig

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
It's just whatever the hash function happens to do. By the time the "hot"
keys are slotted to be spread among multiple reducers, they are no longer
hot, so it doesn't matter if you put a few of the partitions in the same
reducer. Remember, we mostly care about things we have to keep in memory.
Since values for different keys don't have to be held in memory at the same
time, dropping them on the same reducer is not too big a deal.
Are you seeing slowness in the skew join that you are trying to fix? Can you
share the details of your specific use case and cluster setup?

-Dmitriy

2010/6/21 Gang Luo <lg...@yahoo.com.cn>

> OK. I konw how many reducers to allocate for one hot key. But how will the
> reducers allocated for key1 overlap the reducers for key2? Say key1 needs 3
> reducers and is allocated reducer 0, 1, 2. Key2 needs 4 reducers, can we
> allocate reducers 0, 1, 2, 3 for key2? Or 3, 4, 5, 6? What is the ideas
> behind the allocation for all of the hot keys?
>
> Thanks,
> -Gang
>
>
>
>
> ----- 原始邮件 ----
> 发件人: Alan Gates <ga...@yahoo-inc.com>
> 收件人: pig-dev@hadoop.apache.org
> 发送日期: 2010/6/18 (周五) 2:46:09 下午
> 主   题: Re: skew join in pig
>
> Are you asking how many reducers are used to split a hot key?  If so, the
> answer is as many as we estimate it will take to make the the records for
> the key fit into memory.  For example, if we have a key which we estimate
> has 10 million records, each record being about 100 bytes and for each
> reduce task we have 400M available, then we will allocate 3 reducers for
> that hot key.  We do not need to take into account any other keys sent to
> this reducer because reducers process rows one key at a time.
>
> Alan.
>
> On Jun 16, 2010, at 11:51 AM, Gang Luo wrote:
>
> > Thanks for replying. It is much clear now. One more thing to ask about
> the third question is, how to allocate reducers to several hot keys?
> Hashing? Further, Pig doesn't divide the reducers into hot-key reducers and
> non-hot-key reducers, is it right?
> >
> > Thanks,
> > -Gang
> >
> >
> > ----- 原始邮件 ----
> > 发件人: Alan Gates <ga...@yahoo-inc.com>
> > 收件人: pig-dev@hadoop.apache.org
> > 发送日期: 2010/6/16 (周三) 12:16:13 下午
> > 主   题: Re: skew join in pig
> >
> >
> > On Jun 16, 2010, at 8:36 AM, Gang Luo wrote:
> >
> >> Hi,
> >> there is something confusing me in the skew join (
> http://wiki.apache.org/pig/PigSkewedJoinSpec)
> >> 1. does the sampling job sample and build histogram on both tables, or
> just one table (in this case, which one) ?
> > Just the left one.
> >
> >> 2. the join job still take the two table as inputs, and shuffle tuples
> from partitioned table to particular reducer (one tuple to one reducer), and
> shuffle tuples from streamed table to all reducers associative to one
> partition (one tuple to multiple reducers). Is that correct?
> > Keys with small enough values to fit in memory are shuffled to reducers
> as normal.  Keys that are too large are split between reducers on the left
> side, and replicated to all of those reducers that have the splits (not all
> reducers) on the right side.  Does that answer your question?
> >
> >> 3. Hot keys need more than one reducers. Are these reducers dedicated to
> this key only? Could they also take other keys at the same time?
> > They take other keys at the same time.
> >
> >> 4. for non-hot keys, my understanding is that they are shuffled to
> reducers based on default hash partitioner. However, it could happen all the
> keys shuffled to one reducers incurs skew even none of them is skewed
> individually.
> > This is always the case in map reduce, though a good hash function should
> minimize the occurrences of this.
> >
> >>
> >> Can someone give me some ideas on these? Thanks.
> >>
> >> -Gang
> >>
> >>
> >>
> > Alan.
> >
> >
> >
>
>
>
>

Re: skew join in pig

Posted by Gang Luo <lg...@yahoo.com.cn>.
OK. I konw how many reducers to allocate for one hot key. But how will the reducers allocated for key1 overlap the reducers for key2? Say key1 needs 3 reducers and is allocated reducer 0, 1, 2. Key2 needs 4 reducers, can we allocate reducers 0, 1, 2, 3 for key2? Or 3, 4, 5, 6? What is the ideas behind the allocation for all of the hot keys? 

Thanks,
-Gang




----- 原始邮件 ----
发件人: Alan Gates <ga...@yahoo-inc.com>
收件人: pig-dev@hadoop.apache.org
发送日期: 2010/6/18 (周五) 2:46:09 下午
主   题: Re: skew join in pig

Are you asking how many reducers are used to split a hot key?  If so, the answer is as many as we estimate it will take to make the the records for the key fit into memory.  For example, if we have a key which we estimate has 10 million records, each record being about 100 bytes and for each reduce task we have 400M available, then we will allocate 3 reducers for that hot key.  We do not need to take into account any other keys sent to this reducer because reducers process rows one key at a time.

Alan.

On Jun 16, 2010, at 11:51 AM, Gang Luo wrote:

> Thanks for replying. It is much clear now. One more thing to ask about the third question is, how to allocate reducers to several hot keys? Hashing? Further, Pig doesn't divide the reducers into hot-key reducers and non-hot-key reducers, is it right?
> 
> Thanks,
> -Gang
> 
> 
> ----- 原始邮件 ----
> 发件人: Alan Gates <ga...@yahoo-inc.com>
> 收件人: pig-dev@hadoop.apache.org
> 发送日期: 2010/6/16 (周三) 12:16:13 下午
> 主   题: Re: skew join in pig
> 
> 
> On Jun 16, 2010, at 8:36 AM, Gang Luo wrote:
> 
>> Hi,
>> there is something confusing me in the skew join (http://wiki.apache.org/pig/PigSkewedJoinSpec)
>> 1. does the sampling job sample and build histogram on both tables, or just one table (in this case, which one) ?
> Just the left one.
> 
>> 2. the join job still take the two table as inputs, and shuffle tuples from partitioned table to particular reducer (one tuple to one reducer), and shuffle tuples from streamed table to all reducers associative to one partition (one tuple to multiple reducers). Is that correct?
> Keys with small enough values to fit in memory are shuffled to reducers as normal.  Keys that are too large are split between reducers on the left side, and replicated to all of those reducers that have the splits (not all reducers) on the right side.  Does that answer your question?
> 
>> 3. Hot keys need more than one reducers. Are these reducers dedicated to this key only? Could they also take other keys at the same time?
> They take other keys at the same time.
> 
>> 4. for non-hot keys, my understanding is that they are shuffled to reducers based on default hash partitioner. However, it could happen all the keys shuffled to one reducers incurs skew even none of them is skewed individually.
> This is always the case in map reduce, though a good hash function should minimize the occurrences of this.
> 
>> 
>> Can someone give me some ideas on these? Thanks.
>> 
>> -Gang
>> 
>> 
>> 
> Alan.
> 
> 
> 


      

Re: skew join in pig

Posted by Alan Gates <ga...@yahoo-inc.com>.
Are you asking how many reducers are used to split a hot key?  If so,  
the answer is as many as we estimate it will take to make the the  
records for the key fit into memory.  For example, if we have a key  
which we estimate has 10 million records, each record being about 100  
bytes and for each reduce task we have 400M available, then we will  
allocate 3 reducers for that hot key.  We do not need to take into  
account any other keys sent to this reducer because reducers process  
rows one key at a time.

Alan.

On Jun 16, 2010, at 11:51 AM, Gang Luo wrote:

> Thanks for replying. It is much clear now. One more thing to ask  
> about the third question is, how to allocate reducers to several hot  
> keys? Hashing? Further, Pig doesn't divide the reducers into hot-key  
> reducers and non-hot-key reducers, is it right?
>
> Thanks,
> -Gang
>
>
> ----- 原始邮件 ----
> 发件人: Alan Gates <ga...@yahoo-inc.com>
> 收件人: pig-dev@hadoop.apache.org
> 发送日期: 2010/6/16 (周三) 12:16:13 下午
> 主   题: Re: skew join in pig
>
>
> On Jun 16, 2010, at 8:36 AM, Gang Luo wrote:
>
>> Hi,
>> there is something confusing me in the skew join (http://wiki.apache.org/pig/PigSkewedJoinSpec 
>> )
>> 1. does the sampling job sample and build histogram on both tables,  
>> or just one table (in this case, which one) ?
> Just the left one.
>
>> 2. the join job still take the two table as inputs, and shuffle  
>> tuples from partitioned table to particular reducer (one tuple to  
>> one reducer), and shuffle tuples from streamed table to all  
>> reducers associative to one partition (one tuple to multiple  
>> reducers). Is that correct?
> Keys with small enough values to fit in memory are shuffled to  
> reducers as normal.  Keys that are too large are split between  
> reducers on the left side, and replicated to all of those reducers  
> that have the splits (not all reducers) on the right side.  Does  
> that answer your question?
>
>> 3. Hot keys need more than one reducers. Are these reducers  
>> dedicated to this key only? Could they also take other keys at the  
>> same time?
> They take other keys at the same time.
>
>> 4. for non-hot keys, my understanding is that they are shuffled to  
>> reducers based on default hash partitioner. However, it could  
>> happen all the keys shuffled to one reducers incurs skew even none  
>> of them is skewed individually.
> This is always the case in map reduce, though a good hash function  
> should minimize the occurrences of this.
>
>>
>> Can someone give me some ideas on these? Thanks.
>>
>> -Gang
>>
>>
>>
> Alan.
>
>
>


Re: skew join in pig

Posted by Gang Luo <lg...@yahoo.com.cn>.
Thanks for replying. It is much clear now. One more thing to ask about the third question is, how to allocate reducers to several hot keys? Hashing? Further, Pig doesn't divide the reducers into hot-key reducers and non-hot-key reducers, is it right?

Thanks,
-Gang


----- 原始邮件 ----
发件人: Alan Gates <ga...@yahoo-inc.com>
收件人: pig-dev@hadoop.apache.org
发送日期: 2010/6/16 (周三) 12:16:13 下午
主   题: Re: skew join in pig


On Jun 16, 2010, at 8:36 AM, Gang Luo wrote:

> Hi,
> there is something confusing me in the skew join (http://wiki.apache.org/pig/PigSkewedJoinSpec)
> 1. does the sampling job sample and build histogram on both tables, or just one table (in this case, which one) ?
Just the left one.

> 2. the join job still take the two table as inputs, and shuffle tuples from partitioned table to particular reducer (one tuple to one reducer), and shuffle tuples from streamed table to all reducers associative to one partition (one tuple to multiple reducers). Is that correct?
Keys with small enough values to fit in memory are shuffled to reducers as normal.  Keys that are too large are split between reducers on the left side, and replicated to all of those reducers that have the splits (not all reducers) on the right side.  Does that answer your question?

> 3. Hot keys need more than one reducers. Are these reducers dedicated to this key only? Could they also take other keys at the same time?
They take other keys at the same time.

> 4. for non-hot keys, my understanding is that they are shuffled to reducers based on default hash partitioner. However, it could happen all the keys shuffled to one reducers incurs skew even none of them is skewed individually.
This is always the case in map reduce, though a good hash function should minimize the occurrences of this.

> 
> Can someone give me some ideas on these? Thanks.
> 
> -Gang
> 
> 
> 
Alan.


      

Re: skew join in pig

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
On Wed, Jun 16, 2010 at 9:16 AM, Alan Gates <ga...@yahoo-inc.com> wrote:

>
>
> 4. for non-hot keys, my understanding is that they are shuffled to reducers
>> based on default hash partitioner. However, it could happen all the keys
>> shuffled to one reducers incurs skew even none of them is skewed
>> individually.
>>
> This is always the case in map reduce, though a good hash function should
> minimize the occurrences of this.


Plus this isn't the sort of skew that kills jobs. What we are worried about
with skew join is the amount of memory needed to buffer all records for a
single key -- if that amount is greater than available memory, we start
swapping and things get very bad. If all that's happening is that a
partition has a few more non-skewed keys than other partitions, it will
still be able to make progress perfectly fine, it'll just run a bit longer;
not a big deal in the grand scheme of things.

Re: skew join in pig

Posted by Alan Gates <ga...@yahoo-inc.com>.
On Jun 16, 2010, at 8:36 AM, Gang Luo wrote:

> Hi,
> there is something confusing me in the skew join (http://wiki.apache.org/pig/PigSkewedJoinSpec 
> )
> 1. does the sampling job sample and build histogram on both tables,  
> or just one table (in this case, which one) ?
Just the left one.

> 2. the join job still take the two table as inputs, and shuffle  
> tuples from partitioned table to particular reducer (one tuple to  
> one reducer), and shuffle tuples from streamed table to all reducers  
> associative to one partition (one tuple to multiple reducers). Is  
> that correct?
Keys with small enough values to fit in memory are shuffled to  
reducers as normal.  Keys that are too large are split between  
reducers on the left side, and replicated to all of those reducers  
that have the splits (not all reducers) on the right side.  Does that  
answer your question?

> 3. Hot keys need more than one reducers. Are these reducers  
> dedicated to this key only? Could they also take other keys at the  
> same time?
They take other keys at the same time.

> 4. for non-hot keys, my understanding is that they are shuffled to  
> reducers based on default hash partitioner. However, it could happen  
> all the keys shuffled to one reducers incurs skew even none of them  
> is skewed individually.
This is always the case in map reduce, though a good hash function  
should minimize the occurrences of this.

>
> Can someone give me some ideas on these? Thanks.
>
> -Gang
>
>
>
Alan.