You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by xeonmailinglist-gmail <xe...@gmail.com> on 2015/03/18 10:30:31 UTC

trying to understand HashPartitioner

Hi,

I am trying to understand how |HashPartitioner.java| works. Thus, I ran 
a mapreduce job with 5 reducers and 5 input files. I thought that the 
output of |getPartition(K2 key, V2 value, int numReduceTasks)| was the 
number of reduce task that |K2| and |V2| will execute. Is this correct?



-- 
--

Re: trying to understand HashPartitioner

Posted by 杨浩 <ya...@gmail.com>.

It's not the number of the the reduce task, but the ID of the reduce task.
For definite <k2, v2>, it will only be dealed on one reduce task.

In MRv2, each reduce task has an ID, like 0、1、2、3、4. The result is the
reduce ID and the <k2, v2> will be processed on that reduce task

2015-03-19 7:27 GMT+08:00 Jianfeng (Jeff) Zhang <jz...@hortonworks.com>:

>
>  You can take it similar as the HashMap of java. Use the hashCode of one
> object to distribute it into different bucket.
>
>
>
>  Best Regard,
> Jeff Zhang
>
>
>   From: xeonmailinglist-gmail <xe...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, March 18, 2015 at 7:08 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: trying to understand HashPartitioner
>
>  What tells with partition will run on which reduce task?
>
> On 18-03-2015 09:30, xeonmailinglist-gmail wrote:
>
>  Hi,
>
> I am trying to understand how HashPartitioner.java works. Thus, I ran a
> mapreduce job with 5 reducers and 5 input files. I thought that the output
> of getPartition(K2 key, V2 value, int numReduceTasks) was the number of
> reduce task that K2 and V2 will execute. Is this correct?
>  
>
> --
> --
>
>
> --
> --
>
>

Re: trying to understand HashPartitioner

Posted by 杨浩 <ya...@gmail.com>.

It's not the number of the the reduce task, but the ID of the reduce task.
For definite <k2, v2>, it will only be dealed on one reduce task.

In MRv2, each reduce task has an ID, like 0、1、2、3、4. The result is the
reduce ID and the <k2, v2> will be processed on that reduce task

2015-03-19 7:27 GMT+08:00 Jianfeng (Jeff) Zhang <jz...@hortonworks.com>:

>
>  You can take it similar as the HashMap of java. Use the hashCode of one
> object to distribute it into different bucket.
>
>
>
>  Best Regard,
> Jeff Zhang
>
>
>   From: xeonmailinglist-gmail <xe...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, March 18, 2015 at 7:08 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: trying to understand HashPartitioner
>
>  What tells with partition will run on which reduce task?
>
> On 18-03-2015 09:30, xeonmailinglist-gmail wrote:
>
>  Hi,
>
> I am trying to understand how HashPartitioner.java works. Thus, I ran a
> mapreduce job with 5 reducers and 5 input files. I thought that the output
> of getPartition(K2 key, V2 value, int numReduceTasks) was the number of
> reduce task that K2 and V2 will execute. Is this correct?
>  
>
> --
> --
>
>
> --
> --
>
>

Re: trying to understand HashPartitioner

Posted by 杨浩 <ya...@gmail.com>.

It's not the number of the the reduce task, but the ID of the reduce task.
For definite <k2, v2>, it will only be dealed on one reduce task.

In MRv2, each reduce task has an ID, like 0、1、2、3、4. The result is the
reduce ID and the <k2, v2> will be processed on that reduce task

2015-03-19 7:27 GMT+08:00 Jianfeng (Jeff) Zhang <jz...@hortonworks.com>:

>
>  You can take it similar as the HashMap of java. Use the hashCode of one
> object to distribute it into different bucket.
>
>
>
>  Best Regard,
> Jeff Zhang
>
>
>   From: xeonmailinglist-gmail <xe...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, March 18, 2015 at 7:08 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: trying to understand HashPartitioner
>
>  What tells with partition will run on which reduce task?
>
> On 18-03-2015 09:30, xeonmailinglist-gmail wrote:
>
>  Hi,
>
> I am trying to understand how HashPartitioner.java works. Thus, I ran a
> mapreduce job with 5 reducers and 5 input files. I thought that the output
> of getPartition(K2 key, V2 value, int numReduceTasks) was the number of
> reduce task that K2 and V2 will execute. Is this correct?
>  
>
> --
> --
>
>
> --
> --
>
>

Re: trying to understand HashPartitioner

Posted by 杨浩 <ya...@gmail.com>.

It's not the number of the the reduce task, but the ID of the reduce task.
For definite <k2, v2>, it will only be dealed on one reduce task.

In MRv2, each reduce task has an ID, like 0、1、2、3、4. The result is the
reduce ID and the <k2, v2> will be processed on that reduce task

2015-03-19 7:27 GMT+08:00 Jianfeng (Jeff) Zhang <jz...@hortonworks.com>:

>
>  You can take it similar as the HashMap of java. Use the hashCode of one
> object to distribute it into different bucket.
>
>
>
>  Best Regard,
> Jeff Zhang
>
>
>   From: xeonmailinglist-gmail <xe...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Wednesday, March 18, 2015 at 7:08 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: trying to understand HashPartitioner
>
>  What tells with partition will run on which reduce task?
>
> On 18-03-2015 09:30, xeonmailinglist-gmail wrote:
>
>  Hi,
>
> I am trying to understand how HashPartitioner.java works. Thus, I ran a
> mapreduce job with 5 reducers and 5 input files. I thought that the output
> of getPartition(K2 key, V2 value, int numReduceTasks) was the number of
> reduce task that K2 and V2 will execute. Is this correct?
>  
>
> --
> --
>
>
> --
> --
>
>

Re: trying to understand HashPartitioner

Posted by "Jianfeng (Jeff) Zhang" <jz...@hortonworks.com>.

You can take it similar as the HashMap of java. Use the hashCode of one object to distribute it into different bucket.



Best Regard,
Jeff Zhang


From: xeonmailinglist-gmail <xe...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, March 18, 2015 at 7:08 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: trying to understand HashPartitioner

What tells with partition will run on which reduce task?

On 18-03-2015 09:30, xeonmailinglist-gmail wrote:

Hi,

I am trying to understand how HashPartitioner.java works. Thus, I ran a mapreduce job with 5 reducers and 5 input files. I thought that the output of getPartition(K2 key, V2 value, int numReduceTasks) was the number of reduce task that K2 and V2 will execute. Is this correct?



--
--


--
--

Re: trying to understand HashPartitioner

Posted by "Jianfeng (Jeff) Zhang" <jz...@hortonworks.com>.

You can take it similar as the HashMap of java. Use the hashCode of one object to distribute it into different bucket.



Best Regard,
Jeff Zhang


From: xeonmailinglist-gmail <xe...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, March 18, 2015 at 7:08 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: trying to understand HashPartitioner

What tells with partition will run on which reduce task?

On 18-03-2015 09:30, xeonmailinglist-gmail wrote:

Hi,

I am trying to understand how HashPartitioner.java works. Thus, I ran a mapreduce job with 5 reducers and 5 input files. I thought that the output of getPartition(K2 key, V2 value, int numReduceTasks) was the number of reduce task that K2 and V2 will execute. Is this correct?



--
--


--
--

Re: trying to understand HashPartitioner

Posted by "Jianfeng (Jeff) Zhang" <jz...@hortonworks.com>.

You can take it similar as the HashMap of java. Use the hashCode of one object to distribute it into different bucket.



Best Regard,
Jeff Zhang


From: xeonmailinglist-gmail <xe...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, March 18, 2015 at 7:08 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: trying to understand HashPartitioner

What tells with partition will run on which reduce task?

On 18-03-2015 09:30, xeonmailinglist-gmail wrote:

Hi,

I am trying to understand how HashPartitioner.java works. Thus, I ran a mapreduce job with 5 reducers and 5 input files. I thought that the output of getPartition(K2 key, V2 value, int numReduceTasks) was the number of reduce task that K2 and V2 will execute. Is this correct?



--
--


--
--

Re: trying to understand HashPartitioner

Posted by "Jianfeng (Jeff) Zhang" <jz...@hortonworks.com>.

You can take it similar as the HashMap of java. Use the hashCode of one object to distribute it into different bucket.



Best Regard,
Jeff Zhang


From: xeonmailinglist-gmail <xe...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Wednesday, March 18, 2015 at 7:08 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: trying to understand HashPartitioner

What tells with partition will run on which reduce task?

On 18-03-2015 09:30, xeonmailinglist-gmail wrote:

Hi,

I am trying to understand how HashPartitioner.java works. Thus, I ran a mapreduce job with 5 reducers and 5 input files. I thought that the output of getPartition(K2 key, V2 value, int numReduceTasks) was the number of reduce task that K2 and V2 will execute. Is this correct?



--
--


--
--

Re: trying to understand HashPartitioner

Posted by xeonmailinglist-gmail <xe...@gmail.com>.

What tells with partition will run on which reduce task?

On 18-03-2015 09:30, xeonmailinglist-gmail wrote:
>
> Hi,
>
> I am trying to understand how |HashPartitioner.java| works. Thus, I 
> ran a mapreduce job with 5 reducers and 5 input files. I thought that 
> the output of |getPartition(K2 key, V2 value, int numReduceTasks)| was 
> the number of reduce task that |K2| and |V2| will execute. Is this 
> correct?
>
> 
> -- 
> --

-- 
--

Re: trying to understand HashPartitioner

Posted by xeonmailinglist-gmail <xe...@gmail.com>.

What tells with partition will run on which reduce task?

On 18-03-2015 09:30, xeonmailinglist-gmail wrote:
>
> Hi,
>
> I am trying to understand how |HashPartitioner.java| works. Thus, I 
> ran a mapreduce job with 5 reducers and 5 input files. I thought that 
> the output of |getPartition(K2 key, V2 value, int numReduceTasks)| was 
> the number of reduce task that |K2| and |V2| will execute. Is this 
> correct?
>
> 
> -- 
> --

-- 
--

Re: trying to understand HashPartitioner

Posted by xeonmailinglist-gmail <xe...@gmail.com>.

What tells with partition will run on which reduce task?

On 18-03-2015 09:30, xeonmailinglist-gmail wrote:
>
> Hi,
>
> I am trying to understand how |HashPartitioner.java| works. Thus, I 
> ran a mapreduce job with 5 reducers and 5 input files. I thought that 
> the output of |getPartition(K2 key, V2 value, int numReduceTasks)| was 
> the number of reduce task that |K2| and |V2| will execute. Is this 
> correct?
>
> 
> -- 
> --

-- 
--

Re: trying to understand HashPartitioner

Posted by xeonmailinglist-gmail <xe...@gmail.com>.

What tells with partition will run on which reduce task?

On 18-03-2015 09:30, xeonmailinglist-gmail wrote:
>
> Hi,
>
> I am trying to understand how |HashPartitioner.java| works. Thus, I 
> ran a mapreduce job with 5 reducers and 5 input files. I thought that 
> the output of |getPartition(K2 key, V2 value, int numReduceTasks)| was 
> the number of reduce task that |K2| and |V2| will execute. Is this 
> correct?
>
> 
> -- 
> --

-- 
--