You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by 朱 偉民 <xi...@tsm.kddilabs.jp> on 2008/09/05 08:08:05 UTC

I found a bug of the source NetworkTopology::pseudoSortByDistance

Hello, can you help me

I am a hadoop system's user. I found a bug of
NetworkTopology::pseudoSortByDistance in version 18.0.
The bug is:
1.When the local node is not found but the local rack node is found and that
node's position in the node array is 0, a random node at position 0 is put.
2.When the local node is not found and the local rack node is not found but
the local datacenter node is found A random node at position 0 is put 3.
When it comes near most and there are two or more data node most, hadoop
can't read the data from a node arbitrary by random numbers in that

I changed the source code for fix the bug. But I can't submit the source
code file to hadoop server.

The source is:

  public synchronized void pseudoSortByDistance( Node reader, Node[] nodes )
{
	  
	  if(nodes.length == 0)return;
	  
	  if(reader != null){
		  int distances[] = new int[nodes.length];
		  // get their distances to reader
		  for(int i=0;i<nodes.length; i++){
			  distances[i] = getDistance(reader,nodes[i]);
		  }
		  // Sort nodes array by their distances to reader
		  for(int i=0;i<distances.length; i++){
			  for(int j=i+1;j<distances.length;j++){
				  if(distances[i] > distances[j] ){
					  swap(nodes,i,j);
				  }
			  }
		  }
		  /**
		   *  put a random node at position 0 from the nodes of
		   *  that is equal with the first node's distance to reader
		   */  
		  int i;
		  for(i=0;i<distances.length;i++){
			  if(distances[i] != distances[0])break;
		  }
		  if(i != 0)swap(nodes, 0, r.nextInt(i));
		  
	  }else{ // put a random node at position 0 if reader is null
		  swap(nodes, 0, r.nextInt(nodes.length));
	  }
  }



Re: I found a bug of the source NetworkTopology::pseudoSortByDistance

Posted by Owen O'Malley <om...@apache.org>.
Please file a jira and submit the patch. Here are the directions for  
how to submit a patch:

http://wiki.apache.org/hadoop/HowToContribute

-- Owen

RE: I found a bug of the source NetworkTopology::pseudoSortByDistance

Posted by 朱 偉民 <xi...@tsm.kddilabs.jp>.
Hi Hairong,

I have understood that the NetworkTopology.pseudoSortByDistance method's
aims is to be simple and fast.
But There is really a bug.

Test Case:
When the local node is not found and
Only one local rack node is found and
that local rack node's position in the node array is 0

In the case,
localRackNode is 0 and
tempIndex is 0 and
(localRackNode != -1 && localRackNode != tempIndex ) 's result is false.

Therefore a random node will be put in the beginning of the array.

Let's fix it as follows

      // position tempIndex is 0 and local rack node is 0,return with do
anything
      if(tempIndex == 0 && localRackNode == 0){
    	  return;
      }
      
      // swap the local rack node and the node at position tempIndex
      if(localRackNode != -1 && localRackNode != tempIndex ) {

thanks very much

-----Original Message-----
From: Hairong Kuang [mailto:hairong@yahoo-inc.com] 
Sent: Saturday, September 06, 2008 2:24 AM
To: hadoop-dev
Subject: Re: I found a bug of the source
NetworkTopology::pseudoSortByDistance

NetworkTopology.pseudoSortByDistance aims to be simple and fast because it
is used by every open operation. It is not intended to sort the nodes as the
method name indicates. Instead, it searches for local node & local rack node
and put them in the beginning of the array. If none of them is found, put a
random node there. Since most map/reduce jobs read from local nodes or local
rack nodes, this works out pretty fine.
 
Hairong


On 9/4/08 11:08 PM, "朱 偉民" <xi...@tsm.kddilabs.jp> wrote:

> Hello, can you help me
> 
> I am a hadoop system's user. I found a bug of
> NetworkTopology::pseudoSortByDistance in version 18.0.
> The bug is:
> 1.When the local node is not found but the local rack node is found and
that
> node's position in the node array is 0, a random node at position 0 is
put.
> 2.When the local node is not found and the local rack node is not found
but
> the local datacenter node is found A random node at position 0 is put 3.
> When it comes near most and there are two or more data node most, hadoop
> can't read the data from a node arbitrary by random numbers in that
> 
> I changed the source code for fix the bug. But I can't submit the source
> code file to hadoop server.
> 
> The source is:
> 
>   public synchronized void pseudoSortByDistance( Node reader, Node[] nodes
)
> {
>  
>  if(nodes.length == 0)return;
>  
>  if(reader != null){
>  int distances[] = new int[nodes.length];
>  // get their distances to reader
>  for(int i=0;i<nodes.length; i++){
>  distances[i] = getDistance(reader,nodes[i]);
>  }
>  // Sort nodes array by their distances to reader
>  for(int i=0;i<distances.length; i++){
>  for(int j=i+1;j<distances.length;j++){
>  if(distances[i] > distances[j] ){
>  swap(nodes,i,j);
>  }
>  }
>  }
>  /**
>   *  put a random node at position 0 from the nodes of
>   *  that is equal with the first node's distance to reader
>   */  
>  int i;
>  for(i=0;i<distances.length;i++){
>  if(distances[i] != distances[0])break;
>  }
>  if(i != 0)swap(nodes, 0, r.nextInt(i));
>  
>  }else{ // put a random node at position 0 if reader is null
>  swap(nodes, 0, r.nextInt(nodes.length));
>  }
>   }
> 
> 




Re: I found a bug of the source NetworkTopology::pseudoSortByDistance

Posted by Hairong Kuang <ha...@yahoo-inc.com>.
NetworkTopology.pseudoSortByDistance aims to be simple and fast because it
is used by every open operation. It is not intended to sort the nodes as the
method name indicates. Instead, it searches for local node & local rack node
and put them in the beginning of the array. If none of them is found, put a
random node there. Since most map/reduce jobs read from local nodes or local
rack nodes, this works out pretty fine.
 
Hairong


On 9/4/08 11:08 PM, "朱 偉民" <xi...@tsm.kddilabs.jp> wrote:

> Hello, can you help me
> 
> I am a hadoop system's user. I found a bug of
> NetworkTopology::pseudoSortByDistance in version 18.0.
> The bug is:
> 1.When the local node is not found but the local rack node is found and that
> node's position in the node array is 0, a random node at position 0 is put.
> 2.When the local node is not found and the local rack node is not found but
> the local datacenter node is found A random node at position 0 is put 3.
> When it comes near most and there are two or more data node most, hadoop
> can't read the data from a node arbitrary by random numbers in that
> 
> I changed the source code for fix the bug. But I can't submit the source
> code file to hadoop server.
> 
> The source is:
> 
>   public synchronized void pseudoSortByDistance( Node reader, Node[] nodes )
> {
>  
>  if(nodes.length == 0)return;
>  
>  if(reader != null){
>  int distances[] = new int[nodes.length];
>  // get their distances to reader
>  for(int i=0;i<nodes.length; i++){
>  distances[i] = getDistance(reader,nodes[i]);
>  }
>  // Sort nodes array by their distances to reader
>  for(int i=0;i<distances.length; i++){
>  for(int j=i+1;j<distances.length;j++){
>  if(distances[i] > distances[j] ){
>  swap(nodes,i,j);
>  }
>  }
>  }
>  /**
>   *  put a random node at position 0 from the nodes of
>   *  that is equal with the first node's distance to reader
>   */  
>  int i;
>  for(i=0;i<distances.length;i++){
>  if(distances[i] != distances[0])break;
>  }
>  if(i != 0)swap(nodes, 0, r.nextInt(i));
>  
>  }else{ // put a random node at position 0 if reader is null
>  swap(nodes, 0, r.nextInt(nodes.length));
>  }
>   }
> 
>