You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Konstantin Shvachko (JIRA)" <ji...@apache.org> on 2006/04/26 23:26:04 UTC

[jira] Created: (HADOOP-170) setReplication and related bug fixes

setReplication and related bug fixes
------------------------------------

         Key: HADOOP-170
         URL: http://issues.apache.org/jira/browse/HADOOP-170
     Project: Hadoop
        Type: Improvement

  Components: fs, dfs  
    Versions: 0.1.1    
    Reporter: Konstantin Shvachko
 Assigned to: Konstantin Shvachko 


Having variable replication (HADOOP-51) it is natural to be able to
change replication for existing files. This patch introduces the functionality.
Here is a detailed list of issues addressed by the patch.

1) setReplication() and getReplication() methods are implemented.
2) DFSShell prints file replication for any listed file.
3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
4) Bug fix. This is a distributed bug.
Suppose that file replication is 3, and a client reduces it to 1.
Two data nodes will be chosen to remove their copies, and will do that.
After a while they will report to the name node that the copies have been actually deleted.
Until they report the name node assumes the copies still exist.
Now the client decides to increase replication back to 3 BEFORE the data nodes
reported the copies are deleted. Then the name node can choose one of the data nodes,
which it thinks have a block copy, to replicate the block to new data nodes.
This setting is quite unusual but possible even without variable replications.
5) Logging for name and data nodes is improved in several cases.
E.g. data nodes never logged that they deleted a block.



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-170) setReplication and related bug fixes

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-170?page=comments#action_12376587 ] 

Doug Cutting commented on HADOOP-170:
-------------------------------------

Can you please add proper javadoc to the new public methods in FileSystem.java?  Thanks.

Also, as mentioned above, an easy way to indicate wide replication would be a welcome addition.  I'll certainly commit this without that, but I'd really like to be able to use this feature, and cannot yet.

> setReplication and related bug fixes
> ------------------------------------
>
>          Key: HADOOP-170
>          URL: http://issues.apache.org/jira/browse/HADOOP-170
>      Project: Hadoop
>         Type: Improvement

>   Components: fs, dfs
>     Versions: 0.1.1
>     Reporter: Konstantin Shvachko
>     Assignee: Konstantin Shvachko
>  Attachments: setReplication.patch
>
> Having variable replication (HADOOP-51) it is natural to be able to
> change replication for existing files. This patch introduces the functionality.
> Here is a detailed list of issues addressed by the patch.
> 1) setReplication() and getReplication() methods are implemented.
> 2) DFSShell prints file replication for any listed file.
> 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
> 4) Bug fix. This is a distributed bug.
> Suppose that file replication is 3, and a client reduces it to 1.
> Two data nodes will be chosen to remove their copies, and will do that.
> After a while they will report to the name node that the copies have been actually deleted.
> Until they report the name node assumes the copies still exist.
> Now the client decides to increase replication back to 3 BEFORE the data nodes
> reported the copies are deleted. Then the name node can choose one of the data nodes,
> which it thinks have a block copy, to replicate the block to new data nodes.
> This setting is quite unusual but possible even without variable replications.
> 5) Logging for name and data nodes is improved in several cases.
> E.g. data nodes never logged that they deleted a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-170) setReplication and related bug fixes

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-170?page=all ]

Konstantin Shvachko updated HADOOP-170:
---------------------------------------

    Attachment:     (was: setReplication.patch)

> setReplication and related bug fixes
> ------------------------------------
>
>          Key: HADOOP-170
>          URL: http://issues.apache.org/jira/browse/HADOOP-170
>      Project: Hadoop
>         Type: Improvement

>   Components: fs, dfs
>     Versions: 0.1.1
>     Reporter: Konstantin Shvachko
>     Assignee: Konstantin Shvachko
>  Attachments: setReplication.patch
>
> Having variable replication (HADOOP-51) it is natural to be able to
> change replication for existing files. This patch introduces the functionality.
> Here is a detailed list of issues addressed by the patch.
> 1) setReplication() and getReplication() methods are implemented.
> 2) DFSShell prints file replication for any listed file.
> 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
> 4) Bug fix. This is a distributed bug.
> Suppose that file replication is 3, and a client reduces it to 1.
> Two data nodes will be chosen to remove their copies, and will do that.
> After a while they will report to the name node that the copies have been actually deleted.
> Until they report the name node assumes the copies still exist.
> Now the client decides to increase replication back to 3 BEFORE the data nodes
> reported the copies are deleted. Then the name node can choose one of the data nodes,
> which it thinks have a block copy, to replicate the block to new data nodes.
> This setting is quite unusual but possible even without variable replications.
> 5) Logging for name and data nodes is improved in several cases.
> E.g. data nodes never logged that they deleted a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-170) setReplication and related bug fixes

Posted by "Runping Qi (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-170?page=comments#action_12376614 ] 

Runping Qi commented on HADOOP-170:
-----------------------------------


It will be more effective if after assigning a map task to a specific jobtracker node, the jobtrackercan request the name node to replicate the blocks of the split for the maptask on to the jobtracker node.

> setReplication and related bug fixes
> ------------------------------------
>
>          Key: HADOOP-170
>          URL: http://issues.apache.org/jira/browse/HADOOP-170
>      Project: Hadoop
>         Type: Improvement

>   Components: fs, dfs
>     Versions: 0.1.1
>     Reporter: Konstantin Shvachko
>     Assignee: Konstantin Shvachko
>      Fix For: 0.2
>  Attachments: setReplication.patch
>
> Having variable replication (HADOOP-51) it is natural to be able to
> change replication for existing files. This patch introduces the functionality.
> Here is a detailed list of issues addressed by the patch.
> 1) setReplication() and getReplication() methods are implemented.
> 2) DFSShell prints file replication for any listed file.
> 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
> 4) Bug fix. This is a distributed bug.
> Suppose that file replication is 3, and a client reduces it to 1.
> Two data nodes will be chosen to remove their copies, and will do that.
> After a while they will report to the name node that the copies have been actually deleted.
> Until they report the name node assumes the copies still exist.
> Now the client decides to increase replication back to 3 BEFORE the data nodes
> reported the copies are deleted. Then the name node can choose one of the data nodes,
> which it thinks have a block copy, to replicate the block to new data nodes.
> This setting is quite unusual but possible even without variable replications.
> 5) Logging for name and data nodes is improved in several cases.
> E.g. data nodes never logged that they deleted a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-170) setReplication and related bug fixes

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-170?page=comments#action_12376581 ] 

Doug Cutting commented on HADOOP-170:
-------------------------------------

>  This probably has to be called smartReplication(). 

What's the matter with Integer.MAX_VALUE?

This is one of the most important applications for variable replications counts.  We have them now, but they're not yet easy to use in the most obvious and needed application.  That's why I'm asking about this.

> we will still not see the desired locality because of the crc files

The .crc files are very tiny, much less than 1%.  If all of the data is read locally but .crc files, then throughput will be much faster, switches will not be the bottleneck.


> setReplication and related bug fixes
> ------------------------------------
>
>          Key: HADOOP-170
>          URL: http://issues.apache.org/jira/browse/HADOOP-170
>      Project: Hadoop
>         Type: Improvement

>   Components: fs, dfs
>     Versions: 0.1.1
>     Reporter: Konstantin Shvachko
>     Assignee: Konstantin Shvachko
>  Attachments: setReplication.patch
>
> Having variable replication (HADOOP-51) it is natural to be able to
> change replication for existing files. This patch introduces the functionality.
> Here is a detailed list of issues addressed by the patch.
> 1) setReplication() and getReplication() methods are implemented.
> 2) DFSShell prints file replication for any listed file.
> 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
> 4) Bug fix. This is a distributed bug.
> Suppose that file replication is 3, and a client reduces it to 1.
> Two data nodes will be chosen to remove their copies, and will do that.
> After a while they will report to the name node that the copies have been actually deleted.
> Until they report the name node assumes the copies still exist.
> Now the client decides to increase replication back to 3 BEFORE the data nodes
> reported the copies are deleted. Then the name node can choose one of the data nodes,
> which it thinks have a block copy, to replicate the block to new data nodes.
> This setting is quite unusual but possible even without variable replications.
> 5) Logging for name and data nodes is improved in several cases.
> E.g. data nodes never logged that they deleted a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-170) setReplication and related bug fixes

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-170?page=comments#action_12376598 ] 

Konstantin Shvachko commented on HADOOP-170:
--------------------------------------------

- Replication can be set to maximum if the number of nodes is available.
That is block will replicated on all almost all nodes.
- Replication can be set to n/10, whatever n you choose.
- Currently cannot achieve replication one per rack, since there is no notion of racks, which well known.
How exactly do you want to use  Integer.MAX_VALUE?

> setReplication and related bug fixes
> ------------------------------------
>
>          Key: HADOOP-170
>          URL: http://issues.apache.org/jira/browse/HADOOP-170
>      Project: Hadoop
>         Type: Improvement

>   Components: fs, dfs
>     Versions: 0.1.1
>     Reporter: Konstantin Shvachko
>     Assignee: Konstantin Shvachko
>      Fix For: 0.2
>  Attachments: setReplication.patch
>
> Having variable replication (HADOOP-51) it is natural to be able to
> change replication for existing files. This patch introduces the functionality.
> Here is a detailed list of issues addressed by the patch.
> 1) setReplication() and getReplication() methods are implemented.
> 2) DFSShell prints file replication for any listed file.
> 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
> 4) Bug fix. This is a distributed bug.
> Suppose that file replication is 3, and a client reduces it to 1.
> Two data nodes will be chosen to remove their copies, and will do that.
> After a while they will report to the name node that the copies have been actually deleted.
> Until they report the name node assumes the copies still exist.
> Now the client decides to increase replication back to 3 BEFORE the data nodes
> reported the copies are deleted. Then the name node can choose one of the data nodes,
> which it thinks have a block copy, to replicate the block to new data nodes.
> This setting is quite unusual but possible even without variable replications.
> 5) Logging for name and data nodes is improved in several cases.
> E.g. data nodes never logged that they deleted a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-170) setReplication and related bug fixes

Posted by "Yoram Arnon (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-170?page=comments#action_12376593 ] 

Yoram Arnon commented on HADOOP-170:
------------------------------------

the optimal replication factor for distributing a file in 2 hops is sqrt(cluster size). The worst case delivery time is 2*sqrt(n) times a single transfer time.

replicating to all the nodes of the dfs cluster has two disadvantages:
1. it's slower than replicating to sqrt(n), with a worst case of n.
2. if the dfs cluster is large, but multiple smaller map-reduce clusters are using it concurrently, there's little value in over-replication to every node of the cluster when it's required by a subset.

+1 on asking the job tracker for the size of the cluster and replicating based on that size.

> setReplication and related bug fixes
> ------------------------------------
>
>          Key: HADOOP-170
>          URL: http://issues.apache.org/jira/browse/HADOOP-170
>      Project: Hadoop
>         Type: Improvement

>   Components: fs, dfs
>     Versions: 0.1.1
>     Reporter: Konstantin Shvachko
>     Assignee: Konstantin Shvachko
>  Attachments: setReplication.patch
>
> Having variable replication (HADOOP-51) it is natural to be able to
> change replication for existing files. This patch introduces the functionality.
> Here is a detailed list of issues addressed by the patch.
> 1) setReplication() and getReplication() methods are implemented.
> 2) DFSShell prints file replication for any listed file.
> 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
> 4) Bug fix. This is a distributed bug.
> Suppose that file replication is 3, and a client reduces it to 1.
> Two data nodes will be chosen to remove their copies, and will do that.
> After a while they will report to the name node that the copies have been actually deleted.
> Until they report the name node assumes the copies still exist.
> Now the client decides to increase replication back to 3 BEFORE the data nodes
> reported the copies are deleted. Then the name node can choose one of the data nodes,
> which it thinks have a block copy, to replicate the block to new data nodes.
> This setting is quite unusual but possible even without variable replications.
> 5) Logging for name and data nodes is improved in several cases.
> E.g. data nodes never logged that they deleted a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-170) setReplication and related bug fixes

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-170?page=comments#action_12376558 ] 

Doug Cutting commented on HADOOP-170:
-------------------------------------

We should start setting a high replication count in MapReduce for submitted job files, that are read by every node.  But how should we use this?  Can we simply do something like setReplication(Integer.MAX_VALUE) and have a file replicated as high as the fs thinks is useful (once per node, once per rack, once per nodes/10, whichever it chooses).

> setReplication and related bug fixes
> ------------------------------------
>
>          Key: HADOOP-170
>          URL: http://issues.apache.org/jira/browse/HADOOP-170
>      Project: Hadoop
>         Type: Improvement

>   Components: fs, dfs
>     Versions: 0.1.1
>     Reporter: Konstantin Shvachko
>     Assignee: Konstantin Shvachko
>  Attachments: setReplication.patch
>
> Having variable replication (HADOOP-51) it is natural to be able to
> change replication for existing files. This patch introduces the functionality.
> Here is a detailed list of issues addressed by the patch.
> 1) setReplication() and getReplication() methods are implemented.
> 2) DFSShell prints file replication for any listed file.
> 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
> 4) Bug fix. This is a distributed bug.
> Suppose that file replication is 3, and a client reduces it to 1.
> Two data nodes will be chosen to remove their copies, and will do that.
> After a while they will report to the name node that the copies have been actually deleted.
> Until they report the name node assumes the copies still exist.
> Now the client decides to increase replication back to 3 BEFORE the data nodes
> reported the copies are deleted. Then the name node can choose one of the data nodes,
> which it thinks have a block copy, to replicate the block to new data nodes.
> This setting is quite unusual but possible even without variable replications.
> 5) Logging for name and data nodes is improved in several cases.
> E.g. data nodes never logged that they deleted a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-170) setReplication and related bug fixes

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-170?page=comments#action_12376576 ] 

Benjamin Reed commented on HADOOP-170:
--------------------------------------

It's really JobTracker, not the fs, that knows how high to set the replication count since the JobTracker will know the number of mapper tasks. The JobTracker also knows when the job has finished and can set the replication count back to the original number. I guess in most cases the job files will just get deleted anyway at the end of the job.

When you say "submitted job files", you are just talking about the job conf and jar file that correspond to a job, correct?

> setReplication and related bug fixes
> ------------------------------------
>
>          Key: HADOOP-170
>          URL: http://issues.apache.org/jira/browse/HADOOP-170
>      Project: Hadoop
>         Type: Improvement

>   Components: fs, dfs
>     Versions: 0.1.1
>     Reporter: Konstantin Shvachko
>     Assignee: Konstantin Shvachko
>  Attachments: setReplication.patch
>
> Having variable replication (HADOOP-51) it is natural to be able to
> change replication for existing files. This patch introduces the functionality.
> Here is a detailed list of issues addressed by the patch.
> 1) setReplication() and getReplication() methods are implemented.
> 2) DFSShell prints file replication for any listed file.
> 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
> 4) Bug fix. This is a distributed bug.
> Suppose that file replication is 3, and a client reduces it to 1.
> Two data nodes will be chosen to remove their copies, and will do that.
> After a while they will report to the name node that the copies have been actually deleted.
> Until they report the name node assumes the copies still exist.
> Now the client decides to increase replication back to 3 BEFORE the data nodes
> reported the copies are deleted. Then the name node can choose one of the data nodes,
> which it thinks have a block copy, to replicate the block to new data nodes.
> This setting is quite unusual but possible even without variable replications.
> 5) Logging for name and data nodes is improved in several cases.
> E.g. data nodes never logged that they deleted a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-170) setReplication and related bug fixes

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-170?page=comments#action_12376574 ] 

Konstantin Shvachko commented on HADOOP-170:
--------------------------------------------

This probably has to be called smartReplication().

Right now even if we the data blocks are placed "smart" by the namenode
we will still not see the desired locality because of the crc files, which are
completely independent on the data files.

> setReplication and related bug fixes
> ------------------------------------
>
>          Key: HADOOP-170
>          URL: http://issues.apache.org/jira/browse/HADOOP-170
>      Project: Hadoop
>         Type: Improvement

>   Components: fs, dfs
>     Versions: 0.1.1
>     Reporter: Konstantin Shvachko
>     Assignee: Konstantin Shvachko
>  Attachments: setReplication.patch
>
> Having variable replication (HADOOP-51) it is natural to be able to
> change replication for existing files. This patch introduces the functionality.
> Here is a detailed list of issues addressed by the patch.
> 1) setReplication() and getReplication() methods are implemented.
> 2) DFSShell prints file replication for any listed file.
> 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
> 4) Bug fix. This is a distributed bug.
> Suppose that file replication is 3, and a client reduces it to 1.
> Two data nodes will be chosen to remove their copies, and will do that.
> After a while they will report to the name node that the copies have been actually deleted.
> Until they report the name node assumes the copies still exist.
> Now the client decides to increase replication back to 3 BEFORE the data nodes
> reported the copies are deleted. Then the name node can choose one of the data nodes,
> which it thinks have a block copy, to replicate the block to new data nodes.
> This setting is quite unusual but possible even without variable replications.
> 5) Logging for name and data nodes is improved in several cases.
> E.g. data nodes never logged that they deleted a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-170) setReplication and related bug fixes

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-170?page=all ]

Konstantin Shvachko updated HADOOP-170:
---------------------------------------

    Attachment: setReplication.patch

> setReplication and related bug fixes
> ------------------------------------
>
>          Key: HADOOP-170
>          URL: http://issues.apache.org/jira/browse/HADOOP-170
>      Project: Hadoop
>         Type: Improvement

>   Components: fs, dfs
>     Versions: 0.1.1
>     Reporter: Konstantin Shvachko
>     Assignee: Konstantin Shvachko
>  Attachments: setReplication.patch
>
> Having variable replication (HADOOP-51) it is natural to be able to
> change replication for existing files. This patch introduces the functionality.
> Here is a detailed list of issues addressed by the patch.
> 1) setReplication() and getReplication() methods are implemented.
> 2) DFSShell prints file replication for any listed file.
> 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
> 4) Bug fix. This is a distributed bug.
> Suppose that file replication is 3, and a client reduces it to 1.
> Two data nodes will be chosen to remove their copies, and will do that.
> After a while they will report to the name node that the copies have been actually deleted.
> Until they report the name node assumes the copies still exist.
> Now the client decides to increase replication back to 3 BEFORE the data nodes
> reported the copies are deleted. Then the name node can choose one of the data nodes,
> which it thinks have a block copy, to replicate the block to new data nodes.
> This setting is quite unusual but possible even without variable replications.
> 5) Logging for name and data nodes is improved in several cases.
> E.g. data nodes never logged that they deleted a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Resolved: (HADOOP-170) setReplication and related bug fixes

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-170?page=all ]
     
Doug Cutting resolved HADOOP-170:
---------------------------------

    Fix Version: 0.2
     Resolution: Fixed

I just committed this.  Thanks, Konstantin!

> setReplication and related bug fixes
> ------------------------------------
>
>          Key: HADOOP-170
>          URL: http://issues.apache.org/jira/browse/HADOOP-170
>      Project: Hadoop
>         Type: Improvement

>   Components: fs, dfs
>     Versions: 0.1.1
>     Reporter: Konstantin Shvachko
>     Assignee: Konstantin Shvachko
>      Fix For: 0.2
>  Attachments: setReplication.patch
>
> Having variable replication (HADOOP-51) it is natural to be able to
> change replication for existing files. This patch introduces the functionality.
> Here is a detailed list of issues addressed by the patch.
> 1) setReplication() and getReplication() methods are implemented.
> 2) DFSShell prints file replication for any listed file.
> 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
> 4) Bug fix. This is a distributed bug.
> Suppose that file replication is 3, and a client reduces it to 1.
> Two data nodes will be chosen to remove their copies, and will do that.
> After a while they will report to the name node that the copies have been actually deleted.
> Until they report the name node assumes the copies still exist.
> Now the client decides to increase replication back to 3 BEFORE the data nodes
> reported the copies are deleted. Then the name node can choose one of the data nodes,
> which it thinks have a block copy, to replicate the block to new data nodes.
> This setting is quite unusual but possible even without variable replications.
> 5) Logging for name and data nodes is improved in several cases.
> E.g. data nodes never logged that they deleted a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-170) setReplication and related bug fixes

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-170?page=all ]

Konstantin Shvachko updated HADOOP-170:
---------------------------------------

    Attachment: setReplication.patch

Java documentation is added for new public methods.


> setReplication and related bug fixes
> ------------------------------------
>
>          Key: HADOOP-170
>          URL: http://issues.apache.org/jira/browse/HADOOP-170
>      Project: Hadoop
>         Type: Improvement

>   Components: fs, dfs
>     Versions: 0.1.1
>     Reporter: Konstantin Shvachko
>     Assignee: Konstantin Shvachko
>  Attachments: setReplication.patch
>
> Having variable replication (HADOOP-51) it is natural to be able to
> change replication for existing files. This patch introduces the functionality.
> Here is a detailed list of issues addressed by the patch.
> 1) setReplication() and getReplication() methods are implemented.
> 2) DFSShell prints file replication for any listed file.
> 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
> 4) Bug fix. This is a distributed bug.
> Suppose that file replication is 3, and a client reduces it to 1.
> Two data nodes will be chosen to remove their copies, and will do that.
> After a while they will report to the name node that the copies have been actually deleted.
> Until they report the name node assumes the copies still exist.
> Now the client decides to increase replication back to 3 BEFORE the data nodes
> reported the copies are deleted. Then the name node can choose one of the data nodes,
> which it thinks have a block copy, to replicate the block to new data nodes.
> This setting is quite unusual but possible even without variable replications.
> 5) Logging for name and data nodes is improved in several cases.
> E.g. data nodes never logged that they deleted a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-170) setReplication and related bug fixes

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-170?page=comments#action_12376596 ] 

Doug Cutting commented on HADOOP-170:
-------------------------------------

> +1 on asking the job tracker for the size of the cluster and replicating based on that size.

I think you mean "namenode", not "job tracker".  Is that right?  If so, this sounds fine.  I don't care that we get it exactly right the first time, just that the API have some standard way to say "replicate highly": we can improve the implementation later, but we should only have to change client applications once.

> setReplication and related bug fixes
> ------------------------------------
>
>          Key: HADOOP-170
>          URL: http://issues.apache.org/jira/browse/HADOOP-170
>      Project: Hadoop
>         Type: Improvement

>   Components: fs, dfs
>     Versions: 0.1.1
>     Reporter: Konstantin Shvachko
>     Assignee: Konstantin Shvachko
>  Attachments: setReplication.patch
>
> Having variable replication (HADOOP-51) it is natural to be able to
> change replication for existing files. This patch introduces the functionality.
> Here is a detailed list of issues addressed by the patch.
> 1) setReplication() and getReplication() methods are implemented.
> 2) DFSShell prints file replication for any listed file.
> 3) Bug fix. FSDirectory.delete() logs delete operation even if it is not successful.
> 4) Bug fix. This is a distributed bug.
> Suppose that file replication is 3, and a client reduces it to 1.
> Two data nodes will be chosen to remove their copies, and will do that.
> After a while they will report to the name node that the copies have been actually deleted.
> Until they report the name node assumes the copies still exist.
> Now the client decides to increase replication back to 3 BEFORE the data nodes
> reported the copies are deleted. Then the name node can choose one of the data nodes,
> which it thinks have a block copy, to replicate the block to new data nodes.
> This setting is quite unusual but possible even without variable replications.
> 5) Logging for name and data nodes is improved in several cases.
> E.g. data nodes never logged that they deleted a block.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira