You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2007/01/16 19:40:27 UTC

[jira] Created: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

dead datanode set should be maintained in the file handle or file system for hdfs
---------------------------------------------------------------------------------

                 Key: HADOOP-893
                 URL: https://issues.apache.org/jira/browse/HADOOP-893
             Project: Hadoop
          Issue Type: Bug
          Components: dfs
            Reporter: Owen O'Malley
         Assigned To: Sameer Paranjpye


Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-893:
--------------------------------

    Attachment: HADOOP-893-1.patch


1.patch attached. DFSClient.DFSInputStream now maintains a member deadNodes. It clears the list when no live datanode could be found for any block.


> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Work started: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HADOOP-893 started by Raghu Angadi.

> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470383 ] 

dhruba borthakur commented on HADOOP-893:
-----------------------------------------

It could be helpful to write a simple test using the MiniDFSCluster to test this functionality. Especially to ensure that re-determination of alive/dead occurs when all datanodes for a block are found to be "dead".

> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-893:
--------------------------------

    Status: Open  (was: Patch Available)

> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Commented: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by Nigel Daley <nd...@yahoo-inc.com>.

This patch hasn't been code reviewed.  692 should take priority.


On Feb 2, 2007, at 1:24 PM, Doug Cutting (JIRA) wrote:

>
>     [ https://issues.apache.org/jira/browse/HADOOP-893? 
> page=com.atlassian.jira.plugin.system.issuetabpanels:comment- 
> tabpanel#action_12469876 ]
>
> Doug Cutting commented on HADOOP-893:
> -------------------------------------
>
> This patch conflicts with HADOOP-692.  Which should take priority?
>
>> dead datanode set should be maintained in the file handle or file  
>> system for hdfs
>> --------------------------------------------------------------------- 
>> ------------
>>
>>                 Key: HADOOP-893
>>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>>             Project: Hadoop
>>          Issue Type: Bug
>>          Components: dfs
>>            Reporter: Owen O'Malley
>>         Assigned To: Raghu Angadi
>>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch
>>
>>
>> Currently each call to read is creating a new set of dead  
>> datanodes. It seems like it would be more useful to keep a set of  
>> dead datanodes at either the file handle or file system level  
>> since dead datanodes are probably not quite that transient.
>
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>

[jira] Commented: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469876 ] 

Doug Cutting commented on HADOOP-893:
-------------------------------------

This patch conflicts with HADOOP-692.  Which should take priority?

> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-893:
--------------------------------

    Attachment: HADOOP-893-3.patch


Corrected the patch after rack-aware  changes. Change is that this patch does not modify DFSClient.bestNode(). bestNodes() does not select a random datanode any more. 

> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch, HADOOP-893-3.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471197 ] 

Hadoop QA commented on HADOOP-893:
----------------------------------

+1, because http://issues.apache.org/jira/secure/attachment/12350586/HADOOP-893-3.patch applied and successfully tested against trunk revision r504682.

> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch, HADOOP-893-3.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-893:
--------------------------------

    Status: Patch Available  (was: Open)

Tested with various cases of datanodes going down.


> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch, HADOOP-893-3.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470381 ] 

dhruba borthakur commented on HADOOP-893:
-----------------------------------------

+1. Code looks good.

The patch re-determines alive/dead datanodes only when all the replicas of a block reside on dead-datanodes. Suppose an application opened a file and kept the handle active for an extended period of time (when datanodes can go down and come back up). The randomization behaviour of selecting a random remote node is now going to be different from earlier.

> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469878 ] 

Owen O'Malley commented on HADOOP-893:
--------------------------------------

I vote for HADOOP-692. It has been much better reviewed and it bigger.

> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-893:
--------------------------------

       Resolution: Fixed
    Fix Version/s: 0.12.0
           Status: Resolved  (was: Patch Available)

I just committed this.  Thanks, Raghu!

> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>             Fix For: 0.12.0
>
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch, HADOOP-893-3.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469882 ] 

Raghu Angadi commented on HADOOP-893:
-------------------------------------


Yes, HADDOP-692. I will update this patch after that. Also Dhruba still hasn't got around to reveiwing this patch.

> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469637 ] 

Hadoop QA commented on HADOOP-893:
----------------------------------

+1, because http://issues.apache.org/jira/secure/attachment/12350178/HADOOP-893-2.patch applied and successfully tested against trunk revision r502402.

> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470394 ] 

Raghu Angadi commented on HADOOP-893:
-------------------------------------


I am planning to do manual test of various cases including the case where all the nodes are unresponsive. I will look into how we can shutdown nodes in MiniDFSCluster.

Yes, it does change the randomization behavior from earlier. This is one of the changes from existing code.






> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-893:
--------------------------------

    Attachment: HADOOP-893-2.patch

2.patch contains a small correction.


> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-893:
--------------------------------

    Status: Patch Available  (was: In Progress)

> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>         Attachments: HADOOP-893-1.patch, HADOOP-893-2.patch
>
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-893) dead datanode set should be maintained in the file handle or file system for hdfs

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi reassigned HADOOP-893:
-----------------------------------

    Assignee: Raghu Angadi  (was: Sameer Paranjpye)

> dead datanode set should be maintained in the file handle or file system for hdfs
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-893
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Owen O'Malley
>         Assigned To: Raghu Angadi
>
> Currently each call to read is creating a new set of dead datanodes. It seems like it would be more useful to keep a set of dead datanodes at either the file handle or file system level since dead datanodes are probably not quite that transient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.