You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Hairong Kuang (JIRA)" <ji...@apache.org> on 2009/01/14 20:31:02 UTC

[jira] Created: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
--------------------------------------------------------------------------------------------------------

Key: HADOOP-5034
URL: https://issues.apache.org/jira/browse/HADOOP-5034
Project: Hadoop Core
Issue Type: New Feature
Components: dfs
Reporter: Hairong Kuang
Fix For: 0.21.0

Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request.

This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.

I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665972#action_12665972 ] 

Raghu Angadi commented on HADOOP-5034:
--------------------------------------

+1 for the solution and I think this should go into 0.19 and up.. if not 0.18. Main reason is that when we are affected by this issue, there is no work around. It is pretty much a dead-lock : e.g. :  if the cluster is almost ful, the replications don't succeed, but if admin wants make room by deleting file, blocks won't be deleted since there are replication pending.

Regd implementation, the patch adds new datanode heartbeat command that combines invalidates and replications. I think it will simpler if we make heartbeat reply to contain list or array of commands. This way we don't need new command types.


> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockTransferInvalidate.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5034:
----------------------------------

    Attachment: blockTransferInvalidate2.patch

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockTransferInvalidate.patch, blockTransferInvalidate1.patch, blockTransferInvalidate2.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5034:
----------------------------------

    Attachment: blockTransferInvalidate1.patch

This patch reflects Raghu's suggestion.

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockTransferInvalidate.patch, blockTransferInvalidate1.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5034) NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5034:
----------------------------------

    Status: Patch Available  (was: Open)

> NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.19.1
>
>         Attachments: blockTransferInvalidate.patch, blockTransferInvalidate1.patch, blockTransferInvalidate2.patch, blockTransferInvalidate3.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5034:
----------------------------------

    Attachment:     (was: blockTransferInvalidate2.patch)

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockTransferInvalidate.patch, blockTransferInvalidate1.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665984#action_12665984 ] 

Raghu Angadi commented on HADOOP-5034:
--------------------------------------

+1 for 0.19.

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockTransferInvalidate.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5034:
----------------------------------

    Attachment: blockTransferInvalidate3.patch

The previous patch does not include the test case.

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockTransferInvalidate.patch, blockTransferInvalidate1.patch, blockTransferInvalidate2.patch, blockTransferInvalidate3.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665975#action_12665975 ] 

dhruba borthakur commented on HADOOP-5034:
------------------------------------------

+1 for putting this into 0.19. Very helpful patch.

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockTransferInvalidate.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-5034:
---------------------------------

    Affects Version/s: 0.18.0
         Hadoop Flags: [Reviewed]

+1 for the patch. Could we change the fix version to 0.19?

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockTransferInvalidate.patch, blockTransferInvalidate1.patch, blockTransferInvalidate2.patch, blockTransferInvalidate3.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5034) NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668723#action_12668723 ] 

Hadoop QA commented on HADOOP-5034:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12398963/blockTransferInvalidate3.patch
  against trunk revision 738944.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 core tests.  The patch passed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3776/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3776/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3776/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3776/console

This message is automatically generated.

> NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.19.1
>
>         Attachments: blockTransferInvalidate.patch, blockTransferInvalidate1.patch, blockTransferInvalidate2.patch, blockTransferInvalidate3.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664635#action_12664635 ] 

Raghu Angadi commented on HADOOP-5034:
--------------------------------------

> I think we should first try to fine-tune the number of blocks we send for deletion. Currently we send 100 [...]

I think this limit is related but different issue. Even when it was implemented it was supposed to be a work around for how DN handles deletion. We should either remove or have a very large limit at NN and let DN handle deleting large number of blocks properly (say in a separate thread from heart beat thread). This was fix was proposed quite a few times but we didn't fix it. Trying to fine tune it only prolongs the problem.

> If fine-tuning will not solve the problem we can go on with the modifications to the code. 

It does not look like changing this limit won't fix the issue since NN never get to send any block to delete. Logically I don't see any reason why NN can not send both replication and deletion requests in the same response to DN.


> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>             Fix For: 0.21.0
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang reassigned HADOOP-5034:
-------------------------------------

    Assignee: Hairong Kuang

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5034) NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5034:
----------------------------------

      Resolution: Fixed
    Release Note: This patch changes the DatanodeProtocoal version number from 18 to 19. The patch allows NameNode to send both block replication and deletion request to a DataNode in response to a heartbeat.
          Status: Resolved  (was: Patch Available)

I've just committed this.

> NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.19.1
>
>         Attachments: blockTransferInvalidate.patch, blockTransferInvalidate1.patch, blockTransferInvalidate2.patch, blockTransferInvalidate3.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5034) NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5034:
----------------------------------

    Fix Version/s:     (was: 0.21.0)
                   0.19.1
          Summary: NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat  (was: NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat)

> NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.19.1
>
>         Attachments: blockTransferInvalidate.patch, blockTransferInvalidate1.patch, blockTransferInvalidate2.patch, blockTransferInvalidate3.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5034:
----------------------------------

    Attachment: blockTransferInvalidate2.patch

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockTransferInvalidate.patch, blockTransferInvalidate1.patch, blockTransferInvalidate2.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5034) NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673944#action_12673944 ] 

Hudson commented on HADOOP-5034:
--------------------------------

Integrated in Hadoop-trunk #756 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/756/])
    

> NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.19.1
>
>         Attachments: blockTransferInvalidate.patch, blockTransferInvalidate1.patch, blockTransferInvalidate2.patch, blockTransferInvalidate3.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665983#action_12665983 ] 

Hairong Kuang commented on HADOOP-5034:
---------------------------------------

> I think it will simpler if we make heartbeat reply to contain list or array of commands. This way we don't need new command types.
I like this idea a lot. This makes the solution simpler and make the protocol easy to extend later on.

> putting this into 0.19. 
Not a problem for me. This patch has a protocol change. If the community wants it in 0.19, maybe we can make an exception. :-)

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockTransferInvalidate.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-5034:
----------------------------------

    Attachment: blockTransferInvalidate.patch

Here is the patch that does the following:
1. ReplicationMonitor schedules both block replication and deletion in one iteration;
2. Heartbeat picks up both replication and deletion requests if both are scheduled;
3. BlockCommand is enhanced so that a command can carry both replication and deletion requests;
4. Datanode handles both replication and deletion requests when receive a heartbeat reply.

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: blockTransferInvalidate.patch
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Robert Chansler (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664368#action_12664368 ] 

Robert Chansler commented on HADOOP-5034:
-----------------------------------------

I first thought that just switching the priority to deletions rather than replications would be satisfactory, but Hairong explained that since processing is time-sliced, if the higher priority task occurs at the very modest rate of once per heartbeat, the lower priority task will be starved. Starving deletions can make replication impossible. The reverse is not true, but it is difficult to make the statistical argument that starving replications for a while is OK. You can't starve replication too long by doing deletions as there are only so many replicas to delete, but is it OK for replications to wait a minute? An hour? It is perhaps best to just follow Hairong's suggestion.

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>             Fix For: 0.21.0
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664730#action_12664730 ] 

Hairong Kuang commented on HADOOP-5034:
---------------------------------------

I do not think fine-tuning the limit will solve the problem. What happened was that block replication starved block deletion. Block deletion was not observed at all.  When I ReplicationMonitor code, I found out that block deletion does not get scheduled for any datanode even if there is only one replication work scheduled for the whole cluster. This explains why no block deletion was observed at all.

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>             Fix For: 0.21.0
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5034) NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664380#action_12664380 ] 

Konstantin Shvachko commented on HADOOP-5034:
---------------------------------------------

I think we should first try to fine-tune the number of blocks we send for deletion. Currently we send 100, which means that data-node can not  delete more  than a 100 blocks  every 3 seconds. This seems to be low on large clusters, but I don't what the optimum is.
If fine-tuning will not solve the problem we can go on with the modifications to the code.

> NameNode should send both both replication and deletion requests to DataNode in one reply to a heartbeat
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5034
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Hairong Kuang
>             Fix For: 0.21.0
>
>
> Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. 
> This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes.
> I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.