You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Konstantin Shvachko (JIRA)" <ji...@apache.org> on 2008/03/12 01:50:46 UTC

[jira] Created: (HADOOP-3002) HDFS should not remove blocks while in safemode.

HDFS should not remove blocks while in safemode.
------------------------------------------------

                 Key: HADOOP-3002
                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
             Project: Hadoop Core
          Issue Type: Bug
          Components: dfs
            Reporter: Konstantin Shvachko
            Priority: Critical
             Fix For: 0.16.2, 0.17.0


I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
This happened on my experimental cluster with accelerated block report rate.
By definition in safe mode the name-node should not
- accept client requests to change the namespace state, and
- schedule block replications and/or block removal for the data-nodes.

We don't want any unnecessary replications until all blocks are reported during startup.
We also don't want to remove blocks if safe mode is entered manually.
In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
Block reports can also return block commands, which should be banned during safe mode.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Robert Chansler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chansler reassigned HADOOP-3002:
---------------------------------------

    Assignee: Konstantin Shvachko

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611405#action_12611405 ] 

Konstantin Shvachko commented on HADOOP-3002:
---------------------------------------------

I reverted changes that has been committed. The global lock leads to a potential deadlock.

Thanks Dhruba I overlooked the global lock, which we did not have before. It was introduced in 0.18 by HADOOP-1985.
I'll submit another patch.

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.17.2, 0.18.0
>
>         Attachments: DelBlocksInSafeMode-018.patch, DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-3002:
----------------------------------------

    Attachment: DelBlocksInSafeMode-017.patch
                DelBlocksInSafeMode-018.patch

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.17.2, 0.18.0
>
>         Attachments: DelBlocksInSafeMode-017.patch, DelBlocksInSafeMode-018.patch, DelBlocksInSafeMode-018.patch, DelBlocksInSafeMode.patch, DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-3002:
----------------------------------------

    Fix Version/s:     (was: 0.19.0)
                   0.17.0
                   0.18.0
           Status: Patch Available  (was: Open)

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.18.0, 0.17.0
>
>         Attachments: DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sameer Paranjpye updated HADOOP-3002:
-------------------------------------

    Affects Version/s: 0.16.0
        Fix Version/s:     (was: 0.17.0)

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611264#action_12611264 ] 

Hadoop QA commented on HADOOP-3002:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12385265/DelBlocksInSafeMode.patch
  against trunk revision 674442.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 1 new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2799/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2799/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2799/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2799/console

This message is automatically generated.

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.17.2, 0.18.0
>
>         Attachments: DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-3002:
--------------------------------

    Fix Version/s:     (was: 0.17.0)
                   0.17.2

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.17.2, 0.18.0
>
>         Attachments: DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-3002:
----------------------------------------

    Attachment: DelBlocksInSafeMode-018.patch

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.17.2, 0.18.0
>
>         Attachments: DelBlocksInSafeMode-018.patch, DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sameer Paranjpye updated HADOOP-3002:
-------------------------------------

    Fix Version/s:     (was: 0.16.2)

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Konstantin Shvachko
>            Priority: Critical
>             Fix For: 0.17.0
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-3002:
----------------------------------------

    Attachment: DelBlocksInSafeMode.patch

This is the patch that postpones removal of blocks until the safe mode is off.
The main reason for delition was that block report processing was removing blocks that do not belong 
to any file directly ignoring the regular mechanism that first adds invalid blocks into recentInvalidateSets 
and then schedules  them for deletion via heartbeats.
# I changed block report processing to just placing invalid blocks to recentInvalidateSets
and not returning any commands to data-nodes. This optimized processReport() because now it 
does not scan the block report once again looking for invalid blocks.
# I changed heartbeat processing because it never checked the safe mode and would schedule
replications or deletions if there were any in the pending lists. 
During startup the pending lists are empty but in manual safe mode it may not be the case.
So now the only commands that are allowed when safe mode is on are requests for block reports 
and distributed upgrade commands.
It is not clear why some code in handleHeartbeat() is inside the synchronized section and some is not.
Placed everything inside.



> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.17.0, 0.18.0
>
>         Attachments: DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-3002:
----------------------------------------

    Attachment: DelBlocksInSafeMode.patch

This is a new patch, which does not change heartbeat processing.
The global lock issue will be taken care of by HADOOP-3620.

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.17.2, 0.18.0
>
>         Attachments: DelBlocksInSafeMode-018.patch, DelBlocksInSafeMode.patch, DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sameer Paranjpye updated HADOOP-3002:
-------------------------------------

    Priority: Major  (was: Blocker)

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Konstantin Shvachko
>             Fix For: 0.17.0
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-3002:
----------------------------------------

    Status: Open  (was: Patch Available)

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.17.2, 0.18.0
>
>         Attachments: DelBlocksInSafeMode-018.patch, DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611830#action_12611830 ] 

Hadoop QA commented on HADOOP-3002:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12385454/DelBlocksInSafeMode.patch
  against trunk revision 674932.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2816/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2816/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2816/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2816/console

This message is automatically generated.

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.17.2, 0.18.0
>
>         Attachments: DelBlocksInSafeMode-018.patch, DelBlocksInSafeMode.patch, DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-3002:
----------------------------------------

    Status: Patch Available  (was: Open)

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.17.2, 0.18.0
>
>         Attachments: DelBlocksInSafeMode-018.patch, DelBlocksInSafeMode.patch, DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Robert Chansler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chansler updated HADOOP-3002:
------------------------------------

    Priority: Blocker  (was: Major)

If we fix this, 3677 can be demoted.

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated HADOOP-3002:
----------------------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 0.18.0)
           Status: Resolved  (was: Patch Available)

I just committed this.

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.17.2
>
>         Attachments: DelBlocksInSafeMode-017.patch, DelBlocksInSafeMode-018.patch, DelBlocksInSafeMode-018.patch, DelBlocksInSafeMode.patch, DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Lohit Vijayarenu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lohit Vijayarenu updated HADOOP-3002:
-------------------------------------

    Fix Version/s: 0.19.0

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>             Fix For: 0.19.0
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624763#action_12624763 ] 

Hudson commented on HADOOP-3002:
--------------------------------

Integrated in Hadoop-trunk #581 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/581/])

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.17.2
>
>         Attachments: DelBlocksInSafeMode-017.patch, DelBlocksInSafeMode-018.patch, DelBlocksInSafeMode-018.patch, DelBlocksInSafeMode.patch, DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3002) HDFS should not remove blocks while in safemode.

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611387#action_12611387 ] 

dhruba borthakur commented on HADOOP-3002:
------------------------------------------

Hi Konstantin, I opened HADOOP-3709 to document the lock-hierarchy violation in processing heartbeats. The goal is to not acquire the global FSNamesystem lock to process every heartbeat. Maybe the patch you provide in this patch already fixes HADOOP-3709.

> HDFS should not remove blocks while in safemode.
> ------------------------------------------------
>
>                 Key: HADOOP-3002
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3002
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.17.2, 0.18.0
>
>         Attachments: DelBlocksInSafeMode-018.patch, DelBlocksInSafeMode.patch
>
>
> I noticed that data-nodes are removing blocks during a rather prolonged distributed upgrade when the name-node is in safe mode.
> This happened on my experimental cluster with accelerated block report rate.
> By definition in safe mode the name-node should not
> - accept client requests to change the namespace state, and
> - schedule block replications and/or block removal for the data-nodes.
> We don't want any unnecessary replications until all blocks are reported during startup.
> We also don't want to remove blocks if safe mode is entered manually.
> In heartbeat processing we explicitly verify that the name-node is in safe-mode and do not return any block commands to the data-nodes.
> Block reports can also return block commands, which should be banned during safe mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.