You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org> on 2009/01/14 03:14:59 UTC

[jira] Created: (HADOOP-5027) Block report processing should compare gneration stamp

Block report processing should compare gneration stamp
------------------------------------------------------

                 Key: HADOOP-5027
                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
             Project: Hadoop Core
          Issue Type: Bug
          Components: dfs
            Reporter: Tsz Wo (Nicholas), SZE


If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664378#action_12664378 ] 

Raghu Angadi commented on HADOOP-5027:
--------------------------------------

> The fix is described in the description and Dhruba's comment. 

all right. 

>> Btw, the patch just ignores such a block with a warning..., should it also delete such a block?
>This case should not happen. If it happens, it indicates that there are bugs somewhere. Without knowing what is going on, it is better not delete the block.

adding something like "this is unexpected" to the message might make admins and users report it back, if they notice it.



> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-5027:
--------------------------------

    Fix Version/s:     (was: 0.19.1)
                   0.19.2

As discussed on core-dev@ (http://www.nabble.com/Hadoop-0.19.1-td21739202.html) we will disable append in 0.19.1.  Moving these append related issues to 0.19.2.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Priority: Blocker
>             Fix For: 0.19.2
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668104#action_12668104 ] 

Konstantin Shvachko commented on HADOOP-5027:
---------------------------------------------

May be it will work, but I would prefer to avoid complicating name-node logic if it can be handled by data-nodes.
My [comment|http://issues.apache.org/jira/browse/HADOOP-4663#action_12663891] proposes some alternatives.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-5027:
-------------------------------------------

    Attachment: 5027_20090114.patch

5027_20090114.patch: check generation stamps in block report processing.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664030#action_12664030 ] 

dhruba borthakur commented on HADOOP-5027:
------------------------------------------

This code looks good to me. There is already a unit test TestFleAppend2 that tests "append". Any clues on why this bug is not caught by that unit test?

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668136#action_12668136 ] 

Konstantin Shvachko commented on HADOOP-5027:
---------------------------------------------

I see at least 6 currently open issues, which seem to be different, but actually being about the same problem.
HADOOP-4663
HADOOP-5027 (states 2 problems)
HADOOP-5133
HADOOP-4379
HADOOP-4692
The problem is *consistent handling incomplete blocks* or another way to say it *consistent handling block replication*.
Trying to fix the issues separately is imHo a mistake.
I am advocating to step back, get a general view, design a proposal, make sure it works and then fix issues.
Gradually changing the code may destabilize the main hdfs functionality, which is not related to synchs or appends.
Although I am not personally in favor of gradual changes in this particular case, you can still go this path 
e.g. in a separate branch with a potential to merging it back to trunk when it is done.
I think people should have a choice to sacrifice certain functionality in favor of reliability if they want to.
Gradual changes close this alternative.
I'll link this comment from the related issues.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Robert Chansler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chansler updated HADOOP-5027:
------------------------------------

    Priority: Major  (was: Blocker)

Demoted: 20 shouldn't have to wait for this.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.2
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-5027:
-------------------------------------

    Fix Version/s: 0.19.1

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664355#action_12664355 ] 

szetszwo edited comment on HADOOP-5027 at 1/15/09 5:34 PM:
-------------------------------------------------------------------------

> May be the problem is not with the block report, but rather with nextGenerationStamp(Block), ...

nextGenerationStamp(Block) should not update the block information in the NameNode since *the update may fail*.  If the NameNode block gs is updated but the DataNode block gs is not and the update fails, the the block in the DataNode will be considered as invalid.

> I also would like to see a unit test catching this bug.

Sure, working on one.

      was (Author: szetszwo):
    > May be the problem is not with the block report, but rather with nextGenerationStamp(Block), ...

nextGenerationStamp(Block) should not update the block information in the NameNode since *the update may fail*.  If the NameNode block gs is updated but the DataNode block gs is not and the update fails.  The the block in the DataNode will be consider as invalid.

> I also would like to see a unit test catching this bug.

Sure, working on one.
  
> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664372#action_12664372 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5027:
------------------------------------------------

> I meant all the changes go through NN.

It is currently going through NN.  BTW, you may want to take a look the javadoc in LeaseManager.  It describes how the generation stamps are updated.

> Note that a patch was attached but the fix is not described anywhere in the jira. Finally I read the patch.

The fix is described in the description and [Dhruba's comment|https://issues.apache.org/jira/browse/HADOOP-5027?focusedCommentId=12663862#action_12663862].  If you need every detail of the fix, I think reading the patch is the best way.  I am glad that you have already done so.

> Looking at the patch now, it looks like you do (correctly) consider it an error if reported gs is larger than stored gs.From the description of the jira, I thought NN might consider new reported gs as the correct one. 

We are considering the case that the blocks are being written.  Sorry that it is not clear.

> Btw, the patch just ignores such a block with a warning..., should it also delete such a block?

This case should not happen.  If it happens, it indicates that there are bugs somewhere.  Without knowing what is going on, it is better not delete the block.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664356#action_12664356 ] 

Raghu Angadi commented on HADOOP-5027:
--------------------------------------

> The client data-node should call name-node to obtain a new generation stamp any way, I believe this what Raghu meant?

Yes. I thought all the gen-stamp changes are done by NN. 

It is a bit scary if generation can be changed by clients and datanodes. That would be very different from rest of HDFS metadata management.


> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667771#action_12667771 ] 

dhruba borthakur commented on HADOOP-5027:
------------------------------------------

>When I open a file for append the last block of the file is copied from the main storage to a tmp directory to be tre

The block also gets a new generation stamp. 

> And the name-node should mark it as under-replicated triggering replication process

Namenode does not trigger re-replication for blocks that belong to files that are being written (or appended to).

Let me try to explian this with an example. Suppose a file has one block blk_10_25. The generation stamp of this block is 25. Suppose this block is not full and is the only block in the file. Suppose a client wants to append a few bytes to this file. The client instructs the datanode to stamp this block with a generation stamp of 26. The block now becomes blk_10_26 and is moved from the real block directory to "blockBeingWritten" directory (see HADOOP-4663). The namenode metadat is also updated to record the block for this file to be blk_10_26. Block reports from this datanode will not have the block blk_10_25, but this is not a problem blk_25_10 is not a valid block anymore. These details are explained in th design doc for appends (HADOOP-1700).

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664362#action_12664362 ] 

Raghu Angadi commented on HADOOP-5027:
--------------------------------------

> What do you mean "done by NN"?

I meant all the changes go through NN. 

Note that a patch was attached but the fix is not described anywhere in the jira. Finally I read the patch.

Looking at the patch now, it looks like you do (correctly) consider it  an error if reported gs is larger than stored gs.From the description of the jira, I thought NN might consider new reported gs as the correct one.  Btw, the patch just ignores such a block with a warning..., should it also delete such a block?



> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665346#action_12665346 ] 

dhruba borthakur commented on HADOOP-5027:
------------------------------------------

When a client is creating a block for the first time, the block is not being replicated by the NN. Only when the client is done writing to that block and the block is finalized does the NN start to replicate the block (if needed).

The current design is to apply the above design to pre-existing block that is being opened for "append". The NN will not replicate this block until the appending-client is completely done with that block.

I agree with Rob that the availability guarantees of a block being re-opened for write is not the same as a block that nobody is writing to. But "append" is a new feature and HADOOP-1700 states that this guarantee is acceptable. Rob's request is valid, but we can address that as an improvement over the current design and attack it in a seperate JIRA. Does that sound reasonable?

This patch is imperative to make "appends" work as designed for 0.19. 

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663875#action_12663875 ] 

Raghu Angadi commented on HADOOP-5027:
--------------------------------------

Doesn't NN already know what the latest genstamp should be? All the genstamp changes are made by NN.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664415#action_12664415 ] 

dhruba borthakur commented on HADOOP-5027:
------------------------------------------

I like Raghu's comment to make the warning message something like "This is an Unexpected error, please report back to  the Hadoop development community" or something like that,

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664774#action_12664774 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5027:
------------------------------------------------

As suggested by Rob, if the reported block is ignored in NN, the data available guarantee will not be as good as before.   Specifically, the existing data stored in the last block being reopened may not be replicated.  I think we need to work on the design first.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-5027:
-------------------------------------

    Priority: Blocker  (was: Major)

I would like to add this to the Blocker list so that we can get a fix for this one in the next release of 0.19.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667782#action_12667782 ] 

Konstantin Shvachko commented on HADOOP-5027:
---------------------------------------------

> Block reports from this datanode will not have the block blk_10_25

The block report from the data-node will not have neither blk_10_25 nor blk_10_26 or any block with id 10.
But the name-node has this block in its block map. It will compare the new block report with what it thinks the data-node should have and will find out that the data-node does not have this replica anymore see reportDiff(), and will remove that replica location. Do I miss anything? 
I am looking at the code: NN calls {{removeStoredBlock()}} for this block, which removes it from the {{DatanodeDescriptor}} and calls {{updateNeededReplications()}}.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664351#action_12664351 ] 

shv edited comment on HADOOP-5027 at 1/15/09 5:29 PM:
----------------------------------------------------------------------

May be the problem is not with the block report, but rather with {{nextGenerationStamp(Block)}}, which does not update generation stamp for the block. The -client- _data-node_ should call name-node to obtain a new generation stamp any way, I believe this what Raghu meant? So why do we not changing the gen stamp of the block.
I also would like to see a unit test catching this bug.

      was (Author: shv):
    May be the problem is not with the block report, but rather with {{nextGenerationStamp(Block)}}, which does not update generation stamp for the block. The client should call name-node to obtain a new generation stamp any way, I believe this what Raghu meant? So why do we not changing the gen stamp of the block.
I also would like to see a unit test catching this bug.
  
> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663605#action_12663605 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5027:
------------------------------------------------

This is not a problem if both sync and append are not used.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663862#action_12663862 ] 

dhruba borthakur commented on HADOOP-5027:
------------------------------------------

A block-report processing should ignore blocks if it is the last block in an file that is under construction. 

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664355#action_12664355 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5027:
------------------------------------------------

> May be the problem is not with the block report, but rather with nextGenerationStamp(Block), ...

nextGenerationStamp(Block) should not update the block information in the NameNode since *the update may fail*.  If the NameNode block gs is updated but the DataNode block gs is not and the update fails.  The the block in the DataNode will be consider as invalid.

> I also would like to see a unit test catching this bug.

Sure, working on one.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664351#action_12664351 ] 

Konstantin Shvachko commented on HADOOP-5027:
---------------------------------------------

May be the problem is not with the block report, but rather with {{nextGenerationStamp(Block)}}, which does not update generation stamp for the block. The client should call name-node to obtain a new generation stamp any way, I believe this what Raghu meant? So why do we not changing the gen stamp of the block.
I also would like to see a unit test catching this bug.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664369#action_12664369 ] 

Konstantin Shvachko commented on HADOOP-5027:
---------------------------------------------

> nextGenerationStamp(Block) should not update the block information in the NameNode since the update may fail.

The update has already failed, that is why you do the recovery in the first place, right? So NN already may have a wrong generation stamp. I still don't understand why nextGenerationStamp(Block) cannot change Block's gen stamp.

Another thing if you change {{blockReport}} to understand generation stamps, then the same should be done for {{blockReceived()}}.

I have always been under the impression that generation stamp plays role only during lease recovery, and that blockReport and blockReceived distinguish blocks by their {{id = <block#> + <genStamp>}}. I would rather keep it that simple if possible.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663895#action_12663895 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5027:
------------------------------------------------

> Doesn't NN already know what the latest genstamp should be?

When a block is being updated in lease recovery, some replicators might have a gs larger then the one stored in the NameNode.  If the update fails, the latest gs may not be committed to the NameNode.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664358#action_12664358 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5027:
------------------------------------------------

> I thought all the gen-stamp changes are done by NN. 

What do you mean "done by NN"?

> It is a bit scary if generation can be changed by clients and datanodes...

Clients cannot change gs.  They only can request a update.  Then, the selected primary datanode will request a generation stamp from the namenode.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667893#action_12667893 ] 

dhruba borthakur commented on HADOOP-5027:
------------------------------------------

I see your point. Thanks.

One solution may be to make reportDiff() add a block to the "toRemove" data structure only if it belongs to a file that is not under construction. Will that work?

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5027) Block report processing should compare gneration stamp

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667731#action_12667731 ] 

Konstantin Shvachko commented on HADOOP-5027:
---------------------------------------------

Since this is about block reports in the context of appends the following question should be appropriate here. May be also related to "tmp" directories discussion in HADOOP-4663.
When I open a file for append the last block of the file is copied from the main storage to a tmp directory to be treated as if it is just being created. Then on the next blockReport since the block is not in the main storage it will not be reported to the name-node. And the name-node should mark it as under-replicated triggering replication process. Replication will not be able to proceed as all replicas on other machines are in the same state, that is in tmp directory. Is that a problem?
The main problem here I think is that the name-node removes the location of the replica if it was not reported during a block report. If all 3 nodes report that they don't have the block, the block will become not readable, that is the name-node cannot give clients any locations of the block.

> Block report processing should compare gneration stamp
> ------------------------------------------------------
>
>                 Key: HADOOP-5027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>             Fix For: 0.19.1
>
>         Attachments: 5027_20090114.patch
>
>
> If a reported block has a different generation stamp then the one stored in the NameNode, the reported block will be considered as invalid.  This is incorrect since blocks with larger generation stamp are valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.