You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Raghu Angadi (JIRA)" <ji...@apache.org> on 2008/11/15 01:56:44 UTC

[jira] Created: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Datanode should delete files under tmp when upgraded from 0.17
--------------------------------------------------------------

                 Key: HADOOP-4663
                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
             Project: Hadoop Core
          Issue Type: Bug
    Affects Versions: 0.18.0
            Reporter: Raghu Angadi
            Assignee: dhruba borthakur
            Priority: Blocker
             Fix For: 0.18.3



Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :

- remove the tmp files during upgrade, or
- if the files under /tmp are in pre-18 format (i.e. no generation), delete them.

Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.

Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-4663:
-------------------------------------

    Attachment: deleteTmp.patch

Here is the first version of the patch. 

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668637#action_12668637 ] 

Hairong Kuang commented on HADOOP-4663:
---------------------------------------

I still feel very uncomfortable about prompting blocks under "blocksBeingWritten". Basically this changes the state of these blocks. DataNode does not know that these blocks are being written any more. In the most recent patch to HADOOP-4692, depending on the state of blocks, block replication takes different behavior. It may wrongly delete on-disk block if its length does not match the NN recorded length if a being written block is prompted to be a permanent block.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-4663:
-------------------------------------

    Attachment: appendQuestions.txt

This document appendQuestions.txt attempts to answer some questions related to the "Append" feature for HDFS.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650078#action_12650078 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

Hi Raghu, do you have any feedback on the patch I posted earlier? Thanks

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661690#action_12661690 ] 

Konstantin Shvachko commented on HADOOP-4663:
---------------------------------------------

Creating multiple "tmp" directories seems to be an indication of the design flaw.
The question is why do we need to place the blocks that will be recovered to the main storage no matter what in a temporary directory instead of storing them in the main storage from the very beginning?
Also, I don't see any "proof of correctness" here. How do we know this will not lead to the same or other problems the way the initial code did?

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661738#action_12661738 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

The namenode always discards replicas that are smaller in size than other replicas. It always keeps the largest-size replica as the valid one. So, can this bug affect data integrity? 

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661732#action_12661732 ] 

Konstantin Shvachko commented on HADOOP-4663:
---------------------------------------------

> In fact, the blocks here are not temporary any more.

This is the key. They are not temporary so why do we keep them in tmp directory? Why DNs cannot report them to the name-node? It is just a matter of reporting the right length, right?

> disable "sync" on 0.18.3, will be an API incompatiblity from 0.18.2.

This is not incompatibility. The feature does not work correctly in 0.18.2 and it will still not work correctly in 0.18.3 with the advantage of not causing all the problems.
I think removing everything under tmp as Raghu proposes is the right solution for 0.18.3.
We should talk about the "real" fix in terms of 0.19 and up. This seems to be a consensus among colleagues around me.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662104#action_12662104 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

filed HADOOP-4997 for temporary work around. 

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668266#action_12668266 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

An offline discussion with Hairong resulted in detecting that a special purpose block report (for blocks in blocksBeingWritten directory) needs to be sent by the datanode at start-up time. The Namenode has to process this report specially: it should not insert these blocks into blocksMap, instead it should update the targets of the last blocks for filesUnderConstruction.

Given the "special" needs of the above, I think it is better if we promote the blocks from "blocksBingWriten" directory to the main directory (after matching/truncating sizes of block files and their crc files) into the main data directory. 

So, I propose that we do the following:
At start-up, the DN matches the blocks in the "blocksBeingWritten" directory with their meta files. If the size do not match, then the datafile is truncated to match the length described by the CRCs in the metafile. This ensures that this block is likely to be a valid one. Then, these blocks are promoted to the main block directory. (The generation-stamp-protocol will detect inconsistent replicas during lease recovery)

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Robert Chansler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chansler updated HADOOP-4663:
------------------------------------

    Fix Version/s:     (was: 0.18.3)
                   0.19.1

Large consensus (Dhruba, Nicholas, Sanjay, Konstantin, Rob) that this is not appropriate for 18.3. Hadoop:4997 will substitute for this in 18.3.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668137#action_12668137 ] 

Konstantin Shvachko commented on HADOOP-4663:
---------------------------------------------

[See related comment here|https://issues.apache.org/jira/browse/HADOOP-5027#action_12668136]

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668203#action_12668203 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

@Raghu: The hard-limit-lease-timeout is currently one hour. The NN will start lease recovery after 1 hour. The lease recovery process locates the right replicas (with the same size) bumps up the generation stamp on all these datanode(s), and persists the new block id (that has the new generation stamp) into the Inode. Since the inode map now has a new block id (because of the change in gen stamp), all old replicas that have the old generation stamp do not belong to any inode. If such a old block checks in with the NN via a block report, it will get deleted. Does it sound right?

@Hairong: I get your proposal. I am starting to like it a lot!
>When a DataNode starts up, it reads blocks under tmp and put them to OngoingCreates data structure
Do we really need to do this? Even if we leave the blocks in the tmp directory, it will be ok isn't it? A block report won't contain this block. But how does it matter because the NN-blockreport-processing will always ignore the last block of a file that is under construction. Now, when lease recovery occurs, this block could move from the tmp directory to the real block directory.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669101#action_12669101 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

I have been thinking of Hairong's suggestion as well. I agree that it makes sense for the blocks inside "blocksBeingWritten" directory to not be auto-promoted, but instead make lease recovery promote them on demand. The only problem I had with this approach is that a "special" block report is needed. 

What if we have a block report always have two counters up front? the first counter will list the number of normal blocks in the block report. the second counter will have the number of blocks in the block report that are being picked up from the "blocksBeingWritten" directory? The NN, while processing a block report, will first look at the first counter and process those many blocks from the block report as usual. Then it will look at the second counter and will special-process those many blocks from the block report. 

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651101#action_12651101 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-4663:
------------------------------------------------

Hi Raghu, I did not following this issue close enough.  Why "datanode to not know which files under tmp are invalid since 0.18"?

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661673#action_12661673 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

A few comments :

-  since only the blocks that were part of writes could be valid, it it better to create them in "tmp_valid" or "tmp_writes" directory?
- The rest of the blocks in tmp will be deleted automatically. This fixes the rebalancer issue as well. We can not leave any possible corruptions.

- It is very crucial that any files left over should not cause corruption that we have already seen. 
    -- Simpler and better alternative for 0.18.3 might be to just say sync() is not supported and remove everything under tmp (same as 0.17).
    -- This way it gives more time this jira to make sure there are no cases where it could cause corruption. 

- The patch move all the files to top level directory. But datanode data directories are organized to limit each directory to 64 blocks and 64 subdirectories. The current policy violates that. Overtime datanodes could end up having very large number of files in top level directory.
   -- But this is not very important to fix in 0.18.. but it should be fixed in 0.19 or another line. A seperate jira might help.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649157#action_12649157 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

Some description of what the patch does would be helpful for review.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661788#action_12661788 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

+1 from me for 0.18.3 fix. We might need to disable one or two unit tests.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661809#action_12661809 ] 

Konstantin Shvachko commented on HADOOP-4663:
---------------------------------------------

May be it makes sense to create a separate jira for the fix for 0.18.3 so that we could close it when its done and continue the discussion about the real fix here. Or vice versa.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668255#action_12668255 ] 

dhruba edited comment on HADOOP-4663 at 1/28/09 4:16 PM:
-------------------------------------------------------------------

An offline discussion with Raghu resulted in this proposal (slight modification to Hairong's proposal):

The Datanode, on startup, verifies the length of each block in the "blocksBeingWrtitten" directory with their corresponding meta files lengths (and truncates block file if necessary to match meta file). It inserts these blocks in ongoingCreates.It then leaves those blocks for the lease recovery process to prompt them.  A block report does not contain this block, but it is ok. The NN block report processing always ignores the last block of a file under construction. 

      was (Author: dhruba):
    An offline discussion with Raghu resulted in this proposal (slight modification to Hairong's proposal):

The Datanode, on startup, verifies the length of each block in the "blocksBeingWrtitten" directory with their corresponding meta files lengths (and truncates block file is necessary to match meta file). It inserts these blocks in ongoingCreates.It then leaves those blocks for the lease recovery process to prompt them.  A block report does not contain this block, but it is ok. The NN block report processing always ignores the last block of a file under construction. 
  
> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664338#action_12664338 ] 

Konstantin Shvachko commented on HADOOP-4663:
---------------------------------------------

I think the assumption Dhruba is making about my motivation is not true.
The problem is that the proposed *solution does not work*. 
And I said that in the first line of that long comment above.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650871#action_12650871 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

The other bug is HADOOP-4702.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649170#action_12649170 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

When a Datanode restarts, it attempts to move all file from the "tmp" directory to the block directory. 

What this patch does: If a file does not have a generation stamp associated with it, then it is not moved. The generation stamp is extracted form the corresponding metafile. The metafile could be in the tmp directory itself or it could already be in the block directory.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651105#action_12651105 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

> Why "datanode to not know which files under tmp are invalid since 0.18"? 
It just does not. Thats why it moves them to normal directory (this jira and HADOOP-4702), and incorrectly 'recovers' (HADOOP-3574) etc. 

Why? Because it does not have anything to distinguish between valid and invalid. Note that before 0.18 everything there is invalid.. Since you guys implemented it, it is better for you to explain :-), I guess.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668208#action_12668208 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

bq. @Raghu: The hard-limit-lease-timeout is currently one hour. The NN will start lease recovery after 1 hour. The lease recovery process locates the right replicas (with the same size) bumps up the generation stamp on all these datanode(s), and persists the new block id (that has the new generation stamp) into the Inode. Since the inode map now has a new block id (because of the change in gen stamp), all old replicas that have the old generation stamp do not belong to any inode. If such a old block checks in with the NN via a block report, it will get deleted. Does it sound right? 

>From the scenario I outlined above, will it work? All the blocks have same gen-stamp and have (slightly) different lengths. For the given scenario above, could you tell how it ends up with a valid block? Sorry you might have already answered in above comment.. but I could not easily see it.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668232#action_12668232 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

> The primary datanode then stamps all those "valid" replicas with a new generation stamp and updates the namenode inode list with this new generation stamp for this block.

Thanks that is useful. I am still trying deduce whether you are implying there will be a corruption or a valid replica.

As I see it, in this case there will be one replica with 'x+5' bytes with the new gen-stamp. But given that checksum does not match (no sync to native filesystem is done, all three datanodes are killed while they are in wrte() system call), looks like we end up with corruption.. right? I am missing something?

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648279#action_12648279 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

Block verification failure was actually a fortunate side effect that detected this bug. It is not a fix.  Marking invalid blocks valid looks to me like a real data corruption and should not be allowed at all.


> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Sanjay Radia (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669096#action_12669096 ] 

Sanjay Radia commented on HADOOP-4663:
--------------------------------------

I really like Hirong's suggestion ( https://issues.apache.org/jira/browse/HADOOP-4663?focusedCommentId=12668127#action_12668127) to  *keep* the DN tmp blocks as ongoingCreates and send the special BR to the NN. This is symmetric to the  inodes-under-construction and lease recovery of the NN.


The main issue I had with the older approaches was the inconsistency:
*   we added a notion of  Tmp because we didn't want to send these block as part of a BR to the NN,
*  but if the DN restarted all block in tmp are moved to main directory and included in  the BR anyway. 

Hairong's suggestion keeps the semantics of tmp the same across reboots of DN.
This is  very clean even though it adds additional code and a new "block under cons" BR.
Furthermore it allows us to verify that these blocks match those under construction on the NN side. 
Our past attempts at sync/append and at fixing this bug have  been unsuccessful because I think we were trying to
be too clever. 

The problem I have with Dhruba's suggestion is that it retains the inconsistency I mention above and somehow appears 
to be trying to avoid the special BR. If on a reboot , the DN
moves some blocks from tmp to main (after doing the validations Dhruba suggested), why have them in tmp in the first place?

One could consider not sending this special BR at all ever. This does not work because  the blocks in tmp may not 
every get cleaned in some circumstances, For example,  if  a NN is restarted from 
an older fsimage, the tmp files in the DNs will never be removed. 

So +1 for Hairong's suggestion.


> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652485#action_12652485 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

Regd the patch : 

- Did you test the patch?
- It looks like it still moves meta files to data directory (removes the actual block file).
- Not sure why we check datanode's data directory for metadata file.. even if it exists, it would mostly be in some sub directory under datadir.
     -- Related to the above : from the logic it looks like this might leave more than one metadata file in 

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662154#action_12662154 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------


Micheal Stack from HBase confirmed that HBase on 0.18 does not use sync.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665493#action_12665493 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------


> The data corruption you have seen occured because the generation-stamp-update-procotol is not triggered 
> during a block transfer request. This patch correctly handles block-trasfer-requests and should prevent the data 
> corruption issue from occuring.

Many different types of data corruption occurred recently with 0.18.. mainly because of combination of bugs. 

The corruptions caused by  the issue in this jira has little to do with generation stamp for transfers. Primary cause is this :

# DN promotes all files created in 0.17  from /tmp directory that it should never have done.
# When it moved the files it did not generate a gen stamp for metadata files.
# DN reports those blocks as valid to NN.
# Later DN marks these files as corrupt since there is no metadata.

The above is one of the biggest source of corruptions. There were various other bugs that contributed, many of these were fixed in 0.18.3. 


> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661698#action_12661698 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

When a client is writing to a block on the datanode, it is good to create it in the "tmp" directory instead of the real block directory. This means that the datanode has a record that these blocks are not yet finalized and might need some recovery. In the typical case, existence of a block in the "tmp" directory means that the block is not yet confirmed to the namenode. In the current code, it is the client that actually triggers the recovery. 

Like Raghu proposed, it might be a good idea to name the original "tmp" directory as "tmp_writes". In fact, the blocks here are not temporary any more. It can be named as "blocks_inflight_dir" or something like that. However, if we change it now, won't we need an upgrade  that can handle the case when the "tmp" directory existed?

@Raghu: if we disable "sync" on 0.18.3, that will be an API incompatiblity from 0.18.2. Is that acceptable?



> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-4663:
-------------------------------------

    Attachment: deleteTmp_0.18.patch

Patch for 0.18. It throws an exception from the fsync() call and deletes all blocks from the "tmp" directory of the datadir. 

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668255#action_12668255 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

An offline discussion with Raghu resulted in this proposal (slight modification to Hairong's proposal):

The Datanode, on startup, verifies the length of each block in the "blocksBeingWrtitten" directory with their corresponding meta files lengths (and truncates block file is necessary to match meta file). It inserts these blocks in ongoingCreates.It then leaves those blocks for the lease recovery process to prompt them.  A block report does not contain this block, but it is ok. The NN block report processing always ignores the last block of a file under construction. 

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-4663:
-------------------------------------

    Attachment: handleTmp1.patch

This patch creates all blocks that are part of client write request to be stored in blocksBeingWritten directory. Blocsk that are part of replication requests and created in blocksBeingReplicated directory. 

A datanode restarts removes all blocks from the "tmp" and blocksBeingReplicated directory and moves all blocks from the blocksBeingWritten directory into the real block directory.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666379#action_12666379 ] 

Konstantin Shvachko commented on HADOOP-4663:
---------------------------------------------

> Ok, so it is not about correctness, but rather a performance question.

It is about both. You did not answer my questions.
My point is that if the processing was correct than incomplete blocks whether they are generated by clients or during block replication would be handled correctly, but they were not.
So fixing the problem for a part of them (those generated during replication) will not solve the problem completely because the client generated incomplete blocks are still there.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674490#action_12674490 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

An offline discussion with Hairong, Sanjay, Rob Chansler, Raghu and Konstantin resulted in these observations. 'bbw" refers to "blocksBeingWritten" directory.

1. leave blocks in bbw directory even when data restarts. only when the block is finalized (when user closes the block or lease recovery occurs), does the block move to the real block directory.
2. first block report (following a datanode registration) sends all blocks (including blocks in bbw)
3. the block report processing on namenode  ignores blocks that are under construction
4. lease recovery should verify crc of block before they get promoted from bbw to real block directory
5. When lease recovery ocurs, the datanode should terminate writer-threads before returning length of block



> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.2
>
>         Attachments: appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651094#action_12651094 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------


Nicholas has more details on the problem and thinks HADOOP-3574 would fix it. But I am not sure this approach of fixing each of the side effects as they are detected is a good one.

It looks very error prone and wrong for datanode to not know which files under tmp are invalid since 0.18. do you not see that as a real problem?

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668156#action_12668156 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

@Hairong: Your proposal should mostly work.

But, I still do not understand the problem that can result while promoting blocks from "blocksBeingWritten" directory when datanode restarts. The generation-stamp-protocol takes care of distinguishing bad replicas from good replicas. 

The design is that the NN block report processing (i.e. addStoredBlock) should completely ignore blocks that are under construction.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661791#action_12661791 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

Yes, some unit tests have to be disabled for 0.18.

i am still worried that the fix for 0.18.3 causes an API change. Maybe hbase has  to use it. Although I made the patch for 0.18, I am -1 for putting it into the general 018 release.  I can put in config variables to disable "fsync" if needed.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668209#action_12668209 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

> If such a old block checks in with the NN via a block report, it will get deleted. Does it sound right? 
So we do lose that block. Isn't that a problem?

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668127#action_12668127 ] 

Hairong Kuang commented on HADOOP-4663:
---------------------------------------

Is it possible that DataNode leaves the blocks under tmp untouched at the startup time? Instead it leaves those blocks for the lease recovery process to prompt them. When a DataNode starts up, it reads blocks under tmp and put them to OngoingCreates data structure. It then reports them to NN. If NN sees a tmp block that is not the last block of an under-construction file, mark it as corrupt; Otherwise, this is really an under construction block and NN adds it to the targets set of the file. Later when the file's lease expires, NN will close the file and those blocks will be finalized.

The idea is to start DataNode from the same state when it was down. Prompting blocks at the startup time provides a possibility of polluting dfs data. 


> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668200#action_12668200 ] 

Hairong Kuang commented on HADOOP-4663:
---------------------------------------

Dhruba, for the idea that I proposed, we still preserve the data under tmp. Block finalization is delayed until its file's lease is expired. NN will initiate lease recovery and close the file. Also the proposal tries to restore data structures both at NN and DN side when dfs was down. So any synced data should be available to any reader if they were available to a reader before dfs was down. 

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648264#action_12648264 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

Actualy, on second thoughts, this does not seem to be a bug. For example, let's say that we have a cluster that is running 0.20 release of hadoop. When you restart a datanode, all files in the "tmp" directory need to be moved to the real block directory. The reason being that a FileSystem.sync() call demands that data once written to the datanode should persist. Thus, the problem you are seeing is not related to cluster upgrade.

≥these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.

Maybe we should try to improve the situation here. Is it possible to put the blocks that got moved from "tmp" to the real block directory at the head of the block-to-verify list so that their CRC validation occur first? 

Another optimization (that i do not particularly like) is that the client can tell the datanode if the client has invoked FileSystem.sync(). The Datanode can persist this information. At datanode startup time, the Datanode moves only those blocks that were marked as "synced" earlier.


> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668188#action_12668188 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

Hairong, I understand your point of view.  A bug in the generation-stamp protocol could cause data corruption.

The alternative is to not promote blocks to the main storage when the datanode restarts. In this case, the data that was written by the client to that datanode is lost. This is data written by an application, and it would be nice if HDFS can make every effort to recover this data instead of deleting it. This is true with most other file systems. For example, if an application writes some data to an ext3 file and then dies before closing/fsync it, the OS/FS does not delete the data that was cached in the OS pages. It makes every effort to persist it. If we can have similar semantics for HDFS, it will be a more powerful system, isn't it? An HDFS application that does not issue a "fsync", cannot rely on the fact that all the data it has written will be persisted, but as long as HDFS makes a good effort to keep all the data, that will be nice, isn't it? 

So, this issue all boils down to the tradeoff of having a "file system that never persists data unless the writer explicitly invoked a fsync"  verses  "complexity of the namenode thereby introducing buggy code". 

I can vote for one application that we run inhouse that would definitely like the behaviour where HDFS makes every effort to persist data (rather than invoking sync frequently). HBase can use this feature too (in the future) to recover HBase transactions that lie beyond the sync point (good to have, not a hard requirement).



> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668196#action_12668196 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

@Raghu. In your test case, the NN knows that the file is under construction. addStoredBlock() as well as a block report should not touch this block at all. The existence of a lease record implies that the NN has relinquished control of this file to the writer/datanode pair. Hence, the NN cannot make any decisions (corrupt block? replciate block?) etc for this block. 

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664640#action_12664640 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

Hi Konstantin, can you pl explain (sorry I was not able to understand it from your description even after reading it many many times) why the proposed solution and patch does not work?

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650706#action_12650706 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

I will review this today. Did you get a chance to file the bug on rest of the issues related to files under tmp directory? Could you link that bug to this one? I think these are serious issues since these are essentially data corruptions.


> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659907#action_12659907 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

Thanks Raghu. I will restart work on this one.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665356#action_12665356 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

In the "append world" (i.e 0.19), the semantics that we provide is that data written by a writer may be seen by other readers even if the writer has not invoked "sync". HDFS makes every effort (within reasonable cost limits) to persist data that is written to a file. Moreover, the "append" protocol uses the generation stamp of the block to accurately determine stale blocks, thus it is safe to promote blocks from the "blocksBeingWritten" directory to the real block directory.

The data corruption you have seen occured because the generation-stamp-update-procotol is not triggered during a block transfer request. This patch correctly handles block-trasfer-requests and should prevent the data corruption issue from occuring.




> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663958#action_12663958 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

I think the issue that Konstantin is raising is based on the assumption that data that is not "synced" by a client *should not* appear in a file. This assumption is not true as specified in the design document for Appends. The guarantee is that data that is written prior to a "sync" call will be seen by all new readers of the file.

Just like in any UNIX-y systems, data that is not "sync" can still be seen by other concurrent readers. 

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650877#action_12650877 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

I agree that we need to solve HADOOP-4702. 

>here is another case ease recovery succeeds one hour later on the datanode since DN thinks its temporary files are good.. resulting an another corrupt block.

Can you pl explain this scenario in greater detail? If the client encountered an error and exited, then the write was never completed. How did it create a corrupt block?

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665744#action_12665744 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

The append design explicitly states that the system should make every effort to persist data written to a file even is sync is not invoked. "sync" has some cost associated with it, and an application might not want to incur the cost at every write, but t would definitely like HDFS to make the best effort to persist that data it has written.

> it turned to be error-prone

The design is not error prone, it was the implementation that had some shortcomings. In particular, blocks that are part of replication requests should not have been promoted. This is fixed by this patch. If you are aware of other bugs in this area that can cause problems, it would be really nice if you can list it out or create a unit test to trigger that bug. I understand that the "append" protocol is complex, but when implemented correctly should be fool-proof.


>it adds performance overhead, when sync or append is not used.

are you concerned that renaming the block file from the blocksBeingWritten directory to the main block directory adds overhead to the write-pipeline? 

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668258#action_12668258 ] 

Hairong Kuang commented on HADOOP-4663:
---------------------------------------

This sounds good to me. In addition, DN still needs to report NN all blocks in the tmp directory and NN either instructs DN to delete a report tmp block if it is not under construction or adds it to the target set of the file inode it belongs to. I assume that NN does not persist "targets" of an under construction file.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661700#action_12661700 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

> @Raghu: if we disable "sync" on 0.18.3, that will be an API incompatiblity from 0.18.2. Is that acceptable?

I think that is ok. Of course we don't want to such a thing often but I am pretty sure most users would agree better to safe w.r.t data. At Y! no one is advised to use sync() and we are running with such a fix.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668204#action_12668204 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

Hairong's proposal:

DataNode leaves the blocks under tmp untouched at the startup time. Instead it leaves those blocks for the lease recovery process to prompt them. When a DataNode starts up, it reads blocks under tmp and put them to OngoingCreates data structure. A block  report does not contain this block, but it is ok. The NN block report processing always ignores the last block of a file under construction.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668218#action_12668218 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

> From the scenario I outlined above, will it work? All the blocks have same gen-stamp and have (slightly) different lengths. For the given scenario above, could you tell how it ends up with a valid block?

This is the core of the generation-stamp recovery logic. The primary datanode contacts each of the datanode(s) in the pipeline and retrieves the generation stamp and the length of each of the replicas. It picks as valid only those replicas that equals or exceeds the generation stamp stored in the NN. For each of these "valid" replicas, it picks only those replica(s) that have the smallest size. These are the "real valid" replicas of the block. The other replicas can be deleted. 

The primary datanode then stamps all those "valid" replicas with a new generation stamp and updates the namenode inode list with this new generation stamp for this block.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647888#action_12647888 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

Why not simply delete the block files that are not in 'genstamp' format? This does not depend on upgrade at all. This is how DN handles assigning default generation number initially.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665950#action_12665950 ] 

Konstantin Shvachko commented on HADOOP-4663:
---------------------------------------------

> The append design explicitly states that the system should make every effort to persist data

This is listed under section "The non-goals of this design are:"
Which design are we talking about anyway? The document attached to 1700 is 8 months behind the patch.

> In particular, blocks that are part of replication requests should not have been promoted.

Why? What makes them different from incomplete blocks that are a part of client creates? Same blocks.

> it adds performance overhead, when sync or append is not used.

I am concerned that incomplete blocks will be promoted, then sent (reported) to the name-node, then processed there and finally most of them will be removed. It's the name-node overhead which is a concern not the data-node.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647779#action_12647779 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

I would probably take the first approach of deleting the tmp files during the upgrade. Related JIRAs are HADOOP-3677 and HADOOP-2656

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648271#action_12648271 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

Dhruba, these files are invalid at least on 0.17,  and should not used next restart, no?

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-4663:
---------------------------------

    Component/s: dfs
    Description: 
Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :

- remove the tmp files during upgrade, or
- if the files under /tmp are in pre-18 format (i.e. no generation), delete them.

Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.

Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.


  was:

Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :

- remove the tmp files during upgrade, or
- if the files under /tmp are in pre-18 format (i.e. no generation), delete them.

Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.

Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.



> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665542#action_12665542 ] 

Konstantin Shvachko commented on HADOOP-4663:
---------------------------------------------

Append or not if we want files be visible by other clients there should be a logic which reads files in the tmp directory. And this is not related to promoting incomplete files. We do not have guarantees for the data to survive crashes if it has not been sync-ed even if another client has seen it.

> the "append" protocol uses the generation stamp to accurately determine stale blocks
> it is safe to promote blocks from the "blocksBeingWritten" directory

If the protocol worked correctly it would be safe to promote any incomplete blocks including those in blocksBeingReplicated but it wasn't.

I would rather not promote unsynced files because
# it turned to be error-prone
# it adds performance overhead, when sync or append is not used.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666358#action_12666358 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

> Which design are we talking about anyway? The document attached to 1700 is 8 months behind the patch.

The design document for Appends is still a valid document. It is true that the patch took a long time to develop.

> I am concerned that incomplete blocks will be promoted, then sent (reported) to the name-node, then processed there and finally most of them will be removed. It's the name-node overhead which is a concern not the data-node.

Ok, so it is not about correctness, but rather a performance question. I will run some tests on how much this can add to performance overhead. Will report my findings soon. The reason I like promoting blocks to the real directory (only when the datanode crashes) is because this is data that an application has written and I would rather save it than delete it. From my viewpoint, the system should make every effort to persist this data, rather than saying that "ok, you did not invoke sync, so you lose your data". (I remember a discussion with Sameer saying that it would be nice to have every new block allocation at the namenode  be persisted, and persisting the block list at the namenode is useless if the datanode anyways deletes blocks that were not closed).

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661802#action_12661802 ] 

Konstantin Shvachko commented on HADOOP-4663:
---------------------------------------------

If you don't throw an exception from fsync() then there is now api change. In this case fsync() will work if data-nodes/clients don't fail it's just some sync-ed data may not survive cluster restarts. So hbase people will be able to write there programs with fsync(), but it will be guaranteed to work when they upgrade to newer versions where this issue is going to be fixed.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648295#action_12648295 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

Yeah, I agree. In the general case, the generation stamp incompability will cause these invalid blocks to get deleted. But the 0.18 upgrade allocates a generation stamp of 0 for all old blocks. This is a bug that should be fixed. 

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650875#action_12650875 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

hmm.. these are all side effects of the basic problem :  Datanode does not know which files under /tmp are good and which are not. Before 0.18, all the files were bad. 

My preference would to fix this problem (in addition to HADOOP-4702), since it affects other things. There is another case where after an error, lease recovery succeeds one hour later on the datanode since DN thinks its temporary files are good.. resulting an another corrupt block. There could be more.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668171#action_12668171 ] 

Hairong Kuang commented on HADOOP-4663:
---------------------------------------

I do believe that the generation-stamp-protocol should work. But I do not like the idea of prompting blocks under the tmp directory. Blocks under tmp are under-construction. Blindly finalizing them explicitly introduces polluted blocks to the system and it completely depends on NN to clean them up. This design seems to me not clean. Any minor error on the NN side might cost a lot. Bugs like HADOOP-4810 caused Yahoo to lose quite amount of data and introduced problems like HADOOP-4692 and other problems that we could not identify the cause yet. I think it would be nice that DataNodes do not introduce pollution in the first place.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661742#action_12661742 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

I will make a patch for 0.18 as Raghu/Konstantin suggested that disables the sync API. Will port it shortly.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663042#action_12663042 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

Now that the 0.18.3 issues are handled by HADOOP-4997, let's discuss what we need to do for 0.19 and above. I propose that blocks that are created by client-writes be created in the "tmp" directory whereas blocks created by replcation requests be created in tmp_replication directory. On datanode restarts,  the blocks in the "tmp" directory are reclaimed whereas the blocks in tmp_replication directory are discarded.

The reason I propose to start client-generated blocks in the "tmp" directory (instead of the real block directory) is because these blocks are not yet confirmed to the namenode. They are still being written and ideally should not be included in any block report(s). 

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647803#action_12647803 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

recoverTransitionRead invokes doUpgrade(). I plan to put in a check in doUpgrade() to remove the tmp directories from all storage directories. It is not very elegant, because doUpgrade() now needs to know that each storage dir contains a "tmp" directory, if anybody has any other suggestion for code layout, please let me know.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-4663:
-------------------------------------

    Attachment: deleteTmp2.patch

When a datanode receives a replication request, it starts the block in a separate directory. These blocks will be cleaned up when the datanode restarts. The general algorithm is as follows:

1. When a datanode receives a block from a client, it starts the block in datadir/tmp directory.
2. When a datanode receives a block as part of a replication request, it starts the block in datadir//tmp_replication directory
3. When a block is finalised, it moves from any of the above two temporary directories to its final location.
4. When a datanode restarts, it moves all the blocks from datadir/tmp directory to their final location. These blocsk were part of writes by clients and the client-namenode-genstamp protocol will ensure that the most-uptodate replicas of the block will be maintained.
4. When a datanode restarts, the blocks in datadir/tmp_replication directory are removed. No recovery is needed for these blocks.

The question that remains is what do we need to do for blocks that are moved as part of rebalancer? I think the blocks that rebalancer moves should go into tmp_replication as well. 


> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-4663:
--------------------------------

    Fix Version/s:     (was: 0.19.1)
                   0.19.2

As discussed on core-dev@ (http://www.nabble.com/Hadoop-0.19.1-td21739202.html) we will disable append in 0.19.1.  Moving these append related issues to 0.19.2.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.2
>
>         Attachments: appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667953#action_12667953 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

>will not solve the problem completely because the client generated incomplete blocks are still there

@Konstantin: it appears that you feel that applying this patch might still leave a hole. I am unable to visualize a scenario where this could cause a bug. I think the generation-stamp-protocol is adequate. I would really appreciate it if you can write a unit test that can demonstrate this bug (on trunk + the patch associated with this JIRA).

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662146#action_12662146 ] 

Owen O'Malley commented on HADOOP-4663:
---------------------------------------

For 0.18.3, we absolutely can *not* throw on fsync. It would be much better to silently do nothing. There are too many applications where throwing would break them.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665224#action_12665224 ] 

Konstantin Shvachko commented on HADOOP-4663:
---------------------------------------------

Hi Dhruba. There were several issues (we actually had a data loss) caused by promoting blocks from tmp to main storage. One of them is HADOOP-4702, which was probably when you were on vacation.
The problem was that partial blocks were promoted to the main storage even though they were transient.
I am arguing that your solution does not eliminate this condition. blocksBeingWritten can still contain transient blocks and they will still be promoted to the main storage. A simple example is when you write a block (without using sync()) and the DN fails in the middle leaving an incomplete (transient) block in blocksBeingWritten directory. This block will be moved to the main storage upon DN restart although it should be treated exactly as the blocks in blocksBeingReplicated directory are, because neither the block nor a part of it was ever finalized.
Does it make more sense?

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668177#action_12668177 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

@Dhruba, could you check if following analysis of a possible corruption correct? :

Say a block is being written and gen stamp is consistent across all three datanodes (common case) :

 * Say block sizes after a cluster restart are : x+5, x+10, and x+15 (on three datanode respectively). But this does not mean checksum file is correct since DataNode could be killed any time even OS could restart. 
 * When datanodes join the cluster, blocks on D1 and D2 will be deleted since they are smaller than block on D3. Later block on D3 will be reported as corrupt since Checksums don't match. This is hard corruption. 
 * Note what we will lose any data on the block that was synced earlier.  

> I am unable to visualize a scenario where this could cause a bug. 

It does not imply it is correct. The most important job of HDFS is keep the data intact. It should take priority over new features or scheduled. IMHO The current approach of _"it is correct until proven wrong"_ does not really suit for critical parts. For e.g. couple of months back there were no known corruption issues.. but we later saw many such issues. 

I am not saying we can prove everything. But we should be conservative and do only what is already known to be correct. In this case, the block files are valid only until the last sync.. so truncating to last sync (or to some known to be good length) is better.



> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663891#action_12663891 ] 

Konstantin Shvachko commented on HADOOP-4663:
---------------------------------------------

I think the proposed approach does not solve the problem.
It improves in a sense current state but does not eliminate the problem completely.

The idea of promoting blocks from "tmp" to the real storage is necessary to support sync(). We are trying to cover the following sequence of events:
# a client starts writing to a block and says sync();
# the data-node does the write and the sync() and then fails;
# the data-node restarts and the sync-ed block should appear as a valid block on this node, because the semantic of sync demands the sync-ed data to survive failures of clients, data-nodes and the name-node.

It seams natural to promote blocks from tmp to the real storage during startup, but this caused problems because together with the sync-ed blocks the data-node also promotes all other potentially incomplete blocks from the tmp.

One source of incomplete blocks in tmp is the internal block replication initiated by the name-node but not completed on the data-node due to a failure. 

(I) The proposal is to divide tmp into 2 directories 
"blocks_being_replicated" and "blocks_being_written". This excludes partially replicated 
blocks from being promoted to the real storage during data-node restarts.

But this does not cover another source of incomplete blocks, which is the regular block writing by a client. It also can fail, remain incomplete, and will be promoted to the main storage during data-node restart.
The question is why do we not care about these blocks? 
Why they cannot cause the same problems as the ones that are being replicated?

Suppose I do not use sync-s. The incomplete blocks will still be promoted to the real storage, will be reported to the name-node and the name-node will have to process them and finally remove most of them. Isn't it a degradation in performance.

(II) I'd rather consider dividing into two directories one having "transient" and another having "persistent" blocks. The transient blocks should be removed during startup, and persistent should be promoted into the real storage. Once sync-ed a block should be moved into the persistent directory.

(III) Another variation is to promote sync-ed (persistent) blocks directly to the main storage. The problem here is to deal with blockReceived and blockReports, which I feel can be solved.

(IV) Yet another approach is to keep the storage structure unchanged and write each sync-ed (finalized) portions of the block into main storage by appending that portion to the real block file. This will require an extra disk io for merging files (instead of renaming) but will let us discard everything that is in tmp as we do now.

What we are trying to do with the directories is to assign properties to the block replicas and make them survive crashes. Previously there were just 2 properties final and transient. So we had 2 directories: tmp and main. Now we need a third property, which says the block is persistent but not finalized yet. So we tend to introduce yet another directory. I think this is too straightforward, because there is plenty of other approaches to implement boolean properties on entities.
Why we are not considering them?

Also worth noting this issue directly affects appends, because a replica being appended is first copied to the tmp directory and then treated as described above.


> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663850#action_12663850 ] 

dhruba borthakur commented on HADOOP-4663:
------------------------------------------

I had an offline discussion with Sanjay, Rob Chansler, Nicholas and partly with Konstantin. Here is the summary:

This JIRA will go into 0.19. (For 0.18.3, the equivalent work will be done via HADOOP-4997).

The proposal is that blocks that are created by client-writes be created in the "blocks_being_written" directory whereas blocks created by replication requests be created in "blocks_being_replicated" directory. On datanode restarts, the blocks in the "tmp" directory and "blocks_being_replicated" directory are deleted whereas the blocks in "blocks_being_written" directory are recovered and promoted to the real block directory.


> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.3
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668198#action_12668198 ] 

Raghu Angadi commented on HADOOP-4663:
--------------------------------------

> In your test case, the NN knows that the file is under construction.

Assuming client is also restarted, when does the lease expire? Does the corruption happen after the expiry?

thanks.

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better. Right now there are probably one or two more bugs with it. Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.