You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Konstantin Shvachko (JIRA)" <ji...@apache.org> on 2007/10/19 23:58:51 UTC
[jira] Issue Comment Edited: (HADOOP-2073) Datanode corruption if machine dies while writing VERSION file

    [ https://issues.apache.org/jira/browse/HADOOP-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536354 ] 

shv edited comment on HADOOP-2073 at 10/19/07 2:58 PM:
-----------------------------------------------------------------------

> Does windows allow changing length when the file open?
Yes.

> It still leaves the problem with multiple data directories?
What is that problem? This is not intended to solve all reliability problems, just one.

> We exit if different VERSION files are inconsistent right?
For data-nodes inconsistent file values will cause an exception, for the name-node we choose the most recently updated directory.

> datanode could die while the file is being rewritten, or before the file is resized.
Yes, the perfect solution would be if we could write and resize in memory and then flush and close at once.
Here our problem is that Properties.store() writes data and flushes. So if data-node dies at this point version file will have extra data
if the new data size is less then the old one. The only extra data we write in version file now is related to distributed upgrades.
Even if the upgrade fields remain in the version file the data-node will restart and detect the upgrade has been completed.
So you never end up with the empty version file and we always have either the new or the old data in it, and never the mixture of the two.
The point is that although this approach does not work for arbitrary file modifications, it works for what we do with the version file.
Unless proven otherwise of course.

      was (Author: shv):
    > Does windows allow changing length when the file open?
Yes.

> It still leaves the problem with multiple data directories?
What is that problem? This is not intended to solve all reliability problems, just one.

> We exit if different VERSION files are inconsistent right?
For data-nodes inconsistent file values will cause an exception, for the name-node we choose the most recently updated directory.

> datanode could die while the file is being rewritten, or before the file is resized.
Yes, the perfect solution would be if we could write and resize in memory and then flush and close at once.
Here our problem is that Properties.store() writes data and flushes. So if data-node dies at this point version file will have extra data
if the new data size is less then the old one. The only extra data we write in version file now is related to distributed upgrades.
Even if the upgrade fields remain in the version file the data-node will restart and detect the upgrade has been completed.
So you never end up with the empty version file and that we always have either the new or the old data in it, and never the mixture of the two.
The point is that although this approach does not work for arbitrary file modifications, it works for what we do with the version file.

  
> Datanode corruption if machine dies while writing VERSION file
> --------------------------------------------------------------
>
>                 Key: HADOOP-2073
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2073
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Raghu Angadi
>         Attachments: versionFileSize.patch
>
>
> Yesterday, due to a bad mapreduce job, some of my machines went on OOM killing sprees and killed a bunch of datanodes, among other processes.  Since my monitoring software kept trying to bring up the datanodes, only to have the kernel kill them off again, each machine's datanode was probably killed many times.  A large percentage of these datanodes will not come up now, and write this message to the logs:
> 2007-10-18 00:23:28,076 ERROR org.apache.hadoop.dfs.DataNode: org.apache.hadoop.dfs.InconsistentFSStateException: Directory /hadoop/dfs/data is in an inconsistent state: file VERSION is invalid.
> When I check, /hadoop/dfs/data/current/VERSION is an empty file.  Consequently, I have to delete all the blocks on the datanode and start over.  Since the OOM killing sprees happened simultaneously on several datanodes in my DFS cluster, this could have crippled my dfs cluster.
> I checked the hadoop code, and in org.apache.hadoop.dfs.Storage, I see this:
> {{{
>     /**
>      * Write version file.
>      * 
>      * @throws IOException
>      */
>     void write() throws IOException {
>       corruptPreUpgradeStorage(root);
>       write(getVersionFile());
>     }
>     void write(File to) throws IOException {
>       Properties props = new Properties();
>       setFields(props, this);
>       RandomAccessFile file = new RandomAccessFile(to, "rws");
>       FileOutputStream out = null;
>       try {
>         file.setLength(0);
>         file.seek(0);
>         out = new FileOutputStream(file.getFD());
>         props.store(out, null);
>       } finally {
>         if (out != null) {
>           out.close();
>         }
>         file.close();
>       }
>     }
> }}}
> So if the datanode dies after file.setLength(0), but before props.store(out, null), the VERSION file will get trashed in the corrupted state I see.  Maybe it would be better if this method created a temporary file VERSION.tmp, and then copied it to VERSION, then deleted VERSION.tmp?  That way, if VERSION was detected to be corrupt, the datanode could look at VERSION.tmp to recover the data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.