You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2008/03/03 19:14:51 UTC

[jira] Created: (HADOOP-2926) Ignoring IOExceptions on close

Ignoring IOExceptions on close
------------------------------

                 Key: HADOOP-2926
                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
             Project: Hadoop Core
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.16.0
            Reporter: Owen O'Malley
            Assignee: dhruba borthakur
            Priority: Critical
             Fix For: 0.16.1


Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.

{code}
   try {
     file1.write(...);
     file2.write(...);
   } finally {
      IOUtils.closeStream(file);
  }
{code}

is *bad*. It must be rewritten as:

{code}
   try {
     file1.write(...);
     file2.write(...);
     file1.close(...);
     file2.close(...);
   } catch (IOException ie) {
     IOUtils.closeStream(file1);
     IOUtils.closeStream(file2);
     throw ie;
   }
{code}

I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574704#action_12574704 ] 

Raghu Angadi commented on HADOOP-2926:
--------------------------------------

> Then shouldn't we use it, instead of using IOUtils.closeStream() at all?

May be. I am not sure if we should enforce it every where. IMHO the readability becomes really bad when lot of streams are involved as in DataNode (2 sockets and their input out streams, and two file channels).



> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574651#action_12574651 ] 

dhruba borthakur commented on HADOOP-2926:
------------------------------------------

Yes, that is a good catch. Should be fixed for 0.16 as well.

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574753#action_12574753 ] 

cutting edited comment on HADOOP-2926 at 3/3/08 1:46 PM:
--------------------------------------------------------------

The standard idiom for multiple streams, as mentioned above, is nested try blocks, e.g.:

{noformat}
OutputStream out1 = fs.open(...);
try {
  OutputStream out2 = fs.open(...);
  try {
    out1.write(...);
    out2.write(...);
  } finally {
    out2.close();
} finally {
  out1.close();
}
{noformat}


      was (Author: cutting):
    The standard idiom for multiple streams, as mentioned above, is nested try blocks, e.g.:

{noformat}
OutputStream out1 = fs.open(...);
try {
  OutputStream out2 = fs.open(...);
  try {
    out1.write(...);
    out2.write(...);
  } finally {
    out2.close();
  }
  out1.close();
}
{noformat}

  
> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-2926:
-------------------------------------

    Fix Version/s:     (was: 0.16.1)

Looking at the code more closely, it appears that there isn't a bug that this patch addresses. The DatanOde correctly ignores exceptions. This issue is left open to address coding style-related issues. One suggestion is to make it log an error message when the close throws an exception. Another suggestion is to change the name of this method to closeIgnoreExceptions().

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>         Attachments: closeStream.patch
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575531#action_12575531 ] 

Raghu Angadi commented on HADOOP-2926:
--------------------------------------


Why is this a blocker? What is the specific bug that this patch fixes?

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>         Attachments: closeStream.patch
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575571#action_12575571 ] 

dhruba borthakur commented on HADOOP-2926:
------------------------------------------

This bug has the potential that errors-writing-to-block file go undetected. This can (theoretically) result in a higher probability of data corruption.

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>         Attachments: closeStream.patch
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574664#action_12574664 ] 

Doug Cutting commented on HADOOP-2926:
--------------------------------------

It would be good if folks can use standard Java i/o idioms with Hadoop.

http://www.ibm.com/developerworks/java/library/j-jtp03216.html

{noformat}
OutputStream out = fs.open(...);
try {
  out.write(...);
} finally {
  out.close();
}
{noformat}

When multiple files are involved the best thing is to nest the try blocks.

Shouldn't we try to make this idiom work well with HDFS?


> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575580#action_12575580 ] 

Raghu Angadi commented on HADOOP-2926:
--------------------------------------

Specifically, I am trying to see which part of the patch fixes the potential corruption.

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>         Attachments: closeStream.patch
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575597#action_12575597 ] 

Raghu Angadi commented on HADOOP-2926:
--------------------------------------


I don't think this is a fix for trunk either (this does not change even if streams need to be closed). 

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>         Attachments: closeStream.patch
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574699#action_12574699 ] 

Doug Cutting commented on HADOOP-2926:
--------------------------------------

> I am not sure why this would not work now..

Then shouldn't we use it, instead of using IOUtils.closeStream() at all?

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-2926:
-------------------------------------

    Status: Patch Available  (was: Open)

Triggering Hadoop QA tests.

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>         Attachments: closeStream.patch
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575576#action_12575576 ] 

Raghu Angadi commented on HADOOP-2926:
--------------------------------------

> This bug has the potential that errors-writing-to-block file go undetected. This can (theoretically) result in a higher probability of data corruption.

But every close() added in the patch is for a socket or socket stream. Strictly socket streams don't even need to be closed. Am I missing something?

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>         Attachments: closeStream.patch
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574670#action_12574670 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-2926:
------------------------------------------------

- The catch block in IOUtils.closeStream is currently empty.  We should at least log a message.

- BTW, IOUtils.closeSocket has a similar problem.

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575589#action_12575589 ] 

dhruba borthakur commented on HADOOP-2926:
------------------------------------------

Now that you point it out that streams need not be closed, I agree with you. I cannot see a case when this can cause data corruption. If this is the case, this patch should not go into 0.16.

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>         Attachments: closeStream.patch
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574660#action_12574660 ] 

Raghu Angadi commented on HADOOP-2926:
--------------------------------------

i.e. closeStream() should be used if and only if user wants to ignore IOException and its ok for ref to be null. 

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574668#action_12574668 ] 

Raghu Angadi commented on HADOOP-2926:
--------------------------------------

> Shouldn't we try to make this idiom work well with HDFS?

I am not sure why this would not work now..



> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-2926:
-------------------------------------

    Status: Open  (was: Patch Available)

This would need some rework if we want it for 0.17.

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>         Attachments: closeStream.patch
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575545#action_12575545 ] 

Hadoop QA commented on HADOOP-2926:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12377211/closeStream.patch
against trunk revision 619744.

    @author +1.  The patch does not contain any @author tags.

    tests included -1.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs -1.  The patch appears to introduce 2 new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1899/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1899/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1899/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1899/console

This message is automatically generated.

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>         Attachments: closeStream.patch
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574657#action_12574657 ] 

Raghu Angadi commented on HADOOP-2926:
--------------------------------------

You are right. The example above shows wrong use of closeStream(). 

> it clear it can only be used after the write operation has failed and is being cleaned up.
But this is not true. Many times this used to close sockets where socket needs to be closed because of unrelated error (say i/o failed on a different stream).


> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575070#action_12575070 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-2926:
------------------------------------------------

> The standard idiom for multiple streams, as mentioned above, is nested try blocks
+1 Nested try blocks probably work in most situations.

However, it does not work when the number of streams is unknown in compile time, e.g.
{code}
OutputStream[] out = new OutputStream[n];
...
{code}
Also, it makes the code ugly when the number is large, say > 3.

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-2926:
-------------------------------------

    Attachment: closeStream.patch

> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>         Attachments: closeStream.patch
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574747#action_12574747 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-2926:
------------------------------------------------

> Shouldn't we try to make this idiom work well with HDFS?

This idiom is not obvious for multiple IOs.  For example, the following codes cannot handle exceptions correctly:
{code}
OutputStream out1 = fs.open(...);
OutputStream out2 = fs.open(...);
try {
  out1.write(...);
  out2.write(...);
} finally {
  out1.close();
  out2.close();  //not called if the previous line throws an exception
}
{code}

We still need something like IOUtils.closeStream to do try-catch if there are more then one IOs.
I suggest we define the following method in IOUtils
{code}
public static void closeIO(Closeable... io) throws IOException {
  List<IOException> ioexceptions = new ArrayList<IOException>();
  for(Closeable c : io) {
    try {io.close();}
    catch(IOException e) {ioexceptions.add(e);}
  }
  if (!ioexceptions.isEmpty()) {
    throw new IOException(...); //construct an IOException with the list
  }
}
{code}

Then, multiple IOs can be closed together
{code}
OutputStream out1 = fs.open(...);
OutputStream out2 = fs.open(...);
try {
  out1.write(...);
  out2.write(...);
} finally {
 IOUtils.closeIO(out1, out2);
}
{code}



> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2926) Ignoring IOExceptions on close

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574753#action_12574753 ] 

Doug Cutting commented on HADOOP-2926:
--------------------------------------

The standard idiom for multiple streams, as mentioned above, is nested try blocks, e.g.:

{noformat}
OutputStream out1 = fs.open(...);
try {
  OutputStream out2 = fs.open(...);
  try {
    out1.write(...);
    out2.write(...);
  } finally {
    out2.close();
  }
  out1.close();
}
{noformat}


> Ignoring IOExceptions on close
> ------------------------------
>
>                 Key: HADOOP-2926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2926
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Owen O'Malley
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.16.1
>
>
> Currently in HDFS there are a lot of calls to IOUtils.closeStream that are from finally blocks. I'm worried that this can lead to data corruption in the file system. Take the first instance in DataNode.copyBlock: it writes the block and then calls closeStream on the output stream. If there is an error at the end of the file that is detected in the close, it will be *completely* ignored. Note that logging the error is not enough, the error should be thrown so that the client knows the failure happened.
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>    } finally {
>       IOUtils.closeStream(file);
>   }
> {code}
> is *bad*. It must be rewritten as:
> {code}
>    try {
>      file1.write(...);
>      file2.write(...);
>      file1.close(...);
>      file2.close(...);
>    } catch (IOException ie) {
>      IOUtils.closeStream(file1);
>      IOUtils.closeStream(file2);
>      throw ie;
>    }
> {code}
> I also think that IOUtils.closeStream should be renamed IOUtils.cleanupFailedStream or something to make it clear it can only be used after the write operation has failed and is being cleaned up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.