You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2008/04/04 22:47:25 UTC

[jira] Created: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Expose DFSOutputStream.fsync API though the FileSystem interface
----------------------------------------------------------------

                 Key: HADOOP-3177
                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
             Project: Hadoop Core
          Issue Type: Improvement
          Components: dfs
            Reporter: dhruba borthakur


In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601783#action_12601783 ] 

dhruba borthakur commented on HADOOP-3177:
------------------------------------------

I am saying the same thing as you: let's add a new method FSDataOutputStream.fsync().

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592855#action_12592855 ] 

dhruba borthakur commented on HADOOP-3177:
------------------------------------------

I think it is sufficient (for now) for an application to cast a OutputStream to a DFSOutputStream and then invoke fsync on it. However, it is better if we expose this API on the generic FileSystem API so that an application can work same on all FileSystems.

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602235#action_12602235 ] 

Hadoop QA commented on HADOOP-3177:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12383320/3177_20080603.patch
  against trunk revision 662976.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2567/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2567/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2567/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2567/console

This message is automatically generated.

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: Tsz Wo (Nicholas), SZE
>         Attachments: 3177_20080603.patch
>
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602075#action_12602075 ] 

dhruba borthakur commented on HADOOP-3177:
------------------------------------------

+1. Code looks good.

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: 3177_20080603.patch
>
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592863#action_12592863 ] 

Doug Cutting commented on HADOOP-3177:
--------------------------------------

I think sync is a fairly generic FileSystem operation.

http://java.sun.com/javase/6/docs/api/java/io/FileDescriptor.html#sync()



> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601996#action_12601996 ] 

dhruba borthakur commented on HADOOP-3177:
------------------------------------------

The semantics of FileSystem.fsync is that the system makes every effort to put data on persistent storage. In the case of local file system, it will invoke the sync on the local file system. In the case of DFS, it will ensure that data has been flushed to OS buffers on all datanode(s) in the pipeline.

If a file-system does not support fsync, then it should be a no-op rather than an exception.This is similar to other calls like setReplication which returns success on local file systems even though there isn't any replication for local file system.

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592851#action_12592851 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-3177:
------------------------------------------------

Do we really need to add this method to the FileSystem API?  fsync seems DFS specific.

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601786#action_12601786 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-3177:
------------------------------------------------

The class for the wrapperStream is OutputStream.  I think it should be java.io.FileOutputStream since we are doing FileSystem.  Then, the new method FSDataOutputStream.fsync() just has to call wrapperStream.getFD().sync().  This will work for all FileSystem

For DFS, we need to define a new class, say DfsFileDescriptor, extending java.io.FileDescriptor and make DfsFileDescriptor.sync() calls DFSOutputStream.fsync().

For other FileSystem subclass, if getFD() is not defined, we could throw IOException("not supported").

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601544#action_12601544 ] 

dhruba edited comment on HADOOP-3177 at 6/2/08 12:48 AM:
-------------------------------------------------------------------

This issue blocks h-1700 because DFSOutputStream.fsync() is not a public API yet. More work is needed to make it accessible from an application.

One option would be to introduce  FSDataOutputStream.fsync() API. It would invoke reflection to see if the wrapperStream as has a method named "fsync". If so, then it will invoke wrapperStream.fsync(), otherwise it will invoke wrapperStream.flush(). Does this sound reasonable?

      was (Author: dhruba):
    This issue blocks h-1700 because DFSOutputStream.fsync() is not a public API yet. More work is needed to make it accessible from an application.

One option would be to introduce  FSDataOutputStream.fsync() API. It would invoke reflection to see if the wrapperStream as han a methid named "fsync". If so, then it will invoke wrapperStream.fsync(), otherwise it will invoke wrapperStream.flush(). Does this sound reasonable?
  
> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-3177:
-------------------------------------------

    Attachment: 3177_20080603.patch

3177_20080603.patch: added Syncable interface

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: 3177_20080603.patch
>
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "Tom White (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601922#action_12601922 ] 

Tom White commented on HADOOP-3177:
-----------------------------------

bq. I think it should be java.io.FileOutputStream since we are doing FileSystem.

But FileOutputStream is tied to Java's File abstraction which isn't general enough for Hadoop FileSystems. Furthermore FileOutputStream#getFD is final, as is FileDescriptor, so we can't use it here.

How about an interface:

{code}
public interface Syncable {
  void sync() throws IOException;
}
{code}

(Or should it be "Synchable"?) Then make DFSOutputStream implement Syncable, so FSDataOutputStream - which is also a Syncable - can see if it can call sync() on the underlying stream.

What are the semantics of sync()? I think the expectation is that after sync returns the system has successfully sync'ed buffers to disk. So if this is not true, sync() should throw an exception. This is what java.io.FileDescriptor does. Using a subclass of IOException (java.io.SyncFailedException?) would make this easier for callers. I realize that this description is at odds with the current contract for DFSOutputStream#fsync, which doesn't guarantee that the data has been flushed to persistent storage, but I wondered whether DFSOutputStream could be strengthened to make this guarantee? 

If the FileSystem doesn't support sync then do we get an exception when calling sync(), or is it a no op?

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602011#action_12602011 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-3177:
------------------------------------------------

> Furthermore FileOutputStream#getFD is final, as is FileDescriptor, so we can't use it here.

Oops, I missed this point.  We definitely cannot use FileDescriptor.

I think the Syncable interface idea is good.

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-3177:
-------------------------------------------

        Assignee: Tsz Wo (Nicholas), SZE  (was: dhruba borthakur)
    Release Note: Added a new public interface Syncable which declares the sync() operation.  FSDataOutputStream implements Syncable.  If the wrappedStream in FSDataOutputStream is Syncalbe, calling FSDataOutputStream.sync() is equivalent to call wrappedStream.sync().  Otherwise, FSDataOutputStream.sync() is a no-op.  Both DistributedFileSystem and LocalFileSystem support the sync() operation.
    Hadoop Flags: [Reviewed]
          Status: Patch Available  (was: Open)

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: Tsz Wo (Nicholas), SZE
>         Attachments: 3177_20080603.patch
>
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3177:
-------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.18.0
           Status: Resolved  (was: Patch Available)

I just committed this. Thanks Nicholas!

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.18.0
>
>         Attachments: 3177_20080603.patch
>
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur reassigned HADOOP-3177:
----------------------------------------

    Assignee: dhruba borthakur

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by Nigel Daley <nd...@yahoo-inc.com>.

Thanks Stack.  This happens after a reboot. I've been fixing this  
myself for the past month+ but that's not sustainable.  I finally  
filed an INFRA ticket to get it fixed properly.

thx,
nige

On Jun 3, 2008, at 7:41 PM, stack wrote:
> All tests are failing because they can't write /tmp:
>
>     [exec]  
> ======================================================================
>     [exec]  
> ======================================================================
>     [exec]     Applying patch.
>     [exec]  
> ======================================================================
>     [exec]  
> ======================================================================
>     [exec]
>     [exec]
>     [exec] (Stripping trailing CRs from patch.)
>     [exec] patch: **** Can't create file /tmp/po2hai.T : Permission  
> denied
>     [exec] PATCH APPLICATION FAILED
>
> When I look at /tmp, I see this:
>
> -bash-3.00$ ls -lad /tmp
> drwxr-xr-x   3 root     root         259 Jun  3 18:34 /tmp
>
> I set it so its writable by all:
>
> -bash-3.00$ ls -lad /tmp
> drwxrwxrwx   3 root     root         259 Jun  3 22:41 /tmp
>
> Perhaps it'll stick?  And tests will start to work again?
>
> St.Ack
>
>
> Hadoop QA (JIRA) wrote:
>>     [ https://issues.apache.org/jira/browse/HADOOP-3177? 
>> page=com.atlassian.jira.plugin.system.issuetabpanels:comment- 
>> tabpanel&focusedCommentId=12602139#action_12602139 ]
>> Hadoop QA commented on HADOOP-3177:
>> -----------------------------------
>>
>> -1 overall.  Here are the results of testing the latest  
>> attachment   http://issues.apache.org/jira/secure/attachment/ 
>> 12383320/3177_20080603.patch
>>   against trunk revision 662913.
>>
>>     +1 @author.  The patch does not contain any @author tags.
>>
>>     +1 tests included.  The patch appears to include 9 new or  
>> modified tests.
>>
>>     -1 patch.  The patch command could not apply the patch.
>>
>> Console output: http://hudson.zones.apache.org/hudson/job/Hadoop- 
>> Patch/2561/console
>>
>> This message is automatically generated.
>>
>>
>>> Expose DFSOutputStream.fsync API though the FileSystem interface
>>> ----------------------------------------------------------------
>>>
>>>                 Key: HADOOP-3177
>>>                 URL: https://issues.apache.org/jira/browse/ 
>>> HADOOP-3177
>>>             Project: Hadoop Core
>>>          Issue Type: Improvement
>>>          Components: dfs
>>>            Reporter: dhruba borthakur
>>>            Assignee: Tsz Wo (Nicholas), SZE
>>>         Attachments: 3177_20080603.patch
>>>
>>>
>>> In the current code, there is a DFSOutputStream.fsync() API that  
>>> allows a client to flush all buffered data to the datanodes and  
>>> also persist block locations on the namenode. This API should be  
>>> exposed through the generic API in the org.hadoop.fs.
>>>
>>
>>
>

Re: [jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by stack <st...@duboce.net>.

All tests are failing because they can't write /tmp:

     [exec] 
======================================================================
     [exec] 
======================================================================
     [exec]     Applying patch.
     [exec] 
======================================================================
     [exec] 
======================================================================
     [exec]
     [exec]
     [exec] (Stripping trailing CRs from patch.)
     [exec] patch: **** Can't create file /tmp/po2hai.T : Permission denied
     [exec] PATCH APPLICATION FAILED

When I look at /tmp, I see this:

-bash-3.00$ ls -lad /tmp
drwxr-xr-x   3 root     root         259 Jun  3 18:34 /tmp

I set it so its writable by all:

-bash-3.00$ ls -lad /tmp
drwxrwxrwx   3 root     root         259 Jun  3 22:41 /tmp

Perhaps it'll stick?  And tests will start to work again?

St.Ack


Hadoop QA (JIRA) wrote:
>     [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602139#action_12602139 ] 
>
> Hadoop QA commented on HADOOP-3177:
> -----------------------------------
>
> -1 overall.  Here are the results of testing the latest attachment 
>   http://issues.apache.org/jira/secure/attachment/12383320/3177_20080603.patch
>   against trunk revision 662913.
>
>     +1 @author.  The patch does not contain any @author tags.
>
>     +1 tests included.  The patch appears to include 9 new or modified tests.
>
>     -1 patch.  The patch command could not apply the patch.
>
> Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2561/console
>
> This message is automatically generated.
>
>   
>> Expose DFSOutputStream.fsync API though the FileSystem interface
>> ----------------------------------------------------------------
>>
>>                 Key: HADOOP-3177
>>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>>             Project: Hadoop Core
>>          Issue Type: Improvement
>>          Components: dfs
>>            Reporter: dhruba borthakur
>>            Assignee: Tsz Wo (Nicholas), SZE
>>         Attachments: 3177_20080603.patch
>>
>>
>> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.
>>     
>
>

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602139#action_12602139 ] 

Hadoop QA commented on HADOOP-3177:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12383320/3177_20080603.patch
  against trunk revision 662913.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2561/console

This message is automatically generated.

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: Tsz Wo (Nicholas), SZE
>         Attachments: 3177_20080603.patch
>
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601692#action_12601692 ] 

Doug Cutting commented on HADOOP-3177:
--------------------------------------

> It would invoke reflection to see if the wrapperStream as has a method named "fsync".

Yuk.  What is the reason not to add a sync() method to FileSystem and/or FSDataOutputStream?

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602073#action_12602073 ] 

Doug Cutting commented on HADOOP-3177:
--------------------------------------

This API looks fine to me.

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: 3177_20080603.patch
>
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3177) Expose DFSOutputStream.fsync API though the FileSystem interface

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602127#action_12602127 ] 

Hadoop QA commented on HADOOP-3177:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12383320/3177_20080603.patch
  against trunk revision 662913.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2558/console

This message is automatically generated.

> Expose DFSOutputStream.fsync API though the FileSystem interface
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3177
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: Tsz Wo (Nicholas), SZE
>         Attachments: 3177_20080603.patch
>
>
> In the current code, there is a DFSOutputStream.fsync() API that allows a client to flush all buffered data to the datanodes and also persist block locations on the namenode. This API should be exposed through the generic API in the org.hadoop.fs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.