You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Christoph Graf <ch...@fatwire.com> on 2008/09/24 14:59:06 UTC

HDFS, FSDataOutputStream, flush(), sync(), close()

    Hi everybody,


I have a simple test case which creates a file, writes two lines into
the FSDataOutputStream and then flushes, syncs and closes the stream. I
am using hadoop 0.18.0 with cygwin.

What I observe (in contrast of using java.io.DataOutputStream) is that
the lines get written to the disk only after close() was executed. This
behaviour is true for the local filesystem as well as hdfs.

Can you point me to my mistake?

   
    thanks,


Christoph



        final String PATH = "file:///C:/tmp/test.txt";
       
        Configuration conf = new Configuration();
        conf.set("fs.default.name", PATH);
        FileSystem fs = FileSystem.get(conf);
           
        FSDataOutputStream out = fs.create(new Path(PATH), true);
       
        out.writeBytes("Test line 1 \n");
        out.writeBytes("Test line 2 \n");
       
        out.flush();
        out.sync();
        out.close();

Re: HDFS, FSDataOutputStream, flush(), sync(), close()

Posted by Christoph Graf <ch...@fatwire.com>.
   
    Hi Raghu,


I checked RawLocalFileSystem.java and there is a class
LocalFSFileOutputStream in it, which both implements flush() and sync()
as far as I can see. I did not yet look into much detail but for me that
seems not to be the problem here. Could you confirm?

    thanks,

Christoph


Raghu Angadi schrieb:
>
> Hmm.. neither of these filesystems seems to implement flush(). Can you
> file a jira on it?
>
> HDFS implements sync() and data should be on the disk after that, but
> might not be able available to a reader yet. How did you test if it
> data is on the disk?
>
> Raghu.
>
> Christoph Graf wrote:
>>     Hi everybody,
>>
>>
>> I have a simple test case which creates a file, writes two lines into
>> the FSDataOutputStream and then flushes, syncs and closes the stream. I
>> am using hadoop 0.18.0 with cygwin.
>>
>> What I observe (in contrast of using java.io.DataOutputStream) is that
>> the lines get written to the disk only after close() was executed. This
>> behaviour is true for the local filesystem as well as hdfs.
>>
>> Can you point me to my mistake?
>>
>>        thanks,
>>
>>
>> Christoph
>>
>>
>>
>>         final String PATH = "file:///C:/tmp/test.txt";
>>                Configuration conf = new Configuration();
>>         conf.set("fs.default.name", PATH);
>>         FileSystem fs = FileSystem.get(conf);
>>                    FSDataOutputStream out = fs.create(new Path(PATH),
>> true);
>>                out.writeBytes("Test line 1 \n");
>>         out.writeBytes("Test line 2 \n");
>>                out.flush();
>>         out.sync();
>>         out.close();
>


Re: HDFS, FSDataOutputStream, flush(), sync(), close()

Posted by Christoph Graf <ch...@fatwire.com>.
My test was using the web frontend filesystem browser. It said empty
file as long as close() was not executed. Afterwards the content was
browsable. What would be the correct way to test the availability of the
data after sync() ?

thanks,

Christoph


Raghu Angadi schrieb:
>
> Hmm.. neither of these filesystems seems to implement flush(). Can you
> file a jira on it?
>
> HDFS implements sync() and data should be on the disk after that, but
> might not be able available to a reader yet. How did you test if it
> data is on the disk?
>
> Raghu.
>
> Christoph Graf wrote:
>>     Hi everybody,
>>
>>
>> I have a simple test case which creates a file, writes two lines into
>> the FSDataOutputStream and then flushes, syncs and closes the stream. I
>> am using hadoop 0.18.0 with cygwin.
>>
>> What I observe (in contrast of using java.io.DataOutputStream) is that
>> the lines get written to the disk only after close() was executed. This
>> behaviour is true for the local filesystem as well as hdfs.
>>
>> Can you point me to my mistake?
>>
>>        thanks,
>>
>>
>> Christoph
>>
>>
>>
>>         final String PATH = "file:///C:/tmp/test.txt";
>>                Configuration conf = new Configuration();
>>         conf.set("fs.default.name", PATH);
>>         FileSystem fs = FileSystem.get(conf);
>>                    FSDataOutputStream out = fs.create(new Path(PATH),
>> true);
>>                out.writeBytes("Test line 1 \n");
>>         out.writeBytes("Test line 2 \n");
>>                out.flush();
>>         out.sync();
>>         out.close();
>


Re: HDFS, FSDataOutputStream, flush(), sync(), close()

Posted by Raghu Angadi <ra...@yahoo-inc.com>.
Hmm.. neither of these filesystems seems to implement flush(). Can you 
file a jira on it?

HDFS implements sync() and data should be on the disk after that, but 
might not be able available to a reader yet. How did you test if it data 
is on the disk?

Raghu.

Christoph Graf wrote:
>     Hi everybody,
> 
> 
> I have a simple test case which creates a file, writes two lines into
> the FSDataOutputStream and then flushes, syncs and closes the stream. I
> am using hadoop 0.18.0 with cygwin.
> 
> What I observe (in contrast of using java.io.DataOutputStream) is that
> the lines get written to the disk only after close() was executed. This
> behaviour is true for the local filesystem as well as hdfs.
> 
> Can you point me to my mistake?
> 
>    
>     thanks,
> 
> 
> Christoph
> 
> 
> 
>         final String PATH = "file:///C:/tmp/test.txt";
>        
>         Configuration conf = new Configuration();
>         conf.set("fs.default.name", PATH);
>         FileSystem fs = FileSystem.get(conf);
>            
>         FSDataOutputStream out = fs.create(new Path(PATH), true);
>        
>         out.writeBytes("Test line 1 \n");
>         out.writeBytes("Test line 2 \n");
>        
>         out.flush();
>         out.sync();
>         out.close();