You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Phantom <gh...@gmail.com> on 2007/06/22 05:55:10 UTC

HDFS remote client

Is it possible to keep a file open for say 1 hour and write to it every once
in a while and then close it. I constantly get the same error on attempt to
close the handle of the file when I am done with my writes. I can see all
the writes and flushes happening w/o any problems. I am using the Native C
API to achieve this. Does it have anything to do with  the client losing
some kind of a lease ?  How do I go about fixing this problem ?


This is the exception I get when I try to close the file. I am writing into
HDFS from a remote machine. What am I doing wrong?

Exception in thread "main" java.io.IOException: Filesystem closed
        at org.apache.hadoop.dfs.DFSClient.checkOpen(DFSClient.java:168)
        at org.apache.hadoop.dfs.DFSClient.access$200(DFSClient.java:48)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.write(
DFSClient.java:1245)
        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(
FSDataOutputStream.java:38)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
        at java.io.DataOutputStream.write (DataOutputStream.java:90)
        at org.apache.hadoop.fs.ChecksumFileSystem$FSOutputSummer.write(
ChecksumFileSystem.java:402)
        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(
FSDataOutputStream.java :38)
        at java.io.BufferedOutputStream.flushBuffer(
BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at java.io.DataOutputStream.flush(DataOutputStream.java :106)
        at org.apache.hadoop.fs.FSDataOutputStream.close(
FSDataOutputStream.java:91)
Call to org/apache/hadoop/fs/FSDataOutputStream::close failed!
[Thu Jun 21 18:24:42 2007] "File closed -1"

Thanks
Avinash

Re: HDFS remote client

Posted by Phantom <gh...@gmail.com>.
No I am not using a multi-threaded application. I am using the native HDFS C
API to write into HDFS. I do not have access to the namenode log at the
moment. I will do that first thing in the morning and send out what I see.

Could you please tell me the intuition behind why the volatile would help ?
Writes work fine as if I constantly write in a loop and call
hdfsCloseFile(). However if I perform intermittent writes and then call
hdfsCloseFile() it seems that the FileSystem has already been closed. That
is what I do not understand. Why would the FileSystem close if
hdfsCloseFile() has not been called explicitly ?

Thanks
Avinash

On 6/21/07, Dhruba Borthakur <dh...@yahoo-inc.com> wrote:
>
> Hi avinash,
>
> The only suspicious thing I see is that DFSClient.running is defined as
> boolean running = true;
>
> Can you change this to
> volatile boolean running = true
>
> And see if your problem gets fixed? Also, can you pl let me know about the
> following:
>
> 1. Are you using a multi-threaded application?
> 2. Do you use a DFSClient instance to access more than one file?
> Simultaneously or sequentially?
> 3. Do you see any other types of exceptions in the Namenode log?
>
> Thanks,
> dhruba
>
>
> -----Original Message-----
> From: Phantom [mailto:ghostwhoowalks@gmail.com]
> Sent: Thursday, June 21, 2007 8:55 PM
> To: hadoop-user@lucene.apache.org
> Subject: HDFS remote client
>
> Is it possible to keep a file open for say 1 hour and write to it every
> once
> in a while and then close it. I constantly get the same error on attempt
> to
> close the handle of the file when I am done with my writes. I can see all
> the writes and flushes happening w/o any problems. I am using the Native C
> API to achieve this. Does it have anything to do with  the client losing
> some kind of a lease ?  How do I go about fixing this problem ?
>
>
> This is the exception I get when I try to close the file. I am writing
> into
> HDFS from a remote machine. What am I doing wrong?
>
> Exception in thread "main" java.io.IOException: Filesystem closed
>         at org.apache.hadoop.dfs.DFSClient.checkOpen(DFSClient.java:168)
>         at org.apache.hadoop.dfs.DFSClient.access$200(DFSClient.java:48)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.write(
> DFSClient.java:1245)
>         at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(
> FSDataOutputStream.java:38)
>         at java.io.BufferedOutputStream.write(BufferedOutputStream.java
> :105)
>         at java.io.DataOutputStream.write (DataOutputStream.java:90)
>         at org.apache.hadoop.fs.ChecksumFileSystem$FSOutputSummer.write(
> ChecksumFileSystem.java:402)
>         at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(
> FSDataOutputStream.java :38)
>         at java.io.BufferedOutputStream.flushBuffer(
> BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java
> :123)
>         at java.io.DataOutputStream.flush(DataOutputStream.java :106)
>         at org.apache.hadoop.fs.FSDataOutputStream.close(
> FSDataOutputStream.java:91)
> Call to org/apache/hadoop/fs/FSDataOutputStream::close failed!
> [Thu Jun 21 18:24:42 2007] "File closed -1"
>
> Thanks
> Avinash
>
>

RE: HDFS remote client

Posted by Dhruba Borthakur <dh...@yahoo-inc.com>.
Hi avinash,

The only suspicious thing I see is that DFSClient.running is defined as
boolean running = true;

Can you change this to 
volatile boolean running = true

And see if your problem gets fixed? Also, can you pl let me know about the
following:

1. Are you using a multi-threaded application? 
2. Do you use a DFSClient instance to access more than one file?
Simultaneously or sequentially?
3. Do you see any other types of exceptions in the Namenode log?

Thanks,
dhruba


-----Original Message-----
From: Phantom [mailto:ghostwhoowalks@gmail.com] 
Sent: Thursday, June 21, 2007 8:55 PM
To: hadoop-user@lucene.apache.org
Subject: HDFS remote client

Is it possible to keep a file open for say 1 hour and write to it every once
in a while and then close it. I constantly get the same error on attempt to
close the handle of the file when I am done with my writes. I can see all
the writes and flushes happening w/o any problems. I am using the Native C
API to achieve this. Does it have anything to do with  the client losing
some kind of a lease ?  How do I go about fixing this problem ?


This is the exception I get when I try to close the file. I am writing into
HDFS from a remote machine. What am I doing wrong?

Exception in thread "main" java.io.IOException: Filesystem closed
        at org.apache.hadoop.dfs.DFSClient.checkOpen(DFSClient.java:168)
        at org.apache.hadoop.dfs.DFSClient.access$200(DFSClient.java:48)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.write(
DFSClient.java:1245)
        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(
FSDataOutputStream.java:38)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
        at java.io.DataOutputStream.write (DataOutputStream.java:90)
        at org.apache.hadoop.fs.ChecksumFileSystem$FSOutputSummer.write(
ChecksumFileSystem.java:402)
        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(
FSDataOutputStream.java :38)
        at java.io.BufferedOutputStream.flushBuffer(
BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at java.io.DataOutputStream.flush(DataOutputStream.java :106)
        at org.apache.hadoop.fs.FSDataOutputStream.close(
FSDataOutputStream.java:91)
Call to org/apache/hadoop/fs/FSDataOutputStream::close failed!
[Thu Jun 21 18:24:42 2007] "File closed -1"

Thanks
Avinash