You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Guillaume Perrot <gp...@ubikod.com> on 2013/01/17 12:36:34 UTC
Read sequence files that are being written
Hi everyone,
I am using Hadoop 1.0.3.
I write logs to an Hadoop sequence file into HDFS, I call syncFS() after
each bunch of logs but I never close the file (except when I am performing
daily rolling).
What I want to guarantee is that the file is available to readers while the
file is still being written.
I can read the bytes of the sequence file via FSDataInputStream, but if I
try to use SequenceFile.Reader.next(key,val), it returns false at the first
call.
I know the data is in the file since I can read it with FSDataInputStream
or with the cat command and I am 100% sure that syncFS() is called.
I checked the namenode and datanode logs, no error or warning. fsck shows
no corruption.
Why SequenceFile.Reader is unable to read my currently being written file ?