You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Hassen Riahi <ha...@cern.ch> on 2012/03/17 15:09:19 UTC

random seeks during write in HDFS

Hi,

We are trying to execute a mapper making a random access during  
writing files. It seems that HDFS supports only random seek during  
read and not during write (neither the file modification). Is it  
right? we are using hadoop-0.20. If it is the case, is there any plan  
to support it in the future?

The limitation described above makes the mapper failing to write  
files. Is there any suggestions to bypass this limitation? such as  
write files in a temp area and copying them then to HDFS?

Thanks for the help,
Hassen


Re: random seeks during write in HDFS

Posted by Brock Noland <br...@cloudera.com>.
Hi,

This question is for hdfs-user not mapreduce-user, as such I have removed them.

Yes HDFS does not allow ramdom writes. I suggest your read this doc:
http://hadoop.apache.org/common/docs/current/hdfs_design.html

Specifically the "Assumptions and Goals" section.

Here are two ways to get around this design assumption:

1) Write updated copies of the record with a new time stamp and then
dedup based on a unique key and timestamp.
2) Use HBase

Cheers,
Brock

On Sat, Mar 17, 2012 at 9:09 AM, Hassen Riahi <ha...@cern.ch> wrote:
> Hi,
>
> We are trying to execute a mapper making a random access during writing
> files. It seems that HDFS supports only random seek during read and not
> during write (neither the file modification). Is it right? we are using
> hadoop-0.20. If it is the case, is there any plan to support it in the
> future?
>
> The limitation described above makes the mapper failing to write files. Is
> there any suggestions to bypass this limitation? such as write files in a
> temp area and copying them then to HDFS?
>
> Thanks for the help,
> Hassen
>



-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/