You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Jim R. Wilson" <wi...@gmail.com> on 2008/05/16 20:43:35 UTC

Re: Mirroring data to a non-Hadoop FS

There was some chatter on the Hbase list about a dual hdfs/s3 driver
class which would write to both but only read from hdfs.  Of course,
having this functionality at the hadoop level would be better than in
a subsidiary project.

Maybe the ability to specify a secondary filesystem in the
hadoop-site.xml?  Candidates might include S3, NFS, or of course,
another HDFS in a geographically isolated location.

-- Jim R. Wilson (jimbojw)

On Fri, May 16, 2008 at 12:06 PM, Ted Dunning <td...@veoh.com> wrote:
>
> Why not go to the next step and use a second cluster as the backup?
>
>
> On 5/16/08 6:33 AM, "Robert Krüger" <kr...@signal7.de> wrote:
>
>>
>> Hi,
>>
>> what are the options to keep a copy of data from an HDFS instance in
>> sync with a backup file system which is not HDFS? Are there Rsync-like
>> tools that allow only to transfer deltas or would one have to implement
>> that oneself (e.g. by writing a java program that accesses both
>> filesystems)?
>>
>> Thanks in advance,
>>
>> Robert
>>
>> P.S.: Why would one want that? E.g. to have a completely redundant copy
>> which in case of systematic failure (e.g. data corruption due to a bug)
>> offers a backup not affected by that problem.
>
>