You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by W <wi...@gmail.com> on 2009/06/16 11:44:11 UTC

Hadoop as Cloud Storage

Dear Hadoop Guru's,

After googling and find some information on using hadoop as cloud
storage (long term).
I have a problem to maintain lots of data (around 50 TB) much of them
are TV Commercial (video files).

I know, the best solution for long term file archiving is using tape
backup, but i just curious, is hadoop
can be used as 'data archiving' platform ?

Thanks!

Warm Regards,
Wildan
---
OpenThink Labs
http://openthink-labs.tobethink.com/

Making IT, Business and Education in Harmony

>> 087884599249

Y! : hawking_123
Linkedln : http://www.linkedin.com/in/wildanmaulana

Re: Hadoop as Cloud Storage

Posted by Alex Loddengaard <al...@cloudera.com>.
Hey Wildan,

HDFS is successfully storing well over 50TBs on a single cluster.  It's
meant to store data that will be analyzed in a MR job, but it can be used
for archival storage.  You'd probably consider deploying nodes with lots of
disk space vs. lots of RAM and processor power.  You'll want to do a cost
analysis to determine if tape or HDFS is cheaper.

That said, you should know a few things about HDFS:

   - Its read path is optimized for high throughput, and doesn't care as
   much about latency (read: it's got high latency relative to other file
   systems)
   - It's not meant for small files, so ideally your video files will be at
   least ~100MB each
   - It requires that the machines that makeup your cluster be running
   whenever you want to access or store data.  (Note that HDFS survives if a
   small percentage of your nodes go down; it's built with fault tolerance in
   mind)

I hope this clears things up.  Let me know if you have any other questions.

Alex

On Tue, Jun 16, 2009 at 2:44 AM, W <wi...@gmail.com> wrote:

> Dear Hadoop Guru's,
>
> After googling and find some information on using hadoop as cloud
> storage (long term).
> I have a problem to maintain lots of data (around 50 TB) much of them
> are TV Commercial (video files).
>
> I know, the best solution for long term file archiving is using tape
> backup, but i just curious, is hadoop
> can be used as 'data archiving' platform ?
>
> Thanks!
>
> Warm Regards,
> Wildan
> ---
> OpenThink Labs
> http://openthink-labs.tobethink.com/
>
> Making IT, Business and Education in Harmony
>
> >> 087884599249
>
> Y! : hawking_123
> Linkedln : http://www.linkedin.com/in/wildanmaulana
>