You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Charles Earl <ch...@me.com> on 2011/09/10 15:41:17 UTC

Using virtual disk as HDFS storage

Hi,
I am exploring ways in which HDFS could be run in an environment supporting virtualization. For the particular application, I would at least like the map tasks to run in virtual containers (perhaps even lightweight containers for example LXC), but actually not duplicate the storage per container. That is, one or more map task in the same container, as opposed to virtualized datanode.
One option seems to be to have hadoop.tmp.dir as a virtual disk, use guestfs or similar for management,  and respective map tasks could share this disk.
I'm eliding many details but just wanted some initial feedback, or pointers to similar effort. I'm aware that Mesos (perhaps Hadoop NG?) has support for lxc.
Charles