You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by vincent cai <cl...@gmail.com> on 2010/09/16 09:51:24 UTC

question about zip and extract jobs on hadoop

Hi all

   I'm just thinking about the elt extract jobs.

   is it possible to deploy that on hadoop cluster?

   if zip or unzip command can be run on hadoop datanodes , the network
bandwidth will be the only bottleneck.

   Millions of zip files distributed to datanodes and zip or unzip. if
possible make the super datanode by VM and SAN.

   That could be a "Super fast SAN"

   looks like the FileUtil class is containing some methods calling the
linux gzip or untar command, but the hadoop fs manual is not providing that

   pls let me know if you have any comments .

   I'm just thinking about the possibility.

   thanks

Best Regards
Vincent
Skype , cailibing1