You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Steve Gao <st...@yahoo.com> on 2008/10/02 21:17:20 UTC

How to concatenate hadoop files to a single hadoop file

Suppose I have 3 files in Hadoop that I want to "cat" them to a single file. I know it can be done by "hadoop dfs -cat" to a local file and updating it to Hadoop. But it's very expensive for large files. Is there an internal way to do this in Hadoop itself? Thanks



      

Re: How to concatenate hadoop files to a single hadoop file

Posted by Francesco Salbaroli <fr...@ie.ibm.com>.
Maybe you can write a simple map/reduce task to do it.

Best Regards,
      Francesco

-------------------------------------------------------------------------
Francesco Salbaroli
Stud./Intern @ Innovation Centre
IBM Technology Campus,
Damastown Industrial Estate, Mulhuddart, Dublin 15
E-mail: francesco.salbaroli(at)ie.ibm.com
Tel.: +353 01 815 5625

Graduate Student - Università di Bologna
E-mail: francesco.salbaroli(at)studio.unibo.it
-------------------------------------------------------------------------


Re: How to concatenate hadoop files to a single hadoop file

Posted by Michael Andrews <ma...@liveops.com>.
You might be able to use hars:

http://hadoop.apache.org/core/docs/current/hadoop_archives.html

On 10/2/08 2:51 PM, "Steve Gao" <st...@yahoo.com> wrote:

Anybody knows? Thanks a lot.

--- On Thu, 10/2/08, Steve Gao <st...@yahoo.com> wrote:
From: Steve Gao <st...@yahoo.com>
Subject: How to concatenate hadoop files to a single hadoop file
To: core-user@hadoop.apache.org
Cc: core-dev@hadoop.apache.org
Date: Thursday, October 2, 2008, 3:17 PM

Suppose I have 3 files in Hadoop that I want to "cat" them to a single
file. I know it can be done by "hadoop dfs -cat" to a local file and
updating it to Hadoop. But it's very expensive for large files. Is there an
internal way to do this in Hadoop itself? Thanks








Re: How to concatenate hadoop files to a single hadoop file

Posted by Steve Gao <st...@yahoo.com>.
Anybody knows? Thanks a lot.

--- On Thu, 10/2/08, Steve Gao <st...@yahoo.com> wrote:
From: Steve Gao <st...@yahoo.com>
Subject: How to concatenate hadoop files to a single hadoop file
To: core-user@hadoop.apache.org
Cc: core-dev@hadoop.apache.org
Date: Thursday, October 2, 2008, 3:17 PM

Suppose I have 3 files in Hadoop that I want to "cat" them to a single
file. I know it can be done by "hadoop dfs -cat" to a local file and
updating it to Hadoop. But it's very expensive for large files. Is there an
internal way to do this in Hadoop itself? Thanks



      


      

Re: How to concatenate hadoop files to a single hadoop file

Posted by Steve Gao <st...@yahoo.com>.
Anybody knows? Thanks a lot.

--- On Thu, 10/2/08, Steve Gao <st...@yahoo.com> wrote:
From: Steve Gao <st...@yahoo.com>
Subject: How to concatenate hadoop files to a single hadoop file
To: core-user@hadoop.apache.org
Cc: core-dev@hadoop.apache.org
Date: Thursday, October 2, 2008, 3:17 PM

Suppose I have 3 files in Hadoop that I want to "cat" them to a single
file. I know it can be done by "hadoop dfs -cat" to a local file and
updating it to Hadoop. But it's very expensive for large files. Is there an
internal way to do this in Hadoop itself? Thanks