You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Steve Gao <st...@yahoo.com> on 2008/10/02 21:17:20 UTC
How to concatenate hadoop files to a single hadoop file
Suppose I have 3 files in Hadoop that I want to "cat" them to a single file. I know it can be done by "hadoop dfs -cat" to a local file and updating it to Hadoop. But it's very expensive for large files. Is there an internal way to do this in Hadoop itself? Thanks
Re: How to concatenate hadoop files to a single hadoop file
Posted by Francesco Salbaroli <fr...@ie.ibm.com>.
Maybe you can write a simple map/reduce task to do it.
Best Regards,
Francesco
-------------------------------------------------------------------------
Francesco Salbaroli
Stud./Intern @ Innovation Centre
IBM Technology Campus,
Damastown Industrial Estate, Mulhuddart, Dublin 15
E-mail: francesco.salbaroli(at)ie.ibm.com
Tel.: +353 01 815 5625
Graduate Student - Università di Bologna
E-mail: francesco.salbaroli(at)studio.unibo.it
-------------------------------------------------------------------------
Re: How to concatenate hadoop files to a single hadoop file
Posted by Michael Andrews <ma...@liveops.com>.
You might be able to use hars:
http://hadoop.apache.org/core/docs/current/hadoop_archives.html
On 10/2/08 2:51 PM, "Steve Gao" <st...@yahoo.com> wrote:
Anybody knows? Thanks a lot.
--- On Thu, 10/2/08, Steve Gao <st...@yahoo.com> wrote:
From: Steve Gao <st...@yahoo.com>
Subject: How to concatenate hadoop files to a single hadoop file
To: core-user@hadoop.apache.org
Cc: core-dev@hadoop.apache.org
Date: Thursday, October 2, 2008, 3:17 PM
Suppose I have 3 files in Hadoop that I want to "cat" them to a single
file. I know it can be done by "hadoop dfs -cat" to a local file and
updating it to Hadoop. But it's very expensive for large files. Is there an
internal way to do this in Hadoop itself? Thanks
Re: How to concatenate hadoop files to a single hadoop file
Posted by Steve Gao <st...@yahoo.com>.
Anybody knows? Thanks a lot.
--- On Thu, 10/2/08, Steve Gao <st...@yahoo.com> wrote:
From: Steve Gao <st...@yahoo.com>
Subject: How to concatenate hadoop files to a single hadoop file
To: core-user@hadoop.apache.org
Cc: core-dev@hadoop.apache.org
Date: Thursday, October 2, 2008, 3:17 PM
Suppose I have 3 files in Hadoop that I want to "cat" them to a single
file. I know it can be done by "hadoop dfs -cat" to a local file and
updating it to Hadoop. But it's very expensive for large files. Is there an
internal way to do this in Hadoop itself? Thanks
Re: How to concatenate hadoop files to a single hadoop file
Posted by Steve Gao <st...@yahoo.com>.
Anybody knows? Thanks a lot.
--- On Thu, 10/2/08, Steve Gao <st...@yahoo.com> wrote:
From: Steve Gao <st...@yahoo.com>
Subject: How to concatenate hadoop files to a single hadoop file
To: core-user@hadoop.apache.org
Cc: core-dev@hadoop.apache.org
Date: Thursday, October 2, 2008, 3:17 PM
Suppose I have 3 files in Hadoop that I want to "cat" them to a single
file. I know it can be done by "hadoop dfs -cat" to a local file and
updating it to Hadoop. But it's very expensive for large files. Is there an
internal way to do this in Hadoop itself? Thanks