You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Jeremy Cunningham <je...@statefarm.com> on 2011/06/28 15:19:04 UTC

Emit an entire file

I have lots of binary files stored in hdfs.  I read them using Apache POI and can search with no problems. I want to be able to search for keywords (which I can do) and then copy the file that has the text out to a different location.  The location can be in hdfs but I just need a location that contains all the files that meet my criteria.

Thanks,
Jeremy


Re: Emit an entire file

Posted by Allen Wittenauer <aw...@apache.org>.
On Jun 28, 2011, at 6:19 AM, Jeremy Cunningham wrote:

> I have lots of binary files stored in hdfs.  I read them using Apache POI and can search with no problems. I want to be able to search for keywords (which I can do) and then copy the file that has the text out to a different location.  The location can be in hdfs but I just need a location that contains all the files that meet my criteria.

	There is an entire file system API that enables one to read and write files to HDFS. Additionally, the user specifies where the output is written during a map-reduce job. 

	So.... is there a specific question that you need answered?