You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Richard Zhang <ri...@gmail.com> on 2008/06/28 04:42:37 UTC

are there any large data set to test the map reduce program on hadoop?

Hello Folks:
I wrote a map reduce program for analyzing text files. I would like to use a
large data set with text files to test the performance of the program. Are
there any available text data set which can be used to
test programs on Hadoop? If you know, please let me know.
Thanks.
Richard

Re: are there any large data set to test the map reduce program on hadoop?

Posted by tim robertson <ti...@gmail.com>.
Perhaps something like a RandomTextWriter to generate a file for input?
http://hadoop.apache.org/core/docs/r0.17.0/api/org/apache/hadoop/examples/RandomTextWriter.html

Cheers

Tim



On Sat, Jun 28, 2008 at 4:42 AM, Richard Zhang <ri...@gmail.com>
wrote:

> Hello Folks:
> I wrote a map reduce program for analyzing text files. I would like to use
> a
> large data set with text files to test the performance of the program. Are
> there any available text data set which can be used to
> test programs on Hadoop? If you know, please let me know.
> Thanks.
> Richard
>