You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Jean <la...@yahoo.fr> on 2014/12/19 22:59:28 UTC
Fast grep on hdfs files
Hello,
I want to be able to grep customs strings in lot of files stored in hdfs.
I have at least a size of 500GB-2TB to grep splitted in ~50-200 files.
What would be the best way to have the faster results :
- lines matching
- filenames containing the lines matched
I tested with a map reduce grep but it's slow for interactive user.
Do i need to index everything in hive,solr ?
Spark will be faster than mapreduce ?
Thanks