You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Mehdi Alemi <al...@comp.iust.ac.ir> on 2011/01/26 13:22:00 UTC
How nutch-2.0 is handled by hbase
Dear developers,
I would like to know how you use hbase in nutch2. Off course, I know that
GORA store data persistently. But, I'm confusing about working of it's
map-reduce part.
What is the number of reducers? What is the number of mappers?
Mapper prepare document for reducer and reducer index it with Solr. Solr
index that document along with other documents and store it with GORA. How
merge of indexes is resolved? For each term in inverted index, there should
be one posting list that identify documents containing that term. Is update
of each term in inverted index is performed by merging new posting list and
old posting list?
Best Regards,
Mehdi Alemi