You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Alex Luya <al...@gmail.com> on 2010/08/17 16:49:44 UTC

how to get a map from nutch crawled result?

Hello:
	I want to use a webextrator (webharvest) to extract content form html page base on a map<url,webpage(formated as html)>.so first:
1,How can I read link graphic database out?
2,How to convert result of nutch crawling back to html?
3,How to link them to construct the map<url,webpage(formated as html)>?