You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by jeffersonzhou <je...@gmail.com> on 2011/07/12 13:59:07 UTC

Can I create my own segment containing specific URLs and other information?

Hi,

 

I want to do my own parser and separate all the interesting URLs into a new
segment other than Nutch’s default segments. Can I do so? How?

 

Thanks.


Re: Can I create my own segment containing specific URLs and other information?

Posted by Markus Jelsma <ma...@openindex.io>.
Simple, use scripts to operate on different segments (and/or crawldb's and 
configuratons). I have setups with multiple NUTCH_HOME's, each with an 
isolated crawl.

On Tuesday 12 July 2011 13:59:07 jeffersonzhou wrote:
> Hi,
> 
> 
> 
> I want to do my own parser and separate all the interesting URLs into a new
> segment other than Nutch’s default segments. Can I do so? How?
> 
> 
> 
> Thanks.

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350