You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by jeffersonzhou <je...@gmail.com> on 2011/07/12 13:59:07 UTC
Can I create my own segment containing specific URLs and other information?
Hi,
I want to do my own parser and separate all the interesting URLs into a new
segment other than Nutch’s default segments. Can I do so? How?
Thanks.
Re: Can I create my own segment containing specific URLs and other information?
Posted by Markus Jelsma <ma...@openindex.io>.
Simple, use scripts to operate on different segments (and/or crawldb's and
configuratons). I have setups with multiple NUTCH_HOME's, each with an
isolated crawl.
On Tuesday 12 July 2011 13:59:07 jeffersonzhou wrote:
> Hi,
>
>
>
> I want to do my own parser and separate all the interesting URLs into a new
> segment other than Nutch’s default segments. Can I do so? How?
>
>
>
> Thanks.
--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350