You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Feng Ji <fe...@gmail.com> on 2006/09/06 03:45:47 UTC
filter urls from search result
Hi there,
I want to filter out particular ursl from search result.
And I try to use segement merger to do it;
Firstly, I put target urls in regex-urlfiter.txt and automaton-urlfiter.txt,
as "-http://abc.com/".
then, run "nutch/mergesegs" and "nutch/index", but the search page still
show the urls I want to filter out.
Any idea and which step I missed?
thanks,
Michael,
Re: filter urls from search result
Posted by steveb <sr...@gmail.com>.
Michael,
Did you remember to specify the -filter option when you ran nutch/mergesegs
?
-filter filter out URL-s prohibited by current URLFilters
Feng Ji wrote:
>
> Hi there,
>
> I want to filter out particular ursl from search result.
>
> And I try to use segement merger to do it;
>
> Firstly, I put target urls in regex-urlfiter.txt and
> automaton-urlfiter.txt,
> as "-http://abc.com/".
>
> then, run "nutch/mergesegs" and "nutch/index", but the search page still
> show the urls I want to filter out.
>
> Any idea and which step I missed?
>
> thanks,
>
> Michael,
>
>
--
View this message in context: http://www.nabble.com/filter-urls-from-search-result-tf2224405.html#a6169863
Sent from the Nutch - User forum at Nabble.com.