You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Manoharam Reddy <ma...@gmail.com> on 2007/05/31 12:41:50 UTC

Any URL filter available for search.jsp?

I want to use two filters one for crawling and another for searching
through search.jsp.

I am currently using regex-urlfilter.txt for generate, fetch, update
cycle. But when a user searches the sites, I want him not to see
certain sites in the results that have been crawled.

How can this be achieved?

Re[2]: Any URL filter available for search.jsp?

Posted by Scam <sc...@inbox.ru>.
Hello Andrzej,

Friday, June 15, 2007, 1:25, you wrote:

>> Thursday, May 31, 2007, 14:41, you wrote:
>> 
>> MR> I want to use two filters one for crawling and another for searching
>> MR> through search.jsp.
>> 
>> MR> I am currently using regex-urlfilter.txt for generate, fetch, update
>> MR> cycle. But when a user searches the sites, I want him not to see
>> MR> certain sites in the results that have been crawled.
>> 
>> MR> How can this be achieved?
>> 
>> Anyone solve this problem? I need this filter too. How to do it in the
>> best way in nutch 0.9?
>> Any thoughts?
>> 

AB> http://issues.apache.org/jira/browse/NUTCH-477

I applied the patch to trunk and build nutch but search results are not filtered still.

I placed string "-amazon" for example in the
WEB-INF/classes/regex-urlfilter.txt
file. But nothing changed. Results are not filtered (search results
contains links with "amazon" word).

Could you give me instruction what to do to make it workable?


-- 
Best regards,
 Scam                            mailto:scam@inbox.ru


Re: Any URL filter available for search.jsp?

Posted by Andrzej Bialecki <ab...@getopt.org>.
Scam wrote:
> Hello Manoharam,
> 
> Thursday, May 31, 2007, 14:41, you wrote:
> 
> MR> I want to use two filters one for crawling and another for searching
> MR> through search.jsp.
> 
> MR> I am currently using regex-urlfilter.txt for generate, fetch, update
> MR> cycle. But when a user searches the sites, I want him not to see
> MR> certain sites in the results that have been crawled.
> 
> MR> How can this be achieved?
> 
> Anyone solve this problem? I need this filter too. How to do it in the
> best way in nutch 0.9?
> Any thoughts?
> 

http://issues.apache.org/jira/browse/NUTCH-477


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: Any URL filter available for search.jsp?

Posted by Scam <sc...@inbox.ru>.
Hello Manoharam,

Thursday, May 31, 2007, 14:41, you wrote:

MR> I want to use two filters one for crawling and another for searching
MR> through search.jsp.

MR> I am currently using regex-urlfilter.txt for generate, fetch, update
MR> cycle. But when a user searches the sites, I want him not to see
MR> certain sites in the results that have been crawled.

MR> How can this be achieved?

Anyone solve this problem? I need this filter too. How to do it in the
best way in nutch 0.9?
Any thoughts?

-- 
Best regards,
 Scam                            mailto:scam@inbox.ru