You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Manoharam Reddy <ma...@gmail.com> on 2007/05/31 12:41:50 UTC
Any URL filter available for search.jsp?
I want to use two filters one for crawling and another for searching
through search.jsp.
I am currently using regex-urlfilter.txt for generate, fetch, update
cycle. But when a user searches the sites, I want him not to see
certain sites in the results that have been crawled.
How can this be achieved?
Re[2]: Any URL filter available for search.jsp?
Posted by Scam <sc...@inbox.ru>.
Hello Andrzej,
Friday, June 15, 2007, 1:25, you wrote:
>> Thursday, May 31, 2007, 14:41, you wrote:
>>
>> MR> I want to use two filters one for crawling and another for searching
>> MR> through search.jsp.
>>
>> MR> I am currently using regex-urlfilter.txt for generate, fetch, update
>> MR> cycle. But when a user searches the sites, I want him not to see
>> MR> certain sites in the results that have been crawled.
>>
>> MR> How can this be achieved?
>>
>> Anyone solve this problem? I need this filter too. How to do it in the
>> best way in nutch 0.9?
>> Any thoughts?
>>
AB> http://issues.apache.org/jira/browse/NUTCH-477
I applied the patch to trunk and build nutch but search results are not filtered still.
I placed string "-amazon" for example in the
WEB-INF/classes/regex-urlfilter.txt
file. But nothing changed. Results are not filtered (search results
contains links with "amazon" word).
Could you give me instruction what to do to make it workable?
--
Best regards,
Scam mailto:scam@inbox.ru
Re: Any URL filter available for search.jsp?
Posted by Andrzej Bialecki <ab...@getopt.org>.
Scam wrote:
> Hello Manoharam,
>
> Thursday, May 31, 2007, 14:41, you wrote:
>
> MR> I want to use two filters one for crawling and another for searching
> MR> through search.jsp.
>
> MR> I am currently using regex-urlfilter.txt for generate, fetch, update
> MR> cycle. But when a user searches the sites, I want him not to see
> MR> certain sites in the results that have been crawled.
>
> MR> How can this be achieved?
>
> Anyone solve this problem? I need this filter too. How to do it in the
> best way in nutch 0.9?
> Any thoughts?
>
http://issues.apache.org/jira/browse/NUTCH-477
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
Re: Any URL filter available for search.jsp?
Posted by Scam <sc...@inbox.ru>.
Hello Manoharam,
Thursday, May 31, 2007, 14:41, you wrote:
MR> I want to use two filters one for crawling and another for searching
MR> through search.jsp.
MR> I am currently using regex-urlfilter.txt for generate, fetch, update
MR> cycle. But when a user searches the sites, I want him not to see
MR> certain sites in the results that have been crawled.
MR> How can this be achieved?
Anyone solve this problem? I need this filter too. How to do it in the
best way in nutch 0.9?
Any thoughts?
--
Best regards,
Scam mailto:scam@inbox.ru