You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by Bartek <ba...@o2.pl> on 2009/02/17 19:39:03 UTC

Trying to understand how webapp works

Hello,

I am trying to figure out how webapp part is working.

I've installed nutch and crawled some site. Then deployed .war file and 
in file {tomcat.dir}/nutch/WEB-INF/classes/nutch-site.xml
I've put correct searcher.dir, in my case /usr/local/nutch/crawls/site1

Everything is working fine but...

When I removed whole crawls dir (/usr/local/nutch/crawls) web 
application is still working fine. Searching is working (but not cache - 
I assume that it can't find segments)

So could someone explain to me why it is still working?

Any hints?

P.S. bin/nutch org.apache.nutch.searcher.NutchBean phrase    is not 
working (so it's correct)

Regards,
Bartosz Gadzimski

Re: Trying to understand how webapp works

Posted by Bartek <ba...@o2.pl>.

Sami Siren pisze:
> Bartek wrote:
>> Hello,
>>
>> I am trying to figure out how webapp part is working.
>>
>> I've installed nutch and crawled some site. Then deployed .war file 
>> and in file {tomcat.dir}/nutch/WEB-INF/classes/nutch-site.xml
>> I've put correct searcher.dir, in my case /usr/local/nutch/crawls/site1
>>
>> Everything is working fine but...
>>
>> When I removed whole crawls dir (/usr/local/nutch/crawls) web 
>> application is still working fine. Searching is working (but not 
>> cache - I assume that it can't find segments)
>>
>> So could someone explain to me why it is still working?
> You didn't restart tomcat after killing the directory did you? It 
> might be working because the webapp still has references to all files 
> it needs. Restart tomcat and it should work no more.
>
> -- 
> Sami Siren
>
>
Thanks it explains it a bit. Anyway it's strange because crawled dir had 
more than 1GB so all references should be gone.

Regards,
Bartosz Gadzimski

Re: Trying to understand how webapp works

Posted by Sami Siren <ss...@gmail.com>.

Bartek wrote:
> Hello,
>
> I am trying to figure out how webapp part is working.
>
> I've installed nutch and crawled some site. Then deployed .war file 
> and in file {tomcat.dir}/nutch/WEB-INF/classes/nutch-site.xml
> I've put correct searcher.dir, in my case /usr/local/nutch/crawls/site1
>
> Everything is working fine but...
>
> When I removed whole crawls dir (/usr/local/nutch/crawls) web 
> application is still working fine. Searching is working (but not cache 
> - I assume that it can't find segments)
>
> So could someone explain to me why it is still working?
You didn't restart tomcat after killing the directory did you? It might 
be working because the webapp still has references to all files it 
needs. Restart tomcat and it should work no more.

--
 Sami Siren