You are viewing a plain text version of this content. The canonical link for it is here.
Posted to agent@nutch.apache.org by Mahesh Raman <ra...@gmail.com> on 2005/05/03 18:07:25 UTC
Ability to search while crawling
I am currently using the following command to crawl my intranet:
bin/nutch crawl urls -dir crawl.test -depth 5 >& crawl.log
I also modified by nutch-default.xml in the
/opt/tomcat/webapps/ROOT/WEB-INF/classes/nutch-default.xml directory
to point to the crawl.test directory. Here is the code:
<property>
<name>searcher.dir</name>
<value>/home/mahesh/nutch-0.6/crawl.test</value>
<description>
</description>
</property>
The crawl is working successfully and it has crawled around 250,000
web pages. The crawl is not complete and is still going on. However,
when i do a search from localhost:8080, I do not get any results (Hits
0-0). I have two questions:
1. Is there a way to be able to view the search results before the
intranet crawl is complete?
2. How do I stop the crawl inbetween and if i did stop it, will i
still have access to the 250,000 crawled pages?
Thanks.