You are viewing a plain text version of this content. The canonical link for it is here.
Posted to agent@nutch.apache.org by Mahesh Raman <ra...@gmail.com> on 2005/05/03 18:07:25 UTC

Ability to search while crawling

I am currently using the following command to crawl my intranet:

bin/nutch crawl urls -dir crawl.test -depth 5 >& crawl.log

I also modified by nutch-default.xml in the
/opt/tomcat/webapps/ROOT/WEB-INF/classes/nutch-default.xml directory
to point to the crawl.test directory. Here is the code:

<property> 
  <name>searcher.dir</name>
  <value>/home/mahesh/nutch-0.6/crawl.test</value>
  <description>
  </description>
</property>

The crawl is working successfully and it has crawled around 250,000
web pages. The crawl is not complete and is still going on. However,
when i do a search from localhost:8080, I do not get any results (Hits
0-0). I have two questions:

1. Is there a way to be able to view the search results before the
intranet crawl is complete?
2. How do I stop the crawl inbetween and if i did stop it, will i
still have access to the 250,000 crawled pages?

Thanks.