You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by inet-fan <mi...@mail2.co.il> on 2008/06/23 00:44:39 UTC
No search results - Nutch 0.9 on FreeBSD
Hello!
I don't get search results after crawling and need help, please.
I install in a FreeBSD Jail the Java Development Kit 1.5, tomcat-5.5,
apache-ant-1.7 and nutch-0.9
in the directories: /usr/local/diablo-jdk1.5.0 , /usr/local/tomcat5.5 and
/usr/local/nutch
i run:
setenv JAVA_HOME /usr/local/diablo-jdk1.5.0
sh bin/nutch inject crawl/crawldb urls
sh bin/nutch generate crawl/crawldb crawl/segments
sh bin/nutch fetch crawl/segments/20080622124704
sh bin/nutch updatedb crawl/crawldb crawl/segments/20080622124704
sh bin/nutch invertlinks crawl/linkdb -dir crawl/segments/20080622124704
sh bin/nutch index crawl/indexes crawl/crawldb crawl/linkdb
crawl/segments/20080622124704
then i run:
sh bin/nutch org.apache.nutch.searcher.NutchBean wiki
Total hits: 0
I start the tomcat server from the rc.conf file of the jail.
The nutch.war file is in /usr/local/tomcat5.5/webapps/ROOT.war
What should be the value for searcher.dir?
What did I do else wrong?
the config file:
/jail/jvm/usr/local/tomcat5.5/webapps/search/WEB-INF/classes/nutch-site.xml
looks like:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>http.agent.name</name>
<value>example.com</value>
<description>HTTP 'User-Agent' request header. MUST NOT be empty -
please set this to a single word uniquely related to your organization.
NOTE: You should also check other related properties:
http.robots.agents
http.agent.description
http.agent.url
http.agent.email
http.agent.version
and set their values appropriately.
</description>
</property>
<property>
<name>http.agent.description</name>
<value>search agent</value>
<description>Further description of our bot- this text is used in
the User-Agent header. It appears in parenthesis after the agent name.
</description>
</property>
<property>
<name>http://search.example.com</name>
<value></value>
<description>A URL to advertise in the User-Agent header. This will
appear in parenthesis after the agent name. Custom dictates that this
should be a URL of a page explaining the purpose and behavior of this
crawler.
</description>
</property>
<property>
<name>info@example.com</name>
<value></value>
<description>An email address to advertise in the HTTP 'From' request
header and User-Agent header. A good practice is to mangle this
address (e.g. 'info at example dot com') to avoid spamming.
</description>
</property>
<property>
<name>searcher.dir</name>
<value>/usr/local/nutch/crawl</value>
</property>
</configuration>
thx!
--
View this message in context: http://www.nabble.com/No-search-results---Nutch-0.9-on-FreeBSD-tp18060064p18060064.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: No search results - Nutch 0.9 on FreeBSD
Posted by inet-fan <mi...@mail2.co.il>.
after i restart with /usr/local/tomcat5.5/bin/catalina.sh start its works.
thx!
--
View this message in context: http://www.nabble.com/No-search-results---Nutch-0.9-on-FreeBSD-tp18060064p18067083.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: No search results - Nutch 0.9 on FreeBSD
Posted by inet-fan <mi...@mail2.co.il>.
Hello!
After I creat a new nutch-site.xml file searching works with:
bin/nutch org.apache.nutch.searcher.NutchBean txt
but not on the website, what is wrong?
Hits 0-0 (out of about 0 total matching pages):
The searcher.dir looks now like:
<property>
<name>searcher.dir</name>
<value>/usr/local/nutch/crawl</value>
</property>
Thanks!
miki
--
View this message in context: http://www.nabble.com/No-search-results---Nutch-0.9-on-FreeBSD-tp18060064p18066876.html
Sent from the Nutch - User mailing list archive at Nabble.com.