You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Michael Levy <Lu...@gmail.com> on 2006/04/14 15:40:29 UTC
NullPointerException due to nonexistent (mis-pointed) segments directory
Just in case this is helpful to anyone else just getting started with Nutch:
When I tried to run a Nutch search I got an HTTP Status 500 error
message in my browser. The server log file entry indicating
NullPointerException (copied below) was not particularly helpful to me
in understanding what happened. Fortunately I happened to notice the
innocuous looking line below in the catalina log file:
INFO: opening segment indexes in
/export/home/www/virtual/wiki/doc_root/nutch-0.7.2/segments
...and I knew that when I had crawled my site it had not created a
directory named 'segments'. Renaming the directory fixed this problem.
Below is the contents of the localhost.2006-04-14.log log file. I
wasn't able to interpret this into anything meaningful.
Apr 14, 2006 8:51:00 AM org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet jsp threw exception
java.lang.NullPointerException
at org.apache.nutch.searcher.NutchBean.init(NutchBean.java:96)
at org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:82)
at org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:72)
at org.apache.nutch.searcher.NutchBean.get(NutchBean.java:64)
at org.apache.jsp.search_jsp._jspService(search_jsp.java:112)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:332)
at
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664)
at
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527)
at
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80)
at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
at java.lang.Thread.run(Thread.java:595)
Re: NullPointerException due to nonexistent (mis-pointed) segments
directory
Posted by Michael Levy <Lu...@gmail.com>.
I hope someone can help me with this problem.
This works fine:
#bin/nutch crawl urls.txt
and it creates a directory named something like crawl-20060418105008,
with a working index.
However if I try to add any parameters beyond the root_url_file
parameter I get the output below. I'm really stumped. The following
does not create a directory named FOO, but it does create a directory
named something like crawl-20060418105500, so apparently it ignores the
-dir FOO parameter.
Help, anyone? This happens under Solaris. This works fine on my PC
using cygwin but I want to run this on Solaris. TIA!
## bin/nutch crawl urls.txt -dir FOO
060418 105308 parsing
file:/export/home/www/virtual/wiki/doc_root/nutch-0.7.2/conf/nutch-default.xml
060418 105308 parsing
file:/export/home/www/virtual/wiki/doc_root/nutch-0.7.2/conf/crawl-tool.xml
060418 105308 parsing
file:/export/home/www/virtual/wiki/doc_root/nutch-0.7.2/conf/nutch-site.xml
060418 105308 No FS indicated, using default:local
060418 105308 crawl started in: crawl-20060418105308
060418 105308 rootUrlFile = urls.txt -dir FOO
060418 105308 threads = 10
060418 105308 depth = 5
060418 105310 Created webdb at
LocalFS,/export/home/www/virtual/wiki/doc_root/nutch-0.7.2/crawl-20060418105308/db
Exception in thread "main" java.io.FileNotFoundException: urls.txt -dir
FOO (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:106)
at java.io.FileReader.<init>(FileReader.java:55)
at
org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java:372)
at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535)
at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134)