You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Wolfgang Woerndl <wo...@informatik.tu-muenchen.de> on 2007/10/04 15:42:12 UTC
NullPointerException when tying to init NutchBean
Hello,
I installed Nutch 0.8.1., crawled some Web pages and get (meaningful) results
when calling
bin/nutch org.apache.nutch.searcher.NutchBean test
from the Nutch directory.
Now, I'm trying to integrate searching in a larger Servlet application:
public String fulltextSearch (String querystring, int maxhits)
{
Configuration conf = NutchConfiguration.create();
NutchBean bean = new NutchBean(conf, new Path("/path/to/nutch/"));
Query query = Query.parse(querystring, conf);
...
It compiles but I get a NullPointerException at runtime:
----------
[exec] java.lang.NullPointerException
[exec] at java.io.Reader.<init>(Reader.java:61)
[exec] at java.io.BufferedReader.<init>(BufferedReader.java:76)
[exec] at java.io.BufferedReader.<init>(BufferedReader.java:91)
[exec] at org.apache.nutch.analysis.CommonGrams.init(CommonGrams.java:151)
[exec] at
org.apache.nutch.analysis.CommonGrams.<init>(CommonGrams.java:51)
[exec] at
org.apache.nutch.searcher.FieldQueryFilter.setConf(FieldQueryFilter.java:107)
[exec] at
org.apache.nutch.searcher.url.URLQueryFilter.setConf(URLQueryFilter.java:33)
[exec] at
org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:153)
[exec] at
org.apache.nutch.searcher.QueryFilters.<init>(QueryFilters.java:75)
[exec] at
org.apache.nutch.searcher.IndexSearcher.init(IndexSearcher.java:78)
[exec] at
org.apache.nutch.searcher.IndexSearcher.<init>(IndexSearcher.java:62)
[exec] at org.apache.nutch.searcher.NutchBean.init(NutchBean.java:139)
[exec] at org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:105)
----------
conf.toString() gives:
"Configuration: defaults: hadoop-default.xml , nutch-default.xmlfinal:
hadoop-site.xml , nutch-site.xml"
I assume that Nutch can't find its configuration files. That's why I'm try to
tell it the correct path when calling new NutchBean(...)
Does anybody know what my problem might be, or even a possible solution? :)
Thanks in advance.
Wolfgang
Re: NullPointerException when tying to init NutchBean
Posted by Dennis Kubes <ku...@apache.org>.
My guess is seeing your error below is that you didn't move over the
common-terms.utf8 or other needed files from the nutch conf directory
into the classpath of your web application.
Dennis Kubes
Wolfgang Woerndl wrote:
> Hello,
>
> I installed Nutch 0.8.1., crawled some Web pages and get (meaningful)
> results when calling
> bin/nutch org.apache.nutch.searcher.NutchBean test
> from the Nutch directory.
>
> Now, I'm trying to integrate searching in a larger Servlet application:
>
> public String fulltextSearch (String querystring, int maxhits)
> {
> Configuration conf = NutchConfiguration.create();
> NutchBean bean = new NutchBean(conf, new Path("/path/to/nutch/"));
> Query query = Query.parse(querystring, conf);
> ...
>
> It compiles but I get a NullPointerException at runtime:
>
> ----------
> [exec] java.lang.NullPointerException
> [exec] at java.io.Reader.<init>(Reader.java:61)
> [exec] at java.io.BufferedReader.<init>(BufferedReader.java:76)
> [exec] at java.io.BufferedReader.<init>(BufferedReader.java:91)
> [exec] at
> org.apache.nutch.analysis.CommonGrams.init(CommonGrams.java:151)
> [exec] at
> org.apache.nutch.analysis.CommonGrams.<init>(CommonGrams.java:51)
> [exec] at
> org.apache.nutch.searcher.FieldQueryFilter.setConf(FieldQueryFilter.java:107)
>
> [exec] at
> org.apache.nutch.searcher.url.URLQueryFilter.setConf(URLQueryFilter.java:33)
>
> [exec] at
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:153)
> [exec] at
> org.apache.nutch.searcher.QueryFilters.<init>(QueryFilters.java:75)
> [exec] at
> org.apache.nutch.searcher.IndexSearcher.init(IndexSearcher.java:78)
> [exec] at
> org.apache.nutch.searcher.IndexSearcher.<init>(IndexSearcher.java:62)
> [exec] at
> org.apache.nutch.searcher.NutchBean.init(NutchBean.java:139)
> [exec] at
> org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:105)
> ----------
>
> conf.toString() gives:
> "Configuration: defaults: hadoop-default.xml , nutch-default.xmlfinal:
> hadoop-site.xml , nutch-site.xml"
>
> I assume that Nutch can't find its configuration files. That's why I'm
> try to tell it the correct path when calling new NutchBean(...)
>
> Does anybody know what my problem might be, or even a possible solution? :)
>
> Thanks in advance.
>
> Wolfgang
Re: NullPointerException when tying to init NutchBean
Posted by Wolfgang Woerndl <wo...@informatik.tu-muenchen.de>.
Thanks, now it works, just some feedback for everybody:
- Including the Nutch conf directory in the classpath solved the NPE
- I really need to set the path to the index dir in the NutchBean constructor,
otherwise I get 0 hits (despite having a searcher.dir proporty with the path in
nutch-site.xml)
Wolfgang
Sagar Naik wrote:
> Hey,
> I would like to mention 2 points :
> - The nutch config files shud be in the classpath.
> - The 2nd arg in NutchBean ctor is the path to index dir
>
> I guess this shud solve the NPE
>
>
>
>
>
> Wolfgang Woerndl wrote:
>> Hello,
>>
>> I installed Nutch 0.8.1., crawled some Web pages and get (meaningful)
>> results when calling
>> bin/nutch org.apache.nutch.searcher.NutchBean test
>> from the Nutch directory.
>>
>> Now, I'm trying to integrate searching in a larger Servlet application:
>>
>> public String fulltextSearch (String querystring, int maxhits)
>> {
>> Configuration conf = NutchConfiguration.create();
>> NutchBean bean = new NutchBean(conf, new Path("/path/to/nutch/"));
>> Query query = Query.parse(querystring, conf);
>> ...
>>
>> It compiles but I get a NullPointerException at runtime:
>>
>> ----------
>> [exec] java.lang.NullPointerException
>> [exec] at java.io.Reader.<init>(Reader.java:61)
>> [exec] at java.io.BufferedReader.<init>(BufferedReader.java:76)
>> [exec] at java.io.BufferedReader.<init>(BufferedReader.java:91)
>> [exec] at
>> org.apache.nutch.analysis.CommonGrams.init(CommonGrams.java:151)
>> [exec] at
>> org.apache.nutch.analysis.CommonGrams.<init>(CommonGrams.java:51)
>> [exec] at
>> org.apache.nutch.searcher.FieldQueryFilter.setConf(FieldQueryFilter.java:107)
>>
>> [exec] at
>> org.apache.nutch.searcher.url.URLQueryFilter.setConf(URLQueryFilter.java:33)
>>
>> [exec] at
>> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:153)
>>
>> [exec] at
>> org.apache.nutch.searcher.QueryFilters.<init>(QueryFilters.java:75)
>> [exec] at
>> org.apache.nutch.searcher.IndexSearcher.init(IndexSearcher.java:78)
>> [exec] at
>> org.apache.nutch.searcher.IndexSearcher.<init>(IndexSearcher.java:62)
>> [exec] at
>> org.apache.nutch.searcher.NutchBean.init(NutchBean.java:139)
>> [exec] at
>> org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:105)
>> ----------
>>
>> conf.toString() gives:
>> "Configuration: defaults: hadoop-default.xml , nutch-default.xmlfinal:
>> hadoop-site.xml , nutch-site.xml"
>>
>> I assume that Nutch can't find its configuration files. That's why I'm
>> try to tell it the correct path when calling new NutchBean(...)
>>
>> Does anybody know what my problem might be, or even a possible
>> solution? :)
>>
>> Thanks in advance.
>>
>> Wolfgang
>>
>
>
--
Dr. Wolfgang Woerndl, Institut fuer Informatik, TU Muenchen
Boltzmannstr. 3, 85748 Garching. Buero: Raum 01.05.043
Tel: +49 89 289-18686, Fax: -18657
http://www.in.tum.de/~woerndl/, Email: woerndl@in.tum.de
Re: NullPointerException when tying to init NutchBean
Posted by Sagar Naik <sa...@visvo.com>.
Hey,
I would like to mention 2 points :
- The nutch config files shud be in the classpath.
- The 2nd arg in NutchBean ctor is the path to index dir
I guess this shud solve the NPE
Wolfgang Woerndl wrote:
> Hello,
>
> I installed Nutch 0.8.1., crawled some Web pages and get (meaningful)
> results when calling
> bin/nutch org.apache.nutch.searcher.NutchBean test
> from the Nutch directory.
>
> Now, I'm trying to integrate searching in a larger Servlet application:
>
> public String fulltextSearch (String querystring, int maxhits)
> {
> Configuration conf = NutchConfiguration.create();
> NutchBean bean = new NutchBean(conf, new Path("/path/to/nutch/"));
> Query query = Query.parse(querystring, conf);
> ...
>
> It compiles but I get a NullPointerException at runtime:
>
> ----------
> [exec] java.lang.NullPointerException
> [exec] at java.io.Reader.<init>(Reader.java:61)
> [exec] at java.io.BufferedReader.<init>(BufferedReader.java:76)
> [exec] at java.io.BufferedReader.<init>(BufferedReader.java:91)
> [exec] at
> org.apache.nutch.analysis.CommonGrams.init(CommonGrams.java:151)
> [exec] at
> org.apache.nutch.analysis.CommonGrams.<init>(CommonGrams.java:51)
> [exec] at
> org.apache.nutch.searcher.FieldQueryFilter.setConf(FieldQueryFilter.java:107)
>
> [exec] at
> org.apache.nutch.searcher.url.URLQueryFilter.setConf(URLQueryFilter.java:33)
>
> [exec] at
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:153)
>
> [exec] at
> org.apache.nutch.searcher.QueryFilters.<init>(QueryFilters.java:75)
> [exec] at
> org.apache.nutch.searcher.IndexSearcher.init(IndexSearcher.java:78)
> [exec] at
> org.apache.nutch.searcher.IndexSearcher.<init>(IndexSearcher.java:62)
> [exec] at
> org.apache.nutch.searcher.NutchBean.init(NutchBean.java:139)
> [exec] at
> org.apache.nutch.searcher.NutchBean.<init>(NutchBean.java:105)
> ----------
>
> conf.toString() gives:
> "Configuration: defaults: hadoop-default.xml , nutch-default.xmlfinal:
> hadoop-site.xml , nutch-site.xml"
>
> I assume that Nutch can't find its configuration files. That's why I'm
> try to tell it the correct path when calling new NutchBean(...)
>
> Does anybody know what my problem might be, or even a possible
> solution? :)
>
> Thanks in advance.
>
> Wolfgang
>
--
This message has been scanned for viruses and
dangerous content and is believed to be clean.