You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Zaheed Haque <za...@gmail.com> on 2006/04/07 09:33:41 UTC

is this a bug?

Hi:

SVN version -
When using the following class I get results.
oot@tokyo:/usr/local/src/nutch-0.8-dev# bin/nutch
org.apache.nutch.searcher.NutchBean cnn
060407 092041 10 parsing
jar:file:/usr/local/src/nutch-0.8-dev/lib/hadoop-0.1.0.jar!/hadoop-default.xml
060407 092042 10 parsing
file:/usr/local/src/nutch-0.8-dev/conf/nutch-default.xml
060407 092043 10 parsing file:/usr/local/src/nutch-0.8-dev/conf/nutch-site.xml
060407 092043 10 parsing file:/usr/local/src/nutch-0.8-dev/conf/hadoop-site.xml
060407 092043 11 Client connection to 127.0.0.1:50000: starting
060407 092043 10 opening merged index in /user/root/index
060407 092045 10 Plugins: looking in: /usr/local/src/nutch-0.8-dev/plugins

and its opening the merged "index" and works!

But via the JSP pages it tries to open the "indexes" No there are no
searcher.dir property errors. Why? catalina.out

INFO: Server startup in 10021 ms
060407 092846 parsing
jar:file:/usr/local/java/jakarta-tomcat-5.5.9/webapps/ROOT/WEB-INF/lib/hadoop-0.1.0.jar!/hadoop-default.xml
060407 092847 parsing
file:/usr/local/java/jakarta-tomcat-5.5.9/webapps/ROOT/WEB-INF/classes/nutch-default.xml
060407 092847 parsing
file:/usr/local/java/jakarta-tomcat-5.5.9/webapps/ROOT/WEB-INF/classes/nutch-site.xml
060407 092847 parsing
file:/usr/local/java/jakarta-tomcat-5.5.9/webapps/ROOT/WEB-INF/classes/hadoop-site.xml
060407 092847 Plugins: looking in:
/usr/local/java/jakarta-tomcat-5.5.9/webapps/ROOT/WEB-INF/classes/plugins
060407 092849 Plugin Auto-activation mode: [true]
060407 092849 Registered Plugins:
060407 092849   CyberNeko HTML Parser (lib-nekohtml)
060407 092849   Site Query Filter (query-site)
060407 092849   Html Parse Plug-in (parse-html)
060407 092849   Regex URL Filter Framework (lib-regex-filter)
060407 092849   Basic Indexing Filter (index-basic)
060407 092849   Text Parse Plug-in (parse-text)
060407 092849   JavaScript Parser (parse-js)
060407 092849   Basic Query Filter (query-basic)
060407 092849   Regex URL Filter (urlfilter-regex)
060407 092849   HTTP Framework (lib-http)
060407 092849   URL Query Filter (query-url)
060407 092849   Http Protocol Plug-in (protocol-http)
060407 092849   the nutch core extension points (nutch-extensionpoints)
060407 092849 Registered Extension-Points:
060407 092849   Nutch Protocol (org.apache.nutch.protocol.Protocol)
060407 092849   Nutch URL Filter (org.apache.nutch.net.URLFilter)
060407 092849   HTML Parse Filter (org.apache.nutch.parse.HtmlParseFilter)
060407 092849   Nutch Online Search Results Clustering Plugin
(org.apache.nutch.clustering.OnlineClusterer)
060407 092849   Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter)
060407 092849   Nutch Content Parser (org.apache.nutch.parse.Parser)
060407 092849   Ontology Model Loader (org.apache.nutch.ontology.Ontology)
060407 092850   Nutch Analysis (org.apache.nutch.analysis.NutchAnalyzer)
060407 092850   Nutch Query Filter (org.apache.nutch.searcher.QueryFilter)
060407 092850 11 creating new bean
060407 092850 11 opening indexes in /user/root/indexes
060407 092850 11 opening segments in /user/root/segments
060407 092851 11 found resource common-terms.utf8 at
file:/usr/local/java/jakarta-tomcat-5.5.9/webapps/ROOT/WEB-INF/classes/common-terms.utf8
060407 092851 11 opening linkdb in /user/root/linkdb

Is something wrong with my config somewhere...I can't seems to find it.

Cheers
Zaheed

Re: is this a bug?

Posted by Zaheed Haque <za...@gmail.com>.
hmmm.. interesting running local without hadoop gives the following
from catalina.out

060407 101908 11 creating new bean
060407 101908 11 opening merged index in
/usr/local/src/nutch-0.8-dev/bin/seed/index
060407 101909 11 opening segments in
/usr/local/src/nutch-0.8-dev/bin/seed/segments
060407 101909 11 found resource common-terms.utf8 at
file:/usr/local/java/jakarta-tomcat-5.5.9/webapps/ROOT/WEB-INF/classes/common-terms.utf8


Which works and correct.. Any idea?

Cheers

On 4/7/06, Zaheed Haque <za...@gmail.com> wrote:
> Hi:
>
> SVN version -
> When using the following class I get results.
> oot@tokyo:/usr/local/src/nutch-0.8-dev# bin/nutch
> org.apache.nutch.searcher.NutchBean cnn
> 060407 092041 10 parsing
> jar:file:/usr/local/src/nutch-0.8-dev/lib/hadoop-0.1.0.jar!/hadoop-default.xml
> 060407 092042 10 parsing
> file:/usr/local/src/nutch-0.8-dev/conf/nutch-default.xml
> 060407 092043 10 parsing file:/usr/local/src/nutch-0.8-dev/conf/nutch-site.xml
> 060407 092043 10 parsing file:/usr/local/src/nutch-0.8-dev/conf/hadoop-site.xml
> 060407 092043 11 Client connection to 127.0.0.1:50000: starting
> 060407 092043 10 opening merged index in /user/root/index
> 060407 092045 10 Plugins: looking in: /usr/local/src/nutch-0.8-dev/plugins
>
> and its opening the merged "index" and works!
>
> But via the JSP pages it tries to open the "indexes" No there are no
> searcher.dir property errors. Why? catalina.out
>
> INFO: Server startup in 10021 ms
> 060407 092846 parsing
> jar:file:/usr/local/java/jakarta-tomcat-5.5.9/webapps/ROOT/WEB-INF/lib/hadoop-0.1.0.jar!/hadoop-default.xml
> 060407 092847 parsing
> file:/usr/local/java/jakarta-tomcat-5.5.9/webapps/ROOT/WEB-INF/classes/nutch-default.xml
> 060407 092847 parsing
> file:/usr/local/java/jakarta-tomcat-5.5.9/webapps/ROOT/WEB-INF/classes/nutch-site.xml
> 060407 092847 parsing
> file:/usr/local/java/jakarta-tomcat-5.5.9/webapps/ROOT/WEB-INF/classes/hadoop-site.xml
> 060407 092847 Plugins: looking in:
> /usr/local/java/jakarta-tomcat-5.5.9/webapps/ROOT/WEB-INF/classes/plugins
> 060407 092849 Plugin Auto-activation mode: [true]
> 060407 092849 Registered Plugins:
> 060407 092849   CyberNeko HTML Parser (lib-nekohtml)
> 060407 092849   Site Query Filter (query-site)
> 060407 092849   Html Parse Plug-in (parse-html)
> 060407 092849   Regex URL Filter Framework (lib-regex-filter)
> 060407 092849   Basic Indexing Filter (index-basic)
> 060407 092849   Text Parse Plug-in (parse-text)
> 060407 092849   JavaScript Parser (parse-js)
> 060407 092849   Basic Query Filter (query-basic)
> 060407 092849   Regex URL Filter (urlfilter-regex)
> 060407 092849   HTTP Framework (lib-http)
> 060407 092849   URL Query Filter (query-url)
> 060407 092849   Http Protocol Plug-in (protocol-http)
> 060407 092849   the nutch core extension points (nutch-extensionpoints)
> 060407 092849 Registered Extension-Points:
> 060407 092849   Nutch Protocol (org.apache.nutch.protocol.Protocol)
> 060407 092849   Nutch URL Filter (org.apache.nutch.net.URLFilter)
> 060407 092849   HTML Parse Filter (org.apache.nutch.parse.HtmlParseFilter)
> 060407 092849   Nutch Online Search Results Clustering Plugin
> (org.apache.nutch.clustering.OnlineClusterer)
> 060407 092849   Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter)
> 060407 092849   Nutch Content Parser (org.apache.nutch.parse.Parser)
> 060407 092849   Ontology Model Loader (org.apache.nutch.ontology.Ontology)
> 060407 092850   Nutch Analysis (org.apache.nutch.analysis.NutchAnalyzer)
> 060407 092850   Nutch Query Filter (org.apache.nutch.searcher.QueryFilter)
> 060407 092850 11 creating new bean
> 060407 092850 11 opening indexes in /user/root/indexes
> 060407 092850 11 opening segments in /user/root/segments
> 060407 092851 11 found resource common-terms.utf8 at
> file:/usr/local/java/jakarta-tomcat-5.5.9/webapps/ROOT/WEB-INF/classes/common-terms.utf8
> 060407 092851 11 opening linkdb in /user/root/linkdb
>
> Is something wrong with my config somewhere...I can't seems to find it.
>
> Cheers
> Zaheed
>