You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by William Surowiec <ws...@gmail.com> on 2006/07/17 04:40:13 UTC
Possible problem in WebAppModule
I checked out a copy of nutch this morning from svn. After adding two
jars referenced in an archived email, I still had a build problem. It
occurs in org.apache.nutch.webapp.common.WebAppModule. Guessing, I
changed lines 161/162 to:
String servletName = ((CharacterData)servlet).getData().trim();
String urlPattern = ((CharacterData)pattern).getData().trim();
Additionally, during nutch start up I am experiencing a problem similar to what others have reported with hadoop:
Exception in thread "main" java.io.IOException: Input directory e:/apps/nutch/crawlTest/urls in local is invalid.
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:274)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:327)
at org.apache.nutch.crawl.Injector.inject(Injector.java:138)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:105)
I have made the change to Injector suggested in a posting:
// mergeJob.addInputPath(tempDir);
mergeJob.setInputPath(tempDir);
It appears not to help.
I have not been able to get past this point. Any suggestions welcomed.
And since this is my first post, I would be ungrateful if I did not express my thanks and appreciation for this awesome body of work.
Bill
Re: Possible problem in WebAppModule
Posted by William Surowiec <ws...@gmail.com>.
Sami Siren wrote:
> William Surowiec wrote:
>
>> I checked out a copy of nutch this morning from svn. After adding two
>> jars referenced in an archived email, I still had a build problem. It
>> occurs in org.apache.nutch.webapp.common.WebAppModule. Guessing, I
>> changed lines 161/162 to:
>>
>> String servletName = ((CharacterData)servlet).getData().trim();
>> String urlPattern = ((CharacterData)pattern).getData().trim();
>>
>>
>>
> What is the compilation error you are seeing and in what environment
> (os, jvm)?
>
> --
> Sami Siren
>
Commenting out my revision, switching back to the original code, I
receive the following compilation error within eclipse (sorry for the
lack of alignment.)
Severity and Description Path Resource Location Creation
Time Id
The method getTextContent() is undefined for the type Element nutch
dist/contrib/web2/src/main/java/org/apache/nutch/webapp/common
WebAppModule.java line 163 1153141616750 30768
The method getTextContent() is undefined for the type Element nutch
dist/contrib/web2/src/main/java/org/apache/nutch/webapp/common
WebAppModule.java line 164 1153141616750 30769
I have crawled a website with nutch and built an index (neat stuff) but
have not been able to test my change (other than it compiles.)
Bill
Re: Possible problem in WebAppModule
Posted by Sami Siren <ss...@gmail.com>.
William Surowiec wrote:
>I checked out a copy of nutch this morning from svn. After adding two
>jars referenced in an archived email, I still had a build problem. It
>occurs in org.apache.nutch.webapp.common.WebAppModule. Guessing, I
>changed lines 161/162 to:
>
> String servletName = ((CharacterData)servlet).getData().trim();
> String urlPattern = ((CharacterData)pattern).getData().trim();
>
>
>
What is the compilation error you are seeing and in what environment
(os, jvm)?
--
Sami Siren