You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by an...@orbita1.ru on 2006/09/06 10:31:38 UTC

indexing problem

I've got latest versions of nutch (0.9-dev) and hadoop (Trunk) from svn.
When I try to index I get the next error:

java.lang.ClassCastException: org.apache.nutch.parse.ParseData
     at org.apache.nutch.indexer.Indexer$InputFormat$1.next(Indexer.java:92)
     at org.apache.hadoop.mapred.MapTask$3.next(MapTask.java:184)
     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:44)
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:196)
     at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1075)

 
This exception is raised from method next(Writable key, Writable value) of
class SequenceFileRecordReader. 

Method 'next' is called with 'value' parameter that have different class for
each its call (classes are crawlDatum, ParseData or Inlinks). 

And when these classes (crawlDatum, ParseData or Inlinks) are cast I get
classCastException.

Why do I get this exception? I looked at old sources but didn't find
distinctions in algorithm. What do I miss?



RE: indexing problem

Posted by an...@orbita1.ru.
>>Nutch is not compatible with latest hadoop from svn.

Nutch works coorect after small tuning with latest hadoop from svn ;-)



Re: indexing problem

Posted by Sami Siren <ss...@gmail.com>.
anton@orbita1.ru wrote:
> I've got latest versions of nutch (0.9-dev) and hadoop (Trunk) from svn.
> When I try to index I get the next error:
> 
> java.lang.ClassCastException: org.apache.nutch.parse.ParseData
>      at org.apache.nutch.indexer.Indexer$InputFormat$1.next(Indexer.java:92)
>      at org.apache.hadoop.mapred.MapTask$3.next(MapTask.java:184)
>      at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:44)
>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:196)
>      at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1075)
> 
>  
> This exception is raised from method next(Writable key, Writable value) of
> class SequenceFileRecordReader. 
> 
> Method 'next' is called with 'value' parameter that have different class for
> each its call (classes are crawlDatum, ParseData or Inlinks). 
> 
> And when these classes (crawlDatum, ParseData or Inlinks) are cast I get
> classCastException.
> 
> Why do I get this exception? I looked at old sources but didn't find
> distinctions in algorithm. What do I miss?
> 
> 
Nutch is not compatible with latest hadoop from svn.

--
  Sami Siren