You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Lauren Massa Lochridge <la...@ieee.org> on 2007/04/23 00:58:27 UTC
0.9 ClassCastException: org.apache.hadoop.io.Text
Hello,
Any opinions about a problem we have with 0.9 are appreciated.
The problem is that hits are found via command line NutchBean
invocation, (in this small test case 333 hits) however, the result set
is zero hits due to the exception. Luke also accesses these same indexes
just fine.
Got the Hadoop patch that was referred to in the archives, because the
description seemed applicable, however it appears to be the same version
of hadoop-core: 12.2.2 that came with nutch 0.9. Is that patch already
integrated into the most recent 0.9 nutch release or is it otherwise not
applicable? Can someone tell me what the problem is given the exception
in the log below?
Thanks.
Lauren Massa-Lochridge
eXlr8, Inc.
$ bin/nutch org.apache.nutch.searcher.NutchBean news
Total hits: 333
Exception in thread "main" java.lang.RuntimeException:
java.lang.ClassCastExcept
ion: org.apache.hadoop.io.Text
at
org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.
java:204)
at
org.apache.nutch.searcher.NutchBean.getSummary(NutchBean.java:344)
at org.apache.nutch.searcher.NutchBean.main(NutchBean.java:395)
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text
at org.apache.hadoop.io.UTF8.compareTo(UTF8.java:123)
at
org.apache.hadoop.io.WritableComparator.compare(WritableComparator.ja
va:107)
at
org.apache.hadoop.io.MapFile$Reader.binarySearch(MapFile.java:369)
at org.apache.hadoop.io.MapFile$Reader.seek(MapFile.java:338)
at org.apache.hadoop.io.MapFile$Reader.get(MapFile.java:392)
at
org.apache.hadoop.mapred.MapFileOutputFormat.getEntry(MapFileOutputFo
rmat.java:86)
at
org.apache.nutch.searcher.FetchedSegments$Segment.getEntry(FetchedSeg
ments.java:95)
at
org.apache.nutch.searcher.FetchedSegments$Segment.getParseText(Fetche
dSegments.java:86)
at
org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.
java:159)
at
org.apache.nutch.searcher.FetchedSegments$SummaryThread.run(FetchedSe
gments.java:177)
Re: 0.9 ClassCastException: org.apache.hadoop.io.Text
Posted by Lauren Massa Lochridge <la...@ieee.org>.
Ken,
Thanks very much - you were right. I'd never made the mistake before of
copying in the newly created, ( 0.9 ), /crawl which resulted in adding
to the existing 8.1 segments, rather than deleting all of the old 8.1
and thereby replacing /crawl entirely; your response prompted me to look
at that again and sure enough that's what it was!
Thanks.
Lauren Massa-Lochridge
eXlr8, Inc.
Ken Krugler wrote:
>> Any opinions about a problem we have with 0.9 are appreciated.
>> The problem is that hits are found via command line NutchBean
>> invocation, (in this small test case 333 hits) however, the result
>> set is zero hits due to the exception. Luke also accesses these same
>> indexes just fine.
>>
>> Got the Hadoop patch that was referred to in the archives, because
>> the description seemed applicable, however it appears to be the same
>> version of hadoop-core: 12.2.2 that came with nutch 0.9. Is that
>> patch already integrated into the most recent 0.9 nutch release or is
>> it otherwise not applicable? Can someone tell me what the problem is
>> given the exception in the log below?
>
>
> This looks similar to a problem I had when I was trying to use an
> older crawl (one generated by a version of Nutch in between 0.8.1 and
> 0.9) with the 0.9 distribution.
>
> E.g. if the page content was saved using an older version of Nutch,
> then when the summarizer tries to load the content, you can run into
> this exception.
>
> -- Ken
>
>
>> Thanks.
>> Lauren Lochridge
>> eXlr8, Inc.
>>
>> $ bin/nutch org.apache.nutch.searcher.NutchBean news
>> Total hits: 333
>> Exception in thread "main" java.lang.RuntimeException:
>> java.lang.ClassCastExcept
>> ion: org.apache.hadoop.io.Text
>> at
>> org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.
>> java:204)
>> at
>> org.apache.nutch.searcher.NutchBean.getSummary(NutchBean.java:344)
>> at
>> org.apache.nutch.searcher.NutchBean.main(NutchBean.java:395)
>> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text
>> at org.apache.hadoop.io.UTF8.compareTo(UTF8.java:123)
>> at
>> org.apache.hadoop.io.WritableComparator.compare(WritableComparator.ja
>> va:107)
>> at
>> org.apache.hadoop.io.MapFile$Reader.binarySearch(MapFile.java:369)
>> at org.apache.hadoop.io.MapFile$Reader.seek(MapFile.java:338)
>> at org.apache.hadoop.io.MapFile$Reader.get(MapFile.java:392)
>> at
>> org.apache.hadoop.mapred.MapFileOutputFormat.getEntry(MapFileOutputFo
>> rmat.java:86)
>> at
>> org.apache.nutch.searcher.FetchedSegments$Segment.getEntry(FetchedSeg
>> ments.java:95)
>> at
>> org.apache.nutch.searcher.FetchedSegments$Segment.getParseText(Fetche
>> dSegments.java:86)
>> at
>> org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.
>> java:159)
>> at
>> org.apache.nutch.searcher.FetchedSegments$SummaryThread.run(FetchedSe
>> gments.java:177)
>
>
>
Re: 0.9 ClassCastException: org.apache.hadoop.io.Text
Posted by Ken Krugler <kk...@transpac.com>.
>Any opinions about a problem we have with 0.9 are appreciated.
>The problem is that hits are found via command line NutchBean
>invocation, (in this small test case 333 hits) however, the result
>set is zero hits due to the exception. Luke also accesses these same
>indexes just fine.
>
>Got the Hadoop patch that was referred to in the archives, because
>the description seemed applicable, however it appears to be the same
>version of hadoop-core: 12.2.2 that came with nutch 0.9. Is that
>patch already integrated into the most recent 0.9 nutch release or
>is it otherwise not applicable? Can someone tell me what the problem
>is given the exception in the log below?
This looks similar to a problem I had when I was trying to use an
older crawl (one generated by a version of Nutch in between 0.8.1 and
0.9) with the 0.9 distribution.
E.g. if the page content was saved using an older version of Nutch,
then when the summarizer tries to load the content, you can run into
this exception.
-- Ken
>Thanks.
>Lauren Massa-Lochridge
>eXlr8, Inc.
>
> $ bin/nutch org.apache.nutch.searcher.NutchBean news
> Total hits: 333
> Exception in thread "main" java.lang.RuntimeException:
> java.lang.ClassCastExcept
> ion: org.apache.hadoop.io.Text
> at
> org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.
> java:204)
> at
> org.apache.nutch.searcher.NutchBean.getSummary(NutchBean.java:344)
> at org.apache.nutch.searcher.NutchBean.main(NutchBean.java:395)
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text
> at org.apache.hadoop.io.UTF8.compareTo(UTF8.java:123)
> at
> org.apache.hadoop.io.WritableComparator.compare(WritableComparator.ja
> va:107)
> at
> org.apache.hadoop.io.MapFile$Reader.binarySearch(MapFile.java:369)
> at org.apache.hadoop.io.MapFile$Reader.seek(MapFile.java:338)
> at org.apache.hadoop.io.MapFile$Reader.get(MapFile.java:392)
> at
> org.apache.hadoop.mapred.MapFileOutputFormat.getEntry(MapFileOutputFo
> rmat.java:86)
> at
> org.apache.nutch.searcher.FetchedSegments$Segment.getEntry(FetchedSeg
> ments.java:95)
> at
> org.apache.nutch.searcher.FetchedSegments$Segment.getParseText(Fetche
> dSegments.java:86)
> at
> org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.
> java:159)
> at
> org.apache.nutch.searcher.FetchedSegments$SummaryThread.run(FetchedSe
> gments.java:177)
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"