You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by MilleBii <mi...@gmail.com> on 2011/03/07 10:27:57 UTC
Urgent:FetchedSegments.getSummary generates NullPointerException
Randomly I now seem to get this error in production where it was working
fine for more than a year....
java.lang.NullPointerException
> at
> org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:248)
> at
> org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:63)
> at
> org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:53)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:636)
>
+ for some queries, the first hit pages are fine and suddently it stops and
I get a blank page, for some I get it on the query
+ I checked the query with Luke. Looked fine
+ the preceding bean call in search.jsp (bean.search(query, start +
hitsToRetrieve, hitsPerSite, "site", sort, reverse); did not generate any
exception as far as I can judge.
What can be the cause of that ? how to debug that one ?
I'm using Nutch1.0.
--
-MilleBii-
Re: Urgent:FetchedSegments.getSummary generates NullPointerException
Posted by MilleBii <mi...@gmail.com>.
Yes I found two corrupted segment, but not with Luke which did not give any
help on this one. Event the faulty segments could open nicely.
I loggued the HitDetails to find out which segments where creating the
error.
An improvement could be to catch the exception and log the segment id so
that it is found quickly.
Thx anyway.
2011/3/7 Andrzej Bialecki <ab...@getopt.org>
> On 3/7/11 10:27 AM, MilleBii wrote:
>
>> Randomly I now seem to get this error in production where it was working
>> fine for more than a year....
>>
>> java.lang.NullPointerException
>>
>>> at
>>>
>>> org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:248)
>>> at
>>>
>>> org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:63)
>>> at
>>>
>>> org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:53)
>>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>> at
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>> at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>> at java.lang.Thread.run(Thread.java:636)
>>>
>>>
>> + for some queries, the first hit pages are fine and suddently it stops
>> and
>> I get a blank page, for some I get it on the query
>> + I checked the query with Luke. Looked fine
>> + the preceding bean call in search.jsp (bean.search(query, start +
>> hitsToRetrieve, hitsPerSite, "site", sort, reverse); did not generate any
>> exception as far as I can judge.
>>
>> What can be the cause of that ? how to debug that one ?
>>
>> I'm using Nutch1.0.
>>
>
> One of your segments may be corrupt - usually this means it's either not
> fetched, or not parsed, or truly corrupt (or missing). The expected list of
> valid segments is the list of segment names that was used to produce the
> index - segment names are recorded in Lucene indexes. You could open all
> indexes (e.g. with Luke) and see what are the top terms in the "segment"
> field.
>
>
> --
> Best regards,
> Andrzej Bialecki <><
> ___. ___ ___ ___ _ _ __________________________________
> [__ || __|__/|__||\/| Information Retrieval, Semantic Web
> ___|||__|| \| || | Embedded Unix, System Integration
> http://www.sigram.com Contact: info at sigram dot com
>
>
--
-MilleBii-
Re: Urgent:FetchedSegments.getSummary generates NullPointerException
Posted by Andrzej Bialecki <ab...@getopt.org>.
On 3/7/11 10:27 AM, MilleBii wrote:
> Randomly I now seem to get this error in production where it was working
> fine for more than a year....
>
> java.lang.NullPointerException
>> at
>> org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:248)
>> at
>> org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:63)
>> at
>> org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:53)
>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>> at java.lang.Thread.run(Thread.java:636)
>>
>
> + for some queries, the first hit pages are fine and suddently it stops and
> I get a blank page, for some I get it on the query
> + I checked the query with Luke. Looked fine
> + the preceding bean call in search.jsp (bean.search(query, start +
> hitsToRetrieve, hitsPerSite, "site", sort, reverse); did not generate any
> exception as far as I can judge.
>
> What can be the cause of that ? how to debug that one ?
>
> I'm using Nutch1.0.
One of your segments may be corrupt - usually this means it's either not
fetched, or not parsed, or truly corrupt (or missing). The expected list
of valid segments is the list of segment names that was used to produce
the index - segment names are recorded in Lucene indexes. You could
open all indexes (e.g. with Luke) and see what are the top terms in the
"segment" field.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com