You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Doron Rosenberg <do...@gmail.com> on 2008/07/28 19:49:14 UTC

index-more plugin throwing exception on svn trunk

Using nutch-2008-07-28_04-01-14, if I enable the index-more plugin I get:

\hadoop-Doron\mapred\local\index\_-74404655 autoCommit=true
mergePolicy=org.apac
he.lucene.index.LogByteSizeMergePolicy@330fb9mergeScheduler=org.apache.lucene.i
ndex.ConcurrentMergeScheduler@4d5575 ramBufferSizeMB=16.0 maxBuffereDocs=50
maxB
uffereDeleteTerms=-1 maxFieldLength=10000 index=
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1062)
        at org.apache.nutch.indexer.Indexer.index(Indexer.java:311)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:145)

Removing the plugin makes the crawl work.

According to http://www.nabble.com/index-more-problem--td16757538.html, the
index= part is the issue, but no mention is made of how to fix it.  Also,
this used to work fine in nutch 0.9.

Any ideas how to fix this?

Re: index-more plugin throwing exception on svn trunk

Posted by ansi <my...@gmail.com>.
Maybe you shold check logs/hadoop.log for more detail.

I have got this error because of some exceptions in my own analysis tool.


On 7/29/08, Doron Rosenberg <do...@gmail.com> wrote:
>
> Using nutch-2008-07-28_04-01-14, if I enable the index-more plugin I get:
>
> \hadoop-Doron\mapred\local\index\_-74404655 autoCommit=true
> mergePolicy=org.apac
> he.lucene.index.LogByteSizeMergePolicy@330fb9mergeScheduler
> =org.apache.lucene.i
> ndex.ConcurrentMergeScheduler@4d5575 ramBufferSizeMB=16.0
> maxBuffereDocs=50
> maxB
> uffereDeleteTerms=-1 maxFieldLength=10000 index=
> Exception in thread "main" java.io.IOException: Job failed!
>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1062)
>        at org.apache.nutch.indexer.Indexer.index(Indexer.java:311)
>        at org.apache.nutch.crawl.Crawl.main(Crawl.java:145)
>
> Removing the plugin makes the crawl work.
>
> According to http://www.nabble.com/index-more-problem--td16757538.html,
> the
> index= part is the issue, but no mention is made of how to fix it.  Also,
> this used to work fine in nutch 0.9.
>
> Any ideas how to fix this?
>