You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Felix Zimmermann <fe...@gmx.de> on 2009/12/06 01:35:04 UTC

Indexing with solrindexer -> OutOfMemoryError

Hi,

when trying to index four segments (~5 GB) with solrindexer, I get this
error in hadoop.log. There is no error in the logs of Tomcat, where I
deployed Solr. I crawled with "crawl"-command.

I`ve read that increasing the hadoop heap space will change nothing.
What can I do?

Thanks for help!
Felix.


2009-12-06 00:21:51,061 WARN  mapred.LocalJobRunner - job_local_0001
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2882)
        at
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
        at
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:572)
        at java.lang.StringBuffer.append(StringBuffer.java:320)
        at java.io.StringWriter.write(StringWriter.java:60)
        at org.apache.solr.common.util.XML.escape(XML.java:180)
        at org.apache.solr.common.util.XML.escapeCharData(XML.java:78)
        at org.apache.solr.common.util.XML.writeXML(XML.java:148)
        at
org.apache.solr.client.solrj.util.ClientUtils.writeXML(ClientUtils.java:117)
        at
org.apache.solr.client.solrj.request.UpdateRequest.getXML(UpdateRequest.java:169)
        at
org.apache.solr.client.solrj.request.UpdateRequest.getContentStreams(UpdateRequest.java:160)
        at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:191)
        at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:183)
        at
org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217)
        at
org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48)
        at
org.apache.nutch.indexer.solr.SolrWriter.write(SolrWriter.java:58)
        at org.apache.nutch.indexer.IndexerOutputFormat
$1.write(IndexerOutputFormat.java:54)
        at org.apache.nutch.indexer.IndexerOutputFormat
$1.write(IndexerOutputFormat.java:44)
        at org.apache.hadoop.mapred.ReduceTask
$3.collect(ReduceTask.java:410)
        at
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:158)
        at
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:436)
        at org.apache.hadoop.mapred.LocalJobRunner
$Job.run(LocalJobRunner.java:170)
2009-12-06 00:21:51,650 FATAL solr.SolrIndexer - SolrIndexer:
java.io.IOException: Job failed!
        at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
        at
org.apache.nutch.indexer.solr.SolrIndexer.indexSolr(SolrIndexer.java:73)
        at
org.apache.nutch.indexer.solr.SolrIndexer.run(SolrIndexer.java:95)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at
org.apache.nutch.indexer.solr.SolrIndexer.main(SolrIndexer.java:104)




RE: Indexing with solrindexer -> OutOfMemoryError

Posted by BELLINI ADAM <mb...@msn.com>.
hi,
u have to make your segments smaller than that, just cut every segment in small pieces




> Subject: Indexing with solrindexer -> OutOfMemoryError
> From: felizimm@gmx.de
> To: nutch-user@lucene.apache.org
> Date: Sun, 6 Dec 2009 01:35:04 +0100
> 
> Hi,
> 
> when trying to index four segments (~5 GB) with solrindexer, I get this
> error in hadoop.log. There is no error in the logs of Tomcat, where I
> deployed Solr. I crawled with "crawl"-command.
> 
> I`ve read that increasing the hadoop heap space will change nothing.
> What can I do?
> 
> Thanks for help!
> Felix.
> 
> 
> 2009-12-06 00:21:51,061 WARN  mapred.LocalJobRunner - job_local_0001
> java.lang.OutOfMemoryError: Java heap space
>         at java.util.Arrays.copyOf(Arrays.java:2882)
>         at
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
>         at
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:572)
>         at java.lang.StringBuffer.append(StringBuffer.java:320)
>         at java.io.StringWriter.write(StringWriter.java:60)
>         at org.apache.solr.common.util.XML.escape(XML.java:180)
>         at org.apache.solr.common.util.XML.escapeCharData(XML.java:78)
>         at org.apache.solr.common.util.XML.writeXML(XML.java:148)
>         at
> org.apache.solr.client.solrj.util.ClientUtils.writeXML(ClientUtils.java:117)
>         at
> org.apache.solr.client.solrj.request.UpdateRequest.getXML(UpdateRequest.java:169)
>         at
> org.apache.solr.client.solrj.request.UpdateRequest.getContentStreams(UpdateRequest.java:160)
>         at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:191)
>         at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:183)
>         at
> org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217)
>         at
> org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48)
>         at
> org.apache.nutch.indexer.solr.SolrWriter.write(SolrWriter.java:58)
>         at org.apache.nutch.indexer.IndexerOutputFormat
> $1.write(IndexerOutputFormat.java:54)
>         at org.apache.nutch.indexer.IndexerOutputFormat
> $1.write(IndexerOutputFormat.java:44)
>         at org.apache.hadoop.mapred.ReduceTask
> $3.collect(ReduceTask.java:410)
>         at
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:158)
>         at
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:436)
>         at org.apache.hadoop.mapred.LocalJobRunner
> $Job.run(LocalJobRunner.java:170)
> 2009-12-06 00:21:51,650 FATAL solr.SolrIndexer - SolrIndexer:
> java.io.IOException: Job failed!
>         at
> org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>         at
> org.apache.nutch.indexer.solr.SolrIndexer.indexSolr(SolrIndexer.java:73)
>         at
> org.apache.nutch.indexer.solr.SolrIndexer.run(SolrIndexer.java:95)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at
> org.apache.nutch.indexer.solr.SolrIndexer.main(SolrIndexer.java:104)
> 
> 
> 
 		 	   		  
_________________________________________________________________
Windows Live: Make it easier for your friends to see what you’re up to on Facebook.
http://go.microsoft.com/?linkid=9691816