You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by xiao yang <ya...@gmail.com> on 2010/06/12 03:55:37 UTC

Re: OutOfMemoryError when index

I found a very large web page in the segments, deleted it, and it's OK now.

On Sat, Jun 12, 2010 at 6:44 AM, Ted Yu <yu...@gmail.com> wrote:
> Have you found solution to the problem below ?
>
> On Thu, Mar 4, 2010 at 2:46 AM, xiao yang <ya...@gmail.com> wrote:
>>
>> Hi, all
>>
>> I get outofmemory Error when index using bin/nutch index crawl/indexes
>> crawl/crawldb crawl/linkdb crawl/segments/*
>> I have configure HADOOP_HEAPSIZE in hadoop-env.sh and
>> mapred.child.java.opts in mapred-site.xml to the hardware limit.
>>   <property>
>>     <name>mapred.child.java.opts</name>
>>     <value>-Xmx2600m</value>
>>   </property>
>> Larger configuration will lead to startup failure of Hadoop.
>> What should I do?
>>
>> Thanks!
>> Xiao
>>
>> FATAL org.apache.hadoop.mapred.TaskTracker: Error running child :
>> java.lang.OutOfMemoryError
>>        at java.io.FileInputStream.readBytes(Native Method)
>>        at java.io.FileInputStream.read(FileInputStream.java:199)
>>        at
>> org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream.read(RawLocalFileSystem.java:83)
>>        at
>> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.read(RawLocalFileSystem.java:136)
>>        at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
>>        at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
>>        at java.io.DataInputStream.read(DataInputStream.java:132)
>>        at
>> org.apache.hadoop.mapred.IFileInputStream.doRead(IFileInputStream.java:149)
>>        at
>> org.apache.hadoop.mapred.IFileInputStream.read(IFileInputStream.java:101)
>>        at org.apache.hadoop.mapred.IFile$Reader.readData(IFile.java:328)
>>        at org.apache.hadoop.mapred.IFile$Reader.rejigData(IFile.java:358)
>>        at
>> org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:342)
>>        at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:404)
>>        at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)
>>        at
>> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330)
>>        at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
>>        at
>> org.apache.hadoop.mapred.Task$ValuesIterator.readNextKey(Task.java:973)
>>        at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:932)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:241)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:237)
>>        at
>> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:79)
>>        at
>> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50)
>>        at
>> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
>>        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
>>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>