You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Armando Gonçalves <ma...@gmail.com> on 2009/03/19 02:57:46 UTC

MergeSegments Error.

When I try to merge the Segments of two crawls, 2Gb and 1Gb each. I get a
very bizarre eror:
Exception in thread "main" java.io.IOException: Job failed!
       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
       at
org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:620)
       at
org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:665)

I was wondering why Its happened cuz I've already merged segments before. So
I noticed two missbehaviour
First odd thing: 24h of merging!!!!!!
Second (and the reason to get the Job Failed): /tmp/hadoop-<user> got ALL my
Free Storage Space (100Gb)
Any clue how to fix this ?

-- 
Armando Gonçalves
C.C 2005-2

Re: MergeSegments Error.

Posted by Armando Gonçalves <ma...@gmail.com>.
I didn't get ... why this should solve the problem ?
the current configuration is :

<property>
  <name>hadoop.tmp.dir</name>
  <value>/tmp/hadoop-${user.name}</value>
  <description>A base for other temporary directories.</description>
</property>


On Thu, Mar 19, 2009 at 7:35 AM, vishal vachhani <vi...@gmail.com>wrote:

> define follwoing prop in $nutch_home/conf/hodoop-site.xml in order to
> change
> the tmp folder path.
>
> <property>
> <name>hadoop.tmp.dir</name>
>  <value><any-path>/hadoop-${user.name}</value>
>  <description>Hadoop temp directory</description> </property>
>
>
> 2009/3/19 Armando Gonçalves <ma...@gmail.com>
>
> > When I try to merge the Segments of two crawls, 2Gb and 1Gb each. I get a
> > very bizarre eror:
> > Exception in thread "main" java.io.IOException: Job failed!
> >       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
> >       at
> > org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:620)
> >       at
> > org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:665)
> >
> > I was wondering why Its happened cuz I've already merged segments before.
> > So
> > I noticed two missbehaviour
> > First odd thing: 24h of merging!!!!!!
> > Second (and the reason to get the Job Failed): /tmp/hadoop-<user> got ALL
> > my
> > Free Storage Space (100Gb)
> > Any clue how to fix this ?
> >
> > --
> > Armando Gonçalves
> > C.C 2005-2
> >
>



-- 
Armando Gonçalves
C.C 2005-2

Re: MergeSegments Error.

Posted by vishal vachhani <vi...@gmail.com>.
define follwoing prop in $nutch_home/conf/hodoop-site.xml in order to change
the tmp folder path.

<property>
<name>hadoop.tmp.dir</name>
  <value><any-path>/hadoop-${user.name}</value>
  <description>Hadoop temp directory</description> </property>


2009/3/19 Armando Gonçalves <ma...@gmail.com>

> When I try to merge the Segments of two crawls, 2Gb and 1Gb each. I get a
> very bizarre eror:
> Exception in thread "main" java.io.IOException: Job failed!
>       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>       at
> org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:620)
>       at
> org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:665)
>
> I was wondering why Its happened cuz I've already merged segments before.
> So
> I noticed two missbehaviour
> First odd thing: 24h of merging!!!!!!
> Second (and the reason to get the Job Failed): /tmp/hadoop-<user> got ALL
> my
> Free Storage Space (100Gb)
> Any clue how to fix this ?
>
> --
> Armando Gonçalves
> C.C 2005-2
>