You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Tomislav Poljak <tp...@gmail.com> on 2007/12/11 18:00:06 UTC

Re: Problem with partititioning

Hi,
I have the same problem (same exception) after selecting phase of
generating, sometimes it works fine and sometimes this exception occurs.
Why is that and how can I fix it?

Thanks,
     Tomislav
  
On Tue, 2007-11-06 at 15:02 +0100, Karol Rybak wrote:
> ced that partitioning job generates a large number (150) of
> reduce tasks, while i have 


Re: Problem with partititioning

Posted by Mathijs Homminga <ma...@knowlogy.nl>.
We have had the same problem (I think it is not the partitioning but the 
last part of the select which goes wrong). We solved it by turning of 
speculative execution.
In hadoop-site.xml:

<property>
  <name>mapred.speculative.execution</name>
  <value>false</value>
  <description>If true, then multiple instances of some map and reduce 
tasks
               may be executed in parallel.</description>
</property>

 From the stacktrace, it looks like the tracker tries to copy the output 
data from the task which was killed (tip already completed) instead from 
the task (same task, other attempt) which completed.  Is this related to 
the Generator job in Nutch or an issue in Hadoop?

Mathijs

Tomislav Poljak wrote:
> Hi,
> I have the same problem (same exception) after selecting phase of
> generating, sometimes it works fine and sometimes this exception occurs.
> Why is that and how can I fix it?
>
> Thanks,
>      Tomislav
>   
> On Tue, 2007-11-06 at 15:02 +0100, Karol Rybak wrote:
>   
>> ced that partitioning job generates a large number (150) of
>> reduce tasks, while i have 
>>     
>
>   

-- 
Knowlogy
Helperpark 290 C
9723 ZA Groningen
+31 (0)50 2103567
http://www.knowlogy.nl

mathijs.homminga@knowlogy.nl
+31 (0)6 15312977