You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Stefan Scheffler <ss...@avantgarde-labs.de> on 2012/09/12 14:41:46 UTC

Hadoop and Nutch

Hi,
I try to run nutch 2.0 on a hadoop cluster and get the following 
exception. I compiled nutch from sources and start it with:

HADOOP_CLASSPATH=lib/apache-nutch-1.6-SNAPSHOT.jar hadoop 
org.apache.nutch.crawl.Crawl urls -dir test -depth 2 -topN 5

12/09/12 14:34:20 INFO mapred.JobClient: Task Id : 
attempt_201208141240_0593_m_000001_2, Status : FAILED
java.lang.RuntimeException: Error in configuring object
     at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
     at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
     at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:387)
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
     at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:396)
     at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
     at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: java.lang.reflect.InvocationTargetException
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.jav

The error comes, when hadoops starts the injecting. I have no idea, 
where there error comes from.
Has someone a clue about?

With friendly regards
Stefan Scheffler



Re: Hadoop and Nutch

Posted by Walter Tietze <ti...@neofonie.de>.
Hi,


I have the same problems using Nutch 1.5.1 with
Cloudera CDH4 and Yarn.

There are already entries in the mail archive for
that:

http://www.mail-archive.com/user@nutch.apache.org/msg07182.html
http://www.mail-archive.com/user@nutch.apache.org/msg07184.html


I thought the problem has to do with CDH4 and Yarn.


Please, let me know, if you found a solution for that!



cheers, Walter



Am 12.09.2012 17:21, schrieb Stefan Scheffler:
> Hey,
> Thank you for your for the reply.
> The error stays the same :(
> On 12.09.2012 15:10, Julien Nioche wrote:
>> Hi Stefan,
>>
>> you don't need to set HADOOP CLASSPATH, just use the scripts provided
>> from runtime/deploy/bin
>>
>> ant job => runtime/deploy/bin  => nutch crawl
>>
>> J.
>>
>> On 12 September 2012 13:41, Stefan Scheffler
>> <ss...@avantgarde-labs.de>wrote:
>>
>>> Hi,
>>> I try to run nutch 2.0 on a hadoop cluster and get the following
>>> exception. I compiled nutch from sources and start it with:
>>>
>>> HADOOP_CLASSPATH=lib/apache-**nutch-1.6-SNAPSHOT.jar hadoop
>>> org.apache.nutch.crawl.Crawl urls -dir test -depth 2 -topN 5
>>>
>>> 12/09/12 14:34:20 INFO mapred.JobClient: Task Id :
>>> attempt_201208141240_0593_m_**000001_2, Status : FAILED
>>> java.lang.RuntimeException: Error in configuring object
>>>      at org.apache.hadoop.util.**ReflectionUtils.setJobConf(**
>>> ReflectionUtils.java:93)
>>>      at org.apache.hadoop.util.**ReflectionUtils.setConf(**
>>> ReflectionUtils.java:64)
>>>      at org.apache.hadoop.util.**ReflectionUtils.newInstance(**
>>> ReflectionUtils.java:117)
>>>      at
>>> org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:387)
>>>      at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:325)
>>>      at org.apache.hadoop.mapred.**Child$4.run(Child.java:270)
>>>      at java.security.**AccessController.doPrivileged(**Native Method)
>>>      at javax.security.auth.Subject.**doAs(Subject.java:396)
>>>      at org.apache.hadoop.security.**UserGroupInformation.doAs(**
>>> UserGroupInformation.java:**1177)
>>>      at org.apache.hadoop.mapred.**Child.main(Child.java:264)
>>> Caused by: java.lang.reflect.**InvocationTargetException
>>>      at sun.reflect.**NativeMethodAccessorImpl.**invoke0(Native Method)
>>>      at sun.reflect.**NativeMethodAccessorImpl.**invoke(**
>>> NativeMethodAccessorImpl.java:**39)
>>>      at sun.reflect.**DelegatingMethodAccessorImpl.**invoke(**
>>> DelegatingMethodAccessorImpl.**jav
>>>
>>> The error comes, when hadoops starts the injecting. I have no idea,
>>> where
>>> there error comes from.
>>> Has someone a clue about?
>>>
>>> With friendly regards
>>> Stefan Scheffler
>>>
>>>
>>>
>>
> 
> 


-- 

--------------------------------
Walter Tietze
Senior Softwareengineer
Research

Neofonie GmbH
Robert-Koch-Platz 4
10115 Berlin

T +49.30 24627 318
F +49.30 24627 120

Walter.Tietze@neofonie.de
http://www.neofonie.de

Handelsregister
Berlin-Charlottenburg: HRB 67460

Geschäftsführung:
Thomas Kitlitschko
--------------------------------


Re: Hadoop and Nutch

Posted by Stefan Scheffler <ss...@avantgarde-labs.de>.
Hey,
Thank you for your for the reply.
The error stays the same :(
On 12.09.2012 15:10, Julien Nioche wrote:
> Hi Stefan,
>
> you don't need to set HADOOP CLASSPATH, just use the scripts provided
> from runtime/deploy/bin
>
> ant job => runtime/deploy/bin  => nutch crawl
>
> J.
>
> On 12 September 2012 13:41, Stefan Scheffler
> <ss...@avantgarde-labs.de>wrote:
>
>> Hi,
>> I try to run nutch 2.0 on a hadoop cluster and get the following
>> exception. I compiled nutch from sources and start it with:
>>
>> HADOOP_CLASSPATH=lib/apache-**nutch-1.6-SNAPSHOT.jar hadoop
>> org.apache.nutch.crawl.Crawl urls -dir test -depth 2 -topN 5
>>
>> 12/09/12 14:34:20 INFO mapred.JobClient: Task Id :
>> attempt_201208141240_0593_m_**000001_2, Status : FAILED
>> java.lang.RuntimeException: Error in configuring object
>>      at org.apache.hadoop.util.**ReflectionUtils.setJobConf(**
>> ReflectionUtils.java:93)
>>      at org.apache.hadoop.util.**ReflectionUtils.setConf(**
>> ReflectionUtils.java:64)
>>      at org.apache.hadoop.util.**ReflectionUtils.newInstance(**
>> ReflectionUtils.java:117)
>>      at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:387)
>>      at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:325)
>>      at org.apache.hadoop.mapred.**Child$4.run(Child.java:270)
>>      at java.security.**AccessController.doPrivileged(**Native Method)
>>      at javax.security.auth.Subject.**doAs(Subject.java:396)
>>      at org.apache.hadoop.security.**UserGroupInformation.doAs(**
>> UserGroupInformation.java:**1177)
>>      at org.apache.hadoop.mapred.**Child.main(Child.java:264)
>> Caused by: java.lang.reflect.**InvocationTargetException
>>      at sun.reflect.**NativeMethodAccessorImpl.**invoke0(Native Method)
>>      at sun.reflect.**NativeMethodAccessorImpl.**invoke(**
>> NativeMethodAccessorImpl.java:**39)
>>      at sun.reflect.**DelegatingMethodAccessorImpl.**invoke(**
>> DelegatingMethodAccessorImpl.**jav
>>
>> The error comes, when hadoops starts the injecting. I have no idea, where
>> there error comes from.
>> Has someone a clue about?
>>
>> With friendly regards
>> Stefan Scheffler
>>
>>
>>
>


-- 
Stefan Scheffler
Avantgarde Labs GmbH
Löbauer Straße 19, 01099 Dresden
Telefon: + 49 (0) 351 21590834
Email: sscheffler@avantgarde-labs.de


Re: Hadoop and Nutch

Posted by Julien Nioche <li...@gmail.com>.
Hi Stefan,

you don't need to set HADOOP CLASSPATH, just use the scripts provided
from runtime/deploy/bin

ant job => runtime/deploy/bin  => nutch crawl

J.

On 12 September 2012 13:41, Stefan Scheffler
<ss...@avantgarde-labs.de>wrote:

> Hi,
> I try to run nutch 2.0 on a hadoop cluster and get the following
> exception. I compiled nutch from sources and start it with:
>
> HADOOP_CLASSPATH=lib/apache-**nutch-1.6-SNAPSHOT.jar hadoop
> org.apache.nutch.crawl.Crawl urls -dir test -depth 2 -topN 5
>
> 12/09/12 14:34:20 INFO mapred.JobClient: Task Id :
> attempt_201208141240_0593_m_**000001_2, Status : FAILED
> java.lang.RuntimeException: Error in configuring object
>     at org.apache.hadoop.util.**ReflectionUtils.setJobConf(**
> ReflectionUtils.java:93)
>     at org.apache.hadoop.util.**ReflectionUtils.setConf(**
> ReflectionUtils.java:64)
>     at org.apache.hadoop.util.**ReflectionUtils.newInstance(**
> ReflectionUtils.java:117)
>     at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**java:387)
>     at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:325)
>     at org.apache.hadoop.mapred.**Child$4.run(Child.java:270)
>     at java.security.**AccessController.doPrivileged(**Native Method)
>     at javax.security.auth.Subject.**doAs(Subject.java:396)
>     at org.apache.hadoop.security.**UserGroupInformation.doAs(**
> UserGroupInformation.java:**1177)
>     at org.apache.hadoop.mapred.**Child.main(Child.java:264)
> Caused by: java.lang.reflect.**InvocationTargetException
>     at sun.reflect.**NativeMethodAccessorImpl.**invoke0(Native Method)
>     at sun.reflect.**NativeMethodAccessorImpl.**invoke(**
> NativeMethodAccessorImpl.java:**39)
>     at sun.reflect.**DelegatingMethodAccessorImpl.**invoke(**
> DelegatingMethodAccessorImpl.**jav
>
> The error comes, when hadoops starts the injecting. I have no idea, where
> there error comes from.
> Has someone a clue about?
>
> With friendly regards
> Stefan Scheffler
>
>
>


-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Re: Hadoop and Nutch

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi,

On Wed, Sep 12, 2012 at 1:41 PM, Stefan Scheffler
<ss...@avantgarde-labs.de> wrote:

> I try to run nutch 2.0 on a hadoop cluster and get the following exception.

>
> HADOOP_CLASSPATH=lib/apache-nutch-1.6-SNAPSHOT.jar hadoop
> org.apache.nutch.crawl.Crawl urls -dir test -depth 2 -topN 5

The Nutch versions are not consistent. Please check and get back to us.

Lewis