You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Chance Callahan <ch...@gmail.com> on 2011/07/20 03:47:24 UTC

Nutch bugs up when starting

Whenever I start Nutch, I get the following error:

2011-07-20 01:40:49,744 INFO   server Copying
/user/hdfs/bin/nutch-1.2.jar->/tmp/jobsub-eAdLAn/work/tmp.jar
2011-07-20 01:40:50,179 INFO   server all_clusters:
[<hadoop.job_tracker.LiveJobTracker object at 0x94237ec>,
<hadoop.fs.hadoopfs.HadoopFileSystem object at 0x92ab7ec>]
2011-07-20 01:40:50,179 INFO   server Starting
['/usr/lib/hadoop/bin/hadoop', '--config',
'/etc/alternatives/hadoop-0.20-conf/', 'jar', 'tmp.jar',
'org.apache.nutch.fetcher.Fetcher', '-conf',
'/nutch-1.2/conf/nutch-site.xml',
'-Dplugin.folders=/nutch-1.2/plugins/',
'/nutch-1.2/urlsdir/seeds.txt', '-dir', 'crawldb/crawl'].  (Env:
{'HADOOP_CLASSPATH':
'/usr/share/hue/apps/jobsub/src/jobsub/../../java-lib/trace.jar:/usr/share/hue/desktop/libs/hadoop/src/hadoop/../../static-group-mapping/java-lib/static-group-mapping-1.2.0.jar',
'HUE_JOBTRACE_LOG': '/tmp/jobsub-eAdLAn/jobs', 'HUE_JOBSUB_USER':
'hdfs', 'HADOOP_OPTS':
'-javaagent:/usr/share/hue/ext/thirdparty/java/aspectj-1.6.5/aspectjweaver.jar
-Dhue.suffix=-via-hue -Duser.name=hdfs', 'HUE_JOBSUB_GROUPS': 'hdfs',
'HADOOP_HOME': '/usr/lib/hadoop'})
2011-07-20 01:40:50,179 INFO   server Running:
/usr/lib/hadoop/bin/hadoop --config
/etc/alternatives/hadoop-0.20-conf/ jar tmp.jar
org.apache.nutch.fetcher.Fetcher -conf /nutch-1.2/conf/nutch-site.xml
-Dplugin.folders=/nutch-1.2/plugins/ /nutch-1.2/urlsdir/seeds.txt -dir
crawldb/crawl
11/07/19 21:40:57 WARN fetcher.Fetcher: Fetcher: Your
'http.agent.name' value should be listed first in 'http.robots.agents'
property.
11/07/19 21:40:57 INFO fetcher.Fetcher: Fetcher: starting at 2011-07-19 21:40:57
11/07/19 21:40:57 INFO fetcher.Fetcher: Fetcher: segment:
/nutch-1.2/urlsdir/seeds.txt
11/07/19 21:40:58 INFO security.UgiFixer: Hue UGI fixer aspect loaded.
11/07/19 21:41:03 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/07/19 21:41:03 INFO security.JobClientTracer: Hue job submission
aspect loaded.
11/07/19 21:41:03 INFO util.NativeCodeLoader: Loaded the native-hadoop library
11/07/19 21:41:03 INFO mapred.JobClient: Cleaning up the staging area
file:/tmp/hadoop-hdfs/mapred/staging/hdfs1105342640/.staging/job_local_0001
11/07/19 21:41:03 FATAL fetcher.Fetcher: Fetcher: java.lang.NullPointerException
	at org.apache.nutch.fetcher.FetcherOutputFormat.checkOutputSpecs(FetcherOutputFormat.java:49)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:874)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal_aroundBody0(JobClient.java:807)
	at org.apache.hadoop.mapred.JobClient$AjcClosure1.run(JobClient.java:1)
	at org.apache.hadoop.security.JobClientTrace.ajc$around$org_apache_hadoop_security_JobClientTrace$1$b9879daproceed(JobClientTrace.aj:1)
	at org.apache.hadoop.security.JobClientTrace.ajc$around$org_apache_hadoop_security_JobClientTrace$1$b9879da(JobClientTrace.aj:33)
	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1242)
	at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1107)
	at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:1145)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:1116)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

Any ideas what it means and how to fix it?

Thanks,
Chance Callahan
KD0MXN