You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Nicolas BĂ©lisle <ni...@gmail.com> on 2007/02/06 06:48:03 UTC
Crawl on a multiprocessor system
Hi,
I'm trying to run Nutch 0.8.1 on a multiprocessor system as described in :
http://www.mail-archive.com/nutch-user@lucene.apache.org/msg02394.html
However, the Injector stops at "Injector: Converting injected urls to crawl
db entries" and keeps waiting.
Here's my hadoop-site.xml configuration :
<configuration>
<property>
<name>fs.default.name</name>
<value>local</value>
<description>
The name of the default file system. Either the literal string
"local" or a host:port for NDFS.
</description>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
<description>
The host and port that the MapReduce job tracker runs at. If
"local", then jobs are run in-process as a single map and
reduce task.
</description>
</property>
<property>
<name>mapred.map.tasks</name>
<value>1</value>
<description>
define mapred.map tasks to be number of slave hosts
</description>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>1</value>
<description>
define mapred.reduce tasks to be number of slave hosts
</description>
</property>
<property>
<name>mapred.system.dir</name>
<value>/nutch/filesystem/mapreduce/system</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/nutch/filesystem/mapreduce/local</value>
</property>
</configuration>
The "slaves" files contains :
localhost
Any ideas ?
Regards,
Nicolas