You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Nicolas BĂ©lisle <ni...@gmail.com> on 2007/02/06 06:48:03 UTC

Crawl on a multiprocessor system

Hi,

I'm trying to run Nutch 0.8.1 on a multiprocessor system as described in :
http://www.mail-archive.com/nutch-user@lucene.apache.org/msg02394.html

However, the Injector stops at "Injector: Converting injected urls to crawl
db entries" and keeps waiting.

Here's my hadoop-site.xml configuration :

<configuration>
<property>
  <name>fs.default.name</name>
  <value>local</value>
  <description>
    The name of the default file system. Either the literal string
    "local" or a host:port for NDFS.
  </description>
</property>
<property>
  <name>mapred.job.tracker</name>
  <value>localhost:9001</value>
  <description>
    The host and port that the MapReduce job tracker runs at. If
    "local", then jobs are run in-process as a single map and
    reduce task.
  </description>
</property>
<property>
  <name>mapred.map.tasks</name>
  <value>1</value>
  <description>
    define mapred.map tasks to be number of slave hosts
  </description>
</property>
<property>
  <name>mapred.reduce.tasks</name>
  <value>1</value>
  <description>
    define mapred.reduce tasks to be number of slave hosts
  </description>
</property>
<property>
  <name>mapred.system.dir</name>
  <value>/nutch/filesystem/mapreduce/system</value>
</property>
<property>
  <name>mapred.local.dir</name>
  <value>/nutch/filesystem/mapreduce/local</value>
</property>
</configuration>

The "slaves" files contains :
localhost


Any ideas ?


Regards,

Nicolas