You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Dennis Kubes <nu...@dragonflymc.com> on 2006/03/17 00:32:06 UTC
Help Setting Up Nutch 0.8 Distributed
I am having trouble getting Nutch to work using the DFS. I pulled Nutch 0.8
from SVN and build it just
fine using eclipse. I was able to set it up on a Whitebox Enterprise Linux
3 Respin 2 box (800 Mghz,
512M ram) and do a crawl using the local file-system. I was able to setup
the was inside of tomcat
and search the local index.
I then tried to switch to using the DFS. I was running everything as a
nutch user. I have a password-less
login to the local machine. I am using the options below in my
hadoop-site.xml file. When I run start-all.sh
I get some weird output but doing a ps -ef | grep java shows 2 java threads
running. Then when I try to do
a crawl it errors out.
Anybody got any ideas.
Dennis
hadoop-site.xml
----------------------------------------------------------------------------
------------------
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>localhost:9000</value>
<description>
The name of the default file system. Either the literal string
"local" or a host:port for NDFS.
</description>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
<description>
The host and port that the MapReduce job tracker runs at. If
"local", then jobs are run in-process as a single map and
reduce task.
</description>
</property>
<property>
<name>mapred.map.tasks</name>
<value>2</value>
<description>
define mapred.map tasks to be number of slave hosts
</description>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>2</value>
<description>
define mapred.reduce tasks to be number of slave hosts
</description>
</property>
<property>
<name>dfs.name.dir</name>
<value>/nutch/filesystem/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/nutch/filesystem/data</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/nutch/filesystem/mapreduce</value>
</property>
</configuration>
log of startup
----------------------------------------------------------------------------
------------------
localhost:9000: command-line: line 0: Bad configuration option:
ConnectTimeout
devcluster02:9000: command-line: line 0: Bad configuration option:
ConnectTimeout
starting namenode, logging to
/nutch/search/bin/../logs/hadoop-nutch-namenode-devcluster01.visvo.com.log
: command not foundadoop: line 2:
: command not foundadoop: line 7:
: command not foundadoop: line 10:
: command not foundadoop: line 13:
: command not foundadoop: line 16:
: command not foundadoop: line 19:
: command not foundadoop: line 22:
: command not foundadoop: line 25:
: command not foundadoop: line 28:
: command not foundadoop: line 31:
starting jobtracker, logging to
/nutch/search/bin/../logs/hadoop-nutch-jobtracke
r-devcluster01.visvo.com.log
: command not foundadoop: line 2:
: command not foundadoop: line 7:
: command not foundadoop: line 10:
: command not foundadoop: line 13:
: command not foundadoop: line 16:
: command not foundadoop: line 19:
: command not foundadoop: line 22:
: command not foundadoop: line 25:
: command not foundadoop: line 28:
: command not foundadoop: line 31:
localhost:9000: command-line: line 0: Bad configuration option:
ConnectTimeout
devcluster02:9000: command-line: line 0: Bad configuration option:
ConnectTimeout
ps -ef | grep java
----------------------------------------------------------------------------
------------------
[nutch@devcluster01 search]$ ps -ef | grep java
nutch 9907 1 2 17:26 pts/0 00:00:02
/usr/java/jdk1.5.0_06/bin/java -Xmx1000m -classpath
/nutch/search/conf:/usr/java/jdk1.5.0_06/lib/tools.jar:/nutch/search:/nutch/
search/hadoop-*.jar:/nutch/search/lib/commons-lang-2.1.jar:/nutch/search/lib
/commons-logging-api-1.0.4.jar:/nutch/search/lib/concurrent-1.3.4.ja
nutch 9945 1 8 17:27 pts/0 00:00:07
/usr/java/jdk1.5.0_06/bin/java -Xmx1000m -classpath
/nutch/search/conf:/usr/java/jdk1.5.0_06/lib/tools.jar:/nutch/search:/nutch/
search/hadoop-*.jar:/nutch/search/lib/commons-lang-2.1.jar:/nutch/search/lib
/commons-logging-api-1.0.4.jar:/nutch/search/lib/concurrent-1.3.4.ja
nutch 10028 9771 0 17:28 pts/0 00:00:00 grep java
Errors when running crawl
----------------------------------------------------------------------------
------------------
[nutch@devcluster01 search]$ bin/nutch crawl urls -depth 3 -topN 50
060316 173158 parsing
jar:file:/nutch/search/lib/hadoop-0.1-dev.jar!/hadoop-default.xml
060316 173158 parsing file:/nutch/search/conf/nutch-default.xml
060316 173159 parsing file:/nutch/search/conf/crawl-tool.xml
060316 173159 parsing
jar:file:/nutch/search/lib/hadoop-0.1-dev.jar!/mapred-default.xml
060316 173159 parsing file:/nutch/search/conf/nutch-site.xml
060316 173159 parsing file:/nutch/search/conf/hadoop-site.xml
060316 173159 Client connection to 127.0.0.1:9000: starting
060316 173159 crawl started in: crawl-20060316173159
060316 173159 rootUrlDir = urls
060316 173159 threads = 10
060316 173159 depth = 3
060316 173159 topN = 50
060316 173159 Injector: starting
060316 173159 Injector: crawlDb: crawl-20060316173159/crawldb
060316 173159 Injector: urlDir: urls
060316 173159 Injector: Converting injected urls to crawl db entries.
060316 173159 parsing
jar:file:/nutch/search/lib/hadoop-0.1-dev.jar!/hadoop-default.xml
060316 173159 parsing file:/nutch/search/conf/nutch-default.xml
060316 173159 parsing file:/nutch/search/conf/crawl-tool.xml
060316 173159 parsing
jar:file:/nutch/search/lib/hadoop-0.1-dev.jar!/mapred-default.xml
060316 173159 parsing
jar:file:/nutch/search/lib/hadoop-0.1-dev.jar!/mapred-default.xml
060316 173159 parsing file:/nutch/search/conf/nutch-site.xml
060316 173159 parsing file:/nutch/search/conf/hadoop-site.xml
060316 173200 Client connection to 127.0.0.1:9001: starting
060316 173200 Client connection to 127.0.0.1:9000: starting
060316 173200 parsing
jar:file:/nutch/search/lib/hadoop-0.1-dev.jar!/hadoop-default.xml
060316 173200 parsing file:/nutch/search/conf/hadoop-site.xml
Exception in thread "main" java.io.IOException: Cannot create file
/tmp/hadoop/mapred/system/submit_wdapr7/job.jar on client
DFSClient_1136455260
at org.apache.hadoop.ipc.Client.call(Client.java:301)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:141)
at org.apache.hadoop.dfs.$Proxy0.create(Unknown Source)
at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSCli
ent.java:587)
at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:556)
at org.apache.hadoop.dfs.DFSClient.create(DFSClient.java:99)
at
org.apache.hadoop.dfs.DistributedFileSystem.createRaw(DistributedFileSystem.
java:71)
at
org.apache.hadoop.fs.FSDataOutputStream$Summer.<init>(FSDataOutputStream.jav
a:39)
at
org.apache.hadoop.fs.FSDataOutputStream.<init>(FSDataOutputStream.java:128)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:180)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:168)
at
org.apache.hadoop.dfs.DistributedFileSystem.doFromLocalFile(DistributedFileS
ystem.java:156)
at
org.apache.hadoop.dfs.DistributedFileSystem.copyFromLocalFile(DistributedFil
eSystem.java:131)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:247)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:294)
at org.apache.nutch.crawl.Injector.inject(Injector.java:114)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:104)
RE: Help Setting Up Nutch 0.8 Distributed
Posted by Dennis Kubes <nu...@dragonflymc.com>.
My problem was that I was using the distributed file system but
trying to start the crawl from a local directory instead of first
uploading the crawl list to the distributed filesystem. Once I
uploaded it, it worked fine.
Dennis
-----Original Message-----
From: Marko Bauhardt [mailto:mb@media-style.com]
Sent: Saturday, March 18, 2006 5:48 AM
To: nutch-user@lucene.apache.org
Subject: Re: Help Setting Up Nutch 0.8 Distributed
Am 17.03.2006 um 17:20 schrieb Dennis Kubes:
> Exception in thread "main" java.io.IOException: Job failed!
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:
> 310)
> at org.apache.nutch.crawl.Injector.inject(Injector.java:114)
> at org.apache.nutch.crawl.Crawl.main(Crawl.java:104)
The Injector fails.
> java.io.IOException: No input directories specified in: Configuration:
> defaults: hadoop-default.xml , mapred-default.xml
The Injector does not found the directory with the inject files.
Marko
Re: Help Setting Up Nutch 0.8 Distributed
Posted by Marko Bauhardt <mb...@media-style.com>.
Am 17.03.2006 um 17:20 schrieb Dennis Kubes:
> Exception in thread "main" java.io.IOException: Job failed!
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:
> 310)
> at org.apache.nutch.crawl.Injector.inject(Injector.java:114)
> at org.apache.nutch.crawl.Crawl.main(Crawl.java:104)
The Injector fails.
> java.io.IOException: No input directories specified in: Configuration:
> defaults: hadoop-default.xml , mapred-default.xml
The Injector does not found the directory with the inject files.
Marko
RE: Help Setting Up Nutch 0.8 Distributed
Posted by Dennis Kubes <nu...@dragonflymc.com>.
That was just specifying the command line wrong. It starts the crawl not
but just stalls:
060317 101821 parsing file:/nutch/search/conf/hadoop-site.xml
060317 101829 Running job: job_1ko8i3
060317 101830 map 0% reduce 0%
I am seeing this a lot in the namenode log:
060317 102009 Zero targets found, forbidden1.size=1 forbidden2.size()=0
060317 102009 Zero targets found, forbidden1.size=1 forbidden2.size()=0
-----Original Message-----
From: Dennis Kubes [mailto:nutch-dev@dragonflymc.com]
Sent: Friday, March 17, 2006 9:55 AM
To: nutch-user@lucene.apache.org
Subject: RE: Help Setting Up Nutch 0.8 Distributed
Ok, the servers are starting now but when I try to do a crawl I am getting
an error like below. I think that I am missing a configuration option, but
I don't know which one. I have included my hadoop-site.xml as well.
error upon crawl:
060317 093312 Client connection to 127.0.0.1:9000: starting
060317 093312 parsing
jar:file:/nutch/search/lib/hadoop-0.1-dev.jar!/hadoop-default.xml
060317 093312 parsing file:/nutch/search/conf/hadoop-site.xml
060317 093322 Running job: job_c78m3c
060317 093323 map 100% reduce 100%
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:310)
at org.apache.nutch.crawl.Injector.inject(Injector.java:114)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:104)
job tracker log file:
060317 093322 parsing
jar:file:/nutch/search/lib/hadoop-0.1-dev.jar!/mapred-default.xml
060317 093322 parsing
/nutch/filesystem/mapreduce/local/job_c78m3c.xml/jobTracker
060317 093322 parsing file:/nutch/search/conf/hadoop-site.xml
060317 093322 job init failed
java.io.IOException: No input directories specified in: Configuration:
defaults: hadoop-default.xml , mapred-default.xml ,
/nutch/filesystem/mapreduce/local/
job_c78m3c.xml/jobTrackerfinal: hadoop-site.xml
at
org.apache.hadoop.mapred.InputFormatBase.listFiles(InputFormatBase.java:84)
at
org.apache.hadoop.mapred.InputFormatBase.getSplits(InputFormatBase.java:94)
at
org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:127)
at
org.apache.hadoop.mapred.JobTracker$JobInitThread.run(JobTracker.java:208)
at java.lang.Thread.run(Thread.java:595)
Exception in thread "Thread-21" java.lang.NullPointerException
at
org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:437)
at
org.apache.hadoop.mapred.JobTracker$JobInitThread.run(JobTracker.java:212)
at java.lang.Thread.run(Thread.java:595)
060317 093325 Server connection on port 9001 from 127.0.0.1: exiting
hadoop-site.xml:
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
<description>
The host and port that the MapReduce job tracker runs at. If
"local", then jobs are run in-process as a single map and
reduce task.
</description>
</property>
<property>
<name>mapred.map.tasks</name>
<value>2</value>
<description>
define mapred.map tasks to be number of slave hosts
</description>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>2</value>
<description>
define mapred.reduce tasks to be number of slave hosts
</description>
</property>
<property>
<name>dfs.name.dir</name>
<value>/nutch/filesystem/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/nutch/filesystem/data</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/nutch/filesystem/mapreduce/system</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/nutch/filesystem/mapreduce/local</value>
</property>
-----Original Message-----
From: Dennis Kubes [mailto:nutch-dev@dragonflymc.com]
Sent: Friday, March 17, 2006 9:05 AM
To: nutch-user@lucene.apache.org
Subject: RE: Help Setting Up Nutch 0.8 Distributed
I got one of the issues fixed. The output like below is caused by the
hadoop-env.sh
file being in dos format and not being executable. A dos2unix and chmod 700
fixed
the command not found output. Still working on why the server won't start.
caused by hadoop-env.sh in dos format and not being executable:
: command not found line 2:
: command not found line 7:
: command not found line 10:
: command not found line 13:
: command not found line 16:
: command not found line 20:
: command not found line 23:
: command not found line 26:
: command not found line 29:
: command not found line 32:
Dennis
-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org]
Sent: Thursday, March 16, 2006 6:50 PM
To: nutch-user@lucene.apache.org
Subject: Re: Help Setting Up Nutch 0.8 Distributed
Dennis Kubes wrote:
> : command not foundlaves.sh: line 29:
> : command not foundlaves.sh: line 32:
> localhost: ssh: \015: Name or service not known
> devcluster02: ssh: \015: Name or service not known
>
> And still getting this error:
>
> 060316 175355 parsing file:/nutch/search/conf/hadoop-site.xml
> Exception in thread "main" java.io.IOException: Cannot create file
> /tmp/hadoop/mapred/system/submit_mmuodk/job.jar on client
> DFSClient_-913777457
> at org.apache.hadoop.ipc.Client.call(Client.java:301)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:141)
> at org.apache.hadoop.dfs.$Proxy0.create(Unknown Source)
> at
>
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSCli
> ent.java:587)
> at org
>
> My ssh version is:
>
> openssh-clients-3.6.1p2-33.30.3
> openssh-server-3.6.1p2-33.30.3
> openssh-askpass-gnome-3.6.1p2-33.30.3
> openssh-3.6.1p2-33.30.3
> openssh-askpass-3.6.1p2-33.30.3
>
> Is it something to do with my slaves file?
The \015 looks like a file has a CR where perhaps an LF is expected?
What does 'od -c conf/slaves' print? What happens when you try
something like 'bin/slaves uptime'?
Doug
RE: Help Setting Up Nutch 0.8 Distributed
Posted by Dennis Kubes <nu...@dragonflymc.com>.
Ok, the servers are starting now but when I try to do a crawl I am getting
an error like below. I think that I am missing a configuration option, but
I don't know which one. I have included my hadoop-site.xml as well.
error upon crawl:
060317 093312 Client connection to 127.0.0.1:9000: starting
060317 093312 parsing
jar:file:/nutch/search/lib/hadoop-0.1-dev.jar!/hadoop-default.xml
060317 093312 parsing file:/nutch/search/conf/hadoop-site.xml
060317 093322 Running job: job_c78m3c
060317 093323 map 100% reduce 100%
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:310)
at org.apache.nutch.crawl.Injector.inject(Injector.java:114)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:104)
job tracker log file:
060317 093322 parsing
jar:file:/nutch/search/lib/hadoop-0.1-dev.jar!/mapred-default.xml
060317 093322 parsing
/nutch/filesystem/mapreduce/local/job_c78m3c.xml/jobTracker
060317 093322 parsing file:/nutch/search/conf/hadoop-site.xml
060317 093322 job init failed
java.io.IOException: No input directories specified in: Configuration:
defaults: hadoop-default.xml , mapred-default.xml ,
/nutch/filesystem/mapreduce/local/
job_c78m3c.xml/jobTrackerfinal: hadoop-site.xml
at
org.apache.hadoop.mapred.InputFormatBase.listFiles(InputFormatBase.java:84)
at
org.apache.hadoop.mapred.InputFormatBase.getSplits(InputFormatBase.java:94)
at
org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:127)
at
org.apache.hadoop.mapred.JobTracker$JobInitThread.run(JobTracker.java:208)
at java.lang.Thread.run(Thread.java:595)
Exception in thread "Thread-21" java.lang.NullPointerException
at
org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:437)
at
org.apache.hadoop.mapred.JobTracker$JobInitThread.run(JobTracker.java:212)
at java.lang.Thread.run(Thread.java:595)
060317 093325 Server connection on port 9001 from 127.0.0.1: exiting
hadoop-site.xml:
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
<description>
The host and port that the MapReduce job tracker runs at. If
"local", then jobs are run in-process as a single map and
reduce task.
</description>
</property>
<property>
<name>mapred.map.tasks</name>
<value>2</value>
<description>
define mapred.map tasks to be number of slave hosts
</description>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>2</value>
<description>
define mapred.reduce tasks to be number of slave hosts
</description>
</property>
<property>
<name>dfs.name.dir</name>
<value>/nutch/filesystem/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/nutch/filesystem/data</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/nutch/filesystem/mapreduce/system</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/nutch/filesystem/mapreduce/local</value>
</property>
-----Original Message-----
From: Dennis Kubes [mailto:nutch-dev@dragonflymc.com]
Sent: Friday, March 17, 2006 9:05 AM
To: nutch-user@lucene.apache.org
Subject: RE: Help Setting Up Nutch 0.8 Distributed
I got one of the issues fixed. The output like below is caused by the
hadoop-env.sh
file being in dos format and not being executable. A dos2unix and chmod 700
fixed
the command not found output. Still working on why the server won't start.
caused by hadoop-env.sh in dos format and not being executable:
: command not found line 2:
: command not found line 7:
: command not found line 10:
: command not found line 13:
: command not found line 16:
: command not found line 20:
: command not found line 23:
: command not found line 26:
: command not found line 29:
: command not found line 32:
Dennis
-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org]
Sent: Thursday, March 16, 2006 6:50 PM
To: nutch-user@lucene.apache.org
Subject: Re: Help Setting Up Nutch 0.8 Distributed
Dennis Kubes wrote:
> : command not foundlaves.sh: line 29:
> : command not foundlaves.sh: line 32:
> localhost: ssh: \015: Name or service not known
> devcluster02: ssh: \015: Name or service not known
>
> And still getting this error:
>
> 060316 175355 parsing file:/nutch/search/conf/hadoop-site.xml
> Exception in thread "main" java.io.IOException: Cannot create file
> /tmp/hadoop/mapred/system/submit_mmuodk/job.jar on client
> DFSClient_-913777457
> at org.apache.hadoop.ipc.Client.call(Client.java:301)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:141)
> at org.apache.hadoop.dfs.$Proxy0.create(Unknown Source)
> at
>
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSCli
> ent.java:587)
> at org
>
> My ssh version is:
>
> openssh-clients-3.6.1p2-33.30.3
> openssh-server-3.6.1p2-33.30.3
> openssh-askpass-gnome-3.6.1p2-33.30.3
> openssh-3.6.1p2-33.30.3
> openssh-askpass-3.6.1p2-33.30.3
>
> Is it something to do with my slaves file?
The \015 looks like a file has a CR where perhaps an LF is expected?
What does 'od -c conf/slaves' print? What happens when you try
something like 'bin/slaves uptime'?
Doug
RE: Help Setting Up Nutch 0.8 Distributed
Posted by Dennis Kubes <nu...@dragonflymc.com>.
I got one of the issues fixed. The output like below is caused by the
hadoop-env.sh
file being in dos format and not being executable. A dos2unix and chmod 700
fixed
the command not found output. Still working on why the server won't start.
caused by hadoop-env.sh in dos format and not being executable:
: command not found line 2:
: command not found line 7:
: command not found line 10:
: command not found line 13:
: command not found line 16:
: command not found line 20:
: command not found line 23:
: command not found line 26:
: command not found line 29:
: command not found line 32:
Dennis
-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org]
Sent: Thursday, March 16, 2006 6:50 PM
To: nutch-user@lucene.apache.org
Subject: Re: Help Setting Up Nutch 0.8 Distributed
Dennis Kubes wrote:
> : command not foundlaves.sh: line 29:
> : command not foundlaves.sh: line 32:
> localhost: ssh: \015: Name or service not known
> devcluster02: ssh: \015: Name or service not known
>
> And still getting this error:
>
> 060316 175355 parsing file:/nutch/search/conf/hadoop-site.xml
> Exception in thread "main" java.io.IOException: Cannot create file
> /tmp/hadoop/mapred/system/submit_mmuodk/job.jar on client
> DFSClient_-913777457
> at org.apache.hadoop.ipc.Client.call(Client.java:301)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:141)
> at org.apache.hadoop.dfs.$Proxy0.create(Unknown Source)
> at
>
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSCli
> ent.java:587)
> at org
>
> My ssh version is:
>
> openssh-clients-3.6.1p2-33.30.3
> openssh-server-3.6.1p2-33.30.3
> openssh-askpass-gnome-3.6.1p2-33.30.3
> openssh-3.6.1p2-33.30.3
> openssh-askpass-3.6.1p2-33.30.3
>
> Is it something to do with my slaves file?
The \015 looks like a file has a CR where perhaps an LF is expected?
What does 'od -c conf/slaves' print? What happens when you try
something like 'bin/slaves uptime'?
Doug
Re: Help Setting Up Nutch 0.8 Distributed
Posted by Doug Cutting <cu...@apache.org>.
Dennis Kubes wrote:
> : command not foundlaves.sh: line 29:
> : command not foundlaves.sh: line 32:
> localhost: ssh: \015: Name or service not known
> devcluster02: ssh: \015: Name or service not known
>
> And still getting this error:
>
> 060316 175355 parsing file:/nutch/search/conf/hadoop-site.xml
> Exception in thread "main" java.io.IOException: Cannot create file
> /tmp/hadoop/mapred/system/submit_mmuodk/job.jar on client
> DFSClient_-913777457
> at org.apache.hadoop.ipc.Client.call(Client.java:301)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:141)
> at org.apache.hadoop.dfs.$Proxy0.create(Unknown Source)
> at
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSCli
> ent.java:587)
> at org
>
> My ssh version is:
>
> openssh-clients-3.6.1p2-33.30.3
> openssh-server-3.6.1p2-33.30.3
> openssh-askpass-gnome-3.6.1p2-33.30.3
> openssh-3.6.1p2-33.30.3
> openssh-askpass-3.6.1p2-33.30.3
>
> Is it something to do with my slaves file?
The \015 looks like a file has a CR where perhaps an LF is expected?
What does 'od -c conf/slaves' print? What happens when you try
something like 'bin/slaves uptime'?
Doug
RE: Help Setting Up Nutch 0.8 Distributed
Posted by Dennis Kubes <nu...@dragonflymc.com>.
Now I am getting this:
...
: command not foundlaves.sh: line 29:
: command not foundlaves.sh: line 32:
localhost: ssh: \015: Name or service not known
devcluster02: ssh: \015: Name or service not known
And still getting this error:
060316 175355 parsing file:/nutch/search/conf/hadoop-site.xml
Exception in thread "main" java.io.IOException: Cannot create file
/tmp/hadoop/mapred/system/submit_mmuodk/job.jar on client
DFSClient_-913777457
at org.apache.hadoop.ipc.Client.call(Client.java:301)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:141)
at org.apache.hadoop.dfs.$Proxy0.create(Unknown Source)
at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSCli
ent.java:587)
at org
My ssh version is:
openssh-clients-3.6.1p2-33.30.3
openssh-server-3.6.1p2-33.30.3
openssh-askpass-gnome-3.6.1p2-33.30.3
openssh-3.6.1p2-33.30.3
openssh-askpass-3.6.1p2-33.30.3
Is it something to do with my slaves file?
-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org]
Sent: Thursday, March 16, 2006 5:46 PM
To: nutch-user@lucene.apache.org
Subject: Re: Help Setting Up Nutch 0.8 Distributed
Dennis Kubes wrote:
> localhost:9000: command-line: line 0: Bad configuration option:
> ConnectTimeout
> devcluster02:9000: command-line: line 0: Bad configuration option:
> ConnectTimeout
[ ... ]
> localhost:9000: command-line: line 0: Bad configuration option:
> ConnectTimeout
> devcluster02:9000: command-line: line 0: Bad configuration option:
> ConnectTimeout
The launch of the datanodes and tasktrackers failed, since your version
of ssh does not support the ConnectTimeout option. Edit
conf/nutch-env.sh, and add a 'export HADOOP_SSH_OPTS=' line to remove
this option.
Doug
Re: Help Setting Up Nutch 0.8 Distributed
Posted by Doug Cutting <cu...@apache.org>.
Dennis Kubes wrote:
> localhost:9000: command-line: line 0: Bad configuration option:
> ConnectTimeout
> devcluster02:9000: command-line: line 0: Bad configuration option:
> ConnectTimeout
[ ... ]
> localhost:9000: command-line: line 0: Bad configuration option:
> ConnectTimeout
> devcluster02:9000: command-line: line 0: Bad configuration option:
> ConnectTimeout
The launch of the datanodes and tasktrackers failed, since your version
of ssh does not support the ConnectTimeout option. Edit
conf/nutch-env.sh, and add a 'export HADOOP_SSH_OPTS=' line to remove
this option.
Doug