You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by ozzy19 <di...@live.it> on 2013/10/09 16:32:46 UTC

Error execute nutch crawl in Eclipse

I have a problem when I run the class Crawl from eclipse, I get this output:
<http://lucene.472066.n3.nabble.com/file/n4094376/Immagine.png> 

Can someone help me solve? thank you!



--
View this message in context: http://lucene.472066.n3.nabble.com/Error-execute-nutch-crawl-in-Eclipse-tp4094376.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Error execute nutch crawl in Eclipse

Posted by Talat Uyarer <ta...@agmlab.com>.
Hi again,

Dont use Crawl.java. If you want to make all cycle crawl you can use crawl shell script in Eclipse. Now I use specifically run of job. I run manually in order of Job.

You can see https://issues.apache.org/jira/browse/NUTCH-1621 Crawl.java will remove in next version.

Talat  

----- Orijinal Mesaj -----
Kimden: "ozzy19" <di...@live.it>
Kime: user@nutch.apache.org
Gönderilenler: 9 Ekim Çarşamba 2013 17:50:50
Konu: Re: Error execute nutch crawl in Eclipse

The version of nutch is 1.7



--
View this message in context: http://lucene.472066.n3.nabble.com/Error-execute-nutch-crawl-in-Eclipse-tp4094376p4094381.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Error execute nutch crawl in Eclipse

Posted by ozzy19 <di...@live.it>.
The version of nutch is 1.7



--
View this message in context: http://lucene.472066.n3.nabble.com/Error-execute-nutch-crawl-in-Eclipse-tp4094376p4094381.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Error execute nutch crawl in Eclipse

Posted by Talat UYARER <ta...@agmlab.com>.
What is your Nutch Version ?

09-10-2013 17:35 tarihinde, ozzy19 yazdı:
> This is hadoop log:
>
>   2013-10-09 13:06:27,113 WARN  crawl.Crawl - solrUrl is not set, indexing
> will be skipped...
> 2013-10-09 13:06:27,464 INFO  crawl.Crawl - crawl started in: crawl
> 2013-10-09 13:06:27,464 INFO  crawl.Crawl - rootUrlDir = url
> 2013-10-09 13:06:27,464 INFO  crawl.Crawl - threads = 10
> 2013-10-09 13:06:27,464 INFO  crawl.Crawl - depth = 5
> 2013-10-09 13:06:27,464 INFO  crawl.Crawl - solrUrl=null
> 2013-10-09 13:06:27,465 INFO  crawl.Crawl - topN = 5
> 2013-10-09 13:06:27,531 INFO  crawl.Injector - Injector: starting at
> 2013-10-09 13:06:27
> 2013-10-09 13:06:27,531 INFO  crawl.Injector - Injector: crawlDb:
> crawl/crawldb
> 2013-10-09 13:06:27,531 INFO  crawl.Injector - Injector: urlDir: url
> 2013-10-09 13:06:27,655 INFO  crawl.Injector - Injector: Converting injected
> urls to crawl db entries.
> 2013-10-09 13:06:29,436 INFO  regex.RegexURLNormalizer - can't find rules
> for scope 'inject', using default
> 2013-10-09 13:06:29,466 WARN  mapred.LocalJobRunner - job_local_0001
> java.io.IOException: Cannot run program "bash": CreateProcess error=2,
> Impossibile trovare il file specificato
> 	at java.lang.ProcessBuilder.start(Unknown Source)
> 	at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
> 	at org.apache.hadoop.util.Shell.run(Shell.java:134)
> 	at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
> 	at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
> 	at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
> 	at
> org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:61)
> 	at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1469)
> 	at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154)
> 	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:359)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> Caused by: java.io.IOException: CreateProcess error=2, Impossibile trovare
> il file specificato
> 	at java.lang.ProcessImpl.create(Native Method)
> 	at java.lang.ProcessImpl.<init>(Unknown Source)
> 	at java.lang.ProcessImpl.start(Unknown Source)
> 	... 12 more
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Error-execute-nutch-crawl-in-Eclipse-tp4094376p4094377.html
> Sent from the Nutch - User mailing list archive at Nabble.com.


Re: Error execute nutch crawl in Eclipse

Posted by ozzy19 <di...@live.it>.
This is hadoop log:

 2013-10-09 13:06:27,113 WARN  crawl.Crawl - solrUrl is not set, indexing
will be skipped...
2013-10-09 13:06:27,464 INFO  crawl.Crawl - crawl started in: crawl
2013-10-09 13:06:27,464 INFO  crawl.Crawl - rootUrlDir = url
2013-10-09 13:06:27,464 INFO  crawl.Crawl - threads = 10
2013-10-09 13:06:27,464 INFO  crawl.Crawl - depth = 5
2013-10-09 13:06:27,464 INFO  crawl.Crawl - solrUrl=null
2013-10-09 13:06:27,465 INFO  crawl.Crawl - topN = 5
2013-10-09 13:06:27,531 INFO  crawl.Injector - Injector: starting at
2013-10-09 13:06:27
2013-10-09 13:06:27,531 INFO  crawl.Injector - Injector: crawlDb:
crawl/crawldb
2013-10-09 13:06:27,531 INFO  crawl.Injector - Injector: urlDir: url
2013-10-09 13:06:27,655 INFO  crawl.Injector - Injector: Converting injected
urls to crawl db entries.
2013-10-09 13:06:29,436 INFO  regex.RegexURLNormalizer - can't find rules
for scope 'inject', using default
2013-10-09 13:06:29,466 WARN  mapred.LocalJobRunner - job_local_0001
java.io.IOException: Cannot run program "bash": CreateProcess error=2,
Impossibile trovare il file specificato
	at java.lang.ProcessBuilder.start(Unknown Source)
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
	at org.apache.hadoop.util.Shell.run(Shell.java:134)
	at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
	at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
	at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
	at
org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:61)
	at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1469)
	at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:359)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Caused by: java.io.IOException: CreateProcess error=2, Impossibile trovare
il file specificato
	at java.lang.ProcessImpl.create(Native Method)
	at java.lang.ProcessImpl.<init>(Unknown Source)
	at java.lang.ProcessImpl.start(Unknown Source)
	... 12 more




--
View this message in context: http://lucene.472066.n3.nabble.com/Error-execute-nutch-crawl-in-Eclipse-tp4094376p4094377.html
Sent from the Nutch - User mailing list archive at Nabble.com.