You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jerritt Pace <je...@yahoo.ie> on 2015/12/27 21:56:04 UTC

I am a problem running nutch after trying various different versions of the program

I have posted this, or various versions, in stack overflow and other related forums:
I am trying to integrate nutch with solr, but I am no longer convinced that is the problem.

I am getting an error in Nutch 1.11, 1.5.1, and 2.3 when I try to execute a crawl command,such as 

bin/crawl C:/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/urls solr.server.url=http://localhost:8983/solr/collections1 urls/ 2

I have my java classpath set, and nutch is running, ie i get response from $ bin/nutch; I have copied the nutch schema.xml file to the solr core conf file, but I get the same error regardless of what version of nutch I am using:
Error running:
 /cygdrive/c/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/bin/nutch inject C:/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/urls -crawlId solr.server.url=http://localhost:8983/solr/collections1
Failed with exit value 127. 
This is the full output:
   
      - $ bin/crawl C:/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/urls solr.server.url=http://localhost:8983/solr/collections1 urls/ 2   
Injecting seed URLs   
/cygdrive/c/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/bin/nutch inject C:/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/urls -crawlId solr.server.url=http://localhost:8983/solr/collections1   
InjectorJob: starting at 2015-12-26 15:21:26   
InjectorJob: Injecting urlDir: C:/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/urls   
InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora storage class.   
InjectorJob: java.io.IOException: Failed to set permissions of path: \tmp\hadoop-User5\mapred\staging\User52078840406\.staging to 0700   
        at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)   
        at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)   
        at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)   
        at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)   
        at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)   
        at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:126)   
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)   
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)   
        at java.security.AccessController.doPrivileged(Native Method)   
        at javax.security.auth.Subject.doAs(Subject.java:422)   
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)   
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)   
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)   
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)   
        at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:50)   
        at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:231)   
        at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)   
        at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)   
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)   
        at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)   
   
Error running:   
  /cygdrive/c/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/bin/nutch inject C:/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/urls -crawlId solr.server.url=http://localhost:8983/solr/collections1   
Failed with exit value 127.
   -    


I am getting this error regardless of the version of nutch or solr I have tried, and I have tried to find an answer to fix this problem for most of a week, but to no avail.
Any help with this problem that I cannot seem to get a grip on my own would be very much appreciated!
Thank you, 
Jerritt Pace
   
   -    


Re: I am a problem running nutch after trying various different versions of the program

Posted by Binoy Dalal <bi...@gmail.com>.
Failed with value 127 means 'command not found' which might suggest that
there is either something wrong with your path or with the command itself.
Additionally based on the stack trace,
InjectorJob: java.io.IOException: Failed to set permissions of path:
\tmp\hadoop-User5\mapred\staging\User52078840406\.staging to 0700q
Are you sure you have the necessary permissions to make changes to the
directory in question.

On Mon, 28 Dec 2015, 02:53 Jerritt Pace <je...@yahoo.ie> wrote:

> I have posted this, or various versions, in stack overflow and other
> related forums:
> I am trying to integrate nutch with solr, but I am no longer convinced
> that is the problem.
>
> I am getting an error in Nutch 1.11, 1.5.1, and 2.3 when I try to execute
> a crawl command,such as
>
> bin/crawl
> C:/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/urls
> solr.server.url=http://localhost:8983/solr/collections1 urls/ 2
>
> I have my java classpath set, and nutch is running, ie i get response from
> $ bin/nutch; I have copied the nutch schema.xml file to the solr core conf
> file, but I get the same error regardless of what version of nutch I am
> using:
> Error running:
>  /cygdrive/c/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/bin/nutch
> inject C:/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/urls
> -crawlId solr.server.url=http://localhost:8983/solr/collections1
> Failed with exit value 127.
> This is the full output:
>
>       - $ bin/crawl
> C:/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/urls
> solr.server.url=http://localhost:8983/solr/collections1 urls/ 2
> Injecting seed URLs
> /cygdrive/c/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/bin/nutch
> inject C:/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/urls
> -crawlId solr.server.url=http://localhost:8983/solr/collections1
> InjectorJob: starting at 2015-12-26 15:21:26
> InjectorJob: Injecting urlDir:
> C:/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/urls
> InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora
> storage class.
> InjectorJob: java.io.IOException: Failed to set permissions of path:
> \tmp\hadoop-User5\mapred\staging\User52078840406\.staging to 0700
>         at
> org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)
>         at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)
>         at
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
>         at
> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
>         at
> org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)
>         at
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:126)
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
>         at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
>         at
> org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:50)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:231)
>         at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
>
> Error running:
>
> /cygdrive/c/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/bin/nutch
> inject C:/Users/User5/Documents/Nutch/apache-nutch-2.3/runtime/local/urls
> -crawlId solr.server.url=http://localhost:8983/solr/collections1
> Failed with exit value 127.
>    -
>
>
> I am getting this error regardless of the version of nutch or solr I have
> tried, and I have tried to find an answer to fix this problem for most of a
> week, but to no avail.
> Any help with this problem that I cannot seem to get a grip on my own
> would be very much appreciated!
> Thank you,
> Jerritt Pace
>
>    -
>
> --
Regards,
Binoy Dalal