You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Ray Crawford <ra...@gmail.com> on 2017/08/14 04:14:07 UTC
Failing on Solr indexing

Solr 6.6.0
Nutch 2.3.1
hbase 0.98.0

*Command:* ./crawl ./urls nutch http://192.168.56.6:8983/solr/#/nutch 10

*Output: *

ParserJob: success

ParserJob: finished at 2017-08-14 04:09:33, time elapsed: 00:00:05

CrawlDB update for nutch

/opt/nutch/runtime/local/bin/nutch updatedb -D mapred.reduce.tasks=2 -D
mapred.child.java.opts=-Xmx1000m -D
mapred.reduce.tasks.speculative.execution=false -D
mapred.map.tasks.speculative.execution=false -D
mapred.compress.map.output=true 1502683473-11162 -crawlId nutch

DbUpdaterJob: starting at 2017-08-14 04:09:34

DbUpdaterJob: batchId: 1502683473-11162

DbUpdaterJob: finished at 2017-08-14 04:09:39, time elapsed: 00:00:05

Indexing nutch on SOLR index -> http://192.168.56.6:8983/solr/#/nutch

/opt/nutch/runtime/local/bin/nutch index -D mapred.reduce.tasks=2 -D
mapred.child.java.opts=-Xmx1000m -D
mapred.reduce.tasks.speculative.execution=false -D
mapred.map.tasks.speculative.execution=false -D
mapred.compress.map.output=true -D solr.server.url=
http://192.168.56.6:8983/solr/#/nutch -all -crawlId nutch

IndexingJob: starting

SolrIndexerJob: java.lang.RuntimeException: job failed:
name=[nutch]Indexer, jobid=job_local1761553160_0001

at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120)

at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:154)

at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:176)

at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:202)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:211)


Error running:

  /opt/nutch/runtime/local/bin/nutch index -D mapred.reduce.tasks=2 -D
mapred.child.java.opts=-Xmx1000m -D
mapred.reduce.tasks.speculative.execution=false -D
mapred.map.tasks.speculative.execution=false -D
mapred.compress.map.output=true -D solr.server.url=
http://192.168.56.6:8983/solr/#/nutch -all -crawlId nutch

Failed with exit value 255.

-----

Solr is accessible.

A nutch core was created.

schema.xml and solrconfig.xml were updated.



*Any idea why Solr isn't allowing the data to flow in?  I tried
both http://192.168.56.6:8983/solr/#/nutch
<http://192.168.56.6:8983/solr/#/nutch> and http://192.168.56.6:8983
<http://192.168.56.6:8983> as the Solr URL.  Both failed in a similar
fashion.*