You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Broham <ab...@yahoo.com> on 2012/11/16 22:52:11 UTC

Getting error trying to connect nutch and solr

I have been going through the  nutch and solr tutorials
<http://wiki.apache.org/nutch/NutchTutorial>   but I have run into a problem
that I cannot get past.  In the  Integerate Solr with Nutch
<http://wiki.apache.org/nutch/NutchTutorial#A6._Integrate_Solr_with_Nutch>  
section it tells me to run the following command:

bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb
crawl/linkdb crawl/segments/*

When I do this, I get the following output:

SolrIndexer: starting at 2012-11-16 13:14:52
2012-11-16 13:14:52.762 java[28342:1903] Unable to load realm info from
SCDynamicStore
Indexing 41 documents
java.io.IOException: Job failed!

If I look in my hadoop.log file I see this chunk which seems related:

2012-11-16 13:43:32,511 INFO  solr.SolrWriter - Indexing 41 documents
2012-11-16 13:43:32,706 WARN  mapred.LocalJobRunner - job_local_0001
org.apache.solr.common.SolrException: ERROR: [doc=http://minersfoundry.org/]
unknown field 'content'

ERROR: [doc=http://minersfoundry.org/] unknown field 'content'

request: http://127.0.0.1:8983/solr/update?wt=javabin&version=2
	at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:427)
	at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
	at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
	at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:142)
	at
org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
	at
org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:466)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:530)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
2012-11-16 13:43:33,190 ERROR solr.SolrIndexer - java.io.IOException: Job
failed!

Both of my schema.xml files (for nutch and for solr) have content defined so
I'm not sure if this is the problem.  Has anyone else run into this
situation before? Suggestions for getting past this error?



--
View this message in context: http://lucene.472066.n3.nabble.com/Getting-error-trying-to-connect-nutch-and-solr-tp4020814.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.

Re: Getting error trying to connect nutch and solr

Posted by Broham <ab...@yahoo.com>.
Looks like I had my schema.xml file in the wrong directory.  I also had to
change the version in my schema.xml from 1.5.1 to 1.5 or I would get a
NumberFormatting error.

Thanks for your help!



--
View this message in context: http://lucene.472066.n3.nabble.com/Getting-error-trying-to-connect-nutch-and-solr-tp4020814p4020972.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.

Re: Getting error trying to connect nutch and solr

Posted by Broham <ab...@yahoo.com>.
Nutch - 1.5.1
Solr - 3.6.1

Thanks!



--
View this message in context: http://lucene.472066.n3.nabble.com/Getting-error-trying-to-connect-nutch-and-solr-tp4020814p4020837.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.

Re: Getting error trying to connect nutch and solr

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi,

Lets start with the basics... Can you please state Nutch version plus
Solr version?

Thank you very much

Lewis

On Fri, Nov 16, 2012 at 9:52 PM, Broham <ab...@yahoo.com> wrote:
> I have been going through the  nutch and solr tutorials
> <http://wiki.apache.org/nutch/NutchTutorial>   but I have run into a problem
> that I cannot get past.  In the  Integerate Solr with Nutch
> <http://wiki.apache.org/nutch/NutchTutorial#A6._Integrate_Solr_with_Nutch>
> section it tells me to run the following command:
>
> bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb
> crawl/linkdb crawl/segments/*
>
> When I do this, I get the following output:
>
> SolrIndexer: starting at 2012-11-16 13:14:52
> 2012-11-16 13:14:52.762 java[28342:1903] Unable to load realm info from
> SCDynamicStore
> Indexing 41 documents
> java.io.IOException: Job failed!
>
> If I look in my hadoop.log file I see this chunk which seems related:
>
> 2012-11-16 13:43:32,511 INFO  solr.SolrWriter - Indexing 41 documents
> 2012-11-16 13:43:32,706 WARN  mapred.LocalJobRunner - job_local_0001
> org.apache.solr.common.SolrException: ERROR: [doc=http://minersfoundry.org/]
> unknown field 'content'
>
> ERROR: [doc=http://minersfoundry.org/] unknown field 'content'
>
> request: http://127.0.0.1:8983/solr/update?wt=javabin&version=2
>         at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:427)
>         at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
>         at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
>         at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:142)
>         at
> org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
>         at
> org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:466)
>         at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:530)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
> 2012-11-16 13:43:33,190 ERROR solr.SolrIndexer - java.io.IOException: Job
> failed!
>
> Both of my schema.xml files (for nutch and for solr) have content defined so
> I'm not sure if this is the problem.  Has anyone else run into this
> situation before? Suggestions for getting past this error?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Getting-error-trying-to-connect-nutch-and-solr-tp4020814.html
> Sent from the Nutch - Dev mailing list archive at Nabble.com.



-- 
Lewis