You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Haya AL-Tuwaijri <ha...@hotmail.com> on 2012/03/04 07:18:52 UTC

Solrindex job failed !


Dear all ..
I'm using Nutch 1.4, Solr 3.5, TomCat 7.0.23

After the integration, crawling is working well, but I'm facing an error while indexing in solr using solrindex command.
The error is (missing required field id), in schema.xml that I copied it from nutch and past it  in solr/conf, there is an Id field as a unique key.

Nutch log file (hadoop.log)

2012-03-04 08:57:24,370 WARN  mapred.LocalJobRunner - job_local_0001
org.apache.solr.common.SolrException: Bad Request

Bad Request

request: http://localhost:8080/solr/update?wt=javabin&version=2
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:432)
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:246)
    at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
    at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
    at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
    at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:93)
    at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
    at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
2012-03-04 08:57:24,724 ERROR solr.SolrIndexer - java.io.IOException: Job failed!

TomCat log file (catalina.log):

SEVERE: org.apache.solr.common.SolrException: [doc=null] missing required field: id
    at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:346)
    at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
    at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
    at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:158)
    at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79)
    at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58)
    at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
    at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
    at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
    at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)
    at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1815)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
    at java.lang.Thread.run(Thread.java:619)

Mar 4, 2012 8:57:24 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/update params={wt=javabin&version=2} status=400 QTime=47 



 


 		 	   		  

RE: Solrindex job failed !

Posted by HaYa aziz <ha...@hotmail.com>.
This is my solrindex-mapping.xml,, I try to change   <field dest="id" 
source="url"/> to   <field dest="id" source="id"/>, but the 
same error appear.

<?xml version="1.0" encoding="UTF-8"?>
<!--
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the "License"); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
-->

<mapping>
    <!-- Simple mapping of fields created by Nutch IndexingFilters
         to fields defined (and expected) in Solr schema.xml.

             Any fields in NutchDocument that match a name defined
             in field/@source will be renamed to the corresponding
             field/@dest.
             Additionally, if a field name (before mapping) matches
             a copyField/@source then its values will be copied to 
             the corresponding copyField/@dest.

             uniqueKey has the same meaning as in Solr schema.xml
             and defaults to "id" if not defined.
         -->
    <fields>
        <field dest="content" source="content"/>
        <field dest="site" source="site"/>
        <field dest="title" source="title"/>
        <field dest="host" source="host"/>
        <field dest="segment" source="segment"/>
        <field dest="boost" source="boost"/>
        <field dest="digest" source="digest"/>
        <field dest="tstamp" source="tstamp"/>
        <field dest="id" source="url"/>
        <copyField source="url" dest="url"/>
    </fields>
    <uniqueKey>id</uniqueKey>
</mapping>



 

 


> From: hayooo_992@hotmail.com
> To: user@nutch.apache.org
> Subject: Solrindex job failed !
> Date: Sun, 4 Mar 2012 09:18:52 +0300
> 
> 
> 
> Dear all ..
> I'm using Nutch 1.4, Solr 3.5, TomCat 7.0.23
> 
> After the integration, crawling is working well, but I'm facing an error while indexing in solr using solrindex command.
> The error is (missing required field id), in schema.xml that I copied it from nutch and past it  in solr/conf, there is an Id field as a unique key.
> 
> Nutch log file (hadoop.log)
> 
> 2012-03-04 08:57:24,370 WARN  mapred.LocalJobRunner - job_local_0001
> org.apache.solr.common.SolrException: Bad Request
> 
> Bad Request
> 
> request: http://localhost:8080/solr/update?wt=javabin&version=2
>     at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:432)
>     at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:246)
>     at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
>     at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
>     at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
>     at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:93)
>     at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
>     at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
>     at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> 2012-03-04 08:57:24,724 ERROR solr.SolrIndexer - java.io.IOException: Job failed!
> 
> TomCat log file (catalina.log):
> 
> SEVERE: org.apache.solr.common.SolrException: [doc=null] missing required field: id
>     at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:346)
>     at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
>     at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
>     at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:158)
>     at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79)
>     at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58)
>     at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>     at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
>     at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
>     at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
>     at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
>     at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>     at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
>     at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
>     at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
>     at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
>     at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
>     at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)
>     at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>     at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
>     at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
>     at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)
>     at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1815)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>     at java.lang.Thread.run(Thread.java:619)
> 
> Mar 4, 2012 8:57:24 AM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/update params={wt=javabin&version=2} status=400 QTime=47 
> 
> 
> 
>  
> 
> 
>  		 	   		  
 		 	   		  

Re: Solrindex job failed !

Posted by shlomi java <sh...@gmail.com>.
What about your solrindex-mapping.xml?
Did you check it to be OK?

SJ

2012/3/4 Haya AL-Tuwaijri <ha...@hotmail.com>

>
>
> Dear all ..
> I'm using Nutch 1.4, Solr 3.5, TomCat 7.0.23
>
> After the integration, crawling is working well, but I'm facing an error
> while indexing in solr using solrindex command.
> The error is (missing required field id), in schema.xml that I copied it
> from nutch and past it  in solr/conf, there is an Id field as a unique key.
>
> Nutch log file (hadoop.log)
>
> 2012-03-04 08:57:24,370 WARN  mapred.LocalJobRunner - job_local_0001
> org.apache.solr.common.SolrException: Bad Request
>
> Bad Request
>
> request: http://localhost:8080/solr/update?wt=javabin&version=2
>    at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:432)
>    at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:246)
>    at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
>    at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:69)
>    at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
>    at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:93)
>    at
> org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
>    at
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
>    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
>    at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> 2012-03-04 08:57:24,724 ERROR solr.SolrIndexer - java.io.IOException: Job
> failed!
>
> TomCat log file (catalina.log):
>
> SEVERE: org.apache.solr.common.SolrException: [doc=null] missing required
> field: id
>    at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:346)
>    at
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
>    at
> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
>    at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:158)
>    at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79)
>    at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58)
>    at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>    at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
>    at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
>    at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
>    at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
>    at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>    at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
>    at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
>    at
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
>    at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
>    at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
>    at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:928)
>    at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>    at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
>    at
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
>    at
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:539)
>    at
> org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1815)
>    at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>    at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>    at java.lang.Thread.run(Thread.java:619)
>
> Mar 4, 2012 8:57:24 AM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/update params={wt=javabin&version=2}
> status=400 QTime=47
>
>
>
>
>
>
>

Re: Solrindex job failed !

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Please read the log message.
It needs to be a unique field.
Correct this in your schema and you should be good to go.
hth


On Mon, Jun 17, 2013 at 6:35 AM, kamal11 <kk...@gmail.com> wrote:

> I am also facing the same error.
> The nutch log says "ERROR solr.SolrIndexer - java.io.IOException: Job
> failed!" and solr log says "SEVERE: org.apache.solr.common.SolrException:
> Document is missing mandatory uniqueKey field: url" , though I already have
> a field <url> in schema file of solr. please help me.I am stuck
>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solrindex-job-failed-tp3797490p4070997.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>



-- 
*Lewis*

Re: Solrindex job failed !

Posted by kamal11 <kk...@gmail.com>.
I am also facing the same error.
The nutch log says "ERROR solr.SolrIndexer - java.io.IOException: Job
failed!" and solr log says "SEVERE: org.apache.solr.common.SolrException:
Document is missing mandatory uniqueKey field: url" , though I already have
a field <url> in schema file of solr. please help me.I am stuck






--
View this message in context: http://lucene.472066.n3.nabble.com/Solrindex-job-failed-tp3797490p4070997.html
Sent from the Nutch - User mailing list archive at Nabble.com.